Hi Bob,
> There are about 6000 entries one of the documents. When
> accessed as follows:
> fn:doc("lookup.xml")/document/entry[code = $code]/name
> profiling shows 6000 code = $code tests are executed. Since
> there are hundreds of lookups while processing a single
> document, a significant amount of time is devoted to this test.
Adding a range index won't help if all 6000 entries are in just one fragment.
The index would return the document node and the parser would still have to
walk all the tree to find the appropriate entries.
I have been mailing a bit with David Lee offline, he was facing a very similar
problem and his and yours resemble a few of my own as well. You could follow
his approach, which simply means setting a Fragment root for 'entry', and then
add a range index for 'code' to optimize the predicate.
My approach would have been to chunk the entire document into 6000 documents
with each one entry. You can put them in a separate directory or some
collection. Then you could do something like: //entry[code = $code]/name, or
prepend this with a call to collection(...)
If you still have the feeling that it is performing suboptimal (you should get
subsecond results, more like one hundred of a second at the most, depending on
your hardware), try writing the expression as a cts:search. The parser depends
less on the query optimizer that way..
> As a workaround I've been able to significantly reduce
> processing time by loading the lookup document entry nodes
> into a map keyed by the code value and declaring that map as
> a variable at the top of my module, but I'm concerned that
> this map gets loaded only at server startup and any updates
> to the lookup.xml document would require a server reboot.
> While I could code an "init" function to reload the maps I'd
> rather avoid this complexity if there is a built in way that
> I can avoid the 6000 tests per value resolution.
I am pretty certain that even though query modules can be cached, it is
executed from beginning to end each time. Which means it is loading the map
into memory at each call. If this is to optimize loading or updating data, this
approach will do just fine I guess. If you are searching, you want as minimal
overhead as possible, so I would think you would be better off with optimizing
your expression.
Kind regards,
Geert
Drs. G.P.H. Josten
Consultant
http://www.daidalos.nl/
Daidalos BV
Source of Innovation
Hoekeindsehof 1-4
2665 JZ Bleiswijk
Tel.: +31 (0) 10 850 1200
Fax: +31 (0) 10 850 1199
http://www.daidalos.nl/
KvK 27164984
De informatie - verzonden in of met dit emailbericht - is afkomstig van
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit
bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit
bericht kunnen geen rechten worden ontleend.
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general