I've run into an interesting case of a very slow query.
In the DB I have I have about 600,000 fragments (in about 6000 files).
These are small fragments (about 300 bytes) with about 10 very short
elements containing a short string or nothing.
In MOST searches I get about 100ms result times but this one takes
about 60 seconds
cts:search(
xdmp:directory("/RxNorm/rxnconso/")//RXNCONSO ,
cts:element-query( xs:QName("STR") ,
cts:word-query( "ENG", ("case-insensitive",
"diacritic-sensitive",
"punctuation-insensitive", "whitespace-insensitive",
"unstemmed","wildcarded") ) ) )[1 to 10]
What I think is going on here is that the term "ENG" is in every single
fragment (its a language code), so its finding 600,000 fragments
but I'm constructing a search to limit the search to only "STR"
elements, of which none contain "ENG".
My guess as to what is happening is that ML is finding a "hit" in every
fragment, but has to open up the fragment
and search to discover that the hit was in the wrong element. The
result is the empty sequence.
but it takes a minute to get to that.
If I change the query to something where search term actually occurs in
the requested field, then I get results in about 100ms.
Is there any recommended technique to avoid this degenerate case ?
----------------------------------------
David A. Lee
Senior Principal Software Engineer
Epocrates, Inc.
[email protected] <mailto:[email protected]>
812-482-5224
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general