I've run into an interesting case of a very slow query.

In the DB I have I have about 600,000 fragments (in about 6000 files).

These are small fragments (about 300 bytes) with about 10 very short
elements containing a short string or nothing.

In MOST searches I get  about 100ms result times but this one takes
about 60 seconds

 

 

cts:search( 

xdmp:directory("/RxNorm/rxnconso/")//RXNCONSO ,

    cts:element-query( xs:QName("STR") , 

       cts:word-query( "ENG",  ("case-insensitive",
"diacritic-sensitive", 

          "punctuation-insensitive", "whitespace-insensitive",
"unstemmed","wildcarded") ) ) )[1 to 10]

 

 

 

 

 

What I think is going on here is that the term "ENG" is in every single
fragment (its a language code), so its finding 600,000 fragments

but I'm constructing a search to limit the search to only "STR"
elements, of which none contain "ENG".

My guess as to what is happening is that ML is  finding a "hit" in every
fragment, but has to open up the fragment

and search to discover that the hit was in the wrong element.   The
result is the empty sequence.

but it takes a minute to get to that.

 

If I change the query to something where search term actually occurs in
the requested field, then I get results in about 100ms.

 

Is there any recommended technique to avoid this degenerate case ? 

 

 

 

 

 

 

----------------------------------------

David A. Lee

Senior Principal Software Engineer

Epocrates, Inc.

[email protected] <mailto:[email protected]> 

812-482-5224

 

 

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to