On Dec 20, 2009, at 8:32 PM, Lee, David wrote:

> I’ve run into an interesting case of a very slow query.
> In the DB I have I have about 600,000 fragments (in about 6000 files).
> These are small fragments (about 300 bytes) with about 10 very short elements 
> containing a short string or nothing.
> In MOST searches I get  about 100ms result times but this one takes about 60 
> seconds

You can catch the query-trace of it to see how (and how well) the filtering is 
being applied.

> cts:search(
> xdmp:directory("/RxNorm/rxnconso/")//RXNCONSO ,
>     cts:element-query( xs:QName("STR") ,
>        cts:word-query( "ENG",  ("case-insensitive", "diacritic-sensitive",
>           "punctuation-insensitive", "whitespace-insensitive", 
> "unstemmed","wildcarded") ) ) )[1 to 10]
>  
> What I think is going on here is that the term “ENG” is in every single 
> fragment (its a language code), so its finding 600,000 fragments
> but I’m constructing a search to limit the search to only “STR” elements, of 
> which none contain “ENG”.
> My guess as to what is happening is that ML is  finding a “hit” in every 
> fragment, but has to open up the fragment
> and search to discover that the hit was in the wrong element.   The result is 
> the empty sequence.
> but it takes a minute to get to that.

The docs on cts:element-query() explain which indexes can help it do its job:

"Enabling both the word position and element position indexes ("word position" 
and "element word position" in the database configuration screen of the Admin 
Interface) will speed up query performance for many queries that use 
cts:element-query. The position indexes enable MarkLogic Server to eliminate 
many false-positive results, which can reduce disk I/O and processing, thereby 
speeding the performance of many queries. The amount of benefit will vary 
depending on your data."

Sounds like that's the most likely candidate; you don't have these indexes so 
the query is seeing those false positives that appear in other elements.

-jh-

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to