Hi Ryan, Even though it says the path is "fully searchable," that doesn't mean it's necessarily using the indexes to find all the appropriate results. An expression using not() will definitely depend on filtering, because you can't search for the absence of something using the indexes. Compare the output of xdmp:plan() or xdmp:query-trace() between these two expressions:
/stuff[child] /stuff[not(child)] In the first case, the index resolution should be very close to 100%, since the Server can look up all documents that have both <stuff> and <child>. In the second case, it will find all documents that have <stuff>, and the predicate [not(child)] will have to be resolved at the filtering stage (reading each <stuff> document to see if <child> is absent or not). If you see "Step 2 predicate 1 contributed 1 constraint: child", that's an encouraging sign that the Server is making use of the index to evaluate the predicate. If you are needing to squeeze more performance out, you can consider using cts:not-query() or cts:and-not-query(), but be very careful with these, because a false positive in the negated query will result in a false negative in the result (missing results). Evan Lenz Software Developer, Community MarkLogic Corporation email [email protected]<mailto:[email protected]> web developer.marklogic.com<http://developer.marklogic.com/> From: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Reply-To: General MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Date: Tue, 31 May 2011 14:24:28 -0700 To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: [MarkLogic Dev General] using fn:not() in queries I'm hoping someone can validate my understanding. From what I can tell through performance measuring, using fn:not() does not automatically my a query unsearchable (ie, not go against the index). Given XML files in the DB that look like this : <stuff> <child>Alan</child> </stuff> where there will always be a child element but there may or may not be a value. If I write an XPath expression like this: /stuff[fn:not(child/text())] Then according to xdmp:plan(), the XPath is fully searchable, and according to the profiler, runs without a performance penalty because of the fn:not(). So my questions are: Are the absences of element values as fast as (or nearly so) as existing values in terms of using the indexes in queries? In other words, is querying for the absence of a value as fast as querying for a value? Is there a faster way to query for an absent or empty value? I could change the data so that there would be no "child" element if there is no value for the element. Would that matter in terms of performance? Would it be faster to have "<child values-exists='no'></child>" and use that attribute in a positive query rather than just <child/> with a negative query? >From my testing, it seems like using fn:not() in this case is just as good as >anything else. But I suspect there's more to the story. Thanks, Ryan
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
