Hi, Paul:
Syntactically, the XPath expressions
* retrieve a sequence of documents from the database
* extract a sequence of nodes from the sequence of documents (in case 2)
* filter to produce a final sequence by applying the predicate to each item
The engine tries to optimize XPath expressions by executing as much as of
the expression as possible as a query against indexes. Not all expressions are
possible to optimize as a query. The optimizer may also miss some cases that
are possible to optimize.
The best practice is to use explicit queries instead of XPath expressions when
retrieving documents from the database. That way, the use of indexes is
unambiguous. In addition, you have access to index mechanisms (such as fields)
that aren't available in predicates.
To put it the other way, the best practice is to use XPath expressions only to
traverse nodes after they have been retrieved from the database.
In the particular case, the equivalent query for XPath expression 2 would
resemble the following:
let $ait := cts:search((), cts:and-query((
cts:collection-query($mycol),
cts:element-query(xs:QName("aaa"),
cts:element-value-query(xs:QName("myelem"), "myval"))
)
)))
While the query is more verbose, it declares a carefully considered use of the
indexes. That's a good thing for scalability and maintainability with most
production databases.
For more information, see:
http://docs.marklogic.com/cts:collection-query
http://docs.marklogic.com/cts:element-query
http://docs.marklogic.com/cts:element-value-query
Hoping that helps,
Erik Hennum
________________________________________
From: [email protected]
<[email protected]> on behalf of Paul M <[email protected]>
Sent: Tuesday, May 8, 2018 11:47:35 AM
To: [email protected]
Subject: [MarkLogic Dev General] collection function searching
I have the following three queries I am comparing
declare variable $my:col as xs:string...
(:
let $ait := collection()[.//myelem="myval"]
:)
(:
let $ait := collection($mycol)/aaa[.//myelem="myval"]
:)
let $ait := collection($mycol)[.//myelem=" myval"]
return $ait//someelem
There may be 10mil total documents in the repository.
There may be 1mil documents that have root element of aaa.
Note:myelem can/should only be in documents with root element of aaa
There may be 3 mil total documents that are in mycol
There may be at most 100k documents that have root element of aaa which are in
mycol
The following are the query-trace for the above three statement:
1st one: roughly 3000 fragments to filter. Seems reasonable aaa documents that
have myelem = myval
2nd statement: roughly 1000 fragments to filter. Seems reasonable - narrowing
the search
Last statement: roughly 8 mil fragments to filter. Not certain why this occurs.
Any explanation to shed some light?
_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general