XPath requires evaluation of every predicate for every context item. If you are
spending too much time in a predicate, refactor to remove the constant terms.
for $x in
collection('/db/content')//(elementa|elementb|elementc|elementd|elemente|elementf)
let $xid := $x/@localisedtextid/string()
let $e := doc('/db/language/lang_en.xml')//text[@id=$xid]
order by $x/@id
return <localisedtext quest="{$e}"/>
As Mike mentioned, a range index might help with the "order by" portion. Note
that I did not use $xid for the order-by, because that might interfere with
range index utilization.
Also, it's best to avoid '//' when possible, and instead state the paths
explicitly.
-- Mike
On 16 Mar 2012, at 12:26 , Nick Tuckett wrote:
> I'm evaluating MarkLogic as a possible way to store and access around 25Mb
> (and growing) of fairly complex XML data. For one particular type of common
> query for my application, I'm seeing drastically different performance
> between MarkLogic and eXist. I would be very grateful for any feedback or
> advice on how to improve this performance
>
> One common feature of this data are attributes containing identifying values
> that reference other elements in the collection - an example of this is for
> referencing localised text from a common XML file. I have been using a fairly
> simple query to benchmark performance that looks like this:
>
> for $x in
> collection('/db/content')//(elementa|elementb|elementc|elementd|elemente|elementf)
> let $e := doc('/db/language/lang_en.xml')//text[@id=$x/@localisedtextid]
> order by $x/@id
> return
> <localisedtext quest="{$e}"/>
>
> The benchmark content has around 2500 instances for this particular case.
> With everything else constant (hardware, OS, content) I see drastically
> different performance between MarkLogic and eXist. The former takes around 59
> seconds to return the data for all instances, the latter takes 8 seconds.
>
> As I understand it, MarkLogic sets up indexing automatically, including
> indexing on element-attribute pairs. To match this, I created an explicit
> equivalent index for eXist for the text/@id pair for use in this case.
>
> For MarkLogic, running the query with the profiler showed that around 75% of
> the execution time went on '@id = $x/@localisedtextid', and query tracing
> produced the following output:
>
> Initial part of query:
>
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10:
> xdmp:eval("xdmp:query-trace(true()), for $x in
> collection('/db/content...", (), <options
> xmlns="xdmp:eval"><database>1488253557778688591</database><modules>148825355777868...</options>)
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Analyzing path for $x:
> fn:collection("/db/content")/descendant-or-self::node()/(elementa|elementb|elementc|elementd|elemente|elementf)
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Step 1 is searchable:
> fn:collection("/db/content")
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Step 2 does not use
> indexes: descendant-or-self::node()
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Step 3 is searchable:
> (elementa|elementb|elementc|elementd|elemente|elementf)
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Path is fully searchable.
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Gathering constraints.
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Step 1 contributed 1
> constraint: fn:collection("/db/ content")
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:48: Step 3 contributed 1
> constraint: elementa
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:62: Step 3 contributed 1
> constraint: elementb
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:81: Step 3 contributed 1
> constraint: elementc
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:110: Step 3 contributed 1
> constraint: elementd
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:122: Step 3 contributed 1
> constraint: elemente
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:132: Step 3 contributed 1
> constraint: elementf
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Executing search.
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Selected 8 fragments to
> filter.
>
> Iterated part of query (repeat N times...)
>
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48:
> xdmp:eval("xdmp:query-trace(true()), for $x in
> collection('/db/content...", (), <options
> xmlns="xdmp:eval"><database>1488253557778688591</database><modules>148825355777868...</options>)
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Analyzing path:
> fn:doc("/db/language/lang_en.xml")/descendant::text[@id =
> xs:untypedAtomic("ElementName143001")]
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Step 1 is searchable:
> fn:doc("/db/language/lang_en.xml")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Step 2 is searchable:
> descendant::text[@id = xs:untypedAtomic("ElementName143001")]
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Path is fully searchable.
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Gathering constraints.
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:10: Step 1 contributed 1
> constraint: fn:doc("/db/language/lang_en.xml")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:54: Comparison contributed
> hash value constraint: text/@id = xs:untypedAtomic("ElementName143001")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Step 2 predicate 1
> contributed 1 constraint: @id = xs:untypedAtomic("ElementName143001")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:54: Comparison contributed
> hash value constraint: text/@id = xs:untypedAtomic("ElementName143001")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Step 2 predicate 1
> contributed 1 constraint: @id = xs:untypedAtomic("ElementName143001")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Step 2 contributed 2
> constraints: descendant::text[@id = xs:untypedAtomic("ElementName143001")]
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Executing search.
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Selected 1 fragment to
> filter
>
> Query meters:
>
> <qm:query-meters xsi:schemaLocation="http://marklogic.com/xdmp/query-meters
> query-meters.xsd" xmlns:qm="http://marklogic.com/xdmp/query-meters"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
> <qm:elapsed-time>PT56.655659S</qm:elapsed-time>
> <qm:requests>0</qm:requests>
> <qm:list-cache-hits>1043</qm:list-cache-hits>
> <qm:list-cache-misses>0</qm:list-cache-misses>
> <qm:in-memory-list-hits>0</qm:in-memory-list-hits>
> <qm:expanded-tree-cache-hits>519</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> <qm:compressed-tree-cache-hits>0</qm:compressed-tree-cache-hits>
> <qm:compressed-tree-cache-misses>0</qm:compressed-tree-cache-misses>
> <qm:in-memory-compressed-tree-hits>0</qm:in-memory-compressed-tree-hits>
> <qm:value-cache-hits>6672643</qm:value-cache-hits>
> <qm:value-cache-misses>6673683</qm:value-cache-misses>
> <qm:regexp-cache-hits>0</qm:regexp-cache-hits>
> <qm:regexp-cache-misses>0</qm:regexp-cache-misses>
> <qm:link-cache-hits>0</qm:link-cache-hits>
> <qm:link-cache-misses>0</qm:link-cache-misses>
> <qm:filter-hits>0</qm:filter-hits>
> <qm:filter-misses>0</qm:filter-misses>
> <qm:fragments-added>0</qm:fragments-added>
> <qm:fragments-deleted>0</qm:fragments-deleted>
> <qm:fs-program-cache-hits>0</qm:fs-program-cache-hits>
> <qm:fs-program-cache-misses>1</qm:fs-program-cache-misses>
> <qm:db-program-cache-hits>0</qm:db-program-cache-hits>
> <qm:db-program-cache-misses>0</qm:db-program-cache-misses>
> <qm:env-program-cache-hits>0</qm:env-program-cache-hits>
> <qm:env-program-cache-misses>0</qm:env-program-cache-misses>
>
> <qm:fs-main-module-sequence-cache-hits>0</qm:fs-main-module-sequence-cache-hits>
>
> <qm:fs-main-module-sequence-cache-misses>0</qm:fs-main-module-sequence-cache-misses>
>
> <qm:db-main-module-sequence-cache-hits>0</qm:db-main-module-sequence-cache-hits>
>
> <qm:db-main-module-sequence-cache-misses>0</qm:db-main-module-sequence-cache-misses>
> <qm:fs-library-module-cache-hits>0</qm:fs-library-module-cache-hits>
> <qm:fs-library-module-cache-misses>0</qm:fs-library-module-cache-misses>
> <qm:db-library-module-cache-hits>0</qm:db-library-module-cache-hits>
> <qm:db-library-module-cache-misses>0</qm:db-library-module-cache-misses>
> <qm:fragments>
> <qm:fragment>
> <qm:root>contents</qm:root>
> <qm:expanded-tree-cache-hits>511</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> </qm:fragment>
> <qm:fragment>
> <qm:root>database</qm:root>
> <qm:expanded-tree-cache-hits>8</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> </qm:fragment>
> </qm:fragments>
> <qm:documents>
> <qm:document>
> <qm:uri>/db/content/file1.xml</qm:uri>
> <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> </qm:document>
> <qm:document>
> <qm:uri>/db/content/file2.xml</qm:uri>
> <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> </qm:document>
> <qm:document>
> <qm:uri>/db/language/lang_en.xml</qm:uri>
> <qm:expanded-tree-cache-hits>511</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> </qm:document>
> <qm:document>
> <qm:uri>/db/content/file3.xml</qm:uri>
> <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> </qm:document>
> <qm:document>
> <qm:uri>/db/content/file4.xml</qm:uri>
> <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> </qm:document>
> <qm:document>
> <qm:uri>/db/content/file5.xml</qm:uri>
> <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> </qm:document>
> <qm:document>
> <qm:uri>/db/content/file6.xml</qm:uri>
> <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> </qm:document>
> <qm:document>
> <qm:uri>/db/content/file7.xml</qm:uri>
> <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> </qm:document>
> <qm:document>
> <qm:uri>/db/content/file8.xml</qm:uri>
> <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
> <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
> </qm:document>
> </qm:documents>
> <qm:hosts/>
> </qm:query-meters>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general