XPath requires evaluation of every predicate for every context item. If you are 
spending too much time in a predicate, refactor to remove the constant terms.

for $x in 
collection('/db/content')//(elementa|elementb|elementc|elementd|elemente|elementf)
let $xid := $x/@localisedtextid/string()
let $e := doc('/db/language/lang_en.xml')//text[@id=$xid]
order by $x/@id
return <localisedtext quest="{$e}"/>

As Mike mentioned, a range index might help with the "order by" portion. Note 
that I did not use $xid for the order-by, because that might interfere with 
range index utilization.

Also, it's best to avoid '//' when possible, and instead state the paths 
explicitly.

-- Mike

On 16 Mar 2012, at 12:26 , Nick Tuckett wrote:

> I'm evaluating MarkLogic as a possible way to store and access around 25Mb 
> (and growing) of fairly complex XML data. For one particular type of common 
> query for my application, I'm seeing drastically different performance 
> between MarkLogic and eXist.  I would be very grateful for any feedback or 
> advice on how to improve this performance
> 
> One common feature of this data are attributes containing identifying values 
> that reference other elements in the collection - an example of this is for 
> referencing localised text from a common XML file. I have been using a fairly 
> simple query to benchmark performance that looks like this:
> 
> for $x in 
> collection('/db/content')//(elementa|elementb|elementc|elementd|elemente|elementf)
> let $e := doc('/db/language/lang_en.xml')//text[@id=$x/@localisedtextid]
> order by $x/@id
> return 
> <localisedtext quest="{$e}"/> 
> 
> The benchmark content has around 2500 instances for this particular case. 
> With everything else constant (hardware, OS, content) I see drastically 
> different performance between MarkLogic and eXist. The former takes around 59 
> seconds to return the data for all instances, the latter takes 8 seconds.
> 
> As I understand it, MarkLogic sets up indexing automatically, including 
> indexing on element-attribute pairs. To match this, I created an explicit 
> equivalent index for eXist for the text/@id pair for use in this case.
> 
> For MarkLogic, running the query with the profiler showed that around 75% of 
> the execution time went on '@id = $x/@localisedtextid', and query tracing 
> produced the following output:
> 
> Initial part of query:
> 
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: 
> xdmp:eval("xdmp:query-trace(true()),&#10;for $x in 
> collection('/db/content...", (), <options 
> xmlns="xdmp:eval"><database>1488253557778688591</database><modules>148825355777868...</options>)
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Analyzing path for $x: 
> fn:collection("/db/content")/descendant-or-self::node()/(elementa|elementb|elementc|elementd|elemente|elementf)
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Step 1 is searchable: 
> fn:collection("/db/content")
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Step 2 does not use 
> indexes: descendant-or-self::node()
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Step 3 is searchable: 
> (elementa|elementb|elementc|elementd|elemente|elementf)
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Path is fully searchable.
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Gathering constraints.
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Step 1 contributed 1 
> constraint: fn:collection("/db/ content")
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:48: Step 3 contributed 1 
> constraint: elementa
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:62: Step 3 contributed 1 
> constraint: elementb
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:81: Step 3 contributed 1 
> constraint: elementc
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:110: Step 3 contributed 1 
> constraint: elementd
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:122: Step 3 contributed 1 
> constraint: elemente
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:132: Step 3 contributed 1 
> constraint: elementf
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Executing search.
> 2012-03-16 11:35:35.743 Info: App-Services: at 2:10: Selected 8 fragments to 
> filter.
> 
> Iterated part of query (repeat N times...)
> 
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: 
> xdmp:eval("xdmp:query-trace(true()),&#10;for $x in 
> collection('/db/content...", (), <options 
> xmlns="xdmp:eval"><database>1488253557778688591</database><modules>148825355777868...</options>)
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Analyzing path: 
> fn:doc("/db/language/lang_en.xml")/descendant::text[@id = 
> xs:untypedAtomic("ElementName143001")]
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Step 1 is searchable: 
> fn:doc("/db/language/lang_en.xml")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Step 2 is searchable: 
> descendant::text[@id = xs:untypedAtomic("ElementName143001")]
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Path is fully searchable.
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Gathering constraints.
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:10: Step 1 contributed 1 
> constraint: fn:doc("/db/language/lang_en.xml")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:54: Comparison contributed 
> hash value constraint: text/@id = xs:untypedAtomic("ElementName143001")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Step 2 predicate 1 
> contributed 1 constraint: @id = xs:untypedAtomic("ElementName143001")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:54: Comparison contributed 
> hash value constraint: text/@id = xs:untypedAtomic("ElementName143001")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Step 2 predicate 1 
> contributed 1 constraint: @id = xs:untypedAtomic("ElementName143001")
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Step 2 contributed 2 
> constraints: descendant::text[@id = xs:untypedAtomic("ElementName143001")]
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Executing search.
> 2012-03-16 11:35:35.743 Info: App-Services: at 3:48: Selected 1 fragment to 
> filter
> 
> Query meters:
> 
>   <qm:query-meters xsi:schemaLocation="http://marklogic.com/xdmp/query-meters 
> query-meters.xsd" xmlns:qm="http://marklogic.com/xdmp/query-meters"; 
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
>     <qm:elapsed-time>PT56.655659S</qm:elapsed-time>
>     <qm:requests>0</qm:requests>
>     <qm:list-cache-hits>1043</qm:list-cache-hits>
>     <qm:list-cache-misses>0</qm:list-cache-misses>
>     <qm:in-memory-list-hits>0</qm:in-memory-list-hits>
>     <qm:expanded-tree-cache-hits>519</qm:expanded-tree-cache-hits>
>     <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>     <qm:compressed-tree-cache-hits>0</qm:compressed-tree-cache-hits>
>     <qm:compressed-tree-cache-misses>0</qm:compressed-tree-cache-misses>
>     <qm:in-memory-compressed-tree-hits>0</qm:in-memory-compressed-tree-hits>
>     <qm:value-cache-hits>6672643</qm:value-cache-hits>
>     <qm:value-cache-misses>6673683</qm:value-cache-misses>
>     <qm:regexp-cache-hits>0</qm:regexp-cache-hits>
>     <qm:regexp-cache-misses>0</qm:regexp-cache-misses>
>     <qm:link-cache-hits>0</qm:link-cache-hits>
>     <qm:link-cache-misses>0</qm:link-cache-misses>
>     <qm:filter-hits>0</qm:filter-hits>
>     <qm:filter-misses>0</qm:filter-misses>
>     <qm:fragments-added>0</qm:fragments-added>
>     <qm:fragments-deleted>0</qm:fragments-deleted>
>     <qm:fs-program-cache-hits>0</qm:fs-program-cache-hits>
>     <qm:fs-program-cache-misses>1</qm:fs-program-cache-misses>
>     <qm:db-program-cache-hits>0</qm:db-program-cache-hits>
>     <qm:db-program-cache-misses>0</qm:db-program-cache-misses>
>     <qm:env-program-cache-hits>0</qm:env-program-cache-hits>
>     <qm:env-program-cache-misses>0</qm:env-program-cache-misses>
>     
> <qm:fs-main-module-sequence-cache-hits>0</qm:fs-main-module-sequence-cache-hits>
>     
> <qm:fs-main-module-sequence-cache-misses>0</qm:fs-main-module-sequence-cache-misses>
>     
> <qm:db-main-module-sequence-cache-hits>0</qm:db-main-module-sequence-cache-hits>
>     
> <qm:db-main-module-sequence-cache-misses>0</qm:db-main-module-sequence-cache-misses>
>     <qm:fs-library-module-cache-hits>0</qm:fs-library-module-cache-hits>
>     <qm:fs-library-module-cache-misses>0</qm:fs-library-module-cache-misses>
>     <qm:db-library-module-cache-hits>0</qm:db-library-module-cache-hits>
>     <qm:db-library-module-cache-misses>0</qm:db-library-module-cache-misses>
>     <qm:fragments>
>       <qm:fragment>
>       <qm:root>contents</qm:root>
>       <qm:expanded-tree-cache-hits>511</qm:expanded-tree-cache-hits>
>       <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>       </qm:fragment>
>       <qm:fragment>
>       <qm:root>database</qm:root>
>       <qm:expanded-tree-cache-hits>8</qm:expanded-tree-cache-hits>
>       <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>       </qm:fragment>
>     </qm:fragments>
>     <qm:documents>
>       <qm:document>
>       <qm:uri>/db/content/file1.xml</qm:uri>
>       <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
>       <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>       </qm:document>
>       <qm:document>
>       <qm:uri>/db/content/file2.xml</qm:uri>
>       <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
>       <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>       </qm:document>
>       <qm:document>
>       <qm:uri>/db/language/lang_en.xml</qm:uri>
>       <qm:expanded-tree-cache-hits>511</qm:expanded-tree-cache-hits>
>       <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>       </qm:document>
>       <qm:document>
>       <qm:uri>/db/content/file3.xml</qm:uri>
>       <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
>       <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>       </qm:document>
>       <qm:document>
>       <qm:uri>/db/content/file4.xml</qm:uri>
>       <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
>       <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>       </qm:document>
>       <qm:document>
>       <qm:uri>/db/content/file5.xml</qm:uri>
>       <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
>       <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>       </qm:document>
>       <qm:document>
>       <qm:uri>/db/content/file6.xml</qm:uri>
>       <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
>       <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>       </qm:document>
>       <qm:document>
>       <qm:uri>/db/content/file7.xml</qm:uri>
>       <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
>       <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>       </qm:document>
>       <qm:document>
>       <qm:uri>/db/content/file8.xml</qm:uri>
>       <qm:expanded-tree-cache-hits>1</qm:expanded-tree-cache-hits>
>       <qm:expanded-tree-cache-misses>0</qm:expanded-tree-cache-misses>
>       </qm:document>
>     </qm:documents>
>     <qm:hosts/>
>   </qm:query-meters>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to