Hi Joshi,

Some observations from first glance. Don't loop over both old and new, but only 
new, and only grab the appropriate mid element from old using a match on 
element a. That will eliminate the exponential order. You can use cts functions 
to guarantee you are using indexes to get the appropriate mid from old.

It might also help to declare mid as fragment root, but that does likely create 
a lot of fragments in your database, and can have side-effects on existing 
search code. But saves loading old and new as one big fragment into memory, and 
needing it to be parsed in memory to reach the mid's..

Deep-equal is also rather expensive, if you can swap it with something simpler, 
that might speed things up as well..

Kind regards,
Geert

>


drs. G.P.H. (Geert) Josten
Consultant

Daidalos BV
Hoekeindsehof 1-4
2665 JZ Bleiswijk

T +31 (0)10 850 1200
F +31 (0)10 850 1199

mailto:geert.jos...@daidalos.nl
http://www.daidalos.nl/

KvK 27164984


De informatie - verzonden in of met dit e-mailbericht - is afkomstig van 
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit 
bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit 
bericht kunnen geen rechten worden ontleend.

> From: general-boun...@developer.marklogic.com
> [mailto:general-boun...@developer.marklogic.com] On Behalf Of
> Joshi, Utsav (LNG-CON)
> Sent: maandag 21 juni 2010 22:17
> To: general@developer.marklogic.com
> Subject: [MarkLogic Dev General] xml comparsion
>
>
>
> I am comparing different version of 2 xml (old.xml and
> new.xml files for reference) to check if there is any changes in xml.
>
> Base/mid/a is my key to compare between old/xml and new.xml.
>
>
>
> It is taking more than a minute for 3000 "mid" elements and
> once it goes beyond 100,000 "mid" element it is taking forever.
>
> I want to reduce the execution time to sub-second response,
> can you please advise.
>
>
>
> old xml
>
> <base>
>  <mid>
>   <a>1</a>
>   <b>b</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>  <mid>
>   <a>2</a>
>   <b>b</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>  <mid>
>   <a>3</a>
>   <b>b</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>
>
>
> new xml
>
> <base>
> <top>xxx</top>
>  <mid>
>   <a>1</a>
>   <b>b</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>  <mid>
>   <a>2</a>
>   <b>b</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>  <mid>
>   <a>3</a>
>   <b>b333</b>
>   <c>c</c>
>   <d>d</d>
>  </mid>
>
>
>
> xquery
>
> xdmp:query-trace(true()),
>
> for $old in doc("/documents/old.xml")/base/mid
>
> for $new in doc("/documents/new.xml")/base/mid
>
> return if ($new/a = $old/a
>
> and not(deep-equal($new, $old))) then
> <updated><old>{$old}</old><new>{$new}</new></updated>
>
> else (), xdmp:query-meters()
>
>
>
>
>
> I have created element range index on local name 'a' and
> below is the excerpt from log file
>
>
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3:
> xdmp:eval("xdmp:query-trace(true()),&#13;&#10;for $old in
> doc(&quot;/docume...", (), <options
> xmlns="xdmp:eval"><isolation>different-transaction</isolation>
> </options>)
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Analyzing path
> for $new: fn:doc("/documents/new.xml")/base/mid
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 1 is
> searchable: fn:doc("/documents/new.xml")
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 2 is searchable: base
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 3 is searchable: mid
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Path is fully searchable.
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Gathering constraints.
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 1
> contributed 1 constraint: fn:doc("/documents/new.xml")
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 2 test
> contributed 1 constraint: base
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Step 3 test
> contributed 1 constraint: mid
>
> 2010-06-17 14:51:27.282 Info: Docs: line 3: Executing search.
>
> 2010-06-17 14:51:27.298 Info: Docs: line 3: Selected 1
> fragment to filter
>
>
>
>
>
> cq output
>
>
>
> <updated><old><mid>
>
>                <a>3000</a>
>
>                <b>b</b>
>
>                <c>c</c>
>
>                <d>d</d>
>
>                <details>
>
>                        <a1>a1</a1>
>
>                        <b1>b1</b1>
>
>                        <c1>c1</c1>
>
>                        <d1>d1</d1>
>
>                </details>
>
>         </mid></old><new><mid>
>
>                <a>3000</a>
>
>                <b>b3000</b>
>
>                <c>c</c>
>
>                <d>d</d>
>
>                <details>
>
>                        <a1>a1</a1>
>
>                        <b1>b1</b1>
>
>                        <c1>c1</c1>
>
>                        <d1>d1</d1>
>
>                </details>
>
>         </mid></new></updated>
>
> <qm:query-meters
> xsi:schemaLocation="http://marklogic.com/xdmp/query-meters
> query-meters.xsd"
> xmlns:qm="http://marklogic.com/xdmp/query-meters";
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
>
>   <qm:elapsed-time>PT1M4.059S</qm:elapsed-time>
>
>   <qm:requests>0</qm:requests>
>
>   <qm:list-cache-hits>26978</qm:list-cache-hits>
>
>   <qm:list-cache-misses>4</qm:list-cache-misses>
>
>   <qm:in-memory-list-hits>0</qm:in-memory-list-hits>
>
>   <qm:expanded-tree-cache-hits>2996</qm:expanded-tree-cache-hits>
>
>   <qm:expanded-tree-cache-misses>2</qm:expanded-tree-cache-misses>
>
>   <qm:compressed-tree-cache-hits>0</qm:compressed-tree-cache-hits>
>
>   <qm:compressed-tree-cache-misses>2</qm:compressed-tree-cache-misses>
>
>
> <qm:in-memory-compressed-tree-hits>0</qm:in-memory-compressed-
> tree-hits>
>
>   <qm:value-cache-hits>0</qm:value-cache-hits>
>
>   <qm:value-cache-misses>8985006</qm:value-cache-misses>
>
>   <qm:regexp-cache-hits>0</qm:regexp-cache-hits>
>
>   <qm:regexp-cache-misses>0</qm:regexp-cache-misses>
>
>   <qm:link-cache-hits>0</qm:link-cache-hits>
>
>   <qm:link-cache-misses>0</qm:link-cache-misses>
>
>   <qm:filter-hits>0</qm:filter-hits>
>
>   <qm:filter-misses>0</qm:filter-misses>
>
>   <qm:fragments-added>0</qm:fragments-added>
>
>   <qm:fragments-deleted>0</qm:fragments-deleted>
>
>   <qm:fs-program-cache-hits>0</qm:fs-program-cache-hits>
>
>   <qm:fs-program-cache-misses>0</qm:fs-program-cache-misses>
>
>   <qm:db-program-cache-hits>0</qm:db-program-cache-hits>
>
>   <qm:db-program-cache-misses>0</qm:db-program-cache-misses>
>
>
> <qm:fs-main-module-sequence-cache-hits>0</qm:fs-main-module-se
> quence-cache-hits>
>
>
> <qm:fs-main-module-sequence-cache-misses>0</qm:fs-main-module-
> sequence-cache-misses>
>
>
> <qm:db-main-module-sequence-cache-hits>0</qm:db-main-module-se
> quence-cache-hits>
>
>
> <qm:db-main-module-sequence-cache-misses>0</qm:db-main-module-
> sequence-cache-misses>
>
>   <qm:fs-library-module-cache-hits>0</qm:fs-library-module-cache-hits>
>
>
> <qm:fs-library-module-cache-misses>0</qm:fs-library-module-cac
> he-misses>
>
>   <qm:db-library-module-cache-hits>0</qm:db-library-module-cache-hits>
>
>
> <qm:db-library-module-cache-misses>0</qm:db-library-module-cac
> he-misses>
>
>   <qm:fragments>
>
>     <qm:fragment>
>
>       <qm:root xmlns="">base</qm:root>
>
>       <qm:expanded-tree-cache-hits>2996</qm:expanded-tree-cache-hits>
>
>       <qm:expanded-tree-cache-misses>2</qm:expanded-tree-cache-misses>
>
>     </qm:fragment>
>
>   </qm:fragments>
>
>   <qm:documents>
>
>     <qm:document>
>
>       <qm:uri>/documents/new.xml</qm:uri>
>
>       <qm:expanded-tree-cache-hits>2996</qm:expanded-tree-cache-hits>
>
>       <qm:expanded-tree-cache-misses>1</qm:expanded-tree-cache-misses>
>
>     </qm:document>
>
>     <qm:document>
>
>       <qm:uri>/documents/old.xml</qm:uri>
>
>       <qm:expanded-tree-cache-hits>0</qm:expanded-tree-cache-hits>
>
>       <qm:expanded-tree-cache-misses>1</qm:expanded-tree-cache-misses>
>
>     </qm:document>
>
>   </qm:documents>
>
> </qm:query-meters>
>
>
>
>
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to