Arthur, I'm sorry to see that you wrote that much code while looking for 
a solution. Sometimes it's helpful to search for an error message if 
it's new to you: http://www.google.com/search?q=XDMP-EXPNTREECACHEFULL 
and http://marklogic.markmail.org/search/?q=XDMP-EXPNTREECACHEFULL are 
good places to start. The short story is that the query's working set 
has to fit in the expanded tree cache.

Moving on to remedies, tuning the in-memory tree size will not affect 
XDMP-EXPNTREECACHEFULL. If you want to try tuning the server, then tune 
the expanded tree cache size. However, it's usually better to tune the 
query. XDMP-EXPNTREECACHEFULL usually means that your query is 
over-ambitious. The query might not be using indexes efficiently, or it 
might simply be a query with a gigantic working set.

I see that this is an update query. ACID properties require a read-lock 
on every document read by the query, and write-lock every document that 
is updated. If you expect to have 300k of these documents for the live 
system, then I would recommend breaking the work up into smaller 
transactions. While it is possible to modify 300k (or more) documents in 
one transaction, it is usually more efficient to modify a batch of 
documents at a time. I like to use batches of 500 or 1000. Besides 
performance concerns, this technique is helpful when you encounter an 
error in the 299,999th document: you only have to reprocess that batch.

Finally, you might be interested in http://marklogic.github.com/corb/ 
which is intended to help automate this sort of bulk-update.

-- Mike

On 2010-09-23 20:14, Zegarek, Arthur wrote:
> I am getting  an XDMP-EXPNTREECACHEFULL error – not sure how to get around it.
>
> Trying to write an xquery that reads through a control list to obtain a list 
> of catalog elements tha require updating, along with a start_date/end_date 
> value to update in main catalog.
>
> When I load the control xml (DevWithoutExclProduct.xml ) up with more than 
> 2000 items or so, I get XDMP-EXPNTREECACHEFULL.  In memory tree size is set 
> to 1Gb
>
> I have 3 versions of the code – details below.  I would think the 2nd  or 3rd 
> versions would not incur the problem, since in this version I isolate the 
> logic in a function that is called with just a single node each time.  I 
> understand the issue is too many nodes being kept I scope, but how can you 
> get around this in a single xquery call, without breaking up the data to make 
> multiple calls to ML Server?  If I limit DevWithoutExclProduct.xml  to 1500 
> products or so, it runs through without the exception.
>
> Currently running this in our dev environment where we have approx 54000 
> products in the Internal collection . In Prod it is more like 300000.
>
> Version 1:
> declare namespace RSUITE="http://www.reallysi.com";
> declare namespace adbl="http://www.audible.com/publisherToRepository";
>
> for $excl in doc("DevWithoutExclProduct.xml")/prods/product, $rsuite in 
> collection("Internal")/product
>
> let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT
> let $prod_id := $excl/prods/product/prod_id
>
> let $excl_content := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_CONTENT
> let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT
>
> let $exi := exists($excl_product )
>
> let $start := $excl/start
> let $end := $excl/end
>
> let $repl_node 
> :=<adbl:EXCLUSIVE_PRODUCT><adbl:START_DATE>{$start/text()}</adbl:START_DATE><adbl:END_DATE>{$end/text()}</adbl:END_DATE></adbl:EXCLUSIVE_PRODUCT>
>
> where $rsuite/adbl:METADATA/adbl:CORE/adbl:ID/text() = $excl/prod_id/text()
>
>
> return
> <a>
> {
> if( $exi = true() )
> then
>     xdmp:node-replace($excl_product, $repl_node)
> else
>   xdmp:node-insert-after($excl_content, $repl_node)
>
> }
> </a>
>
> In this version, the control list, DevWithoutExclProduct.xml, is joined in 
> the same for loop as the main catalog
> Error returned is:
> XDMP-EXPNTREECACHEFULL: for $rsuite as item()* in 
> collection("Internal")/child::product -- Expanded tree cache full on host 
> rsuite.ofc.dev.ewr.audible.com
> line 4
> =
> /use-cases/eval2.xqy line 2
>
>
> Version 2 – Here I tried isolating the functionality in a function, and call 
> the function in a separate for loop that reads through the control. So I am 
> unclear why I am setill getting the tree cache full, given that the function 
> is called with just a single node each time.   Note the difference in the 
> error reported – here there are 2 lines mentioned.
>
> declare namespace RSUITE="http://www.reallysi.com";
> declare namespace adbl="http://www.audible.com/publisherToRepository";
>
> define function update_excl_prod( $excl as node() ) as element()   (: Call 
> with just 1 node !  :)
> {
>
> for $rsuite in collection("Internal")/product
>
> let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT
> let $prod_id := $excl/prods/product/prod_id
>
> let $excl_content := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_CONTENT
> let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT
>
> let $exi := exists($excl_product )
>
> let $start := $excl/start
> let $end_dt := $excl/end
>
> let $repl_node 
> :=<adbl:EXCLUSIVE_PRODUCT><adbl:START_DATE>{$start/text()}</adbl:START_DATE><adbl:END_DATE>{$end_dt/text()}</adbl:END_DATE></adbl:EXCLUSIVE_PRODUCT>
>
> where $rsuite/adbl:METADATA/adbl:CORE/adbl:ID/text() = $excl/prod_id/text()
>
> return
> if( $exi = true() )
> then
>     xdmp:node-replace($excl_product, $repl_node)
> else
>   xdmp:node-insert-after($excl_content, $repl_node)
> }
>
> <result>{
> for $excl in doc("DevWithoutExclProduct.xml")/prods/product
> return update_excl_prod( $excl )
> }
>
> </result>
>
> Error returned here is:
> XDMP-EXPNTREECACHEFULL: for $rsuite as item()* in 
> collection("Internal")/child::product -- Expanded tree cache full on host 
> rsuite.ofc.dev.ewr.audible.com
> line 7
> =
> line 34
> =
> /use-cases/eval2.xqy line 2
>
> Version 3 – Here I tried limiting the for expression to search for the item, 
> removed the where clause.Same exception
>
> declare namespace RSUITE="http://www.reallysi.com";
> declare namespace adbl="http://www.audible.com/publisherToRepository";
>
> define function update_excl_prod( $excl as node() ) as element()   (: Call 
> with just 1 node !  :)
> {
>
> for $rs  in 
> fn:collection('Internal')/product/adbl:METADATA/adbl:CORE/adbl:ID[.= 
> $excl/prod_id/text() ]
>
> let $rsuite   := doc(xdmp:node-uri($rs  ))/product
>
> let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT
> let $prod_id := $excl/prods/product/prod_id
>
> let $excl_content := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_CONTENT
> let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT
>
> let $exi := exists($excl_product )
>
> let $start := $excl/start
> let $end_dt := $excl/end
>
> let $repl_node 
> :=<adbl:EXCLUSIVE_PRODUCT><adbl:START_DATE>{$start/text()}</adbl:START_DATE><adbl:END_DATE>{$end_dt/text()}</adbl:END_DATE></adbl:EXCLUSIVE_PRODUCT>
>
> (: where $rsuite/adbl:METADATA/adbl:CORE/adbl:ID/text() = 
> $excl/prod_id/text()   :)
>
> return
> if( $exi = true() )
> then
>     xdmp:node-replace($excl_product, $repl_node)
> else
>   xdmp:node-insert-after($excl_content, $repl_node)
> }
>
> <result>{
> for $excl in doc("DevWithoutExclProduct.xml")/prods/product
> return update_excl_prod( $excl )
> }
>
> </result>
>
>
> Art
>
>
> Art Zegarek  |  Director of Data Architecture
> T: 973.820.0396    F: 973.820.0505    C: 732-735-2592
>
> audible.com
> 1 Washington Park, 16th Floor, Newark, NJ 07102
>
>


_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to