Arthur, I'm sorry to see that you wrote that much code while looking for a solution. Sometimes it's helpful to search for an error message if it's new to you: http://www.google.com/search?q=XDMP-EXPNTREECACHEFULL and http://marklogic.markmail.org/search/?q=XDMP-EXPNTREECACHEFULL are good places to start. The short story is that the query's working set has to fit in the expanded tree cache.
Moving on to remedies, tuning the in-memory tree size will not affect XDMP-EXPNTREECACHEFULL. If you want to try tuning the server, then tune the expanded tree cache size. However, it's usually better to tune the query. XDMP-EXPNTREECACHEFULL usually means that your query is over-ambitious. The query might not be using indexes efficiently, or it might simply be a query with a gigantic working set. I see that this is an update query. ACID properties require a read-lock on every document read by the query, and write-lock every document that is updated. If you expect to have 300k of these documents for the live system, then I would recommend breaking the work up into smaller transactions. While it is possible to modify 300k (or more) documents in one transaction, it is usually more efficient to modify a batch of documents at a time. I like to use batches of 500 or 1000. Besides performance concerns, this technique is helpful when you encounter an error in the 299,999th document: you only have to reprocess that batch. Finally, you might be interested in http://marklogic.github.com/corb/ which is intended to help automate this sort of bulk-update. -- Mike On 2010-09-23 20:14, Zegarek, Arthur wrote: > I am getting an XDMP-EXPNTREECACHEFULL error – not sure how to get around it. > > Trying to write an xquery that reads through a control list to obtain a list > of catalog elements tha require updating, along with a start_date/end_date > value to update in main catalog. > > When I load the control xml (DevWithoutExclProduct.xml ) up with more than > 2000 items or so, I get XDMP-EXPNTREECACHEFULL. In memory tree size is set > to 1Gb > > I have 3 versions of the code – details below. I would think the 2nd or 3rd > versions would not incur the problem, since in this version I isolate the > logic in a function that is called with just a single node each time. I > understand the issue is too many nodes being kept I scope, but how can you > get around this in a single xquery call, without breaking up the data to make > multiple calls to ML Server? If I limit DevWithoutExclProduct.xml to 1500 > products or so, it runs through without the exception. > > Currently running this in our dev environment where we have approx 54000 > products in the Internal collection . In Prod it is more like 300000. > > Version 1: > declare namespace RSUITE="http://www.reallysi.com" > declare namespace adbl="http://www.audible.com/publisherToRepository" > > for $excl in doc("DevWithoutExclProduct.xml")/prods/product, $rsuite in > collection("Internal")/product > > let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT > let $prod_id := $excl/prods/product/prod_id > > let $excl_content := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_CONTENT > let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT > > let $exi := exists($excl_product ) > > let $start := $excl/start > let $end := $excl/end > > let $repl_node > :=<adbl:EXCLUSIVE_PRODUCT><adbl:START_DATE>{$start/text()}</adbl:START_DATE><adbl:END_DATE>{$end/text()}</adbl:END_DATE></adbl:EXCLUSIVE_PRODUCT> > > where $rsuite/adbl:METADATA/adbl:CORE/adbl:ID/text() = $excl/prod_id/text() > > > return > <a> > { > if( $exi = true() ) > then > xdmp:node-replace($excl_product, $repl_node) > else > xdmp:node-insert-after($excl_content, $repl_node) > > } > </a> > > In this version, the control list, DevWithoutExclProduct.xml, is joined in > the same for loop as the main catalog > Error returned is: > XDMP-EXPNTREECACHEFULL: for $rsuite as item()* in > collection("Internal")/child::product -- Expanded tree cache full on host > rsuite.ofc.dev.ewr.audible.com > line 4 > = > /use-cases/eval2.xqy line 2 > > > Version 2 – Here I tried isolating the functionality in a function, and call > the function in a separate for loop that reads through the control. So I am > unclear why I am setill getting the tree cache full, given that the function > is called with just a single node each time. Note the difference in the > error reported – here there are 2 lines mentioned. > > declare namespace RSUITE="http://www.reallysi.com" > declare namespace adbl="http://www.audible.com/publisherToRepository" > > define function update_excl_prod( $excl as node() ) as element() (: Call > with just 1 node ! :) > { > > for $rsuite in collection("Internal")/product > > let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT > let $prod_id := $excl/prods/product/prod_id > > let $excl_content := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_CONTENT > let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT > > let $exi := exists($excl_product ) > > let $start := $excl/start > let $end_dt := $excl/end > > let $repl_node > :=<adbl:EXCLUSIVE_PRODUCT><adbl:START_DATE>{$start/text()}</adbl:START_DATE><adbl:END_DATE>{$end_dt/text()}</adbl:END_DATE></adbl:EXCLUSIVE_PRODUCT> > > where $rsuite/adbl:METADATA/adbl:CORE/adbl:ID/text() = $excl/prod_id/text() > > return > if( $exi = true() ) > then > xdmp:node-replace($excl_product, $repl_node) > else > xdmp:node-insert-after($excl_content, $repl_node) > } > > <result>{ > for $excl in doc("DevWithoutExclProduct.xml")/prods/product > return update_excl_prod( $excl ) > } > > </result> > > Error returned here is: > XDMP-EXPNTREECACHEFULL: for $rsuite as item()* in > collection("Internal")/child::product -- Expanded tree cache full on host > rsuite.ofc.dev.ewr.audible.com > line 7 > = > line 34 > = > /use-cases/eval2.xqy line 2 > > Version 3 – Here I tried limiting the for expression to search for the item, > removed the where clause.Same exception > > declare namespace RSUITE="http://www.reallysi.com" > declare namespace adbl="http://www.audible.com/publisherToRepository" > > define function update_excl_prod( $excl as node() ) as element() (: Call > with just 1 node ! :) > { > > for $rs in > fn:collection('Internal')/product/adbl:METADATA/adbl:CORE/adbl:ID[.= > $excl/prod_id/text() ] > > let $rsuite := doc(xdmp:node-uri($rs ))/product > > let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT > let $prod_id := $excl/prods/product/prod_id > > let $excl_content := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_CONTENT > let $excl_product := $rsuite/adbl:METADATA/adbl:CORE/adbl:EXCLUSIVE_PRODUCT > > let $exi := exists($excl_product ) > > let $start := $excl/start > let $end_dt := $excl/end > > let $repl_node > :=<adbl:EXCLUSIVE_PRODUCT><adbl:START_DATE>{$start/text()}</adbl:START_DATE><adbl:END_DATE>{$end_dt/text()}</adbl:END_DATE></adbl:EXCLUSIVE_PRODUCT> > > (: where $rsuite/adbl:METADATA/adbl:CORE/adbl:ID/text() = > $excl/prod_id/text() :) > > return > if( $exi = true() ) > then > xdmp:node-replace($excl_product, $repl_node) > else > xdmp:node-insert-after($excl_content, $repl_node) > } > > <result>{ > for $excl in doc("DevWithoutExclProduct.xml")/prods/product > return update_excl_prod( $excl ) > } > > </result> > > > Art > > > Art Zegarek | Director of Data Architecture > T: 973.820.0396 F: 973.820.0505 C: 732-735-2592 > > audible.com > 1 Washington Park, 16th Floor, Newark, NJ 07102 > > _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
