Thanks Mike. Looking at your code was very helpful. Gary
-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Michael Blakeley Sent: Monday, May 06, 2013 7:12 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Need help with mass updates If you want to see a complete example of using xdmp:spawn for mass updates, take a look at https://github.com/mblakele/task-rebalancer That code is somewhat specialized for rebalancing forests, but much of the design would be the same for any mass update task. In some ways the task server is better than CoRB for this, because the work can be done closer to each forest. I've put some work into a more general-purpose version, basically a rewrite of CoRB using xdmp:spawn. But it isn't quite ready yet. -- Mike On 6 May 2013, at 13:53 , Danny Sokolsky <[email protected]> wrote: > You can also use xdmp:spawn to update a batch at a time. You would then need two modules, the xdmp:spawn module, which typically would have an external variable that you would use to pass in the URLs to process, and another module that figures out the batches and then passes them off to the spawn module. > > -Danny > > From: [email protected] > [mailto:[email protected]] On Behalf Of Brent > Hartwig > Sent: Monday, May 06, 2013 1:46 PM > To: MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] Need help with mass updates > > Hi, Gary, > > When this is all in one transaction, it doesn't matter how you break it up. CORB is built for this purpose. You provide two queries. One selects the documents to process. The other processes the documents, one at a time. Each document is processed in a transaction of its own. > > For the first query, it's good to come up with a way to only select unprocessed documents, unless you wish to reprocess all. This allows for the process to be interrupted but pick up where it left off, later. > > CORB is a Java program. You get to configure the number of threads. > > I couldn't say if there's now a standard feature that supersedes CORB. > > -Brent > > From: [email protected] > [mailto:[email protected]] On Behalf Of Gary > Larsen > Sent: Monday, May 06, 2013 4:36 PM > To: General MarkLogic Developer Discussion > Subject: [MarkLogic Dev General] Need help with mass updates > > Hi, > > I have a query to update documents, but when there are many I get the dreaded XDMP-EXPNTREECACHEFULL error. I've had luck avoiding this error when returning large result sets by processing the docs in segments [$start to $end], but it does not seem to help with the updates. > > Is there a trick to performing mass updates? Any advice would be appreciated. > > xquery version "1.0-ml"; > declare default element namespace > 'http://developer.envisn.com/xmlns/envisn/netvisn/'; > > let $cq := cts:collection-query('audit_history') > > let $incr := 100 > let $size := xdmp:estimate(cts:search(doc(), $cq, 'unfiltered')) let > $segs := ceiling($size div $incr) return > > for $x in (1 to $segs) > let $start := (($x -1) * $incr) +1 > let $end := $start + $incr -1 > > for $d in cts:search(doc(), $cq, 'unfiltered')[$start to $end] > let $lk := $d/auditHistory/lookupInfo > let $loc := element auditParentDisplayPath {$lk/parentDisplayPath/text() }, > $name := element auditDefaultName {$lk/defaultName/text() }, > $class := element auditObjectClass > {$lk/objectClass/text() } return > > (xdmp:node-replace($lk/parentDisplayPath, $loc), > xdmp:node-replace($lk/defaultName, $name), > xdmp:node-replace($lk/objectClass, $class), > > for $u in $d/auditHistory//Action/user > let $uname := element auditUserName {$u/username/text() } return > xdmp:node-replace($u/username, $uname) > ) > > Thanks, > > Gary Larsen > Envisn Inc. > 508-259-6465 > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
