Thanks Mike.  Looking at your code was very helpful.

Gary

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Michael
Blakeley
Sent: Monday, May 06, 2013 7:12 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Need help with mass updates

If you want to see a complete example of using xdmp:spawn for mass updates,
take a look at https://github.com/mblakele/task-rebalancer

That code is somewhat specialized for rebalancing forests, but much of the
design would be the same for any mass update task. In some ways the task
server is better than CoRB for this, because the work can be done closer to
each forest.

I've put some work into a more general-purpose version, basically a rewrite
of CoRB using xdmp:spawn. But it isn't quite ready yet.

-- Mike

On 6 May 2013, at 13:53 , Danny Sokolsky <[email protected]>
wrote:

> You can also use xdmp:spawn to update a batch at a time.  You would then
need two modules, the xdmp:spawn module, which typically would have an
external variable that you would use to pass in the URLs to process, and
another module that figures out the batches and then passes them off to the
spawn module.
>  
> -Danny
>  
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Brent 
> Hartwig
> Sent: Monday, May 06, 2013 1:46 PM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Need help with mass updates
>  
> Hi, Gary,
>  
> When this is all in one transaction, it doesn't matter how you break it
up.  CORB is built for this purpose.  You provide two queries.  One selects
the documents to process.  The other processes the documents, one at a time.
Each document is processed in a transaction of its own.
>  
> For the first query, it's good to come up with a way to only select
unprocessed documents, unless you wish to reprocess all.  This allows for
the process to be interrupted but pick up where it left off, later.
>  
> CORB is a Java program.  You get to configure the number of threads.
>  
> I couldn't say if there's now a standard feature that supersedes CORB.
>  
> -Brent
>  
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Gary 
> Larsen
> Sent: Monday, May 06, 2013 4:36 PM
> To: General MarkLogic Developer Discussion
> Subject: [MarkLogic Dev General] Need help with mass updates
>  
> Hi,
>  
> I have a query to update documents, but when there are many I get the
dreaded XDMP-EXPNTREECACHEFULL error.   I've had luck avoiding this error
when returning large result sets by processing the docs in segments [$start
to $end], but it does not seem to help with the updates.
>  
> Is there a trick to performing mass updates?  Any advice would be
appreciated.
>  
> xquery version "1.0-ml";
> declare default element namespace 
> 'http://developer.envisn.com/xmlns/envisn/netvisn/';
> 
> let $cq := cts:collection-query('audit_history')
> 
> let $incr := 100
> let $size := xdmp:estimate(cts:search(doc(), $cq, 'unfiltered')) let 
> $segs := ceiling($size div $incr) return
>     
> for $x in (1 to $segs)
>      let $start :=  (($x -1) * $incr) +1 
>      let $end := $start + $incr -1
>  
>      for $d in cts:search(doc(), $cq, 'unfiltered')[$start to $end]
>          let $lk := $d/auditHistory/lookupInfo
>          let  $loc := element auditParentDisplayPath
{$lk/parentDisplayPath/text() },
>                 $name := element auditDefaultName {$lk/defaultName/text()
},
>                 $class := element auditObjectClass 
> {$lk/objectClass/text() }  return
>      
>          (xdmp:node-replace($lk/parentDisplayPath, $loc),
>           xdmp:node-replace($lk/defaultName, $name),
>           xdmp:node-replace($lk/objectClass, $class),
>       
>           for $u in $d/auditHistory//Action/user
>             let $uname :=  element auditUserName {$u/username/text() }
return
>             xdmp:node-replace($u/username, $uname) 
>           )
> 
> Thanks,
>  
> Gary Larsen
> Envisn Inc.
> 508-259-6465
>  
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to