MarkLogic 7 includes support for anonymous functions, plus a new builtin called
xdmp:spawn-function. I've been using these to put the Task Manager to work, and
https://github.com/mblakele/taskbot is the result.
Taskbot is basically a map-reduce utility. Start with an anonymous function,
and a list of stuff: document URIs, or anything else. Taskbot spawns a task for
each segment of the list, using a size you specify. You provide an anonymous
function that processes each segment. The Task Manager queue and thread pool
manage the work, providing as much data-driven parallelism as the configuration
and the workload allow.
If the anonymous function updates the database, your work is done. If your
function returns results, supply $tb:OPTIONS-SYNC and reduce the results
however you like.
All that might sound a little too abstract, so here's a quick example.
Inserting 1M documents in a single transaction can be painful, but it's easy
with tasks of 500 documents each.
(: This inserts 1M simple test documents,
: in segments of 500 documents each.
: Extend as needed.
:)
tb:list-segment-process(
(: Total size of the job. :)
1 to 1000 * 1000,
(: Size of each segment of work. :)
500,
"test/",
(: This anonymous function will be called for each segment. :)
function($list as item()+, $opts as map:map?) {
(: Any chainsaw should have a safety. Check it here. :)
tb:maybe-fatal(),
for $i in $list return xdmp:document-insert(
"test/"||$i,
element test { attribute id { 'test-'||$i }, $i }),
(: This is an update, so be sure to commit. :)
xdmp:commit() },
(: options - not used in this example. :)
map:new(map:entry('testing', '123...'),
(: This is an update, so be sure to say so. :)
$tb:OPTIONS-UPDATE)
There are more examples in the README at https://github.com/mblakele/taskbot -
plus xray test cases.
I hope it's useful.
-- Mike
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general