Hi Andreas, Interesting slides, good find!
If you are talking about more ad hoc processing, you could look into things like https://github.com/mblakele/taskbot, and https://github.com/marklogic/corb2. These are tools that can batch up the work very well. They won’t spread load across a cluster automatically though. You could however try to split the load somehow, and run multiple instances in parallel, each against a different host. Though, that works best if you are targeting the host that actually holds the data you want to touch. But that is difficult. MLCP does that with its -fastload option. Would MLCP copy feature with a transform perhaps work? MarkLogic also provides Hadoop integration, so maybe that is also worth looking at? Cheers, Geert From: <[email protected]<mailto:[email protected]>> on behalf of Andreas Hubmer <[email protected]<mailto:[email protected]>> Reply-To: MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Date: Thursday, July 30, 2015 at 8:56 AM To: MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Subject: Re: [MarkLogic Dev General] Distributing Tasks Hi Geert, Thanks for the update. Triggers and the CPF aren't exactly what I'm looking for. What I want to do is to distribute one-time tasks like adding new elements to all existing documents. I've found some slides <http://developer.marklogic.com/media/mlw12/Distributed-Content-Processing-in-MarkLogic.pdf> from a ML consultant on "Distributed Content Processing in MarkLogic" but the code builds on ML 4. Probably I'll create a lightweight library myself. Either using one-time scheduled tasks or an HTTP server for distributing the tasks. Regards, Andreas 2015-07-29 17:56 GMT+02:00 Geert Josten <[email protected]<mailto:[email protected]>>: Hi Andreas, I haven’t heard about anything in this direction recently, but FWIW I added a +1 to the RFE. Could post-commit triggers, or CPF help out in some way? They should run on the host that holds the forest that holds the document at hand from what I have heard.. Cheers, Geert From: <[email protected]<mailto:[email protected]>> on behalf of Andreas Hubmer <[email protected]<mailto:[email protected]>> Reply-To: MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Date: Tuesday, July 28, 2015 at 5:20 PM To: MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Subject: [MarkLogic Dev General] Distributing Tasks Hello, In this Knowledgebase article<https://help.marklogic.com/knowledgebase/article/View/112/0/techniques-for-dividing-tasks-between-hosts-in-a-cluster> there is talk about an RFE (2763) that would make it possible to pass in options into xdmp:spawn() to allow the execution of code on a specific host in a cluster. Are there still any plans for this feature? Thanks and cheers, Andreas -- Andreas Hubmer IT Consultant _______________________________________________ General mailing list [email protected]<mailto:[email protected]> Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general -- Andreas Hubmer IT Consultant EBCONT enterprise technologies GmbH Millennium Tower Handelskai 94-96 A-1200 Vienna Mobile: +43 664 60651861 Fax: +43 2772 512 69-9 Email: [email protected]<mailto:[email protected]> Web: http://www.ebcont.com OUR TEAM IS YOUR SUCCESS UID-Nr. ATU68135644 HG St.Pölten - FN 399978 d
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
