With 5.0, xdmp:spawn gained a 'priority' option. I am not sure exactly how it 
behaves, but it might suit your needs.

  http://developer.marklogic.com/pubs/5.0/apidocs/Ext-6.html#xdmp:eval

The only way to have multiple task servers is to have multiple hosts. If you 
put them in separate groups, you can also give them different thread counts and 
queue limits.

-- Mike

On 23 Jan 2012, at 12:22 , Josh Warner-Burke wrote:

> So I'm writing to get advice or technical pointers on managing the priority 
> of task queue items, basically.  Some background on the challenge we're 
> having with our system:
> 
> Marklogic 4.2-5 system which is holding a fair amount of data that has been 
> processed from a 'raw' format to our own enriched 'processed' format - allows 
> users to upload documents (which are processed) and also every day processes 
> automated feeds and allows other data producers to upload directly to it.  
> Each time a file is uploaded, that is some work for the task server as it 
> works through our custom code.
> 
> What we've found is that the system never has a problem with all the 
> user-uploaded data.  Even with a multiple file uploader, people just can't 
> really bog down the system.  However, now that we're processing feeds and 
> allowing data producers to upload data directly to us, we've created the 
> possibility that users will encounter an unresponsive system because of the 
> automated feeds / producers.
> 
> What I am wondering are which if any strategies (or one I haven't listed) can 
> best deal with this - 
> 
> 1. Is it possible to have two task servers?  For us, if we had one task 
> server on user stuff and one task server for background stuff, that would 
> neatly solve our problem.
> 
> 2. Is it possible to assign a higher priority to either a pipeline or a task? 
>  So that we could prioritize tasks related to user uploaded documents and 
> de-prioritize our background stuff.
> 
> 3. Is my only option to set the maximum tasks for the task server to a low 
> enough value that we never create a huge backlog?  And then deal with the 
> error states on some schedule that will result from the too many tasks error 
> message?  This seems undesirable.  To give some sense of the numbers we're 
> working with, our task server max is set at 200,000 - and if it were to ever 
> get this high, it would take probably 2 or 3 days to clear itself out.  We 
> can tell users their data will not always be processed right away, but the 
> expectation is it should never take more than maybe 15 - 30 minutes I think.
> 
> 
> In general am I looking at this the wrong way?  If there is no Marklogic way 
> to deal with this, I can basically 'throttle' our input from feeds or from 
> API access.  But I'd prefer not to have to get into that if I can help it.  
> Thanks in advance for any help.
> 
> -- 
> Josh Warner-Burke
> 
> 42SIX Solutions
> (e): jwbu...@42six.com
> 
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to