Using the `timeout` command sounds simple and effective.

Another approach: Write your corb task module to check for a server field and 
test its dateTime value against current-dateTime. If the job has timed out 
according to this check, throw an error. Before starting corb, arrange to set 
that server field in the Task Server. It's important to set the server field on 
all hosts, and to set it in the task server not in some other app server. When 
the deadline is reached and the errors start to come back, corb should stop.

Parenthetically, when I wrote corb I was feeling frustrated by all the extra 
options in RecordLoader and XQSync - so corb took a minimalist approach. 
Arguably that doesn't give users enough flexibility.

With https://github.com/mblakele/taskbot I went in the opposite direction. 
Instead of an external tool with some fixed number of options, it's an XQuery 
library with an API. This gives users quite a bit of flexibility. The API may 
take some getting used to, but I think it's worth the effort.

To the extent that taskbot can replace corb, this time-limited job could be 
done inside taskbot. One way would be to start the batch job, then schedule a 
one-time task to call `tb:fatal-set(true())` 3-hr later. If the job is still 
running, all the tasks will stop as soon as they check `tb:maybe-fatal`. That 
happens inside the library, and can also be called inside user task code.

Note that taskbot's "fatal" flag is a server field, so it's per-appserver and 
per-host. It affects all taskbot jobs on the Task Server, so you'll have to 
remember to unset it before starting the next job.

Another approach would be close to what I suggested for corb, above. Pass the 
deadline as a dateTime into `tb:list-segment-process` or 
`forests-uris-process`. You'd probably want a copy of the deadline dateTime in 
both the fn-options and eval-options. Then write the user segment function and 
user task code to check for it. Each would check `fn:current-dateTime` against 
the deadline, and throw an error if needed. This is similar to 
`tb:maybe-fatal`, but checking a dateTime instead of a boolean server field.

-- Mike

> On 4 Nov 2014, at 06:40 , Dave Cassel <[email protected]> wrote:
> 
> What you describe is a 4 hour process from the shell script's perspective, 
> but from MarkLogic's perspective, it's a series of independent requests. That 
> means this is something you'll need to do outside of MarkLogic. 
> 
> Looks like the Linux timeout command will do what you need. 
> 
> -- 
> Dave Cassel
> Developer Community Manager
> MarkLogic Corporation
> Cell:  +1-484-798-8720
> 
> 
> From: Rashmi Ranjan Acharya <[email protected]>
> Reply-To: MarkLogic Developer Discussion <[email protected]>
> Date: Tuesday, November 4, 2014 at 9:31 AM
> To: MarkLogic Developer Discussion <[email protected]>
> Subject: Re: [MarkLogic Dev General] Executing CORB with a time limit
> 
>> Dave,
>> 
>> Thank you for quick reply!!!
>> 
>> In this case, CORB is dealing with huge number of document Uris, whereas 
>> module transaction time is very small. 
>> 
>> Requirement is to terminate the process if it takes more than a particular 
>> time and we are running this using a shell script. Suppose the job takes 4 
>> hours to process all Uris but we are having a limit of 3 hours and also we 
>> don't want to rollback the operation for the processed Uris. Whatever Uris 
>> are not processed but present in the queue, we are fine to skip them. 
>> 
>> Please suggest a way to implement this scenario. 
>> 
>> Regards,
>> 
>> Rashmi Ranjan Acharya
>> TATA Consultancy Services,
>> Kolkata, India,
>> Cell: +91-9874844188
>> 
>> 
>> Sent from my iPhone
>> 
>> On 04-Nov-2014, at 7:03 pm, Dave Cassel <[email protected]> wrote:
>> 
>>> Rashmi,
>>> 
>>> Does this mean that if the job does not finish within a specified period, 
>>> it should be terminated? 
>>> 
>>> It's useful to remember that Corb is not doing one big transaction; it is 
>>> serializing a bunch of small transactions to accomplish some goal. With 
>>> that in mind, if the overall Corb process were terminated because it took 
>>> too long, it would not roll back the transactions that it had completed. 
>>> 
>>> Corb runs a particular module on a series of URIs; if your goal is to make 
>>> sure that the module doesn't spend too much time on one particular URI, you 
>>> can limit the  default time limit of the XDBC app server. Suppose you set 
>>> that to 30 seconds; then any URI that required more than 30 seconds of 
>>> processing would fail, thus achieving an upper limit on a per-document 
>>> basis. Note that you'd be left with some documents successfully updated 
>>> (those that took < 30 seconds) and some that failed (those that took > 30 
>>> seconds). 
>>> 
>>> Dave.
>>> 
>>> -- 
>>> Dave Cassel
>>> Developer Community Manager
>>> MarkLogic Corporation
>>> Cell:  +1-484-798-8720
>>> 
>>> 
>>> From: Rashmi Ranjan Acharya <[email protected]>
>>> Reply-To: MarkLogic Developer Discussion <[email protected]>
>>> Date: Tuesday, November 4, 2014 at 6:13 AM
>>> To: "[email protected]" <[email protected]>
>>> Subject: [MarkLogic Dev General] Executing CORB with a time limit
>>> 
>>>> Hi All,
>>>> 
>>>> Is there any way to run a CORB with a particular time limit?
>>>> 
>>>> I am executing CORB through shell script with Java class path.
>>>> 
>>>> Thanks in advance!!!
>>>> 
>>>> Rashmi Ranjan Acharya
>>>> TATA Consultancy Services,
>>>> Kolkata, India
>>>> Cell:+91-9874844188
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>> 
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to