Here is my current bash-script (same timeout-way due to the lack of
alternative suggestions):
timeout 600 julia -p $(nproc) juliacode.jl >>results.log 2>&1
killall -9 -v julia >>cleanup.log 2>&1
Does that seem reasonable? Perhaps Linux experts may think of some
scenarios where this would not be sufficient as far as the
runaway/non-responding process cleanup?
On Thursday, April 2, 2015 at 12:15:33 PM UTC-7, Pavel wrote:
>
> What would be a good way to limit the total runtime of a multicore process
> managed by pmap?
>
> I have pmap processing a collection of optimization runs (with fminbox)
> and most of the time everything runs smoothly. On occasion however 1-2 out
> of e.g. 8 CPUs take too long to complete one optimization, and
> fminbox/conj. grad. does not have a way to limit run time as recently
> discussed:
>
> http://julia-programming-language.2336112.n4.nabble.com/fminbox-getting-quot-stuck-quot-td12163.html
>
> To deal with this in a crude way, at the moment I call Julia from a shell
> (bash) script with timeout:
>
> timeout 600 julia -p 8 juliacode.jl
>
> When doing this, is there anything to help find and stop zombie-processes
> (if any) after timeout forces a multicore pmap run to terminate? Anything
> within Julia related to how the processes are spawned? Any alternatives to
> shell timeout? I know NLopt has a time limit option but that is not
> implemented within Julia (but in the underlying C-library).
>
>