Re: parallel bug: Warning: No more file handles. when one job is delayed (reproducible with a test case)
On Sun, 29 Oct 2017 00:26:15 +0200 Ole Tangewrote: > On Sat, Oct 28, 2017 at 12:44 PM, Shlomi Fish wrote: > > > I see. So what you are saying is that parallel will work fine despite the > > warning and will continue running? > > Yep. But if you run into this problem, it might be a better idea to > remove -k and use something like --results instead. > > /Ole Thanks, Ole! I'll investigate. -- - Shlomi Fish http://www.shlomifish.org/ http://www.shlomifish.org/humour/ways_to_do_it.html If a million Shakespeares had to write together, they would write like a monkey. — based on Stephen Wright, via Nadav Har’El. Please reply to list if it's a mailing list post - http://shlom.in/reply .
Re: parallel bug: Warning: No more file handles. when one job is delayed (reproducible with a test case)
On Sat, Oct 28, 2017 at 12:44 PM, Shlomi Fishwrote: > I see. So what you are saying is that parallel will work fine despite the > warning and will continue running? Yep. But if you run into this problem, it might be a better idea to remove -k and use something like --results instead. /Ole
Re: parallel bug: Warning: No more file handles. when one job is delayed (reproducible with a test case)
Hi Ole, On Sat, 28 Oct 2017 00:28:02 +0200 Ole Tangewrote: > On Fri, Oct 27, 2017 at 9:23 AM, Shlomi Fish wrote: > > > Thanks for your work. > > Good to know it is appreciated. > > > Attached are two files to reproduce a bug I ran into with GNU parallel > > including the latest one on Mageia v7 x86-64: > > > > shlomif@telaviv1:~$ bash run-range.bash > > parallel: Warning: No more file handles. > > parallel: Warning: Raising ulimit -n or /etc/security/limits.conf may help. > > ^CCompleted! > > > > The run-single.bash script is delayed for n=1 and meanwhile other jobs > > accumulate which may explain the problem. This problem caused me to lose one > > night of uptime on an AWS instance because "parallel" got stuck, so I'd > > appreciate an investigation and a fix. > > Your problem can be illustrated with: > > seq 0 1000 | parallel -k -t sleep '{= $_ = $_ ? 0 : 10 =};echo {}' > > This will run 'sleep 10' followed by 1000 jobs of 'sleep 0'. -t causes > the command to be printed as soon as it is started. > > Because of -k GNU Parallel must keep the order of the output. It does > that by having open files to the temporary output files of jobs run. > What happens here, is that before we can close any of the files, we > will have to wait for the first job to complete. Because the other > jobs are very fast to complete, then GNU Parallel runs out of file > handles, and thus warns you: > > parallel: Warning: No more file handles. > parallel: Warning: Raising ulimit -n or /etc/security/limits.conf may help. > > But it is just a warning: As soon as the first job completes, it > completes the remaining jobs. > > > Also see > > https://lists.gnu.org/archive/html/parallel/2017-07/msg6.html . > > If you use -k in that, then we have the explanation: GNU Parallel does > not stop. It waits for one of the jobs to complete before it can close > more filehandles. > I see. So what you are saying is that parallel will work fine despite the warning and will continue running? Since it still got stuck, then the problem is likely elsewhere. Thanks! > > /Ole > -- - Shlomi Fish http://www.shlomifish.org/ Parody of "The Fountainhead" - http://shlom.in/towtf A kid always wishes they were older until they are 18. Afterwards, they always wish they were younger. Please reply to list if it's a mailing list post - http://shlom.in/reply .
Re: parallel bug: Warning: No more file handles. when one job is delayed (reproducible with a test case)
On Fri, Oct 27, 2017 at 9:23 AM, Shlomi Fishwrote: > Thanks for your work. Good to know it is appreciated. > Attached are two files to reproduce a bug I ran into with GNU parallel > including the latest one on Mageia v7 x86-64: > > shlomif@telaviv1:~$ bash run-range.bash > parallel: Warning: No more file handles. > parallel: Warning: Raising ulimit -n or /etc/security/limits.conf may help. > ^CCompleted! > > The run-single.bash script is delayed for n=1 and meanwhile other jobs > accumulate which may explain the problem. This problem caused me to lose one > night of uptime on an AWS instance because "parallel" got stuck, so I'd > appreciate an investigation and a fix. Your problem can be illustrated with: seq 0 1000 | parallel -k -t sleep '{= $_ = $_ ? 0 : 10 =};echo {}' This will run 'sleep 10' followed by 1000 jobs of 'sleep 0'. -t causes the command to be printed as soon as it is started. Because of -k GNU Parallel must keep the order of the output. It does that by having open files to the temporary output files of jobs run. What happens here, is that before we can close any of the files, we will have to wait for the first job to complete. Because the other jobs are very fast to complete, then GNU Parallel runs out of file handles, and thus warns you: parallel: Warning: No more file handles. parallel: Warning: Raising ulimit -n or /etc/security/limits.conf may help. But it is just a warning: As soon as the first job completes, it completes the remaining jobs. > Also see https://lists.gnu.org/archive/html/parallel/2017-07/msg6.html . If you use -k in that, then we have the explanation: GNU Parallel does not stop. It waits for one of the jobs to complete before it can close more filehandles. /Ole
parallel bug: Warning: No more file handles. when one job is delayed (reproducible with a test case)
Hi all! Thanks for your work. Attached are two files to reproduce a bug I ran into with GNU parallel including the latest one on Mageia v7 x86-64: shlomif@telaviv1:~$ bash run-range.bash parallel: Warning: No more file handles. parallel: Warning: Raising ulimit -n or /etc/security/limits.conf may help. ^CCompleted! The run-single.bash script is delayed for n=1 and meanwhile other jobs accumulate which may explain the problem. This problem caused me to lose one night of uptime on an AWS instance because "parallel" got stuck, so I'd appreciate an investigation and a fix. Also see https://lists.gnu.org/archive/html/parallel/2017-07/msg6.html . Regards, Shlomi Fish -- - Shlomi Fish http://www.shlomifish.org/ My Aphorisms - http://www.shlomifish.org/humour.html Chuck Norris has 99 problems including a bitch. — http://www.shlomifish.org/humour/bits/facts/Chuck-Norris/ Please reply to list if it's a mailing list post - http://shlom.in/reply . run-range.bash Description: Binary data run-single.bash Description: Binary data