Re: I/O spin loop

Anderson, Stuart B. Wed, 09 Mar 2022 10:21:14 -0800

> On Mar 8, 2022, at 3:35 PM, Ole Tange <o...@tange.dk> wrote:
> 
> On Tue, Mar 8, 2022 at 11:22 PM Anderson, Stuart B. <s...@caltech.edu> wrote:
>> 
>> parallel version 20190922 from EPEL 8 running a Rocky Linux 8 system 
>> occasionally gets into an I/O spin loop writing 8193 bytes of "x" to a 
>> deleted TMPDIR file and then immediately truncating it, e.g.,
> 
> This is by design:
> 
> https://www.gnu.org/software/parallel/parallel_design.html#disk-full

Any reason not to call statvfs() to see if there is free disk space?

> GNU Parallel does an exponential back off:
> 
> https://www.gnu.org/software/parallel/parallel_design.html#exponentially-back-off

There seems to be a problem with this for some long running jobs,
perhaps limited to nested parallel (see below).

> 
> So for long running jobs, this test is run once per second. For short
> jobs (e.g. parallel true ::: {1..1000}) it will be run much more
> often.
> 
> I find it unlikely that the CPU usage is solely caused by the checking
> of disk full. You can verify this by changing:
> 
>    sub exit_if_disk_full() {
>        return;
>        ...
>   }
> 
> This will, however, remove the protection that checks if $TMPDIR runs
> full during a run (and in this case output may be wrong).
> 
> I ran strace -tt:
> 
> [pid 1000762] 23:54:08.486547 wait4(-1, [{WIFEXITED(s) &&
> WEXITSTATUS(s) == 0}], WNOHANG, NULL) = 1000775
> [pid 1000762] 23:54:08.486782 write(7,
> "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"..., 8192) = 8192
> [pid 1000762] 23:54:08.486875 write(7, "b", 1) = 1
> [pid 1000762] 23:54:08.486956 ftruncate(7, 0) = 0
> 
> so the write+truncate+seek takes at most 406 microseconds. When run on
> my system `vmstat 1` shows that the write never reaches the disk,
> which is the goal.
> 
>> Any thoughts on how to avoid this seemingly useless spin loop with high CPU 
>> and I/O resource consumption?
> 
> Can you show that there really is an I/O consumption? Do the writes
> actually reach your disk?

I can confirm this is not throttled once per second and leads to maxing
out the I/O on a fast NVMe device with very high CPU utilization. The
following example shows it happening twice per millisecond.

[root@zfs1 ~]# ps auxw w| grep 836133
root      836133 98.2  0.0  55488 17672 pts/6    R+   Mar08 1277:14 
/usr/bin/perl /usr/bin/parallel --delay 2 -P /tmp/parallel.zfs.backup3 parallel 
-I// syncoid {} // ::: $LIST

[root@zfs1 ~]# top
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND 
 741105 root      20   0   55508  17676   4388 R  98.7   0.0   1276:10 parallel
 836133 root      20   0   55488  17672   4388 R  98.4   0.0   1278:08 parallel

[root@zfs1 ~]# strace -tt -p 836133 |& head -20
strace: Process 836133 attached
10:17:53.974396 write(10, "x", 1)       = 1
10:17:53.974599 ftruncate(10, 0)        = 0
10:17:53.974620 lseek(10, 0, SEEK_SET)  = 0
10:17:53.974637 lseek(10, 0, SEEK_CUR)  = 0
10:17:53.974654 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
10:17:53.974679 rt_sigaction(SIGCHLD, {sa_handler=0x7fe81f000dc0, sa_mask=[], 
sa_flags=SA_RESTORER, sa_restorer=0x7fe81ed29c20}, {sa_handler=SIG_DFL, 
sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fe81ed29c20}, 8) = 0
10:17:53.974701 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
10:17:53.974722 select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)
10:17:53.974746 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
10:17:53.974763 rt_sigaction(SIGCHLD, {sa_handler=SIG_DFL, sa_mask=[], 
sa_flags=SA_RESTORER, sa_restorer=0x7fe81ed29c20}, {sa_handler=0x7fe81f000dc0, 
sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fe81ed29c20}, 8) = 0
10:17:53.974783 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
10:17:53.974810 wait4(-1, 0x7ffc539e8a14, WNOHANG, NULL) = 0
10:17:53.974829 write(10, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192
10:17:53.974849 write(10, "x", 1)       = 1
10:17:53.974867 ftruncate(10, 0)        = 0
10:17:53.974884 lseek(10, 0, SEEK_SET)  = 0
10:17:53.974898 lseek(10, 0, SEEK_CUR)  = 0
10:17:53.974915 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
10:17:53.974932 rt_sigaction(SIGCHLD, {sa_handler=0x7fe81f000dc0, sa_mask=[], 
sa_flags=SA_RESTORER, sa_restorer=0x7fe81ed29c20}, {sa_handler=SIG_DFL, 
sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fe81ed29c20}, 8) = 0

Thanks.

--
Stuart Anderson
s...@caltech.edu
Re: I/O spin loop

Reply via email to