Subject: /bin/run-parts: also gets stuck regularly (possible fix)
Package: debianutils
Version: 3.4
File: /bin/run-parts
Severity: normal

I'm also seeing this on a number of systems. The relevant process tree
is this:

root     27042  0.0  0.0   1940   728 ?        Ss   07:30   0:00 
/usr/sbin/anacron -s
root     28325  0.0  0.0   1792   504 ?        S    07:35   0:00  \_ /bin/sh -c 
nice run-parts --report /etc/cron.daily
root     28326  0.0  0.0   1716   484 ?        SN   07:35   0:00      \_ 
run-parts --report /etc/cron.daily
root     28335  0.0  0.0      0     0 ?        ZN   07:35   0:00          \_ 
[apt] <defunct>

Doing an strace:

# strace -p 28326
Process 28326 attached - interrupt to quit
select(8, [4 7], NULL, NULL, NULL

File descriptors 4 and 7 are from a pipe (stdout and stderr for the
sub-command):

# lsof -p 28326
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
run-parts 28326 root  cwd    DIR  253,0     4096        2 /
run-parts 28326 root  rtd    DIR  253,0     4096        2 /
run-parts 28326 root  txt    REG  253,0    12112   557134 /bin/run-parts
run-parts 28326 root  mem    REG  253,0  1323460   510463 
/lib/i686/cmov/libc-2.11.2.so
run-parts 28326 root  mem    REG  253,0   113964   508326 /lib/ld-2.11.2.so
run-parts 28326 root    0r   CHR    1,3      0t0      602 /dev/null
run-parts 28326 root    1u   REG  253,0       76   581655 /tmp/filesOEb8G 
(deleted)
run-parts 28326 root    2u   REG  253,0       76   581655 /tmp/filesOEb8G 
(deleted)
run-parts 28326 root    3r   DIR  253,0     4096        2 /
run-parts 28326 root    4r  FIFO    0,8      0t0 15051342 pipe
run-parts 28326 root    5u   REG  253,0        0   581652 /tmp/tmpf8q0vRX 
(deleted)
run-parts 28326 root    7r  FIFO    0,8      0t0 15051343 pipe

I'm seeing this often (about once a week) with /etc/cron.daily/apt on
different systems. I do use unattended-upgrades and in the cases that
run-parts hangs it seems that some updates were installed. Perhaps the
hanging is a side-effect of the file descriptor mangling that debconf
does?

Giving it some more though, I think I figured out what causes this. If
unattended-upgrades restarts a service (on this system MySQL was
restarted) one of the pipe file descriptors is leaked to the daemon.
This is the relevant part of the MySQL process tree that was restarted:

root      2227  0.0  0.0   1792   536 ?        SN   08:00   0:00 /bin/sh 
/usr/bin/mysqld_safe
mysql     2338  0.1  1.8 133176 18792 ?        SNl  08:00   0:26  \_ 
/usr/sbin/mysqld --basedir=/usr --data
root      2339  0.0  0.0   3708   584 ?        SN   08:00   0:00  \_ logger -t 
mysqld -p daemon.error

As can be seen with lsof the pipes that were created by run-parts are
open in the mysqld_safe script (file descriptors 44 and 45):

# lsof -p 2227 
COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
mysqld_sa 2227 root  cwd    DIR  253,0     4096        2 /
mysqld_sa 2227 root  rtd    DIR  253,0     4096        2 /
mysqld_sa 2227 root  txt    REG  253,0    84144   557072 /bin/dash
mysqld_sa 2227 root  mem    REG  253,0  1323460   510463 
/lib/i686/cmov/libc-2.11.2.so
mysqld_sa 2227 root  mem    REG  253,0   113964   508326 /lib/ld-2.11.2.so
mysqld_sa 2227 root    0r   CHR    1,3      0t0      602 /dev/null
mysqld_sa 2227 root    1w   CHR    1,3      0t0      602 /dev/null
mysqld_sa 2227 root    2w   CHR    1,3      0t0      602 /dev/null
mysqld_sa 2227 root    3w  FIFO    0,8      0t0 15066405 pipe
mysqld_sa 2227 root   10r   REG  253,0    16894   280621 /usr/bin/mysqld_safe
mysqld_sa 2227 root   41r   REG  253,0   990701   222770 /var/lib/dpkg/status 
(deleted)
mysqld_sa 2227 root   42u   CHR    1,3      0t0      602 /dev/null
mysqld_sa 2227 root   43u   REG  253,0     1869   205572 
/var/log/unattended-upgrades/unattended-upgrades-dpkg_2010-11-08_07:59:40.806298.log
mysqld_sa 2227 root   44w  FIFO    0,8      0t0 15051342 pipe
mysqld_sa 2227 root   45w  FIFO    0,8      0t0 15051343 pipe
mysqld_sa 2227 root   46w   REG  253,0     1354   205554 /var/log/apt/term.log
mysqld_sa 2227 root   47w   REG  253,0     2224   205556 
/var/log/apt/history.log

Perhaps run-parts should be modified to close the pipes when the child
exits (instead of when all write ends of the pipes are closed).

I think this will require installing a signal handler that sets a flag
that is tested in the select() loop. If you like I can prepare a patch
for this.

What do you think?

ps. I think this is also a bug in the mysqld_safe script or the init
    script (daemons should close all their file descriptors) but I think
    run-parts should also behave sanely.

-- System Information:
Debian Release: squeeze/sid
  APT prefers testing
  APT policy: (900, 'testing'), (800, 'unstable'), (1, 'experimental')
Architecture: i386 (i686)

Kernel: Linux 2.6.35-rc5-686 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages debianutils depends on:
ii  libc6                         2.11.2-7   Embedded GNU C Library: Shared lib
ii  sensible-utils                0.0.4      Utilities for sensible alternative

-- 
-- arthur de jong - [email protected] - west consulting b.v. --




-- 
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Reply via email to