Package: procps
Version: 1:3.3.3-3

When 'ps' tries to determine the tty of a process, it does it by first
finding out the major and minor device number, and then trying to
translate that back into a name. To do that translation, it falls back
through several methods, of which the first is to look up the major
number in /proc/tty/drivers, guess the probable pathname in /dev based
on that and the minor number, and then stat that pathname to see if it
does indeed have the right details.

For the common case of a pty device such as /dev/pts/0, this method
guesses wrong, and 'ps' therefore falls back to its second method,
which is to readlink the process's /proc/NNN/fd/2 and stat whatever it
finds to see if _that's_ a device file with the right numbers (based
on the empirical observation that processes often have fd 2 open on
their own terminal device).

When the process's fd 2 is not pointing at a terminal, this causes ps
to stat a potentially arbitrary node in the VFS. That's normally
harmless, but it can be irritating if one of those nodes is on an
unresponsive filesystem, e.g. an NFS mount whose server is
malfunctioning. And that situation is precisely one in which your
processes start to lock up and you start running 'ps' to find out
what's going on, so you'd quite like 'ps' not to hang as well for the
same reason.

So it would be desirable, if possible, for ps's original approach of
guessing the device name via /proc/tty/drivers to get the right answer
in the first place. And in fact this is very easy. Looking at what
happens with strace in a typical case, we see:

open("/proc/tty/drivers", O_RDONLY)     = 6
read(6, "/dev/tty             /dev/tty   "..., 9999) = 574
close(6)                                = 0
stat("/dev/pts3", 0x7fff3e4adaa0)       = -1 ENOENT (No such file or directory)
stat("/dev/pts", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
readlink("/proc/1555/fd/2", "/dev/pts/3", 127) = 10
stat("/dev/pts/3", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0

Having worked out from /proc/tty/drivers that the major number
corresponds to /dev/pts, we construct the filenames /dev/pts3 and
/dev/pts to test, neither of which works, and then you can see ps
falling back to looking at the process's fd 2. But if we'd tried
/dev/pts/3 as well, that would have been fine!

The attached file procps_driver_name_fix.diff is a small patch against
procps which fixes this for me, and causes the corresponding piece of
strace output to look like this:

open("/proc/tty/drivers", O_RDONLY)     = 7
read(7, "/dev/tty             /dev/tty   "..., 9999) = 574
close(7)                                = 0
stat("/dev/pts3", 0x7fff7c1b58e0)       = -1 ENOENT (No such file or directory)
stat("/dev/pts/3", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0

so that the right tty device is identified immediately and there's no
need to go via /proc at all.

Cheers,
Simon
-- 
for k in [pow(x,37,0x1a1298d262b49c895d47f) for x in [0x50deb914257022de7fff,
0x213558f2215127d5a2d1, 0x90c99e86d08b91218630, 0x109f3d0cfbf640c0beee7,
0xc83e01379a5fbec5fdd1, 0x19d3d70a8d567e388600e, 0x534e2f6e8a4a33155123]]:
 print "".join([chr(32+3*((k>>x)&1))for x in range(79)]) # <ana...@pobox.com>
diff --git a/proc/devname.c b/proc/devname.c
index 0066b46..a28bc1b 100644
--- a/proc/devname.c
+++ b/proc/devname.c
@@ -133,9 +133,12 @@ static int driver_name(char *restrict const buf, unsigned maj, unsigned min){
   }
   sprintf(buf, "/dev/%s%d", tmn->name, min);  /* like "/dev/ttyZZ255" */
   if(stat(buf, &sbuf) < 0){
-    if(tmn->devfs_type) return 0;
-    sprintf(buf, "/dev/%s", tmn->name);  /* like "/dev/ttyZZ255" */
-    if(stat(buf, &sbuf) < 0) return 0;
+    sprintf(buf, "/dev/%s/%d", tmn->name, min);  /* like "/dev/pts/255" */
+    if(stat(buf, &sbuf) < 0){
+      if(tmn->devfs_type) return 0;
+      sprintf(buf, "/dev/%s", tmn->name);  /* like "/dev/ttyZZ255" */
+      if(stat(buf, &sbuf) < 0) return 0;
+    }
   }
   if(min != MINOR_OF(sbuf.st_rdev)) return 0;
   if(maj != MAJOR_OF(sbuf.st_rdev)) return 0;

Reply via email to