[ilugd] Strange behaviour of device files on diskless Linux

Shuvam Misra Fri, 19 Sep 2003 04:27:22 -0700

Dear all,

I saw some very strange behaviour of device files on Linux recently,
thanks to a colleague. Can partly explain it, after all these years
working with Unix, but my explanation has holes.


I'll explain the phenomenon not as I experienced it, but in brief, so
that you don't have to recreate all the experimentation we did at our
end to confine and reproduce the problem. :)

It appears that if you have a diskless Linux machine, mounting all its
file systems over NFS, then the device files that you see on the
diskless machine do not hold their modification timestamp, even if a
process holds the file open. And this may not be true of all device
files, but is true of /dev/tty[0-9].

If you log in on /dev/tty1, for instance, and keep the shell running on
that terminal, and do an

        ls -l /dev/tty1

from another terminal, then you'll see the modification timestamp of
/dev/tty1 reflecting the last time you either typed on the keyboard or
it displayed anything on the screen (i.e. the last I/O on tty1). At
least, this is what should happen, and this is what indeed happens on
all disk-ful Linux/Unix boxes. (This is how "w" gives you the idle
time for each terminal session, for instance.) And this happens on a
diskless Linux box too, with the /dev/pts/* devices. However, for the
/dev/tty[0-9] devices on a diskless Linux box, the timestamp resets
itself, after about a second or less.

We tried to understand what the timestamp resets itself to. We found
that it resets itself to the timestamp of the device inode for that
device on the NFS server. This timestamp has not changed in years,
because these devices are not accessed by anything on the NFS server;
they're just left there for the diskless desktops to mount and use. For
instance, /dev/tty1 on our desktop called "dc33" is actually mapped onto
nfs:/export/linux/dc33/dev/tty1 which is a device file on the local hard
disk of the machine called "nfs".

This situation is really ridiculous, and we really highlighted this
ridiculous scenario by doing the following:

1. Log in on tty1 and tty2 of a local diskless desktop.

2. While remaining logged in on tty1, run the following script on tty2:

        while :; do
            sleep 1
            ls -l /dev/tty1
        done

3. Occasionally do some small actions on tty1, like run "echo hello" or
    "ls", etc. And watch what happens on the continuous output on tty2.

And this is what you get:

crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Sep 19 12:43 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Sep 19 12:43 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Sep 19 12:44 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1
crw--w----   1 sanjog   tty        4,   3 Aug  6  2000 /dev/tty1

As you can see, each time there's any I/O on tty1, the timestamp
changes, but it resets to the old value within one second.

And I can understand that NFS is stateless, hence the timestamps in
the inode on the NFS server are not changed when they change on the NFS
client. And I also understand that the NFS server will see the vfs_open()
NFS call, but will not see any of the vfs_read() or vfs_write() calls,
because, this being a device inode, those calls will be directed to the
NFS's client's device driver layer, and not go across the wire to the
NFS server. However:

1.  The NFS client flushes data to the NFS server once every few
    seconds, hence the timestamp "must" get updated on the device file
    on the NFS server. Why doesn't it? This works fine for "normal"
    files and directories. The /dev/tty* files on the NFS server have
    not had their timestamps updated in three years (that was when we
    set up our office infrastructure). Why?

    Does this mean that an NFS client does not even bother to write back
    its in-core inode copies for device files?

2.  The "ls" running on tty2 and the login session on tty1 are both
    running on the same diskless machine, hence both are doing the
    stat() system call on the same inode through the same kernel. The
    kernel will have its in-core copy of the inode which will have the
    correct timestamp, irrespective of what is flushed out to the NFS
    server. Therefore, that in-core copy should show the correct
    timestamp right through the login session at least. Remember, this
    inode is held open through the login session because the shell (at
    least) is a continuously running process on tty1, hence the in-core
    copy of the inode will not leave the kernel.

3.  This problem does not appear with the /dev/pts/* devices. Their
    timestamps don't get reset every second.

Hence, while I suspect this phenomenon is something to do with NFS
(after all, it doesn't happen with non-NFS device files, e.g. on my
laptop), my own explanation has plenty of holes in it.

Any ideas?

Shuvam

PS: I don't have any reason to believe kernel version has anything to do
    with this, so unless you can point me to prior art that this is a
    kernel-version-specific bug, I won't get into "which RedHat version
    are you running?" And I'm _not_ running RedHat, so there. :)










_______________________________________________
ilugd mailing list
[EMAIL PROTECTED]
http://frodo.hserus.net/mailman/listinfo/ilugd

[ilugd] Strange behaviour of device files on diskless Linux

Reply via email to