Launchpad has imported 7 comments from the remote bug at
http://sourceware.org/bugzilla/show_bug.cgi?id=14.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2004-02-09T15:07:14+00:00 Dan-tsafrir wrote:

- This bug report was already posted in [email protected]
  I'm resubmitting it here at the request of Andreas Jaeger.

- All the following relates to the file:

     glibc-2.3.2/sysdeps/unix/sysv/linux/getdents.c

  I report more than one bug in getdents() so please bare with me :> (the
  description is actually much longer than the patch). In a nutshell, the
  following program might (nondeterministically) enter an endless-loop:

      DIR           *dir = opendir("some-autofs-directory");
      struct dirent *e;
      while( (e = readdir(dir)) != NULL ) // might never return null
          printf("ino=%d name=%s\n", e->d_ino, e->d_name)

- In the implementation of getdents(), the `d_off' field (belonging to the
  linux kernel's dirent structure) is falsely assumed to contain the *byte*
  offset to the next dirent.
  Note that the linux manual of the readdir system-call states that d_off
  is the "offset to *this* dirent" while glibc's getdents treats it as the
  offset to the *next* dirent.

- In practice, both of the above are wrong/misleading. The `d_off' field
  may contain illegal negative values, 0 (should also never happen as the
  "next" dirent's offset must always be bigger then 0), or positive values
  that are bigger than the size of the directory-file itself:

  o We're not sure what the Linux kernel intended to place in this field,
    but our experience shows that on "real" file systems (that actually
    reside on some disk) the offset seems to be a simple (not necessarily
    continuous) counter: e.g. first entry may have d_off=1, second: d_off=2,
    third: d_off=4096, fourth=d_off=4097 etc. We conjecture this is the
    serial of the dirent record within the directory (and so, this is indeed
    the "offset", but counted in records out of which some were already
    removed).

  o For file systems that are maintained by the amd automounter (automount,
    directories) the d_off seems to be arbitrary (and may be negative, zero
    or beyond the scope of a 32bit integer). We conjecture the amd doesn't
    assign this field and the received values are simply garbage.

- Fortunately, there is actually no need to use d_off as all the needed
  information is embedded in the d_reclen field (dirents are consecutive
  and the "next" dirent is found exactly after the "current" dirent).
  And indeed, glibc's getdents() uses the above technique most of the time.
  But, there are certain exceptional-conditions that might trigger
  getdents() to stop using on the dependable d_reclen and start using
  the unreliable d_off (Furthermore, these exceptional-conditions are
  identified based on the unreliable information held by d_off).

- One aspect of this bug was actually reported in 2001:
      http://mail.gnu.org/archive/html/bug-glibc/2001-03/msg00048.html
  but was wrongly (IMHO) dismissed by Ulrich Drepper who claimed that the
  bug is actually in the linux-kernel. The linux-kernel folks do not intend
  to change the data of their 'struct dirent'. They want it just the way it
  is and don't care if the glibc folks "wrongly" interpret the `d_off' field.

- The bug in glibc only occurs when an "overflow" is *wrongly* detected
  (see lines: 176-183 in getdents.c).
  Overflow might happen when the kernel uses `kernel_dirent64' that contains
  64bit wide fields (d_ino & d_off), while the user uses `struct dirent'
  through readdir(3) (rather than `struct dirent64' through readdir64).
  in which the corresponding fields are 32bit wide.
  [ This is of course a legal scenario: the user shouldn't know nor care
    how the kernel implements its readdir ]

- In such an "overflow" situation, getdents() tries to lseek (the
  directory-fd) to the end of the last-legal-dirent that getdents()
  has successfully read.
  This offset is supposedly held by the local variable `last_offset'.
  But, since `last_offset' is assigned with `d_off' on each iteration,
  and since as mentioned above `d_off' usually holds an incorrect value,
  the lseek is not performed to the correct place.
  getdents() then returns the number of bytes successfully read.

- This situation of course causes the next invocation of readdir() to
  return erroneous data due to the incorrect offset associated with
  the fd.
  Furthermore, it often happens (for autofs directories that serve as
  automount points and are maintained by the amd) that the first dirent-entry
  has d_off=0 and the second dirent causes a wrong "overflow".
  In this case getdents() will return the number of bytes composing
  the first successfully read dirent, and will wrongly lseek to 0: the
  beginning of the file.
  This means that the next invocation of readdir(3) will return (again)
  the first dirent, and so on (never returning NULL), causing any
  program that uses readdir(3) to enter into an endless loop!


How to fix (the above bug + another two):
########################################

1:---------------------------------------------------------------------
Instead of line 193:
        last_offset = d_off;
write:
        if( last_offset == -1 )
                last_offset = 0;
        last_offset += old_reclen;

    [  This makes `last_offset' hold the *real* last offset, and prevents
       the wrong lseek() when an "overflow" occurs. It works since the
       `d_reclen' is reliable and correct, as reflected by the manner
       getdents() uses `old_reclen'  ]


2:---------------------------------------------------------------------
The condition that identifies an "overflow" situation should also be
changed:
Instead of lines 176-179:
        if ((sizeof (outp->u.d_ino) != sizeof (inp->k.d_ino)
             && outp->u.d_ino != d_ino)
            || (sizeof (outp->u.d_off) != sizeof (inp->k.d_off)
                && outp->u.d_off != d_off))
only write:
        if ( sizeof (outp->u.d_ino) != sizeof (inp->k.d_ino)
             && outp->u.d_ino != d_ino )

    [  I removed the right-side of the `or': since the `d_off' field is
       buggy and unusable, don't include it in the check for overflow.
       The alterative is that programs reading the content of a directory
       maintained by the amd using readdir(3), will usually fail due to
       overflow, even though the directory is prefectly fine.  ]


3:---------------------------------------------------------------------
The following is another bug I think you have (which is unrelated
to the above).
Instead of line 120:
        && nbytes <= sizeof (DIRENT_TYPE))
write:
        && nbytes <= sizeof (struct kernel_dirent64))

    [  The above condition decides whether the user gave a large enough
       buffer in which `struct kernel_dirent64' will fit (`nbytes' is the
       size of the user's `buf'). If `buf' is too small, a bigger buffer
       is allocated (lines 122-124) so as to be able to hold `struct
       kernel_dirent64'  ]


4:---------------------------------------------------------------------
I think the following is also an (unrelated) bug:
In lines 216-221:

        red_nbytes = MIN (nbytes
                      - ((nbytes / (offsetof (DIRENT_TYPE, d_name) + 14))
                         * size_diff),
                      nbytes - size_diff);

        skdp = kdp = __alloca (red_nbytes);

It is obvious that `red_nbytes' may be negative (or rather, wrapped-around),
since `nbytes' is a value given by the user (and may even be 0).
-----------------------------------------------------------------------


The above patch is given below (your interface doesn't allow attachments),
it was generated using diff -u.
This patch was applied to our installation of glibc and is successfully
used for a few months now by hundreds of of machines.
Thanks,
    --Dan 


--- getdents.c.orig     2003-12-04 19:11:38.000000000 +0200
+++ getdents.c  2003-12-04 19:38:06.000000000 +0200
@@ -117,7 +117,7 @@
       size_t kbytes = nbytes;
       if (offsetof (DIRENT_TYPE, d_name)
          < offsetof (struct kernel_dirent64, d_name)
-         && nbytes <= sizeof (DIRENT_TYPE))
+         && nbytes <= sizeof (kernel_dirent64))
        {
          kbytes = nbytes + offsetof (struct kernel_dirent64, d_name)
                   - offsetof (DIRENT_TYPE, d_name);
@@ -175,8 +175,7 @@
              outp->u.d_off = d_off;
              if ((sizeof (outp->u.d_ino) != sizeof (inp->k.d_ino)
                   && outp->u.d_ino != d_ino)
-                 || (sizeof (outp->u.d_off) != sizeof (inp->k.d_off)
-                     && outp->u.d_off != d_off))
+                 )
                {
                  /* Overflow.  If there was at least one entry
                     before this one, return them without error,
@@ -190,7 +189,10 @@
                  return -1;
                }
 
-             last_offset = d_off;
+             if( last_offset == -1 )
+               last_offset = 0;
+             last_offset += old_reclen;
+
              outp->u.d_reclen = new_reclen;
              outp->u.d_type = d_type;
 
@@ -213,6 +215,7 @@
     const size_t size_diff = (offsetof (DIRENT_TYPE, d_name)
                              - offsetof (struct kernel_dirent, d_name));
 
+    /* bug? (nbytes might be smaller than right side of minus) */
     red_nbytes = MIN (nbytes
                      - ((nbytes / (offsetof (DIRENT_TYPE, d_name) + 14))
                         * size_diff),

Reply at: https://bugs.launchpad.net/glibc/+bug/11685/comments/0

------------------------------------------------------------------------
On 2006-02-21T16:27:00+00:00 Decimal wrote:

This patch applies cleanly to sysdeps/unix/sysv/linux/getdents.c in the
trunk.

I have created a first pass at a testcase but it requires an autofs filesystem
and is non-deterministic according to the bug comments so I have not actually
reproduced yet.


Reply at: https://bugs.launchpad.net/glibc/+bug/11685/comments/11

------------------------------------------------------------------------
On 2006-02-21T16:33:04+00:00 Decimal wrote:

Created attachment 876
Patch taken from bug comments.

Reply at: https://bugs.launchpad.net/glibc/+bug/11685/comments/12

------------------------------------------------------------------------
On 2006-02-21T16:33:44+00:00 Decimal wrote:

Created attachment 877
Preliminary testcase adapted from bug comments.

Reply at: https://bugs.launchpad.net/glibc/+bug/11685/comments/13

------------------------------------------------------------------------
On 2006-02-21T16:47:56+00:00 Decimal wrote:

Created attachment 878
Standalone preliminary testcase adapted from bug comments.

Here is a standalone testcase as well if someone happens to have a system with
autofs to reproduce this and to refine the testcase.

Reply at: https://bugs.launchpad.net/glibc/+bug/11685/comments/14

------------------------------------------------------------------------
On 2010-06-01T03:23:03+00:00 Petr Baudis wrote:

AFAICS, current kernel does put_user(file->f_pos, &lastdirent->d_off), thus
d_off should have well-defined meaning nowadays.


Reply at: https://bugs.launchpad.net/glibc/+bug/11685/comments/15

------------------------------------------------------------------------
On 2010-06-01T09:15:15+00:00 Dan-tsafrir wrote:

Petr, your reason for closing this bug is invalid:

(1) If you read the original bug report you'll notice that it's not
enough for d_off to have a "well defined meaning". What you really
need is for that meaning to be the same in both the kernel and in
libc. This bug report specifically points out the fact that the
meaning is different.

(2) Also, if you search for put_user(file->f_pos,&lastdirent->d_off)
in the kernel code you'd see that it existed many, many years before
this bug was submitted (the line is there since Linux-1.3.0, no
less). Thus, having this line in the kernel doesn't prove or disprove
anything.


Reply at: https://bugs.launchpad.net/glibc/+bug/11685/comments/16


** Changed in: glibc
   Importance: Unknown => Critical

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/11685

Title:
  libc6: Bug (+fix) in readdir() due to getdents()

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to