Package: linux-image-2.6.24-etchnhalf.1-amd64
Version: 2.6.24-6~etchnhalf.4
Severity: normal


epoll_wait sometimes returns spurious readyness notifications: when a
file descriptor is closed and a new one with the same number is created
and added to the epoll set, epoll_wait sometimes returns a readyness
notification for the previous fd:

   connect(11, {sa_family=AF_INET, sin_port=htons(80), 
sin_addr=inet_addr("129.42.56.216")}, 16) = -1 EINPROGRESS (Operation now in 
progress)
   epoll_ctl(4, EPOLL_CTL_MOD, 11, {EPOLLOUT, {u32=11, u64=11}}) = -1 ENOENT 
(No such file or directory)
   epoll_ctl(4, EPOLL_CTL_ADD, 11, {EPOLLOUT, {u32=11, u64=11}}) = 0
   epoll_wait(4, {{EPOLLIN, {u32=10, u64=10}}, {EPOLLHUP, {u32=11, u64=11}}}, 
64, 59743) = 2
   epoll_ctl(4, EPOLL_CTL_MOD, 10, {EPOLLOUT, {u32=10, u64=10}}) = 0
   epoll_ctl(4, EPOLL_CTL_MOD, 11, {EPOLLOUT, {u32=11, u64=11}}) = 0
   getpeername(11, 0xe33390, [63018818183627008]) = -1 ENOTCONN (Transport 
endpoint is not connected)
   write(2, "a Transport endpoint is not conne"..., 90a Transport endpoint is 
not connected at /opt/perl/lib/perl5/AnyEvent/Socket.pm line 770.
   ) = 90
   fcntl(11, F_SETFL, O_RDONLY)            = 0
   read(11, 0xe51af0, 1)                   = ? ERESTARTSYS (To be restarted)
   --- SIGCHLD (Child exited) @ 0 (0) ---
   write(5, "\1\0\0\0\0\0\0\0"..., 8)      = 8
   rt_sigreturn(0x5)                       = 0
   read(11,  <unfinished ...>

note how the MOD fails, indicating that the fd is not yet in the set and
now getpeername sys the socket is not (yet) connected while the following
read blocks (because the socket did NOT yte receive a HUP, as indicated by
epoll_wait).

How do I know the epoll_wait u64 data really refers to fd 11 in the above
example? The event library in question is libev, which uses epoll_ctl in only 
one place:

  ev.data.u64 = fd; /* use u64 to fully initialise the struct, for nicer strace 
etc. */
  if (expect_true (!epoll_ctl (backend_fd, oev ? EPOLL_CTL_MOD : EPOLL_CTL_ADD, 
fd, &ev)))

So libev ALWAYS registers interest in an fd with the u64 data set to the fd 
itself.

I modified libev to include a generation counter:

  ev.data.u64 |= (long long)debug_gencount++ << 32;//D

And an strace of the same issue no looks like this:

   connect(9, {sa_family=AF_INET, sin_port=htons(80), 
sin_addr=inet_addr("129.42.56.216")}, 16) = -1 EINPROGRESS (Operation now in 
progress)
   epoll_ctl(4, EPOLL_CTL_ADD, 9, {EPOLLOUT, {u32=9, u64=26787711025161}}) = 0
   epoll_wait(4, {{EPOLLHUP, {u32=9, u64=26766236188681}}, {EPOLLIN, {u32=10, 
u64=26749056319498}}, {EPOLLHUP, {u32=12, u64=26774826123276}}}, 64, 59743) = 3
   epoll_ctl(4, EPOLL_CTL_MOD, 9, {EPOLLOUT, {u32=9, u64=26766236188681}}) = 0
   epoll_ctl(4, EPOLL_CTL_DEL, 10, {0, {u32=10, u64=26749056319498}}) = -1 
EBADF (Bad file descriptor)
   epoll_ctl(4, EPOLL_CTL_MOD, 12, {EPOLLIN, {u32=12, u64=26774826123276}}) = 0
   read(12, ""..., 65536)                  = 0
   close(12)                               = 0
   wait4(4920, 0x7fff048ec4dc, 0, NULL)    = -1 ECHILD (No child processes)
   getpeername(9, 0x2140ee0, [137077042347770112]) = -1 ENOTCONN (Transport 
endpoint is not connected)
   write(2, "a Transport endpoint is not conne"..., 90a Transport endpoint is 
not connected at /opt/perl/lib/perl5/AnyEvent/Socket.pm line 770.
   ) = 90
   fcntl(9, F_SETFL, O_RDONLY)             = 0
   read(9, 0xd01df0, 1)                    = ? ERESTARTSYS (To be restarted)
   --- SIGCHLD (Child exited) @ 0 (0) ---
   write(5, "\1\0\0\0\0\0\0\0"..., 8)      = 8
   rt_sigreturn(0x5)                       = 0
   read(9, 

As you can see, epoll_ctl_add uses 26787711025161 (gencount 0x185d in the
higher 32 bits, fd 9 in the lower), but epoll_wait returns an event with
u64 set to 26766236188681 (gencount 0x1858, from an an earlier epoll_add,
and fd 9).

This proves that epoll_wait sometimes returns events for fd's not
currently registered in the epoll set.

This bug occurs only rarely, only under load, and is somewhat hard to
reproduce so I can't give a small example program. The analysis here is
probably enough to hunt down and fix this bug, however.

When a socket gets closed, then epoll *must* also remove all pending
notifications for that fd when removing the fd from the set.

-- Package-specific info:
** Version:
Linux version 2.6.24-etchnhalf.1-amd64 (Debian 2.6.24-6~etchnhalf.4) ([EMAIL 
PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP 
Mon Jul 21 10:36:02 UTC 2008

** Tainted: P (1)

-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 
'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.24-etchnhalf.1-amd64 (SMP w/4 CPU cores)
Locale: LANG=C, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages linux-image-2.6.24-etchnhalf.1-amd64 depends on:
ii  debconf [debconf-2.0]        1.5.11etch1 Debian configuration management sy
ii  initramfs-tools [linux-initr 0.92b       tools for generating an initramfs
ii  module-init-tools            3.3-pre4-2  tools for managing Linux kernel mo

linux-image-2.6.24-etchnhalf.1-amd64 recommends no packages.

Versions of packages linux-image-2.6.24-etchnhalf.1-amd64 suggests:
ii  grub                          0.97-27    GRand Unified Bootloader
ii  lilo                          1:22.8-4   LInux LOader - The Classic OS load
pn  linux-doc-2.6.24              <none>     (no description available)

-- debconf information excluded



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to