Dear Debian folks,

Am Sonntag, den 08.07.2012, 12:03 +0200 schrieb Paul Menzel:

> Am Dienstag, den 03.07.2012, 16:50 +0200 schrieb Paul Menzel:
> > Package: klogd
> > Version: 1.5-6.2
> > Severity: important
> 
> > klogd suddenly started to consume 100 % of one CPU core.
> > 
> >         $ ps aux | grep klogd
> >         klog      1731 98.4  0.0   2860  1580 ?        Rs   12:50 194:41 
> > /sbin/klogd -P /var/run/klogd/kmsg
> > 
> > Due to missing debugging symbols attaching with GDB does not seem to be
> > that useful. But maybe it is.
> > 
> >         $ sudo gdb --pid 1731
> >         GNU gdb (GDB) 7.4.1-debian
> >         Copyright (C) 2012 Free Software Foundation, Inc.
> >         License GPLv3+: GNU GPL version 3 or later 
> > <http://gnu.org/licenses/gpl.html>
> >         This is free software: you are free to change and redistribute it.
> >         There is NO WARRANTY, to the extent permitted by law.  Type "show 
> > copying"
> >         and "show warranty" for details.
> >         This GDB was configured as "i486-linux-gnu".
> >         For bug reporting instructions, please see:
> >         <http://www.gnu.org/software/gdb/bugs/>.
> >         Attaching to process 1731
> >         Reading symbols from /sbin/klogd...(no debugging symbols 
> > found)...done.
> >         Reading symbols from 
> > /lib/i386-linux-gnu/i686/cmov/libc.so.6...Reading symbols from 
> > /usr/lib/debug/lib/i386-linux-gnu/i686/cmov/libc-2.13.so...done.
> >         done.
> >         Loaded symbols for /lib/i386-linux-gnu/i686/cmov/libc.so.6
> >         Reading symbols from /lib/ld-linux.so.2...(no debugging symbols 
> > found)...done.
> >         Loaded symbols for /lib/ld-linux.so.2
> >         0xffffe424 in __kernel_vsyscall ()
> >         (gdb) bt
> >         #0  0xffffe424 in __kernel_vsyscall ()
> >         #1  0xb76f0f53 in __read_nocancel () at 
> > ../sysdeps/unix/syscall-template.S:82
> >         #2  0x0804a404 in ?? ()
> >         #3  0x080496a5 in ?? ()
> >         #4  0xb7646e46 in __libc_start_main (main=0x80492e0, argc=3, 
> > ubp_av=0xbf87aee4, init=0x804bff0, fini=0x804bfe0, rtld_fini=0xb77cd590, 
> >             stack_end=0xbf87aedc) at libc-start.c:228
> >         #5  0x08049949 in ?? ()
> >         (gdb) c
> >         Continuing.
> >         ^C     
> >         Program received signal SIGINT, Interrupt.
> >         0xffffe424 in __kernel_vsyscall ()
> >         (gdb) bt
> >         #0  0xffffe424 in __kernel_vsyscall ()
> >         #1  0xb76f0f53 in __read_nocancel () at 
> > ../sysdeps/unix/syscall-template.S:82
> >         #2  0x0804a404 in ?? ()
> >         #3  0x080496a5 in ?? ()
> >         #4  0xb7646e46 in __libc_start_main (main=0x80492e0, argc=3, 
> > ubp_av=0xbf87aee4, init=0x804bff0, fini=0x804bfe0, rtld_fini=0xb77cd590, 
> >             stack_end=0xbf87aedc) at libc-start.c:228
> >         #5  0x08049949 in ?? ()
> >         (gdb) c
> >         Continuing.
> >         ^C  
> >         Program received signal SIGINT, Interrupt.
> >         0xffffe424 in __kernel_vsyscall ()
> >         (gdb) bt
> >         #0  0xffffe424 in __kernel_vsyscall ()
> >         #1  0xb76f0f53 in __read_nocancel () at 
> > ../sysdeps/unix/syscall-template.S:82
> >         #2  0x0804a404 in ?? ()
> >         #3  0x080496a5 in ?? ()
> >         #4  0xb7646e46 in __libc_start_main (main=0x80492e0, argc=3, 
> > ubp_av=0xbf87aee4, init=0x804bff0, fini=0x804bfe0, rtld_fini=0xb77cd590, 
> >             stack_end=0xbf87aedc) at libc-start.c:228
> >         #5  0x08049949 in ?? ()
> >         (gdb) c
> >         Continuing.
> >         ^C
> >         Program received signal SIGINT, Interrupt.
> >         0xffffe424 in __kernel_vsyscall ()
> >         (gdb) quit
> >         A debugging session is active.
> > 
> >     Inferior 1 [process 1731] will be detached.
> > 
> >         Quit anyway? (y or n) y
> > 
> > It seems to want to read something. The following output was printed
> > continuously to the terminal when attaching with `strace`.
> > 
> >         $ sudo strace -p 1731
> >         […]
> >         read(0, "", 4095)                       = 0
> >         read(0, "", 4095)                       = 0
> >         read(0, "", 4095)                       = 0
> >         read(0, "", 4095)                       = 0
> >         read(0, "", 4095)                       = 0
> >         [several more times]
> > 
> > Building klogd with debugging symbols and installing the rogue(?)
> > process was killed.
> > 
> >         $ export DEB_BUILD_OPTIONS="debug"
> >         $ dpkg-buildpackage -us -uc -B
> >         $ sudo dpkg -i ../klogd_1.5-6.2_i386.deb 
> >         (Lese Datenbank ... 494344 Dateien und Verzeichnisse sind derzeit 
> > installiert.)
> >         Vorbereitung zum Ersetzen von klogd 1.5-6.2 (durch 
> > ../klogd_1.5-6.2_i386.deb) ...
> >         Ersatz für klogd wird entpackt ...
> >         klogd (1.5-6.2) wird eingerichtet ...
> >         [....] Stopping kernel log daemon...start-stop-daemon: warning: 
> > failed to kill 1729: No such process
> >         1 pids were not killed
> >         No process in pidfile '/var/run/klogd/kmsgpipe.pid' found running; 
> > none killed.
> >         . ok 
> >         [ ok ] Starting kernel log daemon....
> >         Trigger für man-db werden verarbeitet ...
> > 
> > Is that a known problem?
> 
> First this might be the same issue as #308580 [1].
> 
> I know of one user also experiencing this problem currently. One theory
> is that it is related to running the latest Linux kernel. In my case
> Linux 3.5-rc5+. Running the default Debian Linux kernel 3.2 I have not
> yet experienced this problem I think.

Testing it again, with 3.5-rc{5,6} and probably older versions klogd
uses 100 % right away when it is started reliably.

Here is some output also asked for in #308580.

        $ ps aux | grep klog
        klog      1785 98.7  0.0   2988  1652 ?        Rs    0:10   2:32 
/sbin/klogd -P /var/run/klogd/kmsg
        joey      4406  0.0  0.0   3520   760 pts/3    D+    0:11   0:00 grep 
klog
        $ sudo service klogd stop
        [....] Stopping kernel log daemon...start-stop-daemon: warning: failed 
to kill 1783: No such process
        1 pids were not killed
        No process in pidfile '/var/run/klogd/kmsgpipe.pid' found running; none 
killed.
        . ok

> Furthermore the other user also figured out that it loops
> on /run/klogd/kmsg and that this pipe is closed from one end. When
> dd’ing /proc/kmsg into that pipe is finished, klogd does not handle the
> EOF. The current theory is that the construct
> 
>         if ((rdcnt = read(…)) < 0) {
>         […]
>         }
> 
> is buggy. I have no idea how to fix it though.
> 
> > I also do not know why and how `/etc/init.d/klogd` was changed.
> 
> […]


Thanks,

Paul


> [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=308580

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to