Public bug reported:

Note: An equivalent bug report is filed as Debian Bug#483186 at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=483186

Every now and then, we come across a machine which is unable to mount
the root filesystem for whatever reasons, and get stuck at the busybox
initrd environment, from which we can run dmesg to diagnostic what went
wrong.

To our dismay, in recent months (or years?), dmesg result come out like
this, with lots of missing numbers.  For example, from a test machine
booting Ubuntu 8.04 hardy (with an upgraded kernel):

    [    0.000] Linux version 2.6.2-1-generic ([EMAIL PROTECTED]) (gcc version 
4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu May 2 0:0:4 UTC 20 (Ubuntu 
2.6.2-1.2ubuntu6-generic)
    [    0.000] BIOS-provided physical RAM map:
    [    0.000]  BIOS-e80: 00000000 - 000000e00 (usable)
    [    0.000]  BIOS-e80: 000000e00 - 000000a00 (reserved)

But it is supposed to look like this:

    [    0.000000] Linux version 2.6.25-1-generic ([EMAIL PROTECTED]) (gcc 
version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu May 22 05:01:49 UTC 2008 
(Ubuntu 2.6.25-1.2ubuntu6-generic)
    [    0.000000] BIOS-provided physical RAM map:
    [    0.000000]  BIOS-e820: 0000000000000000 - 000000000009e000 (usable)
    [    0.000000]  BIOS-e820: 000000000009e000 - 00000000000a0000 (reserved)

This caused quite a bit of problem when we trying to diagnose kernel
oops or panics since the addresses are all wrong.

Initially, we thought it had something to do with memory corruption from
the kernel Oops.  But later, we noticed this phenomenon happens even for
cases without a kernel oops, say, perhaps we just got root=/dev/sda7
written wrong.

So, we decided to investigate, and eventually came to the realization
that the dmesg in initrd.img in Ubuntu (and Debian) nowadays come not
from busybox but klibc-utils, and running /usr/lib/klibc/bin/dmesg on a
fully booted system exhibit the same bug.

Checking the source code, we found the code used to strip out <[0-7]>
that prefixes every kernel message (See klogd(8)) is somewhat incorrect.
So, with a bit of hacking, we got that fixed.  :-)  A patch is attached.
Just drop it in debian/patches/20_dmesg_dropped-digits.patch and
repackage!  :-)

We have verified the output of this fixed dmesg identical to that of
util-linux dmesg.

Further thoughts:

We checked out klibc source using:
    git clone git://git.kernel.org/pub/scm/libs/klibc/klibc.git

And noticed it is an upstream bug since dmesg.c was first added on Mon
Aug 20 19:57:50 2007 +0200 as commit
9c5a7acda064daa7482148b5a45ee3b7ed39356c

As to why this bug wasn't discovered sooner... I don't know.  Perhaps
very few people use the tiny dmesg in klibc-utils for diagnostic
purposes?  And before that, Ubuntu (and Debian) uses the dmesg module in
busybox, which exhibits no such bug?

Cheers,

Anthony Fok <anthony dot fok at thizgroup dot com>
ThizLinux Software Co., Ltd. - A member of Thiz Technology Group
Debian GNU/Linux Developer

** Affects: klibc (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: dmesg initrd klibc

-- 
Every second number disappears in (tiny) dmesg output (in initrd.img)
https://bugs.launchpad.net/bugs/235282
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to