Bug#594845: Acknowledgement (linux-image-2.6.32-5-amd64: kernel BUG at /build/buildd-linux-2.6_2.6.32-20-amd64-lNUT1p/..../fs/sysfs/file.c:539)
You are editing in the wrong place. The patch needs to be applied in debian/build/source_amd64_none. Ta. I applied the patch to every copy of tun.c other than the one in source_amd64_openvz_amd64, and now the my trace is printed as it should be. And with the patch applied properly the problem disappears, so it does indeed fix the problem. The debian/bin/test-patches script can handle this all for you. Unfortunately the patch doesn't apply to source_amd64_openvz_amd64, and test-patches dies as soon as that fails. That is why I was doing it manually. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1283844141.4098.113.ca...@russell-laptop
Bug#594845: Acknowledgement (linux-image-2.6.32-5-amd64: kernel BUG at /build/buildd-linux-2.6_2.6.32-20-amd64-lNUT1p/..../fs/sysfs/file.c:539)
On Sun, 2010-09-05 at 01:45 +0100, Ben Hutchings wrote: Are you quite sure you used the modified kernel? Nope. 'cat /proc/version' will tell you for sure which version you are running. $ cat /proc/version Linux version 2.6.32-5-amd64 (Debian 2.6.32-20.2) (russell-deb...@stuart.id.au) (gcc version 4.3.5 (Debian 4.3.5-2) ) #1 SMP Mon Sep 6 09:14:09 EST 2010 So that looks like I am running the right kernel. But because the symptoms didn't change overly I had my doubts from the beginning, and thus have been trying to prove I was running the kernel with your patch applied. Doing that in various ways is why it took me a while to respond to your request to test the patch. One way I tried to confirm it by adding trace: --- x/linux-2.6-2.6.32/drivers/net/tun.c2009-12-03 13:51:21.0 +1000 +++ linux-2.6-2.6.32/drivers/net/tun.c 2010-09-06 08:09:44.068458190 +1000 @@ -36,7 +36,7 @@ #define DRV_NAME tun #define DRV_VERSION1.6 -#define DRV_DESCRIPTIONUniversal TUN/TAP device driver +#define DRV_DESCRIPTIONUniversal TUN/TAP device driver + 0001-tun-Don-t-add-sysfs-attributes-to-devices-without-sy.patch applied #define DRV_COPYRIGHT (C) 1999-2004 Max Krasnyansky m...@qualcomm.com #include linux/module.h @@ -1006,7 +1006,9 @@ if (err 0) goto err_free_sk; - if (device_create_file(tun-dev-dev, dev_attr_tun_flags) || + printk(KERN_INFO 0001-tun-Don-t-add-sysfs-attributes-to-devices-without-sy.patch\n); + if (!net_eq(dev_net(tun-dev), init_net) || + device_create_file(tun-dev-dev, dev_attr_tun_flags) || device_create_file(tun-dev-dev, dev_attr_owner) || device_create_file(tun-dev-dev, dev_attr_group)) printk(KERN_ERR Failed to create tun sysfs files\n); None of the trace is ever appears in /var/log/kern.log, which is to say grep 0001-tun /var/log/kern.log prints nothing. Yet I am evidently running a different kernel as I don't ever hit the BUG the buildd kernel generates. I don't have a clue what is going on. The steps I used to generate the kernel are: $ apt-get source linux-2.6 $ cd linux-2.6-2.6.32 $ fakeroot debian/rules source $ fakeroot debian/rules setup $ patch -p1 .../0001-tun-Don-t-add-sysfs-attributes-to-devices-without-sy.patch $ ed drivers/net/tun.c # add trace $ ed debian/changelog # add rev 2.6.30-20.2 $ fakeroot debian/rules binary $ sudo dpkg -i ../linux-image-2.6.32-5-amd64_2.6.32-20.2_amd64.deb $ sudo ed /boot/grub/grub.cfg # set the default kernel $ sudo reboot -f -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1283773622.4264.65.ca...@russell-laptop
Bug#594845: Acknowledgement (linux-image-2.6.32-5-amd64: kernel BUG at /build/buildd-linux-2.6_2.6.32-20-amd64-lNUT1p/..../fs/sysfs/file.c:539)
On Mon, 2010-09-06 at 21:47 +1000, Russell Stuart wrote: [...] The steps I used to generate the kernel are: $ apt-get source linux-2.6 $ cd linux-2.6-2.6.32 $ fakeroot debian/rules source $ fakeroot debian/rules setup $ patch -p1 .../0001-tun-Don-t-add-sysfs-attributes-to-devices-without-sy.patch $ ed drivers/net/tun.c # add trace [...] You are editing in the wrong place. The patch needs to be applied in debian/build/source_amd64_none. The debian/bin/test-patches script can handle this all for you. Ben. -- Ben Hutchings Once a job is fouled up, anything done to improve it makes it worse. signature.asc Description: This is a digitally signed message part
Bug#594845: Acknowledgement (linux-image-2.6.32-5-amd64: kernel BUG at /build/buildd-linux-2.6_2.6.32-20-amd64-lNUT1p/..../fs/sysfs/file.c:539)
On Sat, 2010-09-04 at 09:27 +1000, Russell Stuart wrote: On Mon, 2010-08-30 at 14:48 +0100, Ben Hutchings wrote: On Mon, 2010-08-30 at 17:34 +1000, Russell Stuart wrote: The problem disappears in linux-image-2.6.35-trunk-amd64_2.6.35-1~experimental.2. Yes, as I expected. Can you please test the attached patch against the version in unstable? Directions for rebuilding an official kernel package are at http://kernel-handbook.alioth.debian.org/ch-common-tasks.html#s-common-official. Applied that. It changed the problem. Before I got a nice repeatable BUG. Now the openvpn instance unconditionally segfaults and normally nothing appears on the console or in kern.log. Once I got lucky and this appeared on the console: [...] Are you quite sure you used the modified kernel? This message matches your original report: kernel:[52062.330671] Code: 74 0f 48 89 ef e8 24 07 00 00 eb 05 bb fe ff ff ff 89 d8 5b 5d 41 5c c3 48 85 ff 74 0e 48 8b 7f 30 48 85 ff 74 05 48 85 f6 75 04 0f 0b eb fe ba 02 00 00 00 e9 5d ff ff ff 55 53 48 89 fb 48 c7 'cat /proc/version' will tell you for sure which version you are running. The machine appears to freeze in various ways - eg you can't get a login prompt to have a sniff around and the first command you type at an existing shell prompt that requires disk IO freezes, and a sleep 300; sudo reboot -f doesn't do anything. On the other hand a for f in $(seq 1000); do echo $f; sleep 1; done continues on as though nothing has happened. Disk IO is probably borked. The crash occurs at a point where the tun driver is holding the 'RTNL' lock which controls access to network device configuration. Any operation that must acquire that lock (and it's surprising how many operations do) will hang. Ben. -- Ben Hutchings Once a job is fouled up, anything done to improve it makes it worse. signature.asc Description: This is a digitally signed message part
Bug#594845: Acknowledgement (linux-image-2.6.32-5-amd64: kernel BUG at /build/buildd-linux-2.6_2.6.32-20-amd64-lNUT1p/..../fs/sysfs/file.c:539)
On Mon, 2010-08-30 at 14:48 +0100, Ben Hutchings wrote: On Mon, 2010-08-30 at 17:34 +1000, Russell Stuart wrote: The problem disappears in linux-image-2.6.35-trunk-amd64_2.6.35-1~experimental.2. Yes, as I expected. Can you please test the attached patch against the version in unstable? Directions for rebuilding an official kernel package are at http://kernel-handbook.alioth.debian.org/ch-common-tasks.html#s-common-official. Applied that. It changed the problem. Before I got a nice repeatable BUG. Now the openvpn instance unconditionally segfaults and normally nothing appears on the console or in kern.log. Once I got lucky and this appeared on the console: Message from sysl...@toby at Sep 4 09:11:57 ... kernel:[52062.327222] [ cut here ] Message from sysl...@toby at Sep 4 09:11:57 ... kernel:[52062.329577] invalid opcode: [#1] SMP Message from sysl...@toby at Sep 4 09:11:57 ... kernel:[52062.330671] last sysfs file: /sys/devices/virtual/misc/tun/uevent Message from sysl...@toby at Sep 4 09:11:57 ... kernel:[52062.330671] Stack: Message from sysl...@toby at Sep 4 09:11:57 ... kernel:[52062.330671] Call Trace: Message from sysl...@toby at Sep 4 09:11:57 ... kernel:[52062.330671] Code: 74 0f 48 89 ef e8 24 07 00 00 eb 05 bb fe ff ff ff 89 d8 5b 5d 41 5c c3 48 85 ff 74 0e 48 8b 7f 30 48 85 ff 74 05 48 85 f6 75 04 0f 0b eb fe ba 02 00 00 00 e9 5d ff ff ff 55 53 48 89 fb 48 c7 The machine appears to freeze in various ways - eg you can't get a login prompt to have a sniff around and the first command you type at an existing shell prompt that requires disk IO freezes, and a sleep 300; sudo reboot -f doesn't do anything. On the other hand a for f in $(seq 1000); do echo $f; sleep 1; done continues on as though nothing has happened. Disk IO is probably borked. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1283556425.23992.14.ca...@russell-laptop
Bug#594845: Acknowledgement (linux-image-2.6.32-5-amd64: kernel BUG at /build/buildd-linux-2.6_2.6.32-20-amd64-lNUT1p/..../fs/sysfs/file.c:539)
The problem disappears in linux-image-2.6.35-trunk-amd64_2.6.35-1~experimental.2. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1283153685.4366.2.ca...@russell-laptop
Bug#594845: Acknowledgement (linux-image-2.6.32-5-amd64: kernel BUG at /build/buildd-linux-2.6_2.6.32-20-amd64-lNUT1p/..../fs/sysfs/file.c:539)
On Mon, 2010-08-30 at 17:34 +1000, Russell Stuart wrote: The problem disappears in linux-image-2.6.35-trunk-amd64_2.6.35-1~experimental.2. Yes, as I expected. Can you please test the attached patch against the version in unstable? Directions for rebuilding an official kernel package are at http://kernel-handbook.alioth.debian.org/ch-common-tasks.html#s-common-official. Ben. -- Ben Hutchings Once a job is fouled up, anything done to improve it makes it worse. From e5e7a7a14da22681abd96a305753e8cdcf898d40 Mon Sep 17 00:00:00 2001 From: Ben Hutchings b...@decadent.org.uk Date: Mon, 30 Aug 2010 14:38:14 +0100 Subject: [PATCH] tun: Don't add sysfs attributes to devices without sysfs directories Prior to Linux 2.6.35, net devices outside the initial net namespace did not have sysfs directories. Attempting to add attributes to them will trigger a BUG(). Reported-by: Russell Stuart russell-deb...@stuart.id.au Signed-off-by: Ben Hutchings b...@decadent.org.uk --- drivers/net/tun.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 4fdfa2a..0f77aca 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1006,7 +1006,8 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr) if (err 0) goto err_free_sk; - if (device_create_file(tun-dev-dev, dev_attr_tun_flags) || + if (!net_eq(dev_net(tun-dev), init_net) || + device_create_file(tun-dev-dev, dev_attr_tun_flags) || device_create_file(tun-dev-dev, dev_attr_owner) || device_create_file(tun-dev-dev, dev_attr_group)) printk(KERN_ERR Failed to create tun sysfs files\n); -- 1.7.1 signature.asc Description: This is a digitally signed message part