Bug#1040198: Turris Omnia kernel warns about "i2c i2c-0: mv64xxx: I2C bus locked" and "pca953x 8-0071: failed reading register"
Package: linux-image-6.1.0-9-armmp 6.1.27-1 I recently upgraded a Turris Omnia devce (https://docs.turris.cz/hw/omnia/omnia/) to debian bookworm. The newer kernel now produces a pair of repeated error messages a few times a minute: Jul 03 05:48:02 host kernel: i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0 Jul 03 05:48:02 kernel: pca953x 8-0071: failed reading register Jul 03 05:48:19 host kernel: i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0 Jul 03 05:48:19 host kernel: pca953x 8-0071: failed reading register Jul 03 05:48:57 host kernel: i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0 Jul 03 05:48:57 host kernel: pca953x 8-0071: failed reading register Jul 03 05:49:25 host kernel: i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0 Jul 03 05:49:25 host kernel: pca953x 8-0071: failed reading register This appears to be a regression, because the earlier kernel (5.10.0) did not produce these error messages. If there is any additional debugging information you'd find useful, i'm happy to try to gather it. These log messages are annoying because they mean additional writes to the logfiles. --dkg -- Package-specific info: ** Version: Linux version 6.1.0-9-armmp (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP Debian 6.1.27-1 (2023-05-08) ** Command line: earlyprintk console=ttyS0,115200 root=/dev/mapper/host--vg-root cfg80211.freg=$regdomain pcie_aspm=off ** Not tainted ** Kernel log: Unable to read kernel log; any relevant messages should be attached ** Model information Hardware: Marvell Armada 380/385 (Device Tree) Revision nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables libcrc32c nfnetlink binfmt_misc sit tunnel4 ip_tunnel wireguard libchacha20poly1305 chacha_neon poly1305_arm ath9k curve25519_neon libcurve25519_generic ip6_udp_tunnel udp_tunnel ath9k_common ath9k_hw ath10k_pci ath10k_core x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic dm_crypt dm_mod gpio_pca953x mv88e6xxx dsa_core bridge stp llc selftests ehci_orion ehci_hcd ahci_mvebu i2c_mux_pca954x xhci_plat_hcd libahci_platform i2c_mux libahci xhci_hcd marvell libata mvmdio mvneta mdio_devres of_mdio fixed_phy phylink fwnode_mdio sdhci_pxav3 usbcore scsi_mod scsi_common sdhci_pltfm sdhci libphy spi_orion i2c_mv64xxx aes_arm_bs crypto_simd cryptd ** PCI devices: 00:01.0 PCI bridge [0604]: Marvell Technology Group Ltd. 88F6820 [Armada 385] ARM SoC [11ab:6820] (rev 04) (prog-if 00 [Normal decode]) Subsystem: Marvell Technology Group Ltd. 88F6820 [Armada 385] ARM SoC [11ab:11ab] Device tree node: /sys/firmware/devicetree/base/soc/pcie-controller/pcie@1,0 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: 00:02.0 PCI bridge [0604]: Marvell Technology Group Ltd. 88F6820 [Armada 385] ARM SoC [11ab:6820] (rev 04) (prog-if 00 [Normal decode]) Subsystem: Marvell Technology Group Ltd. 88F6820 [Armada 385] ARM SoC [11ab:11ab] Device tree node: /sys/firmware/devicetree/base/soc/pcie-controller/pcie@2,0 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 002 Device 002: ID 067b:2303 Prolific Technology, Inc. PL2303 Serial Port / Mobile Action MA-8910P Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub -- System Information: Debian Release: 12.0 APT prefers stable-security APT policy: (500, 'stable-security'), (500, 'stable') Architecture: armhf (armv7l) Kernel: Linux 6.1.0-9-armmp (SMP w/2 CPU threads) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) Versions of packages linux-image-6.1.0-9-armmp depends on: ii initramfs-tools [linux-initramfs-tool] 0.142 ii kmod30+20221128-1 ii linux-base 4.9 Versions of packages linux-image-6.1.0-9-armmp recommends: pn apparmor ii firmware-linux-free 20200122-1 Versions of packages linux-image-6.1.0-9-armmp suggests: pn debian-kernel-handbook pn linux-doc-6.1 Versions of packages linux-image-6.1.0-9-armmp is related to: pn firmware-amd-graphics ii firmware-atheros 20230210-5 pn firmware-bnx2 pn firmware-bnx2x pn
Bug#953569:
On Tue 2020-03-10 22:23:18 -0600, Jason A. Donenfeld wrote: > https://data.zx2c4.com/wireguard-5.5.8-20a586ec4f5acf195f71caea55c5a33c574078cb69712da591467ffc08dd8b72.zip Thanks, Jason! > - A user is on stock Debian and runs `apt install wireguard`: only > wireguard-tools is pulled in. > - A user is on weird Debian (say, some AWS kernel) and runs `apt > install wireguard`: wireguard-tools and wireguard-dkms are pulled in. Are you thinking about kernel 5.5 or kernel 5.6 or later? if wg is mainlined in 5.6 and the user is on a 5.6 or later kernel, then i don't think we want wireguard-dkms at all, right? > I was under the impression that the "Provides" mechanism does a good > job at that. But perhaps there's another good way you have in mind. The automatic uninstallation of packages with legacy kernel ABIs is what Ben has identified as a sticking point for putting a Provides: on the "real" kernel package. But it doesn't sound like that's a problem for the kernel metapackages. --dkg signature.asc Description: PGP signature
Bug#953569:
On Wed 2020-03-11 03:50:00 +, Ben Hutchings wrote: > If some of the packages providing a virtual package are explicitly > installed, and some auto-installed, it could reasonably auto-remove the > latter group (though I don't think it does). But if all of them are > auto-installed, which will be the case for kernel packages, it can't > tell which should be kept and which removed. hm, right, i can see how that would be a problem. thanks for the explanation. > But nothing will auto-remove the wireguard package. So you would have > to keep it as a transitional package for one release cycle, even when > it depends on just wireguard-tools. I don't think it would be a problem to have a transitional package for one release cycle. do you? > Yes, this is annoyingly complicated. That's why I need there to be a > plan. I'm open to any suggestions that you think would work better than what we've talked about so far. Let me know what you prefer. --dkg signature.asc Description: PGP signature
Bug#953569:
Thanks to Ben and Jason for following up here. On Wed 2020-03-11 02:52:43 +, Ben Hutchings wrote: > We definitely can't add a Provides on "real" kernel packages, because > this breaks auto-removal of old packages. I'm not sure i understand this. by "real" kernel packages i think you mean something like linux-image-5.4.0-4-amd64. If linux-image-5.5.0-1-amd64 were installed, with such a Provides:, and then linux-image-5.5.0-2-amd64 were installed, also with such a Provides:, then why wouldn't autoremoval of linux-image-5.5.0-1-amd64 still work? The system would still have the Provides: satisfied. Feel free to point me at some piece of Apt or dpkg documentation if i'm missing something obvious. > We could possibly add it to the meta-packages, but there would have to > be a plan for how we can drop it later (and have the Wireguard > user-space just assume the kernel supports it). When we can just assume that the kernel supports it, we might just drop the "wireguard" package entirely, and supply only the "wireguard-tools" package (maybe at that point, we make "wireguard-tools" itself Provide: wireguard). At that point, we certainly wouldn't need the Provides: on the kernel. > We definitely shouldn't accumulate Provides for every component that > was previously packaged out-of-tree. I can see how that would be problematic :) --dkg signature.asc Description: PGP signature
Bug#953569: linux: please cherry-pick wireguard patches from 5.6
Package: src:linux Severity: wishlist Control: affects -1 src:wireguard-linux-compat src:wireguard Hi Debian kernel folks-- Please cherry-pick the wireguard patches from Linux's 5.6 development branch into future debian builds of 5.5 (and 5.4?) builds of the Linux kernel. The Wireguard VPN mechanism is due to be released in upstream kernel 5.6. Debian unstable and testing currently have all the tooling needed to configure and control wireguard interfaces as long as the kernel module is available. In particular, systemd-networkd is capable of configuring wireguard interfaces in some standard configurations, and the "wireguard-tools" source package offers fine-grained control via /usr/bin/wg and a "wg-quick@" systemd unit template. The kernel module itself has been available for a few years now for people with machines that can afford to compile source code via wireguard-dkms, but the dkms fooptprint and failure modes make it more heavyweight than just having the module available directly. I was looking into how to simplify this, and part of it involved packaging a kernel module, but i've been dissuaded by folks on the #debian-kernel IRC channel from trying to keep such a thing in debian. bwh suggested cherry-picking the wireguard patches into src:linux 5.5, which would be great. the wireguard metapackage currently has: Depends: wireguard-dkms (>= 0.0.20200121-2) | wireguard-modules (>= 0.0.20191219) so it would be nice if you could add the following metadata to any generated binary packages: Provides: wireguard-modules (= 0.0.20200121-2) (you can replace the version number with whatever version info is present in the upstream series that you cherry-pick, of course) I understand from some folks on #debian-kernel that they don't think this sort of dependency resolution mechanism is good to do, but it doesn't appear to hurt anything, and it should smooth out the default use case, so it seems like a win to me. Thanks for considering this, --dkg signature.asc Description: PGP signature
Re: Bug#943555: wireguard-dkms: Kernel modules don't build with kernel 5.3.0-1-arm64 on Raspberry Pi3
On Tue 2019-11-12 09:16:37 +0100, Christian Haul wrote: > On 11.11.19 15:33, Daniel Kahn Gillmor wrote: >> control: affects 943555 + dkms >> >> On Sun 2019-11-10 18:09:33 +0100, Christian Haul wrote: >>> On 10.11.19 14:51, Daniel Kahn Gillmor wrote: >>>> On Sat 2019-10-26 12:51:47 +, Chris. wrote: > >> However, that package looks like it's about to be superceded by >> linux-headers-5.3.0-2-arm64 because of a recent ABI bump to the kernel >> in unstable. >> >> I know this is a lot to ask, but can you take the following steps? >> >> * upgrade your system to linux-image-5.3.0-2-arm64 and reboot into the >>new kernel >> * make sure you have linux-headers-5.3.0-2-arm64 installed >> * retry building the various kernel modules that were failing? > > Kernel package just arrived. Builds fine. All good - bug can be closed. > > Thanks for your patience. Great! thanks for your followup here. This e-mail should close the bug report. --dkg signature.asc Description: PGP signature
Re: Bug#943555: wireguard-dkms: Kernel modules don't build with kernel 5.3.0-1-arm64 on Raspberry Pi3
control: affects 943555 + dkms On Sun 2019-11-10 18:09:33 +0100, Christian Haul wrote: > On 10.11.19 14:51, Daniel Kahn Gillmor wrote: >> On Sat 2019-10-26 12:51:47 +, Chris. wrote: >>> on Raspberry Pi3 kernel module stops building since updating to kernel >>> 5.3.0.1. > >> I'm not sure this is a wireguard-specific issue... > > I have added another DKMS package (iptables-netflow-dkms) and it runs > into the same issue. > > # cat /var/lib/dkms/ipt-netflow/2.4/build/make.log > DKMS make.log for ipt-netflow-2.4 for kernel 5.3.0-1-arm64 (aarch64) > Sun Nov 10 14:16:57 UTC 2019 > Compiling for kernel 5.3.7 > make -C /lib/modules/5.3.0-1-arm64/build > M=/var/lib/dkms/ipt-netflow/2.4/build modules CONFIG_DEBUG_INFO=y > make[1]: warning: jobserver unavailable: using -j1. Add '+' to parent > make rule. > make[1]: Entering directory '/usr/src/linux-headers-5.3.0-1-arm64' > arch/arm64/Makefile:58: *** arm-linux-gnueabihf-gcc not found, check > CROSS_COMPILE_COMPAT. Stop. > make[1]: *** [/usr/src/linux-headers-5.3.0-1-common/Makefile:179: > sub-make] Error 2 > make[1]: Leaving directory '/usr/src/linux-headers-5.3.0-1-arm64' > make: *** [Makefile:25: ipt_NETFLOW.ko] Error 2 > > >> This looks to me like you don't have the arm64-specific compiler >> installed, which ought to have been installed correctly by >> linux-headers-5.3.0-1-arm64. > > Not an expert here, but I can still run > > root@rpi3:~# dkms build wireguard/0.0.20191012 -k 5.2.0-3-arm64 > > Kernel preparation unnecessary for this kernel. Skipping... > > Building module: > cleaning build area > make -j4 KERNELRELEASE=5.2.0-3-arm64 -C /lib/modules/5.2.0-3-arm64/build > M=/var/lib/dkms/wireguard/0.0.20191012/build. > > cleaning build area. > > DKMS: build completed. > > I.E. compilation works in general. Something in kernel 5.3 changed that > breaks DKMS on Raspberry Pi3. Looking for a gcc-9 for abihf I can onle > find packages marked "crosscompile" (plus it's missing a dependency so I > can't install) > > Should I file a bug against the linux-headers package instead? I think if anything this bug report itself should be reassigned to linux-headers-5.3.0-1-arm64. However, that package looks like it's about to be superceded by linux-headers-5.3.0-2-arm64 because of a recent ABI bump to the kernel in unstable. I know this is a lot to ask, but can you take the following steps? * upgrade your system to linux-image-5.3.0-2-arm64 and reboot into the new kernel * make sure you have linux-headers-5.3.0-2-arm64 installed * retry building the various kernel modules that were failing? If they are still failing for you on 5.3.0-2-arm64, i recommend reassigning this bug report to linux-headers-5.3.0-2-arm64. Sorry to not have the equipment up and running to test these things myself. I appreciate your persevering on this bug report, let's get it figured out (and hopefully, fixed)! All the best, --dkg signature.asc Description: PGP signature
Bug#929938: linux: please enable CONFIG_XFRM_STATISTICS=y
On Mon 2019-06-03 12:35:45 -0400, Daniel Kahn Gillmor wrote: > 0 dkg@alice:~$ grep CONFIG_XFRM_STATISTICS /boot/config-4.19.0-5-amd64 > # CONFIG_XFRM_STATISTICS is not set > 0 dkg@alice:~$ > > Paul Wouters, Libreswan upstream developer says: > >> Still this kernel option is the only way to get IPsec kernel error >> counters, which are the only diagnostic available for kernel IPsec, so >> they should really enable it. Uwe Kleine-König put this into a merge request on salsa: https://salsa.debian.org/kernel-team/linux/merge_requests/159 Thanks, Uwe! --dkg signature.asc Description: PGP signature
Bug#929938: linux: please enable CONFIG_XFRM_STATISTICS=y
X-Debbugs-Cc: Paul Wouters Package: linux Version: 4.19.37-3 Control: affects -1 libreswan 0 dkg@alice:~$ grep CONFIG_XFRM_STATISTICS /boot/config-4.19.0-5-amd64 # CONFIG_XFRM_STATISTICS is not set 0 dkg@alice:~$ Paul Wouters, Libreswan upstream developer says: > Still this kernel option is the only way to get IPsec kernel error > counters, which are the only diagnostic available for kernel IPsec, so > they should really enable it. Regards, --dkg signature.asc Description: PGP signature
Bug#929280: linux-image-4.19.0-5-powerpc-smp: warning when loading ecdh_generic: "alg: ecdh: Party A: generate public key test failed. Invalid output"
Package: src:linux Version: 4.19.37-3 Severity: normal Control: found -1 5.0.2-1~exp1 If, on this 32-bit powerpc machine, i do: # modprobe -v ecdh_generic then the kernel produces two lines of output: alg: ecdh: Party A: generate public key test failed. Invalid output alg: ecdh: test failed on vector 1, err=-22 This also happens on kernel 5.0.2-1~exp1 (from linux-image-5.0.0-trunk-powerpc-smp). I'm happy to run more tests on this machine if it would be useful. fwiw, this report seems to come from crypto/testmgr.c --dkg -- Package-specific info: ** Version: Linux version 4.19.0-5-powerpc-smp (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-7)) #1 SMP Debian 4.19.37-3 (2019-05-15) ** Command line: BOOT_IMAGE=/boot/vmlinux-4.19.0-5-powerpc-smp root=/dev/mapper/vg_tyr0-root ro quiet radeon.modeset=1 video=radeonfb:off radeon.agpmode=1 init=/bin/systemd ** Not tainted ** Kernel log: Unable to read kernel log; any relevant messages should be attached ** Model information revision: 1.2 (pvr 8003 0102) platform: PowerMac model : PowerBook5,7 machine : PowerBook5,7 motherboard : PowerBook5,7 MacRISC3 Power Macintosh Device Tree model: PowerBook5,7 ** Loaded modules: ecdh_generic uinput binfmt_misc hfs arc4 b43 radeon bcma mac80211 ttm drm_kms_helper joydev cfg80211 drm rfkill rng_core drm_panel_orientation_quirks yenta_socket syscopyarea appletouch sysfillrect sysimgblt fb_sys_fops evdev pcmcia_rsrc sg rack_meter therm_adt746x snd_powermac snd_pcm snd_timer snd soundcore loop ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb hid_apple hid_generic usbhid hid dm_mod firewire_ohci firewire_core sungem sungem_phy crc_itu_t sd_mod ohci_pci ehci_pci ohci_hcd ehci_hcd usbcore ssb sr_mod cdrom usb_common mmc_core pcmcia pcmcia_core ** PCI devices: :00:0b.0 Host bridge [0600]: Apple Inc. UniNorth 2 AGP [106b:0034] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- Kernel driver in use: agpgart-uninorth :00:10.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] RV350/M10 / RV360/M11 [Mobility Radeon 9600 (PRO) / 9700] [1002:4e50] (prog-if 00 [VGA controller]) Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] RV350/M10 / RV360/M11 [Mobility Radeon 9600 (PRO) / 9700] [1002:4e50] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- Kernel driver in use: radeon Kernel modules: radeonfb, radeon 0001:10:0b.0 Host bridge [0600]: Apple Inc. UniNorth 2 PCI [106b:0035] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- Kernel driver in use: b43-pci-bridge Kernel modules: ssb 0001:10:13.0 CardBus bridge [0607]: Texas Instruments PCI1510 PC card Cardbus Controller [104c:ac56] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- Reset+ 16bInt+ PostWrite+ 16-bit legacy interface ports at 0001 Capabilities: Kernel driver in use: yenta_cardbus Kernel modules: yenta_socket 0001:10:17.0 Unassigned class [ff00]: Apple Inc. KeyLargo/Intrepid Mac I/O [106b:003e] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- Kernel driver in use: ohci-pci Kernel modules: ohci_pci 0001:10:1b.1 USB controller [0c03]: NEC Corporation OHCI USB Controller [1033:0035] (rev 43) (prog-if 10 [OHCI]) Subsystem: NEC Corporation USB Controller [1033:0035] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- Kernel driver in use: ohci-pci Kernel modules: ohci_pci 0001:10:1b.2 USB controller [0c03]: NEC Corporation uPD72010x USB 2.0 Controller [1033:00e0] (rev 04) (prog-if 20 [EHCI]) Subsystem: NEC Corporation uPD72010x USB 2.0 Controller [1033:00e0] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- Kernel driver in use: ehci-pci Kernel modules: ehci_pci 0002:24:0b.0 Host bridge [0600]: Apple Inc. UniNorth 2 Internal PCI [106b:0036]
Bug#911768: pinentry-gnome3 fails to open a window with 'No Gcr System Prompter available, falling back to curses'
Control: affects 911768 - gpg-agent Control: affects 911768 + gcr On Fri 2018-12-21 07:28:22 -0500, Theodore Y. Ts'o wrote: > On Thu, Dec 20, 2018 at 03:17:03PM -0500, Daniel Kahn Gillmor wrote: >> >> I wonder whether we can rule out any interaction with gpg-agent itself >> -- does "echo getpin | pinentry-gnome3" itself fall back to curses on >> your system when nfs-kernel-server is installed? > > I can confirm that that I did this experiment before I uninstalled > nfs-kernel-server --- and it fell back to curses. thanks! I think that takes gpg-agent out of the picture (a nice simplification in terms of debugging) but also implicates gcr itself, so i'm adjusting the "affects" tag. > The next experiment to do would be to reinstall nfs-kernel-server and > reboot --- and see if it falls back to curses again. Please report back if you do that experiment! thanks for following up, Ted. --dkg signature.asc Description: PGP signature
Bug#878614: linux-image-4.13.0-1-amd64: unexpected IRQ trap at vector e8 on Intel NUC H26998-401
On Sat 2017-10-14 22:17:29 -0400, Daniel Kahn Gillmor wrote: > Trying to boot this machine into 4.13.0-1-amd64 results in several kernel > messages like: > > unexpected IRQ trap at vector e8 > > per second, and basic system services take ages to start (including > journald, which means i don't have a clear log of the error messages > at the moment). > > rebooting into 4.11.0-2-amd64 produces no such error. The machine is > already running irqbalance 1.1.0-2.3, fwiw. fwiw, this bug went away once i upgraded the BIOS of the machine. the old bios was: TYBYT10H.86A.0019.2014.0327.1516 The new bios is: TYBYT10H.86A.0061.2017.1011.1904 After this upgrade, new kernels boot just fine. The bios file i installed was TY10H0061.bio, with sha256 2b841a072ce65b1b73cb494b7d291e11c245b7373c601559f5547981a10c9543 This apparently also improved the onboard DRAM controller so that it's now operating at 1600MHz (0.6 ns) instead of 1066MHz (0.9 ns). So, i think this was a hardware problem, and not a problem with the kernel, so i'm closing this bug report. I don't know why the newer kernels had the problem with this hardware but older ones didn't. ah well. hope this info helps, --dkg signature.asc Description: PGP signature
Bug#886662: wireguard-dkms should depend on libelf-dev
On Mon 2018-01-08 13:34:53 -0500, Robert Edmonds wrote: > You may want to hold off on fixing this in wireguard. It looks like this > is a regression in src:linux (#886474). Given this failure is coming > from the kernel build system apparently before the module itself even > starts building, it would seem to affect all out-of-tree kernel module > packages. thx for the heads-up, Robert. I'll hold off for the moment and keep an eye on src:linux. --dkg signature.asc Description: PGP signature
Bug#878614: linux-image-4.13.0-1-amd64: unexpected IRQ trap at vector e8 on Intel NUC H26998-401
Package: src:linux Version: 4.13.4-1 Severity: important Dear Maintainer, *** Reporter, please consider answering these questions, where appropriate *** * What led up to the situation? * What exactly did you do (or not do) that was effective (or ineffective)? * What was the outcome of this action? * What outcome did you expect instead? *** End of the template - remove these template lines *** Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Daniel Kahn Gillmor <d...@fritz.lair.fifthhorseman.net> To: Debian Bug Tracking System <sub...@bugs.debian.org> Subject: linux-image-4.13.0-1-amd64: unexpected IRQ trap at vector e8 on Intel NUC H26998-401 Bcc: Daniel Kahn Gillmor <d...@fritz.lair.fifthhorseman.net> Package: src:linux Version: 4.13.4-1 Severity: important Trying to boot this machine into 4.13.0-1-amd64 results in several kernel messages like: unexpected IRQ trap at vector e8 per second, and basic system services take ages to start (including journald, which means i don't have a clear log of the error messages at the moment). rebooting into 4.11.0-2-amd64 produces no such error. The machine is already running irqbalance 1.1.0-2.3, fwiw. Let me know if there's more information i can give you (or other kernels i should try) that would help to narrow this down. --dkg -- Package-specific info: ** Kernel log: boot messages should be attached ** Model information sys_vendor: \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff product_name: \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff product_version: \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff chassis_vendor: \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff chassis_version: \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff bios_vendor: Intel Corp. bios_version: TYBYT10H.86A.0019.2014.0327.1516 board_vendor: Intel Corporation board_name: DE3815TYKH board_version: H26998-401 ** PCI devices: 00:00.0 Host bridge [0600]: Intel Corporation Atom Processor Z36xxx/Z37xxx Series SoC Transaction Register [8086:0f00] (rev 0c) Subsystem: Intel Corporation Atom Processor Z36xxx/Z37xxx Series SoC Transaction Register [8086:2056] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- Kernel driver in use: i915 Kernel modules: i915 00:13.0 SATA controller [0106]: Intel Corporation Atom Processor E3800 Series SATA AHCI Controller [8086:0f23] (rev 0c) (prog-if 01 [AHCI 1.0]) Subsystem: Intel Corporation Atom Processor E3800 Series SATA AHCI Controller [8086:2056] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- Kernel driver in use: ahci Kernel modules: ahci 00:14.0 USB controller [0c03]: Intel Corporation Atom Processor Z36xxx/Z37xxx, Celeron N2000 Series USB xHCI [8086:0f35] (rev 0c) (prog-if 30 [XHCI]) Subsystem: Intel Corporation Atom Processor Z36xxx/Z37xxx, Celeron N2000 Series USB xHCI [8086:2056] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 00:17.0 SD Host controller [0805]: Intel Corporation Atom Processor E3800 Series eMMC 4.5 Controller [8086:0f50] (rev 0c) (prog-if 01) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Kernel driver in use: sdhci-pci Kernel modules: sdhci_pci 00:18.0 DMA controller [0801]: Intel Corporation Atom Processor Z36xxx/Z37xxx Series LPIO2 DMA Controller [8086:0f40] (rev 0c) (prog-if 02 [EISA DMA]) Subsystem: Intel Corporation Atom Processor Z36xxx/Z37xxx Series LPIO2 DMA Controller [8086:0f40] Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 00:18.1 Serial bus controller [0c80]: Intel Corporation Atom Processor Z36xxx/Z37xxx Series LPIO2 I2C Controller #1 [8086:0f41] (rev 0c) Subsystem: Intel Corporatio
Bug#878307: usbip: enable use of unix-domain sockets, not just network traffic
Package: usbip Severity: wishlist Some tools (like Gnuk) offer USB device emulation, exported to the host for testing via usbip. However, when using usbip in this way, any local user account with packet-sniffing privilege (e.g. members of group "wireshark" in a common debian convention) get access to the traffic over the USB port. It would be great to enable usbip-style device exposure over a SOCK_STREAM unix-domain socket as well as a TCP-based socket, since this would make it possible to provide emulated USB devices without routing traffic through the networking stack. --dkg
Bug#854421: systemd: "systemctl --user cat dirmngr.socket" produced garbage beyond # /dev/null
Control: retitle 854421 [CVE-2017-5550] kernel dumps arbitrary memory when splice()ing from /dev/null On Tue 2017-02-07 20:21:31 -0500, Ben Hutchings wrote: > Control: reassign -1 src:linux 4.9.2-2 > Control: close -1 4.9.6-3 > Control: severity -1 serious > Control: tag -1 security > > On Tue, 2017-02-07 at 11:14 -0500, Daniel Kahn Gillmor wrote: >> On Tue 2017-02-07 10:49:39 -0500, Daniel Kahn Gillmor wrote: >> > git clone https://0xacab.org/dkg/debian-bug-854421 >> > cd debian-bug-854421 >> > make >> >> interestingly, on at least one machine i try this on, getting it to >> reproduce is very infrequent with plain "make", even with the 20 tries >> on kernel version 4.9.2-2. > > It's much less likely to happen if there's only one CPU. > >> however, "make strace" seems to tickle the bug further, and makes it >> much more likely to reproduce on 4.9.2-2, even though it's only one >> try. >> >> with kernel 4.9.6-3 i haven't been able to reproduce it with either >> "make" or "make strace". > > This is CVE-2017-5550, fixed by: > https://git.kernel.org/linus/b9dc6f65bc5e232d1c05fe34b5daadc7e8bbf1fb Thanks for tracking that down, Ben. I can confirm that it's an infoleak of the worst kind, unfortunately -- i filled the RAM of a root-owned userspace process with an arbitrary string, and then triggered the dump From a non-privileged process and managed to get copies of the arbitrary string :( --dkg signature.asc Description: PGP signature
Bug#854421: systemd: "systemctl --user cat dirmngr.socket" produced garbage beyond # /dev/null
On Tue 2017-02-07 10:49:39 -0500, Daniel Kahn Gillmor wrote: > git clone https://0xacab.org/dkg/debian-bug-854421 > cd debian-bug-854421 > make interestingly, on at least one machine i try this on, getting it to reproduce is very infrequent with plain "make", even with the 20 tries on kernel version 4.9.2-2. however, "make strace" seems to tickle the bug further, and makes it much more likely to reproduce on 4.9.2-2, even though it's only one try. with kernel 4.9.6-3 i haven't been able to reproduce it with either "make" or "make strace". --dkg
Bug#852740: WARNING: …/fs/sysf/group.c:237 device_del : sysfs group 'power' not found for kobject 'event18'
Package: src:linux Version: 4.9.2-2 Severity: normal Using the same USB webcam as from the crash reported in https://bugs.debian.org/852738, i plugged it in again after rebooting into 4.9.0-1-amd64 (version 4.9.2-2). While i didn't get the null pointer dereference, i got several unusual messages ("Warning! Unlikely big volume range (=3072), cval->res is probably wrong") on insertion, and then 5 minutes later i yanked the camera and got a series of warnings on removal that all seem to be related to sysfs groups: Jan 26 14:53:14 alice kernel: usb 1-1.2: new high-speed USB device number 5 using ehci-pci Jan 26 14:53:14 alice kernel: usb 1-1.2: New USB device found, idVendor=046d, idProduct=0990 Jan 26 14:53:14 alice kernel: usb 1-1.2: New USB device strings: Mfr=0, Product=0, SerialNumber=2 Jan 26 14:53:14 alice kernel: usb 1-1.2: SerialNumber: 1B292B19 Jan 26 14:53:14 alice kernel: uvcvideo: Found UVC 1.00 device (046d:0990) Jan 26 14:53:15 alice kernel: uvcvideo 1-1.2:1.0: Entity type for entity Extension 4 was not initialized! Jan 26 14:53:15 alice kernel: uvcvideo 1-1.2:1.0: Entity type for entity Extension 10 was not initialized! Jan 26 14:53:15 alice kernel: uvcvideo 1-1.2:1.0: Entity type for entity Extension 12 was not initialized! Jan 26 14:53:15 alice kernel: uvcvideo 1-1.2:1.0: Entity type for entity Extension 8 was not initialized! Jan 26 14:53:15 alice kernel: uvcvideo 1-1.2:1.0: Entity type for entity Extension 11 was not initialized! Jan 26 14:53:15 alice kernel: uvcvideo 1-1.2:1.0: Entity type for entity Extension 9 was not initialized! Jan 26 14:53:15 alice kernel: uvcvideo 1-1.2:1.0: Entity type for entity Processing 2 was not initialized! Jan 26 14:53:15 alice kernel: uvcvideo 1-1.2:1.0: Entity type for entity Extension 13 was not initialized! Jan 26 14:53:15 alice kernel: uvcvideo 1-1.2:1.0: Entity type for entity Camera 1 was not initialized! Jan 26 14:53:15 alice kernel: input: UVC Camera (046d:0990) as /devices/pci:00/:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/input/input93 Jan 26 14:53:15 alice kernel: usb 1-1.2: Warning! Unlikely big volume range (=3072), cval->res is probably wrong. Jan 26 14:53:15 alice kernel: usb 1-1.2: [5] FU [Mic Capture Volume] ch = 1, val = 4608/7680/1 Jan 26 14:53:15 alice kernel: usbcore: registered new interface driver snd-usb-audio Jan 26 14:53:36 alice kernel: usb 1-1.2: reset high-speed USB device number 5 using ehci-pci Jan 26 14:58:31 alice kernel: usb 1-1.2: USB disconnect, device number 5 Jan 26 14:58:31 alice kernel: usb 1-1.2: cannot submit urb (err = -19) Jan 26 14:58:31 alice kernel: uvcvideo: Failed to resubmit video URB (-19). Jan 26 14:58:31 alice kernel: usb 1-1.2: 3:1: cannot set freq 16000 to ep 0x86 Jan 26 14:58:31 alice kernel: [ cut here ] Jan 26 14:58:31 alice kernel: WARNING: CPU: 0 PID: 1917 at /build/linux-fgnWKv/linux-4.9.2/fs/sysfs/group.c:237 device_del+0x54/0x260 Jan 26 14:58:31 alice kernel: sysfs group 'power' not found for kobject 'event18' Jan 26 14:58:31 alice kernel: Modules linked in: snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device iptable_filter cpufreq_conservative cpufreq_powersave cpufreq_userspace md_mod joydev nls_ascii nls_cp437 vfat fat iTCO_wdt iTCO_vendor_support arc4 iwldvm mac80211 intel_rapl snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp iwlwifi snd_hda_codec_conexant snd_hda_codec_generic cfg80211 kvm_intel kvm irqbypass intel_cstate intel_uncore intel_rapl_perf efi_pstore i2c_i801 psmouse evdev pcspkr efivars i2c_smbus sg lpc_ich mfd_core uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media i915 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep thinkpad_acpi shpchp snd_pcm snd_timer nvram snd soundcore thermal rfkill wmi drm_kms_helper e1000e battery drm ac fjes mei_me mei ptp Jan 26 14:58:31 alice kernel: pps_core i2c_algo_bit video button tpm_tis tpm_tis_core tpm ip_tables x_tables ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache dm_crypt algif_skcipher af_alg sd_mod mmc_block crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw ahci gf128mul ablk_helper cryptd libahci libata serio_raw scsi_mod ehci_pci ehci_hcd sdhci_pci usbcore sdhci mmc_core usb_common dm_mod coretemp loop efivarfs autofs4 Jan 26 14:58:31 alice kernel: CPU: 0 PID: 1917 Comm: AudioThread Not tainted 4.9.0-1-amd64 #1 Debian 4.9.2-2 Jan 26 14:58:31 alice kernel: Hardware name: LENOVO 42875TU/42875TU, BIOS 8DET58WW (1.28 ) 02/14/2012 Jan 26 14:58:31 alice kernel: ac128b84 b90c01afbd18 Jan 26 14:58:31 alice kernel: abe76dbe 8a35cee82000 b90c01afbd70 8a35d3141240 Jan 26 14:58:31 alice kernel: 8a35ceb021e0 8a35d41d99c0 8a35d18fa8f8 abe76e3f Jan 26 14:58:31 alice kernel: Call Trace: Jan 26 14:58:31 alice kernel: [] ? dump_stack+0x5c/0x78 Jan 26 14:58:31 alice kernel: [] ?
Bug#852738: linux-image-4.8.0-2-amd64: NULL pointer dereference in usb_destroy_configuration+0xb7/0x120
Package: src:linux Version: 4.8.15-2 Severity: normal Dear Maintainer, I plugged in a USB webcam, device ID 046d:0990 into my Thinkpad X220. The kernel froze with a NULL pointer dereference. here's the full log. I had to shut the machine down hard and restart it. Later, i rebooted into 4.9.0-1-amd64 (version 4.9.2-2) and tried the device again, with different errors, which i'll follow up with in another bug report. her's the NULL pointer dereference: Jan 26 09:10:34 alice kernel: IPv6: ADDRCONF(NETDEV_CHANGE): enp0s25: link becomes ready Jan 26 14:50:11 alice kernel: usb 1-1.2: new high-speed USB device number 17 using ehci-pci Jan 26 14:50:11 alice kernel: usb 1-1.2: config index 0 descriptor too short (expected 1433, got 0) Jan 26 14:50:11 alice kernel: usb 1-1.2: invalid descriptor for config index 0: type = 0xF8, length = 59 Jan 26 14:50:11 alice kernel: usb 1-1.2: can't read configurations, error -22 Jan 26 14:50:11 alice kernel: BUG: unable to handle kernel NULL pointer dereference at 0043 Jan 26 14:50:11 alice kernel: IP: [] usb_destroy_configuration+0xb7/0x120 [usbcore] Jan 26 14:50:11 alice kernel: PGD 0 Jan 26 14:50:11 alice kernel: Oops: 0002 [#1] SMP Jan 26 14:50:11 alice kernel: Modules linked in: xt_nat xt_policy iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack drbg ansi_cprng seqiv rmd160 ip_vti ip_tunnel af_key ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm6_tunnel tunnel6 xfrm_ipcomp sha1_ssse3 salsa20_x86_64 poly1305_x86_64 poly1305_generic des3_ede_x86_64 chacha20_x86_64 chacha20_generic chacha20poly1305 camellia_generic camellia_aesni_avx_x86_64 camellia_x86_64 cast6_avx_x86_64 cast6_generic cast5_avx_x86_64 cast5_generic cast_common deflate cts gcm serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 blowfish_common twofish_generic twofish_avx_x86_64 twofish_x86_64_3way Jan 26 14:50:11 alice kernel: xts twofish_x86_64 twofish_common xcbc cbc sha512_ssse3 sha512_generic xfrm_user xfrm_algo nls_utf8 cifs sha256_ssse3 cmac md4 hmac des_generic dns_resolver fscache cpuid uas usb_storage ctr ccm iptable_filter cpufreq_conservative cpufreq_powersave cpufreq_userspace md_mod joydev nls_ascii nls_cp437 vfat fat arc4 iTCO_wdt iTCO_vendor_support iwldvm mac80211 iwlwifi snd_hda_codec_hdmi cfg80211 snd_hda_codec_conexant snd_hda_codec_generic intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass intel_cstate intel_uncore efi_pstore intel_rapl_perf evdev psmouse pcspkr efivars i2c_i801 i2c_smbus sg i915 lpc_ich mfd_core uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev snd_hda_intel media snd_hda_codec shpchp snd_hda_core drm_kms_helper snd_hwdep Jan 26 14:50:11 alice kernel: snd_pcm snd_timer thermal wmi thinkpad_acpi nvram snd e1000e drm soundcore rfkill ac battery fjes ptp mei_me pps_core i2c_algo_bit mei video button tpm_tis tpm_tis_core tpm ip_tables x_tables ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache dm_crypt algif_skcipher af_alg sd_mod mmc_block crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ahci libahci aesni_intel libata aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd scsi_mod sdhci_pci sdhci ehci_pci mmc_core ehci_hcd serio_raw usbcore usb_common dm_mod coretemp loop efivarfs autofs4 Jan 26 14:50:11 alice kernel: CPU: 0 PID: 28066 Comm: kworker/0:3 Not tainted 4.8.0-2-amd64 #1 Debian 4.8.15-2 Jan 26 14:50:11 alice kernel: Hardware name: LENOVO 42875TU/42875TU, BIOS 8DET58WW (1.28 ) 02/14/2012 Jan 26 14:50:11 alice kernel: Workqueue: usb_hub_wq hub_event [usbcore] Jan 26 14:50:11 alice kernel: task: 9983d65dc000 task.stack: 998381028000 Jan 26 14:50:11 alice kernel: RIP: 0010:[] [] usb_destroy_configuration+0xb7/0x120 [usbcore] Jan 26 14:50:11 alice kernel: RSP: 0018:99838102bcc8 EFLAGS: 00010206 Jan 26 14:50:11 alice kernel: RAX: 003f RBX: 004f RCX: 998495801b07 Jan 26 14:50:11 alice kernel: RDX: 003d RSI: 998494b91200 RDI: 0043 Jan 26 14:50:11 alice kernel: RBP: 99834d50e810 R08: 0003 R09: 3740 Jan 26 14:50:11 alice kernel: R10: R11: 0173 R12: 99848f065000 Jan 26 14:50:11 alice kernel: R13: R14: 99834d50e400 R15: Jan 26 14:50:11 alice kernel: FS: () GS:99849e20() knlGS: Jan 26 14:50:11 alice kernel: CS: 0010 DS: ES: CR0: 80050033 Jan 26 14:50:11 alice kernel: CR2: 0043 CR3: 08e06000 CR4: 000406f0 Jan 26 14:50:11 alice kernel: Stack: Jan 26 14:50:11 alice kernel: 99848f065098 99848f065000 998491e62800 99848f0650a8 Jan 26 14:50:11 alice kernel: 998491b4e000
Bug#833231: initramfs-tools: during initramfs: "/init: line 1: logsave: not found" when e2fsprogs is not installed
Control: severity 833231 wishlist On Tue 2016-08-02 02:29:27 -0400, Julien Cristau wrote: > e2fsprogs is Essential: yes. Removing things from Essential is probably > nontrivial. I'd say at most this is "wishlist" territory. sure, wishlist is fine with me. --dkg
Bug#833231: initramfs-tools: during initramfs: "/init: line 1: logsave: not found" when e2fsprogs is not installed
Source: initramfs-tools Version: 0.125 Severity: normal Control: affects -1 e2fsprogs I have a minimal system with only btrfs filesystems, and e2fsprogs is not needed on it. It would be nice to be able to uninstall it. however, when i uninstall e2fsprogs and reboot into an initramfs built by initramfs-tools, i see the following error and then i get dropped back into the initramfs shell: (initramfs) exit /init: line 1: logsave: not found The root filesystem on /dev/vda1 requires a manual fsck BusyBox v1.22.1 (Debian 1:1.22.0-19) built-in shell (ash) Enter 'help' for a list of built-in commands. (initramfs) even if i remove the logsave and "requires a manual fsck" lines from /scripts/functions in the initramfs, these errors repeat themselves and i can't get the system booted again. It would be nice to remove the need for e2fsprogs for systems that aren't using the extX family of filesystems. --dkg -- System Information: Debian Release: stretch/sid APT prefers testing-debug APT policy: (500, 'testing-debug'), (500, 'testing'), (200, 'unstable-debug'), (200, 'unstable'), (1, 'experimental-debug'), (1, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 4.6.0-1-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system)
Bug#821400: brcmfmac_sdio should read its configs from efi vars, where applicable
Over in https://bugs.debian.org/821400, Ben Hutchings makes the reasonable suggestion that where possible, when brcm/brcmfmac43340-sdio.txt (or configuration file for other brcm chipsets) is not available, brcmfmac_sdio should try looking for configuration in the efi variables directly. That would certainly make the common use case much simpler, which would make the devices easier to use. This is an attempt to report this suggestion upstream where it has a chance of being implemented. Feel free to redirect me if you think it should be posted elsewhere. Regards, --dkg
Bug#821400: firmware-brcm80211: brcmfmac_sdio wants brcm/brcmfmac43340-sdio .bin and .txt, only .bin supplied
On Mon 2016-04-18 18:47:25 -0400, Ben Hutchings wrote: > Ah, I dimly remembered that this information could be stashed in the > system firmware somewhere. It seems like the driver ought to look > there first if EFI support is enabled (there is an in-kernel API for > reading EFI variables). That will have to be developed upstream > though. Agreed that this would make sense as an upstream change. I'm asking about it over on #linux-wireless on freenode, to see if i can interest any uptsream developers. regards, --dkg
Bug#821400: firmware-brcm80211: brcmfmac_sdio wants brcm/brcmfmac43340-sdio .bin and .txt, only .bin supplied
Version: 20160110-1 Thanks for the quick response, Ben. On Mon 2016-04-18 09:49:27 -0400, Ben Hutchings wrote: >> It looks to me like the brcmfmac_sdio kernel module is expecting this >> .txt file is supposed to be shipped alongside the .bin, but it isn't >> present. > > That's board-specific configuration, not really firmware. I don't know > where you're supposed to get it from. hm, https://wiki.debian.org/InstallingDebianOn/Asus/X205TA#WiFi suggests: cp /sys/firmware/efi/efivars/nvram-74b00bd9-805a-4d61-b51f-43268123d113 /lib/firmware/brcm/brcmfmac43340-sdio.txt and indeed, after doing that, and then removing and re-loading the kernel module with: modprobe -v -r brcmfmac modprobe -v brcmfmac then the device is found correctly. --dkg
Bug#821400: firmware-brcm80211: brcmfmac_sdio wants brcm/brcmfmac43340-sdio .bin and .txt, only .bin supplied
Package: firmware-brcm80211 Version: 20160110-1 Severity: normal On an Asus X205T, dmesg says: brcmfmac_sdio mmc1:0001:1: firmware: direct-loading firmware brcm/brcmfmac43340-sdio.bin brcmfmac_sdio mmc1:0001:1: firmware: failed to load brcm/brcmfmac43340-sdio.txt (-2) brcmfmac_sdio mmc1:0001:1: firmware: Direct firmware load for brcm/brcmfmac43340-sdio.txt failed with error -2 brcmfmac: brcmf_sdio_htclk: HT Avail timeout (100): clkctl 0x50 brcmfmac: brcmf_sdio_htclk: HT Avail timeout (100): clkctl 0x50 and no network interface appears for the device. It looks to me like the brcmfmac_sdio kernel module is expecting this .txt file is supposed to be shipped alongside the .bin, but it isn't present. --dkg -- System Information: Debian Release: stretch/sid APT prefers unstable-debug APT policy: (500, 'unstable-debug'), (500, 'testing'), (200, 'unstable'), (1, 'experimental-debug'), (1, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 4.4.0-1-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system)
Bug#784368: fsck.btrfs is still in /bin, not /sbin
https://bugs.debian.org/784368 shows that initramfs-tools is looking for /sbin/fsck.btrfs, but btrfs-tools ships it in /bin/. Ben Hutchings suggested back in May that btrfs-tools was going to put fsck.btrfs back into /sbin in the "next version", though i can't tell from the bug log which version that was, but 4.1.2-1 came out in August. Maybe we could add the following line to debian/btrfs-tools.links: bin/fsck.btrfs sbin/fsck.btrfs Regards, --dkg
Bug#692324: Oops and hang at boot
On Tue 2015-05-26 04:54:47 -0400 about https://bugs.debian.org/692324, Mathieu Malaterre wrote: Any update with more recent kernel ? Sorry, i don't have easy access to this machine (colddeadhands) right now. I know it was running the recent kernel with no problems as of about a year ago (early 2014), but i don't recall whether it crashed when loading snd-powermac, or whether i had functional audio device access or not. --dkg signature.asc Description: PGP signature
Bug#780818: lvm2 should never try to access /dev/mmcblk0rpmb (avoid hangs)
On Mon 2015-04-06 15:29:23 -0400, Ben Hutchings wrote: Ideally the driver would avoid doing whatever it is that results in a hang, or would expose the RPMB only if it's really accessible. I suspect that where access to the RPMB hangs this is because the system firmware (BIOS/EFI) has intentionally disabled access before handing over to the OS. That seems like a plausible explanation of the situation to me, but i don't know enough to tell if that's the case on the machinery i've tested with. Hiding RPMBs seems to be a popular workaround, and I doubt that it will be useful to access them under a Debian system or other free operating system, so that's what I intend to do. If we need to change that later we can add a module parameter to override this. Thanks, this seems like a sensible workaround for the moment. --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/87vbh958o7@alice.fifthhorseman.net
Bug#780818: lvm2 should never try to access /dev/mmcblk0rpmb (avoid hangs)
Control: affects 780818 lvm2 partman-base On Sat 2015-03-21 07:26:46 -0400, Bastian Blank wrote: I don't think it is a bug in lvm to try reading a device. lvm is a prime culprit of trying to scan this device, which is just not going to be helpful to users, since it won't contain a PV. How does it know that as a fact? Re-assigning to linux, as it allowes read request on devices not supporting it. I understand that the issue might be caused by the kernel itself. I'm noting here that this bug affects both lvm2 (because it causes vgscan to hang for quite a while, at least) as well as causing the partition manager during d-i to take ages on any system that has this kind of device. --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/87wq1r8tzw@alice.fifthhorseman.net
Re: Bug#751339: RFP: ath9k-htc-firmware -- free firmware for Atheros AR7010/AR9271 wireless adapters
On Fri 2015-01-23 07:46:14 -0500, Raphael Hertzog wrote: On Wed, 11 Jun 2014, Daniel Kahn Gillmor wrote: ath9k-ftc-firmware provides a free implementation of firmware for two available wireless chipsets. [...] It would be great to ship these firmware modules in debian now that we have the source. If we can make it easier for people to improve them, that would be great. Unfortunately, the build process appears to be a little bit involved: https://trisquel.info/en/forum/how-install-ath9k-htc-firmware-atheros-communications-inc-ar9271 [...] While this looks like complicated to package independently, the firmwares are available in http://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmware.git/tree/ Could they thus not be included as-is in firmware-free or at least firmware-non-free ? (ccing kernel team for this question) shipping in firmware-non-free would be a shortcut to ensure that we have them available for users (which would be good), but it would feel a little bit ironic to label them non-free just because we haven't sorted out how to manage the build chain yet, and seems like it might risk upsetting the good people who've done the work to make them free (which would be bad). seems like that route might need to be done delicately and with very clear communication about why it's being done, and with a plan for how to transition it into the main archive. I don't know what the rules are for inclusion in firmware-free. looking in the LICENSE file for firmware-free, i see: -- Driver: dsp56k - Atari DSP56k support File: dsp56k/bootstrap.bin Source: dsp56k/bootstrap.asm Licence: GPLv2 or later Copyright Frederik Noring DSP56001 assembler, possibly buildable with a56 from http://www.zdomain.com/a56.html -- and while a56 is in the debian archive, it's not a build-dep of firmware-free (bootstrap.bin is shipped directly in the source package). So maybe ath9k-htc-firmware is no worse than this? It would be a first step and a big service for users who opted to buy such freedom-respecting devices. yes, agreed! --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/87r3ulleoz@alice.fifthhorseman.net
Bug#742055: linux-image-3.13-1-powerpc: please enable r8712u.ko as a module on powerpc
Package: src:linux Version: 3.13.5-1 Severity: normal I'm using a powerbook g4. In addition to the onboard wireless, I have an ASUS USB-N10 NIC, which reports itself as: Bus 002 Device 002: ID 0b05:1786 ASUSTek Computer, Inc. USB-N10 802.11n Network Adapter [Realtek RTL8188SU] It does not work with the debian kernel because r8712u.ko is only enabled as a module for x86, afaict. 0 dkg@tyr:/tmp$ grep 8712 /boot/config-3.13-1-powerpc # CONFIG_R8712U is not set 0 dkg@tyr:/tmp$ Please enable this module for powerpc. thanks! --dkg -- Package-specific info: ** Version: Linux version 3.13-1-powerpc (debian-kernel@lists.debian.org) (gcc version 4.8.2 (Debian 4.8.2-14) ) #1 Debian 3.13.5-1 (2014-03-04) ** Command line: BOOT_IMAGE=/boot/vmlinux-3.13-1-powerpc root=/dev/mapper/vg_tyr0-root ro quiet radeon.modeset=1 video=radeonfb:off radeon.agpmode=1 init=/bin/systemd ** Not tainted ** Kernel log: [ 1567.439429] gem 0002:24:0f.0 eth0: Link down [ 2433.770153] wlan0: authenticate with 10:bd:18:3c:2a:a1 [ 2433.790848] wlan0: send auth to 10:bd:18:3c:2a:a1 (try 1/3) [ 2433.792150] wlan0: authenticated [ 2433.798738] wlan0: associate with 10:bd:18:3c:2a:a1 (try 1/3) [ 2433.801618] wlan0: RX AssocResp from 10:bd:18:3c:2a:a1 (capab=0x431 status=0 aid=68) [ 2433.801920] wlan0: associated [ 2433.801976] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready [ 2433.802098] cfg80211: Calling CRDA for country: US [ 2434.269406] cfg80211: Regulatory domain changed to country: US [ 2434.269420] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp) [ 2434.269424] cfg80211: (2402000 KHz - 2472000 KHz @ 4 KHz), (N/A, 3000 mBm) [ 2434.269427] cfg80211: (517 KHz - 525 KHz @ 8 KHz), (N/A, 1700 mBm) [ 2434.269430] cfg80211: (525 KHz - 533 KHz @ 8 KHz), (N/A, 2300 mBm) [ 2434.269432] cfg80211: (5735000 KHz - 5835000 KHz @ 8 KHz), (N/A, 3000 mBm) [ 2434.269435] cfg80211: (5724 KHz - 6372 KHz @ 216 KHz), (N/A, 4000 mBm) [ 2815.192353] wlan0: deauthenticating from 10:bd:18:3c:2a:a1 by local choice (reason=3) [ 2815.193679] cfg80211: Calling CRDA to update world regulatory domain [ 2815.281425] cfg80211: World regulatory domain updated: [ 2815.281437] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp) [ 2815.281441] cfg80211: (2402000 KHz - 2472000 KHz @ 4 KHz), (N/A, 2000 mBm) [ 2815.281444] cfg80211: (2457000 KHz - 2482000 KHz @ 4 KHz), (N/A, 2000 mBm) [ 2815.281446] cfg80211: (2474000 KHz - 2494000 KHz @ 2 KHz), (N/A, 2000 mBm) [ 2815.281449] cfg80211: (517 KHz - 525 KHz @ 8 KHz), (N/A, 2000 mBm) [ 2815.281452] cfg80211: (5735000 KHz - 5835000 KHz @ 8 KHz), (N/A, 2000 mBm) [ 2815.281455] cfg80211: (5724 KHz - 6372 KHz @ 216 KHz), (N/A, 0 mBm) [ 2861.096651] b43-phy0: Loading firmware version 666.2 (2011-02-23 01:15:07) [ 2861.223862] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 2861.316979] sungem_phy: PHY ID: 1410cc2, addr: 0 [ 2861.317038] gem 0002:24:0f.0 eth0: Found Marvell 88E PHY [ 2861.317566] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 2909.109902] wlan0: authenticate with 10:bd:18:3c:2a:a1 [ 2909.112753] wlan0: send auth to 10:bd:18:3c:2a:a1 (try 1/3) [ 2909.115168] wlan0: authenticated [ 2909.11] wlan0: associate with 10:bd:18:3c:2a:a1 (try 1/3) [ 2909.126786] wlan0: RX AssocResp from 10:bd:18:3c:2a:a1 (capab=0x431 status=0 aid=68) [ 2909.127093] wlan0: associated [ 2909.127150] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready [ 2909.127266] cfg80211: Calling CRDA for country: US [ 2909.182091] cfg80211: Regulatory domain changed to country: US [ 2909.182104] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp) [ 2909.182108] cfg80211: (2402000 KHz - 2472000 KHz @ 4 KHz), (N/A, 3000 mBm) [ 2909.182112] cfg80211: (517 KHz - 525 KHz @ 8 KHz), (N/A, 1700 mBm) [ 2909.182114] cfg80211: (525 KHz - 533 KHz @ 8 KHz), (N/A, 2300 mBm) [ 2909.182117] cfg80211: (5735000 KHz - 5835000 KHz @ 8 KHz), (N/A, 3000 mBm) [ 2909.182120] cfg80211: (5724 KHz - 6372 KHz @ 216 KHz), (N/A, 4000 mBm) [ 2958.021358] wlan0: deauthenticating from 10:bd:18:3c:2a:a1 by local choice (reason=3) [ 2958.031401] cfg80211: Calling CRDA to update world regulatory domain [ 2958.109990] cfg80211: World regulatory domain updated: [ 2958.110003] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp) [ 2958.110007] cfg80211: (2402000 KHz - 2472000 KHz @ 4 KHz), (N/A, 2000 mBm) [ 2958.110009] cfg80211: (2457000 KHz - 2482000 KHz @ 4 KHz), (N/A, 2000 mBm) [ 2958.110012] cfg80211: (2474000 KHz - 2494000 KHz @ 2 KHz), (N/A, 2000 mBm) [ 2958.110015] cfg80211: (517 KHz - 525 KHz @ 8 KHz), (N/A, 2000 mBm) [ 2958.110018] cfg80211: (5735000 KHz - 5835000 KHz @ 8 KHz), (N/A, 2000 mBm) [ 2958.110021]
Bug#713943: Same problem with linux-image-3.12-1-powerpc64
On 02/22/2014 02:51 AM, Erik de Castro Lopo wrote: I run debian testing on a dual G5 powermac. Just upgraded from linux-image-3.4-trunk-powerpc64 to linux-image-3.12-1-powerpc64 and found the same issue. The windfarm modules are loading but the about 30 seconds to a couple of minutes after booting to 3.12-1-powerpc64 the fans speed up to full speed. lsmod says the windfarm modules are being loaded, and its the same modules being loaded under the 3.4 kernel. I have however found a difference. On 3.12 I get: dmesg | grep windfarm [4.358299] windfarm: initializing for dual-core desktop G5 whereas on 3.4 I get: dmesg | grep windfarm [4.791589] windfarm: initializing for dual-core desktop G5 [9.077416] windfarm: CPUs control loops started. [ 12.440957] windfarm: Backside control loop started. [ 12.491701] windfarm: Slots control loop started. [ 12.592933] windfarm: Drive bay control loop started. Definitely something wonky there. My experience is that loading the i2c drivers on kernel 3.12 on a G5 makes the fans calm back down. I don't remember the name of the module exactly, but it's something like powermac-i2c.ko or i2c-powermac.ko (i don't have the machine in front of me), so modprobe -v powermac-i2c or modprobe -v i2c-powermac should do it. hth, --dkg signature.asc Description: OpenPGP digital signature
Bug#728668: linux-image-3.11-1-powerpc: nouveau kernel message every 10 seconds: E[ DRM] DDC responded, but no EDID for TV-1
On 11/04/2013 10:07 AM, Bastian Blank wrote: On Mon, Nov 04, 2013 at 09:11:30AM -0500, Daniel Kahn Gillmor wrote: On 11/04/2013 05:01 AM, Bastian Blank wrote: Do you have something connected to the TV-1 output? As far as i know, there is no TV-1 output. It's this style of machine: https://upload.wikimedia.org/wikipedia/commons/thumb/5/58/IMac_G4_sunflower7.png/250px-IMac_G4_sunflower7.png I don't think the built-in monitor (1440×900px) would be considered connected to TV-1 for a machine like this. Nope. It is most likely connected via LVDS. There is a workaround available: Add the parameter tv_disable=1 to nouveau, - either somewhere in /etc/modprobe.d | options nouveau tv_disable=1 or - on the kernel command line | nouveau.tv_disable=1 hm. I have tried both of these, and the kernel is still printing an error message every 10 seconds: 0 root@omega:~# cat /sys/module/nouveau/parameters/tv_disable 1 0 root@omega:~# dmesg | tail [ 19.586460] nouveau E[ DRM] DDC responded, but no EDID for TV-1 [ 38.398826] nouveau E[ DRM] DDC responded, but no EDID for TV-1 [ 48.446712] nouveau E[ DRM] DDC responded, but no EDID for TV-1 [ 58.494666] nouveau E[ DRM] DDC responded, but no EDID for TV-1 [ 68.542632] nouveau E[ DRM] DDC responded, but no EDID for TV-1 [ 78.590642] nouveau E[ DRM] DDC responded, but no EDID for TV-1 [ 88.638636] nouveau E[ DRM] DDC responded, but no EDID for TV-1 [ 98.686654] nouveau E[ DRM] DDC responded, but no EDID for TV-1 [ 108.734669] nouveau E[ DRM] DDC responded, but no EDID for TV-1 [ 118.782642] nouveau E[ DRM] DDC responded, but no EDID for TV-1 0 root@omega:~# any other suggestions? i note that there is also /sys/modules/nouveau/pramaeters/tv_norm, though i don't know what it is supposed to do. --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/527feb79.2060...@fifthhorseman.net
Bug#728668: linux-image-3.11-1-powerpc: nouveau kernel message every 10 seconds: E[ DRM] DDC responded, but no EDID for TV-1
On 11/04/2013 05:01 AM, Bastian Blank wrote: On Sun, Nov 03, 2013 at 11:37:11PM +, Daniel Kahn Gillmor wrote: As you can see from the attached kernel log, the nouveau module is reporting the same error message every 10 seconds. [ 319.038615] nouveau E[ DRM] DDC responded, but no EDID for TV-1 [ 329.086576] nouveau E[ DRM] DDC responded, but no EDID for TV-1 Do you have something connected to the TV-1 output? As far as i know, there is no TV-1 output. It's this style of machine: https://upload.wikimedia.org/wikipedia/commons/thumb/5/58/IMac_G4_sunflower7.png/250px-IMac_G4_sunflower7.png so it has a built-in monitor, and it has one of those apple-specific compact monitor ports that i think needs a custom dongle to connect it to anything (i do not have the custom dongle). I don't think the built-in monitor (1440×900px) would be considered connected to TV-1 for a machine like this. --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/5277ab12.70...@fifthhorseman.net
Bug#728668: linux-image-3.11-1-powerpc: nouveau kernel message every 10 seconds: E[ DRM] DDC responded, but no EDID for TV-1
On 11/04/2013 10:07 AM, Bastian Blank wrote: Nope. It is most likely connected via LVDS. that's what i would assume. I failed to check it when i was looking at the machine, though. There is a workaround available: Add the parameter tv_disable=1 to nouveau, - either somewhere in /etc/modprobe.d | options nouveau tv_disable=1 or - on the kernel command line | nouveau.tv_disable=1 thanks, i will try this workaround when i next have physical access to the computer (later this week), and report back any results. If there are kernel patches you would like me to try, I am happy to experiment and report back as well. Regards, --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/5277b968.4010...@fifthhorseman.net
Bug#728668: linux-image-3.11-1-powerpc: nouveau kernel message every 10 seconds: E[ DRM] DDC responded, but no EDID for TV-1
Package: src:linux Version: 3.11.6-2 Severity: normal As you can see from the attached kernel log, the nouveau module is reporting the same error message every 10 seconds. This machine (omega) is a 1GHz gooseneck powerpc G4 iMac. I got the same error messages and behavior from 3.10-3-powerpc. --dkg -- Package-specific info: ** Version: Linux version 3.11-1-powerpc (debian-kernel@lists.debian.org) (gcc version 4.8.2 (Debian 4.8.2-1) ) #1 Debian 3.11.6-2 (2013-11-01) ** Command line: BOOT_IMAGE=/boot/vmlinux-3.11-1-powerpc root=/dev/mapper/vg_omega0-root ro init=/bin/systemd quiet ** Not tainted ** Kernel log: [7.582461] ohci-pci 0001:10:18.0: new USB bus registered, assigned bus number 1 [7.582529] ohci-pci 0001:10:18.0: irq 27, io mem 0x80082000 [7.582795] Broadcom 43xx-legacy driver loaded [ Features: PLID ] [7.629639] Raw EDID: [7.629667] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [7.629677] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [7.629686] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [7.629694] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [7.629703] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [7.629712] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [7.629721] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [7.629730] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [7.629750] nouveau :00:10.0: TV-1: EDID block 0 invalid. [7.629762] nouveau E[ DRM] DDC responded, but no EDID for TV-1 [7.657135] usb usb1: New USB device found, idVendor=1d6b, idProduct=0001 [7.657152] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [7.657158] usb usb1: Product: OHCI PCI host controller [7.657164] usb usb1: Manufacturer: Linux 3.11-1-powerpc ohci_hcd [7.657170] usb usb1: SerialNumber: 0001:10:18.0 [7.657566] hub 1-0:1.0: USB hub found [7.657588] hub 1-0:1.0: 2 ports detected [7.657888] ohci-pci 0001:10:19.0: OHCI PCI host controller [7.657908] ohci-pci 0001:10:19.0: new USB bus registered, assigned bus number 2 [7.657969] ohci-pci 0001:10:19.0: irq 28, io mem 0x80081000 [7.741352] usb usb2: New USB device found, idVendor=1d6b, idProduct=0001 [7.741370] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [7.741376] usb usb2: Product: OHCI PCI host controller [7.741382] usb usb2: Manufacturer: Linux 3.11-1-powerpc ohci_hcd [7.741388] usb usb2: SerialNumber: 0001:10:19.0 [7.741780] hub 2-0:1.0: USB hub found [7.741802] hub 2-0:1.0: 2 ports detected [7.744343] ohci-pci 0001:10:1a.0: OHCI PCI host controller [7.744373] ohci-pci 0001:10:1a.0: new USB bus registered, assigned bus number 3 [7.744433] ohci-pci 0001:10:1a.0: irq 29, io mem 0x8008 [7.754878] nouveau [ DRM] allocated 1440x900 fb: 0x9000, bo efbca800 [7.855517] usb usb3: New USB device found, idVendor=1d6b, idProduct=0001 [7.855523] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [7.855526] usb usb3: Product: OHCI PCI host controller [7.855530] usb usb3: Manufacturer: Linux 3.11-1-powerpc ohci_hcd [7.855534] usb usb3: SerialNumber: 0001:10:1a.0 [7.858241] hub 3-0:1.0: USB hub found [7.858262] hub 3-0:1.0: 2 ports detected [7.885128] nouveau E[ DRM] Pixel clock comparison table not found [8.056740] Console: switching to colour frame buffer device 180x56 [8.059488] nouveau :00:10.0: fb0: nouveaufb frame buffer device [8.059495] nouveau :00:10.0: registered panic notifier [8.059987] [drm] Initialized nouveau 1.1.1 20120801 for :00:10.0 on minor 0 [8.073018] usb 1-2: new low-speed USB device number 2 using ohci-pci [8.264418] b43legacy ssb0:0: firmware: agent aborted loading b43legacy/ucode4.fw (not found?) [8.264652] b43legacy-phy0 ERROR: You must go to http://wireless.kernel.org/en/users/Drivers/b43#devicefirmware and download the correct firmware (version 3). [8.267190] usb 1-2: New USB device found, idVendor=1241, idProduct=1603 [8.267207] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [8.267213] usb 1-2: Product: USB Keyboard [8.267219] usb 1-2: Manufacturer: [8.451273] hidraw: raw HID events driver (C) Jiri Kosina [8.508850] usbcore: registered new interface driver usbhid [8.508868] usbhid: USB HID core driver [8.599782] input: USB Keyboard as /devices/pci0001:10/0001:10:18.0/usb1/1-2/1-2:1.0/input/input1 [8.600606] hid-generic 0003:1241:1603.0001: input,hidraw0: USB HID v1.10 Keyboard [ USB Keyboard] on usb-0001:10:18.0-2/input0 [8.606478] input: USB Keyboard as /devices/pci0001:10/0001:10:18.0/usb1/1-2/1-2:1.1/input/input2 [8.610631] hid-generic 0003:1241:1603.0002: input,hidraw1: USB HID v1.10 Device [ USB Keyboard] on usb-0001:10:18.0-2/input1 [9.125058] EXT4-fs (dm-1): mounted
Bug#726759: linux-tools-3.11: uninstallable on jessie/sid (Depends: libperl5.14 (= 5.14.2) but it is not installable)
Package: linux-tools-3.11 Version: 3.11~rc4-1~exp1 Severity: normal 0 root@alice:~# apt-get install linux-tools-3.11 Reading package lists... Done Building dependency tree Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: linux-tools-3.11 : Depends: libperl5.14 (= 5.14.2) but it is not installable E: Unable to correct problems, you have held broken packages. 100 root@alice:~# i suspect that the package hasn't been updated since experimental bumped to -trunk- and perl 5.18 migrated to jessie. --dkg -- System Information: Debian Release: jessie/sid APT prefers testing APT policy: (500, 'testing'), (200, 'unstable'), (1, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 3.11-trunk-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131018183811.21311.45935.report...@alice.fifthhorseman.net
Bug#720998: linux-image-3.2.0-4-amd64: in pid_nr_ns: unable to handle kernel NULL pointer dereference at 0000000000000013 Package: src:linux Version: 3.2.46-1 Severity: normal
Package: src:linux Version: 3.2.46-1 Severity: normal During a scheduled shutdown, the kernel crashed and locked up. the monitor showed several more failures after the one that got recorded below (i can attach a screenshot if that would be useful), but i believe this is the first visible symptom (the timestamps on the backtraces visible on the monitor are all 51010.*, about 10 minutes later than the first NULL pointer dereference). After a reboot, the machine comes up fine. the RAM has been tested via memtest86+, no errors shown. note that this is a 64-bit kernel with a 32-bit userland. Aug 21 07:00:45 hobbes kernel: [ 24.415401] sha1_ssse3: Using AVX optimized SHA-1 implementation Aug 21 07:00:47 hobbes kernel: [ 26.512023] eth0: no IPv6 routers present Aug 21 21:00:34 hobbes kernel: [50413.229680] BUG: unable to handle kernel NULL pointer dereference at 0013 Aug 21 21:00:34 hobbes kernel: [50413.229772] IP: [8105ce6c] pid_nr_ns+0xd/0x24 Aug 21 21:00:34 hobbes kernel: [50413.229837] PGD 1297e6067 PUD 12863a067 PMD 0 Aug 21 21:00:34 hobbes kernel: [50413.229894] Oops: [#1] SMP Aug 21 21:00:34 hobbes kernel: [50413.229936] CPU 3 Aug 21 21:00:34 hobbes kernel: [50413.229958] Modules linked in: sha1_ssse3 sha1_generic hmac cbc cts bnep rfcomm bluetooth rfkill crc16 uinput rpcsec_gss_krb5 nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc loop snd_hda_codec_realtek snd_hda_codec_hdmi powernow_k8 mperf snd_hda_intel crc32c_intel ghash_clmulni_intel aesni_intel snd_hda_codec i2c_piix4 snd_hwdep snd_pcm snd_page_alloc aes_x86_64 snd_seq snd_seq_device snd_timer snd soundcore processor psmouse tpm_tis tpm tpm_bios thermal_sys serio_raw pcspkr aes_generic cryptd evdev ext3 mbcache jbd dm_mod microcode usbhid hid sg sr_mod sd_mod cdrom crc_t10dif usb_storage ohci_hcd radeon xhci_hcd ahci libahci ehci_hcd power_supply i2c_algo_bit r8169 mii libata button usbcore scsi_mod usb_common ttm drm_kms_helper drm wmi i2c_core [last unloaded: scsi_wait_scan] Aug 21 21:00:34 hobbes kernel: [50413.230916] Aug 21 21:00:34 hobbes kernel: [50413.230936] Pid: 6813, comm: killall5 Not tainted 3.2.0-4-amd64 #1 Debian 3.2.46-1 LENOVO 4865A14/Annapurna CRB Aug 21 21:00:34 hobbes kernel: [50413.231045] RIP: 0010:[8105ce6c] [8105ce6c] pid_nr_ns+0xd/0x24 Aug 21 21:00:34 hobbes kernel: [50413.231126] RSP: :88012908bea8 EFLAGS: 00010206 Aug 21 21:00:34 hobbes kernel: [50413.231179] RAX: RBX: RCX: 0012 Aug 21 21:00:34 hobbes kernel: [50413.231249] RDX: RSI: 8161c060 RDI: 000f Aug 21 21:00:34 hobbes kernel: [50413.231317] RBP: 88012646c1c0 R08: R09: 00013780 Aug 21 21:00:34 hobbes kernel: [50413.231386] R10: 00013780 R11: 880126972300 R12: 0013 Aug 21 21:00:34 hobbes kernel: [50413.231458] R13: 0038 R14: 88012646c330 R15: 880126a7d1a0 Aug 21 21:00:34 hobbes kernel: [50413.231528] FS: () GS:88012fd8(0063) knlGS:f77378d0 Aug 21 21:00:34 hobbes kernel: [50413.231606] CS: 0010 DS: 002b ES: 002b CR0: 8005003b Aug 21 21:00:34 hobbes kernel: [50413.231663] CR2: 0013 CR3: 000129e86000 CR4: 000406e0 Aug 21 21:00:34 hobbes kernel: [50413.231732] DR0: DR1: DR2: Aug 21 21:00:34 hobbes kernel: [50413.231801] DR3: DR6: 0ff0 DR7: 0400 Aug 21 21:00:34 hobbes kernel: [50413.231870] Process killall5 (pid: 6813, threadinfo 88012908a000, task 880126a7d1a0) Aug 21 21:00:34 hobbes kernel: [50413.231950] Stack: Aug 21 21:00:34 hobbes kernel: [50413.231973] 8105cedc 81056538 e000 0013 Aug 21 21:00:34 hobbes kernel: [50413.232067] 1a9d 0001 810d292e Aug 21 21:00:34 hobbes kernel: [50413.232153] 88012908bf24 810d2a3d Aug 21 21:00:34 hobbes kernel: [50413.232239] Call Trace: Aug 21 21:00:34 hobbes kernel: [50413.232269] [8105cedc] ? __task_pid_nr_ns+0x46/0x47 Aug 21 21:00:34 hobbes kernel: [50413.232331] [81056538] ? sys_kill+0xc8/0x14c Aug 21 21:00:34 hobbes kernel: [50413.232387] [810d292e] ? __mlock_vma_pages_range+0x5e/0x65 Aug 21 21:00:34 hobbes kernel: [50413.232452] [810d2a3d] ? do_mlock_pages+0x108/0x11a Aug 21 21:00:34 hobbes kernel: [50413.232513] [81351523] ? ia32_do_call+0x13/0x13 Aug 21 21:00:34 hobbes kernel: [50413.232567] Code: e9 a1 f9 ff ff 65 48 8b 04 25 00 c7 00 00 48 8b 80 88 04 00 00 48 8b 70 20 e9 d8 ff ff ff 31 c0 48 85 ff 74 1c 8b 96 20 08 00 00 3b 57 04 77 11 48 c1 e2 05 48 8d 54 17 30 48 39 72 08 75 02 8b Aug 21 21:00:34 hobbes kernel: [50413.232985] RIP [8105ce6c] pid_nr_ns+0xd/0x24 Aug 21 21:00:34 hobbes kernel: [50413.233043]
Bug#720432: linux-image-3.2.0-4-amd64: kernel BUG at mm/slab.c:3111 Invalid opcode: 0000 [#1] SMP
Package: src:linux Version: 3.2.46-1 Severity: normal shortly after boot, the kernel on this machine crashed with the following backtraces, after which it became unresponsive, and i had to hard power it off to get it to boot again. After the reboot, there were no problems. I have run memtest on this machine, and the memory appears to be fine. This is the same class of machine (the lenovo thinkcentre m78) about which i've reported issues with ipmi_si.ko (which was already blacklisted on this machine). Other instances of these machines (even with ipmi_si.ko) experience what appear to be kernel-related lockups much more frequently than other amd64 or i686-pae hardware that runs the same software in the same physical environment, so i'm hoping that these backtraces might provide some sort of hints at what's going on. The machine that generated these backtraces (crookshanks) has run and been properly/safely rebooted twice since this crash, so the bug is not immediately replicable, unfortunately. I'd be happy to run any additional tests or experiments on this hardware if they would help shed more light on the matter. I should probably note that i have the amd64-microcode 1.20120910-2 package installed on these machines, since it doesn't show up in the reportbug automatic data. Any ideas what i should try next to debug this? Regards, --dkg Jul 26 07:11:04 crookshanks kernel: [ 583.444382] [ cut here ] Jul 26 07:11:04 crookshanks kernel: [ 583.40] kernel BUG at /build/linux-dJLVDt/linux-3.2.46/mm/slab.c:3111! Jul 26 07:11:04 crookshanks kernel: [ 583.444507] invalid opcode: [#1] SMP Jul 26 07:11:04 crookshanks kernel: [ 583.444554] CPU 2 Jul 26 07:11:04 crookshanks kernel: [ 583.444575] Modules linked in: bnep sha1_ssse3 rfcomm bluetooth sha1_generic hmac cbc rfkill crc16 cts uinput fuse rpcsec_gss_krb5 nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc loop snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc powernow_k8 mperf crc32c_intel ghash_clmulni_intel i2c_piix4 aesni_intel aes_x86_64 snd_seq snd_seq_device snd_timer aes_generic snd soundcore processor psmouse cryptd pcspkr tpm_tis tpm serio_raw thermal_sys evdev tpm_bios ext3 mbcache jbd dm_mod microcode sg sd_mod sr_mod crc_t10dif cdrom usbhid hid usb_storage ohci_hcd radeon ahci libahci libata wmi power_supply i2c_algo_bit ttm r8169 mii drm_kms_helper drm xhci_hcd i2c_core scsi_mod ehci_hcd button usbcore usb_common [last unloaded: scsi_wait_scan] Jul 26 07:11:04 crookshanks kernel: [ 583.445438] Jul 26 07:11:04 crookshanks kernel: [ 583.445457] Pid: 4106, comm: urban-system-up Not tainted 3.2.0-4-amd64 #1 Debian 3.2.46-1 LENOVO 4865A14/Annapurna CRB Jul 26 07:11:04 crookshanks kernel: [ 583.445565] RIP: 0010:[810eab43] [810eab43] cache_alloc+0xed/0x1fa Jul 26 07:11:04 crookshanks kernel: [ 583.445656] RSP: :88012760bdc8 EFLAGS: 00010002 Jul 26 07:11:04 crookshanks kernel: [ 583.445708] RAX: 880126ad60c0 RBX: 88012f01e500 RCX: 0007 Jul 26 07:11:04 crookshanks kernel: [ 583.445776] RDX: 88012f02ccd0 RSI: dead00200200 RDI: 8801258fe140 Jul 26 07:11:04 crookshanks kernel: [ 583.445843] RBP: R08: 0005 R09: 88012f02fc00 Jul 26 07:11:04 crookshanks kernel: [ 583.445911] R10: 4000 R11: 4000 R12: 000492d0 Jul 26 07:11:04 crookshanks kernel: [ 583.445978] R13: 88012f02ccc0 R14: 88012b2a16c0 R15: 0004 Jul 26 07:11:04 crookshanks kernel: [ 583.446047] FS: () GS:88012fd0(0063) knlGS:f77176c0 Jul 26 07:11:04 crookshanks kernel: [ 583.446124] CS: 0010 DS: 002b ES: 002b CR0: 8005003b Jul 26 07:11:04 crookshanks kernel: [ 583.446179] CR2: f762f2d0 CR3: 0001251a CR4: 000406e0 Jul 26 07:11:04 crookshanks kernel: [ 583.446250] DR0: DR1: DR2: Jul 26 07:11:04 crookshanks kernel: [ 583.446317] DR3: DR6: 0ff0 DR7: 0400 Jul 26 07:11:04 crookshanks kernel: [ 583.446389] Process urban-system-up (pid: 4106, threadinfo 88012760a000, task 8801258a83c0) Jul 26 07:11:04 crookshanks kernel: [ 583.446479] Stack: Jul 26 07:11:04 crookshanks kernel: [ 583.446500] 8801263ddcd0 88012f02ccd0 8801258fe140 88012f02cce0 Jul 26 07:11:04 crookshanks kernel: [ 583.446582] 00d0 01200011 88012f01e500 80d0 Jul 26 07:11:04 crookshanks kernel: [ 583.446662] 80d0 0246 f7717728 810ebdf2 Jul 26 07:11:04 crookshanks kernel: [ 583.446742] Call Trace: Jul 26 07:11:04 crookshanks kernel: [ 583.446772] [810ebdf2] ? kmem_cache_alloc+0x58/0xea Jul 26 07:11:04 crookshanks kernel: [ 583.446832] [81045675] ?
Bug#718546: linux-image-3.2.0-4-486: loading cs5535_mfgpt hangs alix machine
Package: src:linux Version: 3.2.46-1 Severity: normal booting this machine with the standard arguments results in an unrecoverable system hang as udev loads cs5535_mfgpt. This ALIX board has TinyBIOS 0.98. looking at the module itself, i see: 0 dkg@splat:/tmp$ /sbin/modinfo /lib/modules/3.2.0-4-486/kernel/drivers/misc/cs5535-mfgpt.ko filename: /lib/modules/3.2.0-4-486/kernel/drivers/misc/cs5535-mfgpt.ko alias: platform:cs5535-mfgpt license:GPL description:CS5535/CS5536 MFGPT timer driver author: Andres Salomon dilin...@queued.net depends: intree: Y vermagic: 3.2.0-4-486 mod_unload modversions 486 parm: mfgptfix:Reset the MFGPT timers during init; required by some broken BIOSes (ie, TinyBIOS 0.99). (int) 0 dkg@splat:/tmp$ so this machine does have TinyBIOS 0.99, but i can only get the system to boot if i blacklist the cs5535_mfgpt module. kernel 2.6.32 did not have cs5535_mfgpt, and was able to boot successfully without intervention. --dkg -- Package-specific info: ** Version: Linux version 3.2.0-4-486 (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 Debian 3.2.46-1 ** Command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-4-486 root=/dev/mapper/vg_splat0-root ro console=ttyS0,115200n8 init=/sbin/runit-init cs5535-mfgpt.disable=true ** Not tainted ** Kernel log: [7.691204] Failure reading codec reg 0x3c,Last value=0x0 [7.696750] Failure writing to cs5535 codec [7.701072] Failure writing to cs5535 codec [7.705399] Failure reading codec reg 0x1c,Last value=0x0 [7.713340] Failure writing to cs5535 codec [7.717669] Failure reading codec reg 0x0,Last value=0x0 [7.723128] Failure writing to cs5535 codec [7.727461] Failure reading codec reg 0x7c,Last value=0x0 [7.733006] Failure writing to cs5535 codec [7.737349] Failure reading codec reg 0x7e,Last value=0x0 [7.742894] Failure writing to cs5535 codec [7.747218] Failure reading codec reg 0x3c,Last value=0x0 [7.752769] Failure writing to cs5535 codec [7.757099] Failure writing to cs5535 codec [7.761430] Failure reading codec reg 0x1c,Last value=0x0 [7.769348] Failure writing to cs5535 codec [7.773677] Failure reading codec reg 0x0,Last value=0x0 [7.779134] Failure writing to cs5535 codec [7.783459] Failure reading codec reg 0x7c,Last value=0x0 [7.789004] Failure writing to cs5535 codec [7.793348] Failure reading codec reg 0x7e,Last value=0x0 [7.798893] Failure writing to cs5535 codec [7.803218] Failure reading codec reg 0x3c,Last value=0x0 [7.808764] Failure writing to cs5535 codec [7.813096] Failure writing to cs5535 codec [7.817422] Failure reading codec reg 0x1c,Last value=0x0 [7.825357] Failure writing to cs5535 codec [7.829683] Failure reading codec reg 0x0,Last value=0x0 [7.835143] Failure writing to cs5535 codec [7.839466] Failure reading codec reg 0x7c,Last value=0x0 [7.845004] Failure writing to cs5535 codec [7.849345] Failure reading codec reg 0x7e,Last value=0x0 [7.854892] Failure writing to cs5535 codec [7.859227] Failure reading codec reg 0x3c,Last value=0x0 [7.864772] Failure writing to cs5535 codec [7.869104] Failure writing to cs5535 codec [7.873436] Failure reading codec reg 0x1c,Last value=0x0 [7.881364] Failure writing to cs5535 codec [7.885690] Failure reading codec reg 0x0,Last value=0x0 [7.891150] Failure writing to cs5535 codec [7.895485] Failure reading codec reg 0x7c,Last value=0x0 [7.901027] Failure writing to cs5535 codec [7.905369] Failure reading codec reg 0x7e,Last value=0x0 [7.910908] Failure writing to cs5535 codec [7.915233] Failure reading codec reg 0x3c,Last value=0x0 [7.920779] Failure writing to cs5535 codec [7.925101] Failure writing to cs5535 codec [7.929428] Failure reading codec reg 0x1c,Last value=0x0 [7.937371] Failure writing to cs5535 codec [7.941698] Failure reading codec reg 0x0,Last value=0x0 [7.947157] Failure writing to cs5535 codec [7.951493] Failure reading codec reg 0x7c,Last value=0x0 [7.957039] Failure writing to cs5535 codec [7.961370] Failure reading codec reg 0x7e,Last value=0x0 [7.966916] Failure writing to cs5535 codec [7.971248] Failure reading codec reg 0x3c,Last value=0x0 [7.978395] Failure writing to cs5535 codec [7.982747] Failure writing to cs5535 codec [7.987077] Failure reading codec reg 0x1c,Last value=0x0 [8.007310] Failure writing to cs5535 codec [8.011670] Failure reading codec reg 0x0,Last value=0x0 [8.017128] Failure writing to cs5535 codec [8.021461] Failure reading codec reg 0x7c,Last value=0x0 [8.027013] Failure writing to cs5535 codec [8.031340] Failure reading codec reg 0x7e,Last value=0x0 [8.036883] Failure writing to cs5535 codec [8.041216] Failure reading codec reg 0x3c,Last value=0x0 [8.047574] Failure
Bug#717547: nfs-common: simple remount returns an error
On 07/22/2013 12:50 PM, J. R. Okajima wrote: I think I could see the scenario. - on wheezy, /etc/mtab becomes a symlink to /proc/mounts. - mount.nfs writes the given mount options to /etc/mtab, (but not to /proc/mounts.) - I found the problematic option is sec=sys. - when /etc/mtab is a regular file, sec=sys is not written since it was not specified explicitly at the mount-time. - but /proc/mounts always shows sec=. - when /etc/mtab is a symlink, mount(8) retrives the option strings from /proc/mounts including sec=sys and passes it to mount.nfs. - NFS in kernel-space parses the given options, compares them with the internally stored flags, and found that + sec=sys is given explicitly. + but the internal status doesn't have sec=sys flag set. - finally this un-matching is the cause of mount.nfs: mount(2): Invalid argument. In order to fix the problem, there may exist several ways. But I am not sure which is best. - stop /etc/mtab symlink-ing /proc/mounts (by Debian developers). - at the mount-time of NFS, set sec=sys (NFS_MOUNT_SECFLAVOUR) by default (by NFS developers). - mount.nfs always writes the sec=sys string (by mount.nfs developers). I suppose the best way should be decided after the discussion between NFS developers, mount.nfs developers and Debian developers. Interesting. I can confirm that the problem does happen on a test wheezy instance using sec=sys with /etc/mtab symlinked to /proc/mounts, and that it does *not* happen on a wheezy instance (also with the /etc/mtab→/proc/mounts symlink) using sec=krb5p. This suggests to me that it might be a bug in the way that sec=sys in particular is handled by the NFS tools. --dkg signature.asc Description: OpenPGP digital signature
Bug#713972: linux-image-3.10-rc5-powerpc: fails to boot (screen shows setup_arch: bootmem and arch: exit)
Package: src:linux Version: 3.10~rc5-1~exp1 Severity: important linux 3.10-rc5 fails to boot on this machine. This machine boots 3.2.0-1-powerpc (from wheezy) just fine. when i try to boot 3.10-rc5, grub successfully loads the kernel and the initramfs, and appears to hand off control to the kernel, but then the machine hangs for several minutes, and ultimately reboots back to openfirmware. during the hang, the screen displays a little of the openfirmware info in reverse video (black on white): . Device tree strings 0x03525000 - 0x03525bca Device tree struct 0x03526000 - 0x0354 Calling quiesce... returning from prom_init and then overwritten over the top of that in normal video (white on black) is the following messages (i think from the kernel): setup_arch: bootmem arch: exit these same messages come up with kernel 3.2.0-1-powerpc, but that kernel also says something about hugetlb (before the above two messages, i think it's in the dmesg output below), but then continues to boot. I haven't yet tried kernel 3.9. --dkg -- Package-specific info: ** Kernel log: boot messages should be attached ** Model information revision: 131.0 (pvr 0008 8300) platform: PowerMac model : PowerBook2,1 machine : PowerBook2,1 motherboard : PowerBook2,1 MacRISC Power Macintosh ** PCI devices: :00:0b.0 Host bridge [0600]: Apple Inc. UniNorth AGP [106b:0020] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- MAbort+ SERR- PERR- INTx- Latency: 16, Cache Line Size: 32 bytes Capabilities: access denied Kernel driver in use: agpgart-uninorth :00:10.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:4c4e] (rev 64) (prog-if 00 [VGA controller]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 255 (2000ns min), Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 48 Region 0: Memory at 9100 (32-bit, non-prefetchable) [size=16M] Region 1: I/O ports at 0400 [size=256] Region 2: Memory at 9000 (32-bit, non-prefetchable) [size=4K] Expansion ROM at 9002 [disabled] [size=128K] Capabilities: access denied Kernel driver in use: atyfb 0001:10:0b.0 Host bridge [0600]: Apple Inc. UniNorth PCI [106b:001f] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- MAbort+ SERR- PERR- INTx- Latency: 16, Cache Line Size: 32 bytes 0001:10:17.0 Unassigned class [ff00]: Apple Inc. KeyLargo Mac I/O [106b:0022] (rev 02) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 16, Cache Line Size: 32 bytes Region 0: Memory at 8000 (32-bit, non-prefetchable) [size=512K] Kernel driver in use: macio 0001:10:18.0 USB controller [0c03]: Apple Inc. KeyLargo USB [106b:0019] (prog-if 10 [OHCI]) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 16 (750ns min, 21500ns max) Interrupt: pin A routed to IRQ 27 Region 0: Memory at 8008 (32-bit, non-prefetchable) [size=4K] Kernel driver in use: ohci_hcd 0001:10:19.0 USB controller [0c03]: Apple Inc. KeyLargo USB [106b:0019] (prog-if 10 [OHCI]) Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- MAbort- SERR- PERR- INTx- Interrupt: pin A routed to IRQ 28 Region 0: Memory at unassigned (32-bit, non-prefetchable) [disabled] [size=4K] 0002:20:0b.0 Host bridge [0600]: Apple Inc. UniNorth Internal PCI [106b:001e] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- MAbort+ SERR- PERR- INTx- Latency: 16, Cache Line Size: 32 bytes 0002:20:0f.0 Ethernet controller [0200]: Apple Inc. UniNorth GMAC (Sun GEM) [106b:0021] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=slow TAbort- TAbort- MAbort- SERR- PERR+ INTx- Latency: 16 (16000ns
Bug#609747: snd-powermac
reopen 609747 reassign 609747 src:linux found 609747 3.2.46-1 found 609747 3.9.6-1 thanks It looks to me like snd-powermac still has this issue of not knowing to load automatically. I've confirmed it on the above two kernels. --dkg signature.asc Description: OpenPGP digital signature
Bug#701054: verbose kernel logs for thinkcentre m78: 3.2.0-4-686-pae, 3.7-trunk-686-pae, 3.8-trunk-686-pae
On Sun 2013-03-10 23:14:35 -0400, Daniel Kahn Gillmor wrote: On 03/10/2013 08:33 PM, Ben Hutchings wrote: You can try applying the quirk by adding 'acpi_osi=Linux' to the kernel command line. thanks, i'll give that a try when i'm in front of the machine tomorrow. this changed the log message to the following (from a 3.2 boot), but the behavior of the system remained the same, including the null pointer dereference in pulseaudio: [0.588857] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline * just 3.2: [1.676960] [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu [1.690776] [Firmware Bug]: cpu 0, IBS interrupt offset 0 not available (MSRC001103A=0x0100) [1.700255] Failed to setup IBS, -22 * various hda-intel weirdnesses. Does it still crash when starting PulseAudio? kernel 3.2 (which has the above output) does have a null dereference within pulseaudio -- you can see it at 185.6 seconds into the first of the three boots. neither 3.7 and 3.8 have this null dereference. using all three kernels, there is still a hang just before setting preliminary keymap, which is in /etc/rcS.d/S05keyboard-setup. So it's presumably hanging in one of: S01mountkernfs.sh S02udev S03mountdevsubfs.sh S04bootlogd i'm not sure what userspace process is causing that hang (that is, which one is being terminated when i send ctrl-c through the console), but i can try to track it down. Ah. i discovered that if i just wait 3 minutes at that hang instead of pressing ctrl+C impatiently, the boot process does continue, showing the following error messages: udevd[416]: worker [495] unexpectedly returned with status 0x0100 ^M udevd[416]: worker [495] failed while handling '/devices/pci:00/:00:15.2/:03:00.3' ^M done. Setting preliminary keymap...done. So it is almost certainly udev that is failing in this way. Again, with the 3.2 kernel, pulseaudio crashes, but it is slightly different. it's no longer a null pointer dereference. (fwiw, uid 109 appears to be the Debian-gdm user) Here is the backtrace From the crash after letting the boot process timeout with the acpi_osi=Linux parameter: [ 201.019945] kernel tried to execute NX-protected page - exploit attempt? (uid: 109) [ 201.020006] BUG: unable to handle kernel paging request at f62b7940 [ 201.020006] IP: [f62b7940] 0xf62b793f [ 201.020006] *pdpt = 01484001 *pde = 8000362001e3 [ 201.020006] Oops: 0011 [#1] SMP [ 201.020006] Modules linked in: sha1_generic hmac cbc cts bridge stp bnep rfcomm bluetooth rfkill crc16 rpcsec_gss_krb5 uinput fuse nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc loop snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm powernow_k8 snd_seq snd_timer snd_seq_device mperf crc32c_intel ipmi_si(+) i2c_piix4 i2c_core tpm_tis aesni_intel cryptd snd aes_i586 tpm ipmi_msghandler processor psmouse soundcore snd_page_alloc aes_generic serio_raw evdev tpm_bios pcspkr thermal_sys ext3 jbd mbcache dm_mod microcode usb_storage sg usbhid hid sd_mod sr_mod cdrom crc_t10dif ohci_hcd ehci_hcd xhci_hcd ahci libahci libata r8169 mii scsi_mod usbcore usb_common button [last unloaded: scsi_wait_scan] [ 201.052254] [ 201.052254] Pid: 3299, comm: pulseaudio Not tainted 3.2.0-4-686-pae #1 Debian 3.2.35-2 LENOVO 4865A14/Annapurna CRB [ 201.052254] EIP: 0060:[f62b7940] EFLAGS: 00010202 CPU: 3 [ 201.052254] EIP is at 0xf62b7940 [ 201.052254] EAX: f37f6e2c EBX: 0080 ECX: f365e4c0 EDX: f6ee5400 [ 201.052254] ESI: f843b05c EDI: f692ef00 EBP: f6ee5a4c ESP: f6f71d98 [ 201.052254] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 201.052254] Process pulseaudio (pid: 3299, ti=f6f7 task=f2e0ea80 task.ti=f6f7) [ 201.052254] Stack: [ 201.052254] f8438515 0080 0080 f2ea5400 f365e4c0 f6deee00 f37f6e2c f692ef00 [ 201.052254] f6deee38 f6f71e04 f843c518 f6c31b00 f84e9bcb f365e4c0 f367ea00 [ 201.052254] f2e0ea80 f84e9cbf f6f71e04 f367eb18 f367eb2c f2e0ea80 [ 201.052254] Call Trace: [ 201.052254] [f8438515] ? azx_pcm_open+0x171/0x1f7 [snd_hda_intel] [ 201.052254] [f84e9bcb] ? snd_pcm_open_substream+0x36/0x68 [snd_pcm] [ 201.052254] [f84e9cbf] ? snd_pcm_open+0xc2/0x1be [snd_pcm] [ 201.052254] [c10320e5] ? try_to_wake_up+0x155/0x155 [ 201.052254] [f84e9e33] ? snd_pcm_playback_open+0x31/0x46 [snd_pcm] [ 201.052254] [f83af49e] ? snd_open+0xf5/0x133 [snd] [ 201.052254] [c10cf2d7] ? chrdev_open+0xf3/0x111 [ 201.052254] [c10cafb3] ? __dentry_open+0x17a/0x253 [ 201.052254] [c10cbd01] ? nameidata_to_filp+0x3a/0x45 [ 201.052254] [c10cf1e4] ? cdev_put+0x17/0x17 [ 201.052254] [c10d5dac] ? do_last+0x4f8/0x513 [ 201.052254] [c10d606e] ? path_openat+0xa1/0x28b [ 201.052254] [c10d6301] ? do_filp_open+0x23/0x5c [ 201.052254] [c102a0e5] ? should_resched+0x5/0x1e
Bug#701054: verbose kernel logs for thinkcentre m78: 3.2.0-4-686-pae, 3.7-trunk-686-pae, 3.8-trunk-686-pae
On Mon 2013-03-11 15:40:06 -0400, Daniel Kahn Gillmor wrote: None of these warnings or backtraces show up on 3.8, and pulseaudio also does not crash on 3.8. However, i should note that one of the udev threads does still hang/fail on 3.8 for just under 180 seconds: -- [8.730656] sd 7:0:0:2: [sdd] Attached SCSI removable disk udevd[472]: worker [538] unexpectedly returned with status 0x0100 udevd[472]: worker [538] failed while handling '/devices/pci:00/:00:15.2/:03:00.3' done. Setting preliminary keymap...done. Checking root file system...fsck from util-linux-ng 2.17.2 e2fsck 1.41.12 (17-May-2010) /dev/mapper/krazy-root: clean, 134546/249984 fil[ 186.797993] EXT3-fs (dm-0): using internal journal es, 920069/999424 blocks (check in 4 mounts) done. Cleaning up ifupdown [ 186.935970] loop: module loaded -- note that the device in question appears to be: *-serial description: IPMI SMIC interface product: Realtek Semiconductor Co., Ltd. vendor: Realtek Semiconductor Co., Ltd. physical id: 0.3 bus info: pci@:03:00.3 version: 01 width: 64 bits clock: 33MHz capabilities: pm msi pciexpress msix vpd cap_list configuration: driver=ipmi_si latency=0 resources: irq:17 ioport:e000(size=256) memory:fea1-fea100ff memory:fea0-fea03fff should i be telling this machine to blacklist some module, or tell udev to avoid trying to do anything with this device? --dkg pgpyONLK8_Qkw.pgp Description: PGP signature
Bug#701054: verbose kernel logs for thinkcentre m78: 3.2.0-4-686-pae, 3.7-trunk-686-pae, 3.8-trunk-686-pae
On 03/10/2013 08:33 PM, Ben Hutchings wrote: This means: the BIOS includes a quirk for Linux, but we ignored it because there's no way to know which versions it was intended to apply to. (This was changed in Linux 2.6.23, so I have no idea why there are new machines like this.) You can try applying the quirk by adding 'acpi_osi=Linux' to the kernel command line. thanks, i'll give that a try when i'm in front of the machine tomorrow. * just 3.2: [1.676960] [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu [1.690776] [Firmware Bug]: cpu 0, IBS interrupt offset 0 not available (MSRC001103A=0x0100) [1.700255] Failed to setup IBS, -22 * various hda-intel weirdnesses. Does it still crash when starting PulseAudio? kernel 3.2 (which has the above output) does have a null dereference within pulseaudio -- you can see it at 185.6 seconds into the first of the three boots. neither 3.7 and 3.8 have this null dereference. using all three kernels, there is still a hang just before setting preliminary keymap, which is in /etc/rcS.d/S05keyboard-setup. So it's presumably hanging in one of: S01mountkernfs.sh S02udev S03mountdevsubfs.sh S04bootlogd i'm not sure what userspace process is causing that hang (that is, which one is being terminated when i send ctrl-c through the console), but i can try to track it down. I assume it still doesn't actually produce sound output. I haven't yet been able to coax any sound out of the system. The standard diagnostic script for ALSA is: http://www.alsa-project.org/alsa-info.sh If sound is still broken on 3.8 then please run this script there. Will do. Regards, --dkg signature.asc Description: OpenPGP digital signature
Bug#698780: BUG in nfs_mark_delegation_referenced after network outages
On 02/27/2013 01:30 AM, Rik Theys wrote: I've installed the 3.7 kernel from experimental on his system and the bug was no longer triggered, so it's probably fixed upstream. hm, that's interesting to note. I haven't tried with a 3.7 kernel yet. Unfortunately the user is not very responsive in helping me track down which upstream kernel version introduced the fix, and I've been unable to reproduce the problem myself. Do you have a way of (reliably) triggering this bug? alas, i do not :( i've only observed it in the flooded-local-link scenario i described in my earlier post, and it didn't happen to every maachine running that kernel, just one of them. And of course, this is not a failure mode i can really afford to deliberately re-introduce into a production environment either :/ --dkg signature.asc Description: OpenPGP digital signature
Bug#698780: linux-image-3.2.0-0.bpo.4-686-pae: BUG in nfs_mark_delegation_referenced after network outages
Package: src:linux Version: 3.2.35-2~bpo60+1 Severity: normal we're having some sort of network failure that i haven't been able to diagnose in full (one of the other machines on the network appears to occasionally flood the local link). during one of these floods, nfs (understandably) slows down. however, once the connectivity was restored (and the local link segment was working again), a subsequent NFS request triggered this same bug. --dkg -- Package-specific info: ** Version: Linux version 3.2.0-0.bpo.4-686-pae (debian-kernel@lists.debian.org) (gcc version 4.4.5 (Debian 4.4.5-8) ) #1 SMP Debian 3.2.35-2~bpo60+1 ** Command line: BOOT_IMAGE=/vmlinuz-3.2.0-0.bpo.4-686-pae root=/dev/mapper/birman-root ro quiet ** Tainted: D (128) * Kernel has oopsed before. ** Kernel log: [ 10.143580] kjournald starting. Commit interval 5 seconds [ 10.144058] EXT3-fs (dm-4): using internal journal [ 10.144067] EXT3-fs (dm-4): mounted filesystem with ordered data mode [ 10.169196] kjournald starting. Commit interval 5 seconds [ 10.169765] EXT3-fs (dm-2): using internal journal [ 10.169775] EXT3-fs (dm-2): mounted filesystem with ordered data mode [ 10.195736] kjournald starting. Commit interval 5 seconds [ 10.196089] EXT3-fs (dm-3): using internal journal [ 10.196098] EXT3-fs (dm-3): mounted filesystem with ordered data mode [ 11.321823] Bridge firewalling registered [ 11.360595] device eth0 entered promiscuous mode [ 11.362836] tg3 :3f:00.0: irq 42 for MSI/MSI-X [ 12.752639] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 12.755506] ADDRCONF(NETDEV_UP): br0: link is not ready [ 15.079958] tg3 :3f:00.0: eth0: Link is up at 1000 Mbps, full duplex [ 15.079968] tg3 :3f:00.0: eth0: Flow control is on for TX and on for RX [ 15.080318] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 15.080379] br0: port 1(eth0) entering forwarding state [ 15.080389] br0: port 1(eth0) entering forwarding state [ 15.080714] ADDRCONF(NETDEV_CHANGE): br0: link becomes ready [ 16.811499] RPC: Registered named UNIX socket transport module. [ 16.811508] RPC: Registered udp transport module. [ 16.811513] RPC: Registered tcp transport module. [ 16.811518] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 16.882085] FS-Cache: Loaded [ 16.949800] FS-Cache: Netfs 'nfs' registered for caching [ 16.964094] Installing knfsd (copyright (C) 1996 o...@monad.swb.de). [ 18.212279] fuse init (API version 7.17) [ 18.873226] kvm: Nested Virtualization enabled [ 18.873235] kvm: Nested Paging enabled [ 20.039046] input: ACPI Virtual Keyboard Device as /devices/virtual/input/input11 [ 20.996724] sshd (1601): /proc/1601/oom_adj is deprecated, please use /proc/1601/oom_score_adj instead. [ 21.436772] Bluetooth: Core ver 2.16 [ 21.436824] NET: Registered protocol family 31 [ 21.436831] Bluetooth: HCI device and connection manager initialized [ 21.436839] Bluetooth: HCI socket layer initialized [ 21.436844] Bluetooth: L2CAP socket layer initialized [ 21.436855] Bluetooth: SCO socket layer initialized [ 23.000913] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [ 23.000921] Bluetooth: BNEP filters: protocol multicast [ 23.001226] Bluetooth: RFCOMM TTY layer initialized [ 23.001235] Bluetooth: RFCOMM socket layer initialized [ 23.001240] Bluetooth: RFCOMM ver 1.11 [ 23.992379] lp: driver loaded but no devices found [ 23.999604] ppdev: user-space parallel port driver [ 25.792138] br0: no IPv6 routers present [ 25.816095] eth0: no IPv6 routers present [ 27.676641] ip_tables: (C) 2000-2006 Netfilter Core Team [ 28.049051] udev[1923]: starting version 164 [ 28.693023] ip6_tables: (C) 2000-2006 Netfilter Core Team [ 30.112026] br0: port 1(eth0) entering forwarding state [ 33.992404] tun: Universal TUN/TAP device driver, 1.6 [ 33.992411] tun: (C) 1999-2004 Max Krasnyansky m...@qualcomm.com [ 33.995142] device vnet0 entered promiscuous mode [ 33.995621] br0: port 2(vnet0) entering forwarding state [ 33.995634] br0: port 2(vnet0) entering forwarding state [ 35.128400] hda-intel: IRQ timing workaround is activated for card #1. Suggest a bigger bdl_pos_adj. [ 44.304075] vnet0: no IPv6 routers present [ 49.024072] br0: port 2(vnet0) entering forwarding state [15073.112099] nfs: server molly not responding, still trying [17572.459204] tg3 :3f:00.0: eth0: Link is down [17572.494368] br0: port 1(eth0) entering forwarding state [17697.602932] tg3 :3f:00.0: eth0: Link is up at 1000 Mbps, full duplex [17697.602943] tg3 :3f:00.0: eth0: Flow control is on for TX and on for RX [17697.603280] br0: port 1(eth0) entering forwarding state [17697.603304] br0: port 1(eth0) entering forwarding state [17712.608099] br0: port 1(eth0) entering forwarding state [18118.648453] nfs: server molly OK [18855.584888] BUG: unable to handle kernel paging request at ffdc [18855.584934] IP: [fafa5b7a] nfs_mark_delegation_referenced+0x6/0x6
Bug#701054: linux-image-3.2.0-4-686-pae: NULL pointer dereference in azx_pcm_open on ThinkCentre M78 hardware (ATI Technologies Inc Device 9902)
Package: src:linux Version: 3.2.35-2 Severity: normal When i let udev load snd_hda_* modules on this thinkcentre m78, pulseaudio somehow triggers a NULL pointer dereference in the kernel. I think that the reportbug hooks include all the baseline relevant info. i'm also happy to provide whatever followup information you think might be useful. please let me know. Regards, --dkg -- Package-specific info: ** Version: Linux version 3.2.0-4-686-pae (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.2.35-2 ** Command line: BOOT_IMAGE=/vmlinuz-3.2.0-4-686-pae root=/dev/mapper/krazy-root ro quiet ** Tainted: D (128) * Kernel has oopsed before. ** Kernel log: [5.140318] hda-codec: out of range cmd 0:0:da88:708:da88 [5.140501] HDMI status: Codec=0 Pin=3 Presence_Detect=0 ELD_Valid=0 [5.140562] HDMI status: Codec=0 Pin=0 Presence_Detect=0 ELD_Valid=0 [5.140612] HDMI status: Codec=0 Pin=0 Presence_Detect=0 ELD_Valid=0 [5.140676] HDMI status: Codec=0 Pin=0 Presence_Detect=0 ELD_Valid=0 [5.140745] hda-codec: out of range cmd 0:0:da88:f00:c [5.140783] hda-codec: out of range cmd 0:0:da88:709:0 [5.140821] hda-codec: out of range cmd 0:0:da88:f09:0 [5.140858] HDMI status: Codec=0 Pin=55944 Presence_Detect=1 ELD_Valid=1 [5.140862] hda-codec: out of range cmd 0:0:da88:f2e:8 [5.140900] hda-codec: out of range cmd 0:0:da88:709:0 [5.140937] hda-codec: out of range cmd 0:0:da88:f09:0 [5.141023] HDMI status: Codec=0 Pin=0 Presence_Detect=0 ELD_Valid=0 [5.141113] hda_codec: cannot build controls for #0 (error -16) [5.198526] input: HD-Audio Generic Headphone as /devices/pci:00/:00:14.2/sound/card1/input4 [ 19.474300] EXT3-fs (dm-0): using internal journal [ 19.630645] loop: module loaded [ 19.790396] RPC: Registered named UNIX socket transport module. [ 19.790402] RPC: Registered udp transport module. [ 19.790405] RPC: Registered tcp transport module. [ 19.790409] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 19.900704] FS-Cache: Loaded [ 19.931822] FS-Cache: Netfs 'nfs' registered for caching [ 19.951745] Installing knfsd (copyright (C) 1996 o...@monad.swb.de). [ 20.444134] Adding 499708k swap on /dev/mapper/krazy-swap_1. Priority:-1 extents:1 across:499708k [ 21.401917] kjournald starting. Commit interval 5 seconds [ 21.402160] EXT3-fs (sda1): using internal journal [ 21.402167] EXT3-fs (sda1): mounted filesystem with ordered data mode [ 21.449914] kjournald starting. Commit interval 5 seconds [ 21.450469] EXT3-fs (dm-4): using internal journal [ 21.450476] EXT3-fs (dm-4): mounted filesystem with ordered data mode [ 21.470324] kjournald starting. Commit interval 5 seconds [ 21.470574] EXT3-fs (dm-2): using internal journal [ 21.470582] EXT3-fs (dm-2): mounted filesystem with ordered data mode [ 21.479028] kjournald starting. Commit interval 5 seconds [ 21.479273] EXT3-fs (dm-3): using internal journal [ 21.479279] EXT3-fs (dm-3): mounted filesystem with ordered data mode [ 22.176889] fuse init (API version 7.17) [ 22.530616] r8169 :03:00.0: eth0: link down [ 22.530635] r8169 :03:00.0: eth0: link down [ 22.530886] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 24.193329] input: ACPI Virtual Keyboard Device as /devices/virtual/input/input5 [ 24.477874] sshd (1885): /proc/1885/oom_adj is deprecated, please use /proc/1885/oom_score_adj instead. [ 25.445846] r8169 :03:00.0: eth0: link up [ 25.446091] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 28.156661] Bluetooth: Core ver 2.16 [ 28.156695] NET: Registered protocol family 31 [ 28.156699] Bluetooth: HCI device and connection manager initialized [ 28.156705] Bluetooth: HCI socket layer initialized [ 28.156710] Bluetooth: L2CAP socket layer initialized [ 28.156720] Bluetooth: SCO socket layer initialized [ 28.195774] Bluetooth: RFCOMM TTY layer initialized [ 28.195785] Bluetooth: RFCOMM socket layer initialized [ 28.195790] Bluetooth: RFCOMM ver 1.11 [ 28.263947] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [ 28.263953] Bluetooth: BNEP filters: protocol multicast [ 28.303989] Bridge firewalling registered [ 35.125780] udev[2498]: starting version 164 [ 35.898189] BUG: unable to handle kernel NULL pointer dereference at (null) [ 35.898198] IP: [ (null)] (null) [ 35.898205] *pdpt = 331e7001 *pde = [ 35.898212] Oops: 0010 [#1] SMP [ 35.898218] Modules linked in: sha1_generic hmac cbc cts bridge stp bnep rfcomm bluetooth rfkill crc16 rpcsec_gss_krb5 uinput fuse nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc loop snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_intel snd_hda_codec powernow_k8 mperf snd_hwdep snd_pcm crc32c_intel snd_seq snd_timer snd_seq_device snd psmouse i2c_piix4 i2c_core tpm_tis aesni_intel cryptd aes_i586 aes_generic soundcore snd_page_alloc ipmi_si(+) ipmi_msghandler
Bug#683111: linux-image-3.2.0-4-ixp4xx: same WARNING on armel (linux-3.2.32/block/genhd.c:1573 disk_clear_events+0xc8/0x110()) Followup-For: Bug #683111 Package: src:linux Version: 3.2.32-1
I'm seeing an OOPS from the same line of code on armel. You can see the backtrace in the dmesg output below. This is from an NSLU2, where the root filesystem is on a 2GiB USB stick, and the machine has 32MiB of RAM and a 265BogoMIPS XScale-IXP42x CPU. Regards, --dkg -- Package-specific info: ** Version: Linux version 3.2.0-4-ixp4xx (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-9) ) #1 Debian 3.2.32-1 ** Command line: console=ttyS0,115200 noirqdebug ** Tainted: W (512) * Taint on warning. ** Kernel log: [3.487706] hub 1-0:1.0: 5 ports detected [3.923091] usb 1-2: new high-speed USB device number 3 using ehci_hcd [4.074175] usb 1-2: New USB device found, idVendor=0781, idProduct=5406 [4.080986] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [4.088294] usb 1-2: Product: U3 Cruzer Micro [4.092747] usb 1-2: Manufacturer: SanDisk [4.096913] usb 1-2: SerialNumber: 08782307B150EF77 [4.246588] SCSI subsystem initialized [4.264784] usbcore: registered new interface driver uas [4.290730] Initializing USB Mass Storage driver... [4.296806] scsi0 : usb-storage 1-2:1.0 [4.307533] usbcore: registered new interface driver usb-storage [4.313759] USB Mass Storage support registered. [5.305988] scsi 0:0:0:0: Direct-Access SanDisk U3 Cruzer Micro 8.02 PQ: 0 ANSI: 0 CCS [5.316201] scsi 0:0:0:1: CD-ROMSanDisk U3 Cruzer Micro 8.02 PQ: 0 ANSI: 0 [5.475288] sd 0:0:0:0: [sda] 3907583 512-byte logical blocks: (2.00 GB/1.86 GiB) [5.503782] sd 0:0:0:0: [sda] Write Protect is off [5.508677] sd 0:0:0:0: [sda] Mode Sense: 45 00 00 08 [5.512609] sd 0:0:0:0: [sda] No Caching mode page present [5.518196] sd 0:0:0:0: [sda] Assuming drive cache: write through [5.535757] sd 0:0:0:0: [sda] No Caching mode page present [5.541347] sd 0:0:0:0: [sda] Assuming drive cache: write through [5.555094] sda: sda1 sda2 sda5 [5.571007] sd 0:0:0:0: [sda] No Caching mode page present [5.576699] sd 0:0:0:0: [sda] Assuming drive cache: write through [5.582925] sd 0:0:0:0: [sda] Attached SCSI removable disk [6.325444] [ cut here ] [6.330219] WARNING: at /build/buildd-linux_3.2.32-1-armel-KUCrxS/linux-3.2.32/block/genhd.c:1573 disk_clear_events+0xc8/0x110() [6.341920] Modules linked in: sd_mod crc_t10dif usb_storage uas scsi_mod ehci_hcd usbcore usb_common [6.351373] [c00137ac] (unwind_backtrace+0x0/0xe0) from [c0026f18] (warn_slowpath_common+0x4c/0x64) [6.360921] [c0026f18] (warn_slowpath_common+0x4c/0x64) from [c0026f48] (warn_slowpath_null+0x18/0x1c) [6.370733] [c0026f48] (warn_slowpath_null+0x18/0x1c) from [c015af4c] (disk_clear_events+0xc8/0x110) [6.380375] [c015af4c] (disk_clear_events+0xc8/0x110) from [c00ecc48] (check_disk_change+0x18/0x50) [6.389985] [c00ecc48] (check_disk_change+0x18/0x50) from [bf08d478] (sd_open+0x7c/0x138 [sd_mod]) [6.399495] [bf08d478] (sd_open+0x7c/0x138 [sd_mod]) from [c00edce4] (__blkdev_get+0x284/0x3a8) [6.408689] [c00edce4] (__blkdev_get+0x284/0x3a8) from [c00edf98] (blkdev_get+0x190/0x28c) [6.417451] [c00edf98] (blkdev_get+0x190/0x28c) from [c00c1300] (__dentry_open+0x224/0x33c) [6.426293] [c00c1300] (__dentry_open+0x224/0x33c) from [c00c2228] (nameidata_to_filp+0x50/0x5c) [6.435583] [c00c2228] (nameidata_to_filp+0x50/0x5c) from [c00cec50] (do_last.isra.26+0x670/0x6a8) [6.445043] [c00cec50] (do_last.isra.26+0x670/0x6a8) from [c00ced64] (path_openat+0xb4/0x3e0) [6.454062] [c00ced64] (path_openat+0xb4/0x3e0) from [c00cf174] (do_filp_open+0x2c/0x78) [6.462641] [c00cf174] (do_filp_open+0x2c/0x78) from [c00c2318] (do_sys_open+0xe4/0x17c) [6.471237] [c00c2318] (do_sys_open+0xe4/0x17c) from [c000dea0] (ret_fast_syscall+0x0/0x2c) [6.480049] ---[ end trace fb8a3c5ed96fd8b7 ]--- [6.839330] kjournald starting. Commit interval 5 seconds [6.845203] EXT3-fs (sda1): mounted filesystem with ordered data mode [ 10.362348] udevd[254]: starting version 175 [ 11.498005] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver [ 11.643850] ohci_hcd :00:01.0: OHCI Host Controller [ 11.695488] ohci_hcd :00:01.0: new USB bus registered, assigned bus number 2 [ 11.789395] ohci_hcd :00:01.0: irq 28, io mem 0x4800 [ 11.884793] usb usb2: New USB device found, idVendor=1d6b, idProduct=0001 [ 11.891695] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 11.899149] usb usb2: Product: OHCI Host Controller [ 11.904142] usb usb2: Manufacturer: Linux 3.2.0-4-ixp4xx ohci_hcd [ 11.910346] usb usb2: SerialNumber: :00:01.0 [ 11.976445] input: ixp4xx beeper as /devices/platform/ixp4xx-beeper.4/input/input0 [ 12.155663] IXP4xx Queue Manager initialized. [ 12.235505] hub 2-0:1.0: USB hub found [ 12.279355] hub 2-0:1.0: 3 ports detected [ 12.334303] ohci_hcd :00:01.1: OHCI Host Controller [
Bug#694028: [wheezy] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
On 11/26/2012 04:04 AM, Jonathan Nieder wrote: Perfect, thanks. I'll try to find time to dig into the log and acpixtract-ed and iasl -d-ed acpidump some time this week, though I can't promise anything. How often do oopses like this occur? Well, i only use wireless infrequently, and due to terrible battery life on this machine, i'm rarely unplugged from power -- so the conditions that seem to be related to this crash don't happen often enough for the crash to be frequent. i also tried (half-heartedly -- i don't like having this machine crash) once to repeat the sequence of events that appeared to lead to the crash immediately after my reboot, but got no effect. --dkg signature.asc Description: OpenPGP digital signature
Bug#694028: linux-image-3.2.0-4-686-pae: kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
Package: src:linux Version: 3.2.32-1 Severity: normal As you can see from the dmesg output, i had a kernel oops related somehow to a failed paging request. This happened just about the time that i plugged this laptop (an Asus EeePC 900) into wall power, after a brief period of time running from the battery. The machine was using the wireless NIC the entire time. the machine was running X11, but a text-mode OOPS message (with the backtrace) was overwritten, obscuring the X11 display (though the mouse pointer appeared over the text. I was able to restore X11 functionality with ctrl-alt-F1, followed by ctrl-alt-F7. I don't have strong data here, but my impression is that I've seen failures like this much more frequently around power events while the wireless is running. Note that when the wireless is not running, it is usually disabled with the rfkill switch. I don't know if this is enough info to debug, or if this is something that maybe should be chalked up to hardware failures; but i'd be happy to provide more information if it would be useful to anyone. Regards, --dkg -- Package-specific info: ** Version: Linux version 3.2.0-4-686-pae (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-12) ) #1 SMP Debian 3.2.32-1 ** Command line: BOOT_IMAGE=/vmlinuz-3.2.0-4-686-pae root=/dev/mapper/vg_pip0-root ro verbose ** Tainted: D (128) * Kernel has oopsed before. ** Kernel log: [411828.876751] PM: Saving platform NVS memory [411828.877463] Disabling non-boot CPUs ... [411828.877463] ACPI: Low-level resume complete [411828.877463] PM: Restoring platform NVS memory [411828.877463] Force enabled HPET at resume [411828.877463] ACPI: Waking up from system sleep state S3 [411828.953214] uhci_hcd :00:1d.0: wake-up capability disabled by ACPI [411828.953255] uhci_hcd :00:1d.1: wake-up capability disabled by ACPI [411828.953293] uhci_hcd :00:1d.2: wake-up capability disabled by ACPI [411828.953332] uhci_hcd :00:1d.3: wake-up capability disabled by ACPI [411828.953387] ehci_hcd :00:1d.7: wake-up capability disabled by ACPI [411828.953704] PM: early resume of devices complete after 0.846 msecs [411828.953893] i915 :00:02.0: setting latency timer to 64 [411829.013246] snd_hda_intel :00:1b.0: setting latency timer to 64 [411829.013310] snd_hda_intel :00:1b.0: irq 43 for MSI/MSI-X [411829.013489] uhci_hcd :00:1d.0: setting latency timer to 64 [411829.013518] usb usb2: root hub lost power or was reset [411829.013539] uhci_hcd :00:1d.1: setting latency timer to 64 [411829.013564] usb usb3: root hub lost power or was reset [411829.013584] uhci_hcd :00:1d.2: setting latency timer to 64 [411829.013609] usb usb4: root hub lost power or was reset [411829.013628] uhci_hcd :00:1d.3: setting latency timer to 64 [411829.013653] usb usb5: root hub lost power or was reset [411829.013675] ehci_hcd :00:1d.7: setting latency timer to 64 [411829.013714] pci :00:1e.0: setting latency timer to 64 [411829.013735] ata_piix :00:1f.2: setting latency timer to 64 [411829.018135] sd 0:0:0:0: [sda] Starting disk [411829.256044] usb 1-5: reset high-speed USB device number 3 using ehci_hcd [411829.500036] usb 1-8: reset high-speed USB device number 4 using ehci_hcd [411831.364309] ata1.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) filtered out [411831.364319] ata1.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out [411831.380295] ata1.00: configured for UDMA/133 [411831.380962] PM: resume of devices complete after 2427.214 msecs [411831.444182] PM: Finishing wakeup. [411831.444187] Restarting tasks ... done. [411831.532148] video LNXVIDEO:00: Restoring backlight state [411832.112239] sd 2:0:0:0: [sdb] No Caching mode page present [411832.112251] sd 2:0:0:0: [sdb] Assuming drive cache: write through [411832.118091] sd 2:0:0:0: [sdb] No Caching mode page present [411832.118101] sd 2:0:0:0: [sdb] Assuming drive cache: write through [411832.119462] sdb: sdb1 [411832.803601] Atheros(R) L2 Ethernet Driver - version 2.2.3 [411832.803612] Copyright (c) 2007 Atheros Corporation. [411832.803907] atl2 :03:00.0: setting latency timer to 64 [411833.345379] atl2 :03:00.0: irq 44 for MSI/MSI-X [411833.346711] ADDRCONF(NETDEV_UP): eth0: link is not ready [411856.844620] pci :01:00.0: [168c:001c] type 0 class 0x000200 [411856.844659] pci :01:00.0: reg 10: [mem 0x-0x 64bit] [411856.844838] pci :01:00.0: BAR 0: assigned [mem 0xf800-0xf800 64bit] [411856.845043] ath5k :01:00.0: enabling device ( - 0002) [411856.845069] ath5k :01:00.0: setting latency timer to 64 [411856.845179] ath5k :01:00.0: registered as 'phy1' [411857.363325] ath: EEPROM regdomain: 0x60 [411857.36] ath: EEPROM indicates we should expect a direct regpair map [411857.363341] ath: Country alpha2 being used: 00 [411857.363346] ath: Regpair used: 0x60 [411857.366253] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
Bug#692324: linux-image-3.6-trunk-powerpc: Oops and hang at boot: Unable to handle kernel paging request for data at address 0x0000000c
Thanks for the explanation, ben. fwiw, i should have noted that this machine has snd-powermac in /etc/modules. when i remove snd-powermac from /etc/modules and reboot the machine, it boots fine into 3.6-trunk-powerpc. then, doing a modprobe -v snd-powermac causes the same crash. So that module is definitely the culprit. On Tue 2012-11-06 22:20:42 -0500, Ben Hutchings wrote: Otherwise, I think upstream will need some information from the OF device tree (we probably ought to gather that automatically in bug reports...) Maybe something like: find /proc/device-tree -path '*i2c*' would be a useful start. 0 root@colddeadhands:~/bugs/692324# find /proc/device-tree -path '*i2c*' /proc/device-tree/pci@f200/mac-io@17/i2c@18000 /proc/device-tree/pci@f200/mac-io@17/i2c@18000/name /proc/device-tree/pci@f200/mac-io@17/i2c@18000/linux,phandle /proc/device-tree/pci@f200/mac-io@17/i2c@18000/AAPL,driver-name /proc/device-tree/pci@f200/mac-io@17/i2c@18000/AAPL,i2c-rate /proc/device-tree/pci@f200/mac-io@17/i2c@18000/AAPL,address-step /proc/device-tree/pci@f200/mac-io@17/i2c@18000/AAPL,address /proc/device-tree/pci@f200/mac-io@17/i2c@18000/interrupt-parent /proc/device-tree/pci@f200/mac-io@17/i2c@18000/interrupts /proc/device-tree/pci@f200/mac-io@17/i2c@18000/built-in /proc/device-tree/pci@f200/mac-io@17/i2c@18000/#address-cells /proc/device-tree/pci@f200/mac-io@17/i2c@18000/#size-cells /proc/device-tree/pci@f200/mac-io@17/i2c@18000/compatible /proc/device-tree/pci@f200/mac-io@17/i2c@18000/reg /proc/device-tree/pci@f200/mac-io@17/i2c@18000/device_type /proc/device-tree/pci@f200/mac-io@17/i2c@18000/i2c-modem /proc/device-tree/pci@f200/mac-io@17/i2c@18000/i2c-modem/name /proc/device-tree/pci@f200/mac-io@17/i2c@18000/i2c-modem/linux,phandle /proc/device-tree/pci@f200/mac-io@17/i2c@18000/i2c-modem/default-country-code /proc/device-tree/pci@f200/mac-io@17/i2c@18000/i2c-modem/slot-names /proc/device-tree/pci@f200/mac-io@17/i2c@18000/i2c-modem/compatible /proc/device-tree/pci@f200/mac-io@17/i2c@18000/i2c-modem/modem-id /proc/device-tree/pci@f200/mac-io@17/i2c@18000/i2c-modem/device_type /proc/device-tree/pci@f200/mac-io@17/i2c@18000/deq@6a /proc/device-tree/pci@f200/mac-io@17/i2c@18000/deq@6a/name /proc/device-tree/pci@f200/mac-io@17/i2c@18000/deq@6a/linux,phandle /proc/device-tree/pci@f200/mac-io@17/i2c@18000/deq@6a/i2c-address /proc/device-tree/pci@f200/mac-io@17/i2c@18000/deq@6a/reg /proc/device-tree/pci@f200/mac-io@17/i2c@18000/deq@6a/device_type /proc/device-tree/pci@f200/mac-io@17/i2c@18000/cereal@1c0 /proc/device-tree/pci@f200/mac-io@17/i2c@18000/cereal@1c0/name /proc/device-tree/pci@f200/mac-io@17/i2c@18000/cereal@1c0/linux,phandle /proc/device-tree/pci@f200/mac-io@17/i2c@18000/cereal@1c0/device_type /proc/device-tree/pci@f200/mac-io@17/i2c@18000/cereal@1c0/reg /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/name /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/linux,phandle /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/#size-cells /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/#address-cells /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/compatible /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/device_type /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-pwm@298 /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-pwm@298/name /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-pwm@298/linux,phandle /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-pwm@298/platform-do-ivad-pwm /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-pwm@298/compatible /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-pwm@298/reg /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-pwm@298/device_type /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-eeprom@2a6 /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-eeprom@2a6/name /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-eeprom@2a6/linux,phandle /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-eeprom@2a6/platform-do-ivad-eeprom /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-eeprom@2a6/compatible /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-eeprom@2a6/reg /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad-eeprom@2a6/device_type /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad@28c /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad@28c/name /proc/device-tree/pci@f200/mac-io@17/via-pmu@16000/pmu-i2c/ivad@28c/linux,phandle
Bug#687915: linux-image-3.2.0-3-686-pae: general protection fault when plugging in power on Asus EeePC 900
Package: src:linux Version: 3.2.23-1 Severity: normal I had the wireless card enabled for a couple minutes before noticing that the battery on this Asus EeePC 900 was rather low. I plugged it into wall power, and immediately as the battery started charging the following error message showed up on the screen (making my X11 session inaccessible): Sep 16 15:24:45 pip kernel: [35635.074942] general protection fault: 0004 [#1] SMP Sep 16 15:24:45 pip kernel: [35635.075059] Modules linked in: arc4 atl2 nfnetlink bnep rfcomm bluetooth crc16 binfmt_misc uinput fuse ath5k ath mac80211 cfg80211 loop joydev snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss i915 snd_pcm drm_kms_helper iTCO_wdt uvcvideo psmouse snd_page_alloc drm iTCO_vendor_support snd_timer i2c_algo_bit videodev snd media serio_raw evdev rng_core soundcore i2c_core battery eeepc_laptop ac video sparse_keymap power_supply rfkill processor button ext3 mbcache jbd btrfs crc32c libcrc32c zlib_deflate ohci_hcd sha256_generic cryptd aes_i586 aes_generic cbc usbhid hid usb_storage uas dm_crypt dm_mod raid1 md_mod sg sd_mod crc_t10dif ata_generic ata_piix ahci libahci libata scsi_mod uhci_hcd ehci_hcd usbcore thermal thermal_sys usb_common [last unloaded: atl2] Sep 16 15:24:45 pip kernel: [35635.076738] Sep 16 15:24:45 pip kernel: [35635.076771] Pid: 17665, comm: pidof Not tainted 3.2.0-3-686-pae #1 ASUSTeK Computer INC. 900/900 Sep 16 15:24:45 pip kernel: [35635.076815] EIP: 0060:[c10caa62] EFLAGS: 00010206 CPU: 0 Sep 16 15:24:45 pip kernel: [35635.076815] EIP is at sys_close+0x18/0x89 Sep 16 15:24:45 pip kernel: [35635.076815] EAX: f25bc1b0 EBX: 0004 ECX: EDX: 00df Sep 16 15:24:45 pip kernel: [35635.076815] ESI: 0003 EDI: EBP: f25e8000 ESP: f25e9fa8 Sep 16 15:24:45 pip kernel: [35635.076815] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Sep 16 15:24:45 pip kernel: [35635.076815] Process pidof (pid: 17665, ti=f25e8000 task=f25bc1c0 task.ti=f25e8000) Sep 16 15:24:45 pip kernel: [35635.076815] Stack: Sep 16 15:24:45 pip kernel: [35635.076815] 0004 086715f0 c12c419f 0004 b76fcff4 086715f0 Sep 16 15:24:45 pip kernel: [35635.076815] bfd1cdc8 0006 007b 007b 0033 0006 b7726424 Sep 16 15:24:45 pip kernel: [35635.076815] 0073 0286 bfd1cd94 007b 0042904c 0042904c Sep 16 15:24:45 pip kernel: [35635.076815] Call Trace: Sep 16 15:24:45 pip kernel: [35635.076815] [c12c419f] ? sysenter_do_call+0x12/0x28 Sep 16 15:24:45 pip kernel: [35635.076815] Code: d8 e8 d2 ef 02 00 89 d8 e8 16 21 00 00 89 f0 5b 5e 5f c3 56 53 8b 74 24 0c 64 a1 0c 4f 47 c1 4a 04 a4 02 00 00 8d 43 00 8b d2 4e 1f 00 8b 53 04 3b 32 73 56 8b 42 04 8b 04 b0 85 c0 74 4c 8b ff Sep 16 15:24:45 pip kernel: [35635.076815] EIP: [c10caa62] sys_close+0x18/0x89 SS:ESP 0068:f25e9fa8 Sep 16 15:24:45 pip kernel: [35635.279291] ---[ end trace 5800fe6bc9a5d526 ]--- I was unable to even use ctrl-alt-f1 to switch vt's. However, ACPI triggers were working, and i could put the machine to sleep and wake it again. after wake, though, the laptop's screen still showed this spew and i couldn't change that with ctrl-alt-f1, or any trackpad activity. I did not have an external monitor handy to try out the VGA port. Using ACPI triggers, i was able to cleanly shut the machine down, and restart it. The graphics are fine after a restart. I'm happy to try to provide more details if you let me know what info would be useful. --dkg -- Package-specific info: ** Version: Linux version 3.2.0-3-686-pae (Debian 3.2.23-1) (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-8) ) #1 SMP Mon Jul 23 03:50:34 UTC 2012 ** Command line: BOOT_IMAGE=/vmlinuz-3.2.0-3-686-pae root=/dev/mapper/vg_pip0-root ro verbose ** Not tainted ** Kernel log: [17078.216093] PM: suspend of devices complete after 347.687 msecs [17078.216407] ehci_hcd :00:1d.7: PME# enabled [17078.216427] ehci_hcd :00:1d.7: wake-up capability enabled by ACPI [17078.232098] uhci_hcd :00:1d.3: wake-up capability enabled by ACPI [17078.232134] uhci_hcd :00:1d.2: wake-up capability enabled by ACPI [17078.232169] uhci_hcd :00:1d.1: wake-up capability enabled by ACPI [17078.232204] uhci_hcd :00:1d.0: wake-up capability enabled by ACPI [17078.232272] PM: late suspend of devices complete after 16.171 msecs [17078.232393] ACPI: Preparing to enter system sleep state S3 [17078.256648] PM: Saving platform NVS memory [17078.257265] Disabling non-boot CPUs ... [17078.257265] ACPI: Low-level resume complete [17078.257265] PM: Restoring platform NVS memory [17078.257265] Force enabled HPET at resume [17078.257265] ACPI: Waking up from system sleep state S3 [17078.300990] snd_hda_intel :00:1b.0: restoring config space at offset 0x1 (was 0x16, writing 0x12) [17078.301030] pcieport :00:1c.0: restoring config space at offset 0x9 (was 0x1fff1, writing 0x7fe17fd1)
Bug#666121: linux-image-3.2.0-1-686-pae: please rate-limit NFS state manager error messages
Package: linux-2.6 Version: 3.2.7-1 Severity: wishlist Tags: patch Sometimes, the NFSv4 client's state mangaer gets into a bad state. This causes insane amounts of error messages which can quickly fill up /var in common syslog configurations (i've seen 1000's of lines per second) I brought this up with upstream and Trond Myklebust offered the attached patch to rate-limit the error messages: http://thread.gmane.org/gmane.linux.nfs/47832 It was included in Trond's pull request last Thursday to Linus for inclusion with upstream: http://thread.gmane.org/gmane.linux.nfs/48150 Please consider this patch for inclusion in the debian kernel. Thanks, --dkg From 9a3ba432330e504ac61ff0043dbdaba7cea0e35a Mon Sep 17 00:00:00 2001 From: Trond Myklebust trond.mykleb...@netapp.com Date: Mon, 12 Mar 2012 18:01:48 -0400 Subject: [PATCH] NFSv4: Rate limit the state manager warning messages Prevent the state manager from filling up system logs when recovery fails on the server. Signed-off-by: Trond Myklebust trond.mykleb...@netapp.com Cc: sta...@vger.kernel.org --- fs/nfs/callback_xdr.c |4 +++- fs/nfs/nfs4proc.c |2 +- fs/nfs/nfs4state.c|4 ++-- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c index fd6cfdb..95bfc24 100644 --- a/fs/nfs/callback_xdr.c +++ b/fs/nfs/callback_xdr.c @@ -9,6 +9,8 @@ #include linux/sunrpc/svc.h #include linux/nfs4.h #include linux/nfs_fs.h +#include linux/ratelimit.h +#include linux/printk.h #include linux/slab.h #include linux/sunrpc/bc_xprt.h #include nfs4_fs.h @@ -167,7 +169,7 @@ static __be32 decode_compound_hdr_arg(struct xdr_stream *xdr, struct cb_compound if (hdr-minorversion = 1) { hdr-cb_ident = ntohl(*p++); /* ignored by v4.1 */ } else { - printk(KERN_WARNING NFS: %s: NFSv4 server callback with + pr_warn_ratelimited(NFS: %s: NFSv4 server callback with illegal minor version %u!\n, __func__, hdr-minorversion); return htonl(NFS4ERR_MINOR_VERS_MISMATCH); diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 36a7cda..5e0961a 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -1876,7 +1876,7 @@ static struct nfs4_state *nfs4_do_open(struct inode *dir, struct dentry *dentry, * the user though... */ if (status == -NFS4ERR_BAD_SEQID) { - printk(KERN_WARNING NFS: v4 server %s + pr_warn_ratelimited(NFS: v4 server %s returned a bad sequence-id error!\n, NFS_SERVER(dir)-nfs_client-cl_hostname); exception.retry = 1; diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 7c58607..cb708b2 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -984,7 +984,7 @@ static void nfs_increment_seqid(int status, struct nfs_seqid *seqid) case -NFS4ERR_BAD_SEQID: if (seqid-sequence-flags NFS_SEQID_CONFIRMED) return; - printk(KERN_WARNING NFS: v4 server returned a bad + pr_warn_ratelimited(NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence %p!\n, seqid-sequence); @@ -1840,7 +1840,7 @@ static void nfs4_state_manager(struct nfs_client *clp) } while (atomic_read(clp-cl_count) 1); return; out_error: - printk(KERN_WARNING NFS: state manager failed on NFSv4 server %s + pr_warn_ratelimited(NFS: state manager failed on NFSv4 server %s with error %d\n, clp-cl_hostname, -status); nfs4_end_drain_session(clp); nfs4_clear_state_manager_bit(clp);
Bug#665413: BUG: unable to handle kernel paging request in mark_files_ro
On Fri, 23 Mar 2012 21:38:20 -0500, Jonathan Nieder jrnie...@gmail.com wrote: Daniel Kahn Gillmor wrote: I'm about to try to reboot it again to see if i can get it back to stability under the lenny hypervisor and kernel, but i'll need to do that with the rescue 2.6.32-5-486 image as well, so it's possible that i'll have another backtrace or crash to follow up with in a little bit. Ok, thanks again. I'd suggest blacklisting the i915 module to rule it out as a cause. Rebooted the machine again to 2.6.32-5-486 with i915 blacklisted, and got this crash during boot (even before the handoff to init): Begin: Running /scripts/local-bottom ... done. done. Begin: Running /scripts/init-bottom ... [6.202557] BUG: unable to handle kernel paging request at 04040f7c [6.204010] IP: [c107dc0c] pmd_none_or_clear_bad+0x0/0x27 [6.204010] *pde = [6.204010] Oops: [#1] [6.204010] last sysfs file: /sys/power/resume [6.204010] Modules linked in: ext3 jbd mbcache dm_mod raid1 md_mod sd_mod crc_t10dif ata_generic uhci_hcd tg3 thermal ata_piix libphy 3c59x mii tulip ehci_hcd thermal_sys libata scsi_mod usbcore nls_base [last unloaded: scsi_wait_scan] [6.204010] [6.204010] Pid: 44, comm: udevd Not tainted (2.6.32-5-486 #1) HP d530 SFF(DG784A) [6.204010] EIP: 0060:[c107dc0c] EFLAGS: 00010206 CPU: 0 [6.204010] EIP is at pmd_none_or_clear_bad+0x0/0x27 [6.204010] EAX: 04040f7c EBX: b7c0 ECX: 04040404 EDX: 04040f7c [6.204010] ESI: c321bce0 EDI: b7879000 EBP: f72ca8f0 ESP: f72e7eb4 [6.204010] DS: 007b ES: 007b FS: GS: 00e0 SS: 0068 [6.204010] Process udevd (pid: 44, ti=f72e6000 task=f72f0820 task.ti=f72e6000) [6.204010] Stack: [6.204010] c107efe7 c1018d00 fffb2000 ec06b067 f72f0820 b7879fff edfe7065 c31d4ec0 [6.204010] 0 f72e7f44 0001 0039cede f72cec00 b787a000 04040f7c [6.204010] 0 04040f7c 0004 c1073d95 fffb21e4 f72cec00 c1338508 [6.204010] Call Trace: [6.204010] [c107efe7] ? unmap_vmas+0x1ba/0x5b4 [6.204010] [c1018d00] ? kmap_atomic_prot+0xbd/0xe0 [6.204010] [c1073d95] ? pagevec_lru_add+0xf4/0x102 [6.204010] [c108293e] ? exit_mmap+0x90/0xf9 [6.204010] [c1025ddd] ? jiffies_to_timeval+0x1c/0x33 [6.204010] [c1020c9d] ? mmput+0x32/0x92 [6.204010] [c1023b4a] ? exit_mm+0xaa/0xb1 [6.204010] [c104ac93] ? acct_collect+0x5b/0x109 [6.204010] [c1025146] ? do_exit+0x184/0x579 [6.204010] [c1025589] ? do_group_exit+0x4e/0x71 [6.204010] [c10255bd] ? sys_exit_group+0x11/0x14 [6.204010] [c100312c] ? syscall_call+0x7/0xb [6.204010] Code: 2d c1 68 98 05 2d c1 e8 84 71 1c 00 31 d2 89 d0 8d b6 00 00 00 00 89 44 24 14 89 d8 8b 54 24 14 e8 79 66 f9 ff 90 83 c4 18 5b c3 8b 10 89 c1 b8 01 00 00 00 85 d2 74 19 81 e2 fb 0f 00 00 31 c0 [6.204010] EIP: [c107dc0c] pmd_none_or_clear_bad+0x0/0x27 SS:ESP 0068:f72e7eb4 [6.204010] CR2: 04040f7c [6.406606] ---[ end trace e36674c63db8ef72 ]--- [6.411216] Fixing recursive fault but reboot is needed! done. INIT: version 2.88 booting Starting the hotplug events dispatcher: udevd[8.961493] udev[370]: starting version 164 Any ideas? I think this safely rules out i915 as the cause. Also, i need to retract my claim that it was running the lenny kernel for years until just recently, now that i've inspected the logs from the machine more closely. It looks like it was running as a lenny xen system up until Feb. 17, 2011, at which point it switched to running a squeeze stack based on xen-hypervisor-4.0-i386 (v. 4.0-2) and linux-image-2.6.32-5-xen-686 (v. 2.6.32-30). This combination ran on this hardware for over a year (i know, i know) without trouble, and crashed on March 9th with this message: Mar 9 08:04:17 monkey kernel: [1163.489222] BUG: unable to handle kernel paging request at 04247c8b Mar 9 08:04:17 monkey kernel: [1163.489252] IP: [04247c8b] 0x4247c8b Mar 9 08:04:17 monkey kernel: [1163.489273] *pdpt = 01469001 *pde = Mar 9 08:04:17 monkey kernel: [1163.489293] Oops: [#1] SMP Mar 9 08:04:17 monkey kernel: [1163.489310] last sysfs file: /sys/devices/virtual/block/md1/md/mismatch_cnt Mar 9 08:04:17 monkey kernel: [1163.489324] Modules linked in: xt_state iptable_mangle xt_physdev iptable_filter ipt_MASQUERADE ipt_REDIRECT xt_tcpudp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables bridge stp xen_evtchn xenfs dummy loop snd_intel8x0 snd_ac97_codec i915 ac97_bus drm_kms_helper snd_pcm snd_timer i2c_i801 drm snd pl2303 i2c_algo_bit soundcore evdev parport_pc pcspkr psmouse processor usbserial parport video acpi_processor serio_raw shpchp rng_core snd_page_alloc i2c_core output pci_hotplug button ext3 jbd mbcache dm_mod raid1 md_mod sd_mod crc_t10dif ata_generic tg3 3c59x ata_piix floppy tulip uhci_hcd mii libphy libata ehci_hcd scsi_mod usbcore
Bug#665413: BUG: unable to handle kernel paging request
Package: linux-image-2.6.32-5-486 Version: 2.6.32-41 less than 10 minutes after booting to 2.6.32-5-486 on an HP d530 SFF workstation (model DG784A) with 4GiB of RAM, i got this kernel BUG and then panic: [ 574.852044] BUG: unable to handle kernel paging request at b4777dbf [ 574.856011] IP: [c109520e] mark_files_ro+0x27/0x6f [ 574.856011] *pde = [ 574.856011] Oops: 0002 [#1] [ 574.856011] last sysfs file: /sys/devices/virtual/block/md0/md/metadata_version [ 574.856011] Modules linked in: ext3 jbd mbcache raid1 md_mod dm_crypt dm_mod pl2303 usbserial sd_mod crc_t10dif ata_generic i915 tg3 3c59x drm_kms_helper tulip mii libphy uhci_hcd drm i2c_algo_bit snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer i2c_i801 ata_piix snd soundcore shpchp parport_pc button processor thermal parport libata i2c_core ehci_hcd rng_core snd_page_alloc evdev psmouse serio_raw pcspkr scsi_mod pci_hotplug usbcore nls_base video thermal_sys output [ 574.856011] [ 574.856011] Pid: 6349, comm: dpkg-deb Not tainted (2.6.32-5-486 #1) HP d530 SFF(DG784A) [ 574.856011] EIP: 0060:[c109520e] EFLAGS: 00010282 CPU: 0 [ 574.856011] EIP is at mark_files_ro+0x27/0x6f [ 574.856011] EAX: f3435c85 EBX: c134239c ECX: f3435c00 EDX: f3431800 [ 574.856011] ESI: f3435a80 EDI: 0008 EBP: c13423b4 ESP: f57bff40 [ 574.856011] DS: 007b ES: 007b FS: GS: 00e0 SS: 0068 [ 574.856011] Process dpkg-deb (pid: 6349, ti=f57be000 task=f570e080 task.ti=f57be000) [ 574.856011] Stack: [ 574.856011] c105b8ef f57bff60 c1342278 0085 c1408a1c c140881c f3436580 c140841c [ 574.856011] 0 f57bff60 0046 0009 0024 c1407b20 c105b9be c1026dfa 0001 [ 574.856011] 0 000a 0100 0046 0010 092b0282 bfa5f828 c1026ed1 0010 [ 574.856011] Call Trace: [ 574.856011] [c105b8ef] ? __rcu_process_callbacks+0x292/0x352 [ 574.856011] [c105b9be] ? rcu_process_callbacks+0xf/0x1f [ 574.856011] [c1026dfa] ? __do_softirq+0x8e/0x135 [ 574.856011] [c1026ed1] ? do_softirq+0x30/0x3b [ 574.856011] [c1026f94] ? irq_exit+0x25/0x53 [ 574.856011] [c100e963] ? smp_apic_timer_interrupt+0x60/0x68 [ 574.856011] [c10037f1] ? apic_timer_interrupt+0x31/0x40 [ 574.856011] Code: f0 00 00 c3 57 56 89 c6 53 8d 78 74 8b 56 74 eb 54 8b 42 0c 8b 40 0c 0f b7 40 6e 25 00 f0 00 00 3d 00 80 00 00 75 3c 8b 42 14 85 c0 74 35 8b 42 1c a8 02 74 2e 8b 5a 08 83 e0 fd 89 42 1c 85 db [ 574.856011] EIP: [c109520e] mark_files_ro+0x27/0x6f SS:ESP 0068:f57bff40 [ 574.856011] CR2: b4777dbf [ 575.058358] ---[ end trace 31af091d3864bfb9 ]--- [ 575.062966] Kernel panic - not syncing: Fatal exception in interrupt [ 575.069310] Pid: 6349, comm: dpkg-deb Tainted: G D2.6.32-5-486 #1 [ 575.076170] Call Trace: [ 575.078607] [c1244ccb] ? panic+0x38/0xde [ 575.082790] [c1246a1c] ? oops_end+0x81/0x8d [ 575.087227] [c10151c8] ? no_context+0x104/0x10d [ 575.092011] [c1015318] ? __bad_area_nosemaphore+0x147/0x152 [ 575.097835] [c10a21b8] ? touch_atime+0x69/0xd9 [ 575.102530] [c109a309] ? pipe_read+0x32c/0x33b [ 575.107228] [c10069b9] ? sched_clock+0x5/0x7 [ 575.111754] [c10376eb] ? sched_clock_local+0x15/0x11c [ 575.117057] [c1247851] ? do_page_fault+0x0/0x26a [ 575.121927] [c101532d] ? bad_area_nosemaphore+0xa/0xc [ 575.127230] [c124612b] ? error_code+0x6b/0x70 [ 575.131840] [c109520e] ? mark_files_ro+0x27/0x6f [ 575.136709] [c105b8ef] ? __rcu_process_callbacks+0x292/0x352 [ 575.142618] [c105b9be] ? rcu_process_callbacks+0xf/0x1f [ 575.148095] [c1026dfa] ? __do_softirq+0x8e/0x135 [ 575.152964] [c1026ed1] ? do_softirq+0x30/0x3b [ 575.157574] [c1026f94] ? irq_exit+0x25/0x53 [ 575.162010] [c100e963] ? smp_apic_timer_interrupt+0x60/0x68 [ 575.167833] [c10037f1] ? apic_timer_interrupt+0x31/0x40 I normally don't use the -486 variant on this machine (due to the 4GiB of RAM), but i've been having other trouble with the machine (to be reported in due course), and i had booted into a -486 as a fallback. The only kernel boot parameter on this run was console=ttyS0,115200n8 -- i brought up the rest of the system by hand during this fallback attempt. I've run memtest86+ on this machine and the memory shows no errors in that program. Let me know if there are other details i can report that would help with this bug report; sorry i'm not able to get the machine to a stable point yet to run reportbug on it directly. --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/874ntfgcsj@pip.fifthhorseman.net
Bug#665413: BUG: unable to handle kernel paging request in mark_files_ro
On Fri, 23 Mar 2012 19:19:58 -0500, Jonathan Nieder jrnie...@gmail.com wrote: Daniel Kahn Gillmor wrote: [ 574.852044] BUG: unable to handle kernel paging request at b4777dbf [ 574.856011] IP: [c109520e] mark_files_ro+0x27/0x6f [ 574.856011] *pde = [ 574.856011] Oops: 0002 [#1] [ 574.856011] last sysfs file: /sys/devices/virtual/block/md0/md/metadata_version [ 574.856011] Modules linked in: ext3 jbd mbcache raid1 md_mod dm_crypt dm_mod pl2303 usbserial sd_mod crc_t10dif ata_generic i915 tg3 3c59x drm_kms_helper tulip mii libphy uhci_hcd drm i2c_algo_bit snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer i2c_i801 ata_piix snd soundcore shpchp parport_pc button processor thermal parport libata i2c_core ehci_hcd rng_core snd_page_alloc evdev psmouse serio_raw pcspkr scsi_mod pci_hotplug usbcore nls_base video thermal_sys output [ 574.856011] [ 574.856011] Pid: 6349, comm: dpkg-deb Not tainted (2.6.32-5-486 #1) HP d530 SFF(DG784A) [...] [ 574.856011] Call Trace: [ 574.856011] [c105b8ef] ? __rcu_process_callbacks+0x292/0x352 [ 574.856011] [c105b9be] ? rcu_process_callbacks+0xf/0x1f [ 574.856011] [c1026dfa] ? __do_softirq+0x8e/0x135 [ 574.856011] [c1026ed1] ? do_softirq+0x30/0x3b [ 574.856011] [c1026f94] ? irq_exit+0x25/0x53 [ 574.856011] [c100e963] ? smp_apic_timer_interrupt+0x60/0x68 [ 574.856011] [c10037f1] ? apic_timer_interrupt+0x31/0x40 [ 574.856011] Code: f0 00 00 c3 57 56 89 c6 53 8d 78 74 8b 56 74 eb 54 8b 42 0c 8b 40 0c 0f b7 40 6e 25 00 f0 00 00 3d 00 80 00 00 75 3c 8b 42 14 85 c0 74 35 8b 42 1c a8 02 74 2e 8b 5a 08 83 e0 fd 89 42 1c 85 db Is this reproducible? Is the IP and backtrace the same each time? Alas, no, it's not strictly reproducible. With this same kernel, i've also gotten machine freezes (no console output, hard reset required), and also this CPU lockup: [ 5297.844002] BUG: soft lockup - CPU#0 stuck for 61s! [apt-get:914] [ 5297.844002] Modules linked in: ext3 jbd mbcache raid1 md_mod dm_crypt dm_mod sd_mod crc_t10dif pl2303 usbserial ata_generic i915 drm_kms_helper snd_intel8x0 drm snd_ac97_codec ac97_bus i2c_algo_bit tg3 3c59x snd_pcm mii libphy tulip snd_timer snd i2c_i801 soundcore ata_piix uhci_hcd ehci_hcd shpchp parport_pc video floppy parport pcspkr processor snd_page_alloc thermal button i2c_core libata evdev psmouse serio_raw scsi_mod rng_core pci_hotplug usbcore nls_base thermal_sys output [ 5297.844002] [ 5297.844002] Pid: 914, comm: apt-get Not tainted (2.6.32-5-486 #1) HP d530 SFF(DG784A) [ 5297.844002] EIP: 0060:[f9123aa2] EFLAGS: 0246 CPU: 0 [ 5297.844002] EIP is at walk_page_buffers+0x1a/0x65 [ext3] [ 5297.844002] EAX: EBX: ECX: EDX: f3c2a7c0 [ 5297.844002] ESI: f3c2a7c0 EDI: EBP: f3c2a7c0 ESP: c59a3e14 [ 5297.844002] DS: 007b ES: 007b FS: GS: 00e0 SS: 0068 [ 5297.844002] CR0: 8005003b CR2: b7696cbb CR3: 31372000 CR4: 0690 [ 5297.844002] DR0: DR1: DR2: DR3: [ 5297.844002] DR6: 0ff0 DR7: 0400 [ 5297.844002] Call Trace: [ 5297.844002] [f9124e49] ? ext3_ordered_writepage+0x74/0x13c [ext3] [ 5297.844002] [f9123aff] ? buffer_unmapped+0x0/0xc [ext3] [ 5297.844002] [c10723cb] ? __writepage+0x8/0x20 [ 5297.844002] [c1072957] ? write_cache_pages+0x1b2/0x2a2 [ 5297.844002] [c10723c3] ? __writepage+0x0/0x20 [ 5297.844002] [c1072a61] ? generic_writepages+0x1a/0x21 [ 5297.844002] [c106e549] ? __filemap_fdatawrite_range+0x63/0x6e [ 5297.844002] [c106e585] ? filemap_write_and_wait_range+0x31/0x67 [ 5297.844002] [c10ac0e8] ? vfs_fsync_range+0x4b/0x85 [ 5297.844002] [c10ac189] ? vfs_fsync+0x11/0x15 [ 5297.844002] [c1085259] ? sys_msync+0x101/0x164 [ 5297.844002] [c1003043] ? sysenter_do_call+0x12/0x28 Unfortunately, the machine is in a remote location, so performing the hard reset is difficult; My ultimate goal is also to use this machine with xen, since it has been running the lenny (and etch before that, iirc) xen kernel and hypervisor for years with no problem. Booting the machine into the squeeze xen hypervisor (4.0) and the squeeze xen kernel causes a separate series of errors (not yet reported because i haven't had a chance to formulate them cleanly). Here's an example output of running memtest86+ on the same machine, in case a demonstration that the RAM isn't faulty would be useful (or if you can glean more useful info from it than i can) Memtest86+ v4.10 | Pass 40% ### Pentium 4 (0.13) 2660 MHz | Test 59% ### L1 Cache:8K 20001 MB/s | Test #5 [Block move, 80 moves] L2 Cache: 512K 17387 MB/s | Testing: 188K - 2048M 3808M L3 Cache: None| Pattern: Memory : 3808M 1645 MB/s
Bug#665413: BUG: unable to handle kernel paging request
On Fri, 23 Mar 2012 19:51:37 -0700 (PDT), Will Set debiandu...@yahoo.com wrote: the mobo has an 865g chipset. Yes, i believe that's correct. I know of 5 bug reports that confirm using boot parameter - processor.nocst=1 as a workaround for kernels 2.6.38 Thanks, i will try this the next time i get a chance to restart this machine (it's currently crashed again and i don't have physical access right now). Do you have a link to a couple of these bug reports so i could read them myself? I'd appreciate it if you do. linux-image-2.6.39 through linux-image 3.3.0-rc6-686-pae - on both of my 865g based boxes. I'm not sure what this sentence means -- is it related to the sentence above? if so, how? Also, iirc, the bigmem kernel was swallowed by the 686-pae kernel, which might be a reason for the instability when using 486. I wouldn't expect the -486 flavor to be able to fully address all 4GiB of RAM (i.e. i doubt it would make use of the physical address extensions). But i don't understand why this would cause instability. My workload during these crashes is definitely not memory-intensive. This is the first i'm hearing that the -486 flavor would cause instability on highmem machines. Can you point me to some documentation so i could understand why that might be the case? Thanks for your feedback, --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87wr6ag0j5@pip.fifthhorseman.net
Bug#665413: BUG: unable to handle kernel paging request
On 03/24/2012 12:25 AM, Will Set wrote: Friday, March 23, 2012 11:06 PM Daniel Kahn Gillmor wrote: On Fri, 23 Mar 2012 19:51:37 -0700 (PDT), Will Set debiandu...@yahoo.com wrote: I know of 5 bug reports that confirm using boot parameter - processor.nocst=1 as a workaround for kernels 2.6.38 Thanks, i will try this the next time i get a chance to restart this machine (it's currently crashed again and i don't have physical access right now). Probably not necessary since you are using 2.6.32 (squeeze) kernels. Normally, kernels older than 2.6.39 don't need the extra boot parameter. Do you mean that the less-than sign () in your earlier remark was meant to be a greater-than sign () ? But 4 GiB of ram is max for the board and although HP has spec for 4 GiB of ram for your model, they also have notes recommending only 2 GiB of Ram, in another doc. http://bizsupport2.austin.hp.com/bc/docs/support/SupportManual/c00072736/c00072736.pdf i've run this machine with 4GiB of RAM for years now with no trouble until trying to upgrade to squeeze kernels or hypervisors :/ This is the first i'm hearing that the -486 flavor would cause instability on highmem machines. Can you point me to some documentation so i could understand why that might be the case? Here is a link to linux-image-2.6.32-5-686-bigmem http://packages.debian.org/squeeze/linux-image-2.6-686-bigmem. This doesn't provide me with any information that suggests that -486 flavors would cause instability on machines with 4GiB; it only suggests that i want those kernels if i want to make full use of all 4GiB, AFAICT. Did you mean to point me to some other documentation? Am i reading it wrong? --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4f6d4f4c.9020...@fifthhorseman.net
Bug#651558: NFS client initscripts for rpc.svcgssd?
On Sun, 22 Jan 2012 15:21:57 +, ow...@bugs.debian.org (Debian Bug Tracking System) wrote: Version: 1:1.2.5-4 Distribution: unstable Urgency: low Maintainer: Debian kernel team debian-kernel@lists.debian.org Changed-By: Luk Claes l...@debian.org Description: nfs-common - NFS support files common to client and server nfs-kernel-server - support for NFS kernel server Closes: 651558 651634 Changes: nfs-utils (1:1.2.5-4) unstable; urgency=low . [...] * Move rpc.svcgssd to nfs-common (Closes: #651558). I'm happy to see rpc.svcgssd moved to nfs-common, but i'm not sure how a client is expected to have that launched and available, given that there is no initscript, configuration, or anything. What suggestions would you have for clients attempting to ensure that rpc.svcgssd is running and able to receive delegations? --dkg pgpwfYD5ZreFF.pgp Description: PGP signature
Bug#651558: NFS client initscripts for rpc.svcgssd?
reopen 651558 thanks On 02/23/2012 01:50 PM, Daniel Kahn Gillmor wrote: I'm happy to see rpc.svcgssd moved to nfs-common, but i'm not sure how a client is expected to have that launched and available, given that there is no initscript, configuration, or anything. What suggestions would you have for clients attempting to ensure that rpc.svcgssd is running and able to receive delegations? Also, the manual page for rpc.svcgssd should be shipped with the executable. Please move usr/share/man/man8/svcgssd.8.gz and the symlink at rpc.svcgssd.8.gz from nfs-kernel-server to nfs-common. Thanks! --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4f468bbd.3060...@fifthhorseman.net
Bug#660039: rtc no longer available under linux 3.2.4-1
Package: linux-2.6 Version: 3.2.4-1 Subject: rtc no longer available under linux 3.2.4-1 I have an asus eeePC 900. lshw reports it as: description: Notebook product: 900 (90OAM09AB5312111U205Q) vendor: ASUSTeK Computer INC. version: 0704 When it wakes from sleep under 3.2.4-1, the system's clock is off by days, which caused me to look into the real-time clock. Apparently something changed between 3.2.1-1 and 3.2.4-1. When i booted it with 3.2.1-1, the kernel would record the following info about the rtc: [1.505626] rtc_cmos 00:03: RTC can wake from S4 [1.505886] rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0 [1.505999] rtc0: alarms up to one month, 114 bytes nvram, hpet irqs [1.517779] rtc_cmos 00:03: setting system clock to 2012-01-22 18:22:12 UTC (1327256532) Booting it with 3.2.4-1, i see this info instead: [1.503620] rtc_cmos 00:03: RTC can wake from S4 [1.503887] rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0 [1.503980] rtc_cmos 00:03: only 24-hr supported [1.515880] /build/buildd-linux-2.6_3.2.4-1-i386-61WrTr/linux-2.6-3.2.4/debian/build/source_i386_none/drivers/rtc/hctosys.c: unable to open rtc device (rtc0) Also, trying to talk to the hardware clock now gives me: 0 pip:~# hwclock --show --debug hwclock from util-linux 2.20.1 hwclock: Open of /dev/rtc failed: No such file or directory No usable clock interface found. hwclock: Cannot access the Hardware Clock via any known method. 70 pip:~# Whereas before it would report as expected. looking for the cause of the change, i see that: https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.2.2 suggests there's been a change in the rtc code: - commit 36a8176166397d103352670327e1b20d334b5c7d Author: Ben Hutchings b...@decadent.org.uk Date: Tue Jan 10 15:11:02 2012 -0800 drivers/rtc/interface.c: fix alarm rollover when day or month is out-of-range commit e74a8f2edb92cb690b467cea0ab652c509e9f624 upstream. Commit f44f7f96a20a (RTC: Initialize kernel state from RTC) introduced a potential infinite loop. If an alarm time contains a wildcard month and an invalid day ( 31), or a wildcard year and an invalid month (= 12), the loop searching for the next matching date will never terminate. Treat the invalid values as wildcards. Fixes http://bugs.debian.org/646429, http://bugs.debian.org/653331 - however, /usr/share/doc/linux-image-3.2.0-1-686-pae/changelog.Debian.gz suggests that 3.1.8-1 had already introduced the same change by bwh: * rtc: Fix alarm rollover when day or month is out-of-range (Closes: #646429) So i'm not sure what to make of the situation, but i'm happy to provide any additional debugging info that would be useful. Regards, --dkg pgpeokelVBMJF.pgp Description: PGP signature
Bug#660039: rtc no longer available under linux 3.2.4-1
On 02/15/2012 06:00 PM, Jonathan Nieder wrote: Weird. Reproducible? Does Linus's master behave the same way? Can you bisect? argh. It looks like this is not the fault of the kernel, so i'm closing this ticket. I tried rolling back to 3.2.1-1 from snapshot.debian.net: linux-image-3.2.0-1-686-pae_3.2.1-1_i386.deb SHA1: fb5ca95149378def1b12d4c314af928ab4f8d180 and it turned out that this machine was having the same rtc problems after reboot to this older kernel (and on 3.1.8-2, which i tried as well). So something must have happened to my hardware that randomly coincided with my switching kernels :( I tried removing power and batteries from the machine, and booting to different kernels, and the rtc still failed. On my sixth reboot, i went into the BIOS setup, manually changed the time of the clock by a little bit, and chose Exit and Save (or whatever its moral equivalent is). That must have reset something in the hardware, because now (rebooting into 3.2.4-1) the rtc is back to working as normal. Apologies for the false alarm over what appears to be some kind of flakey hardware hiccup. --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4f3c6df8.4030...@fifthhorseman.net
Bug#657802: nfs-kernel-server: NFSv4 kerberos mount stopped working after upgrade to 6.0.4 point release
On 01/31/2012 02:10 PM, Russ Allbery wrote: I personally have never used Kerberized NFS (we're an AFS site), so I'm not really the one to comment on what enctypes NFS requires. I don't track NFS development at all. But if NFS is no longer limited to DES, it's very likely that it now supports the full range of standard Kerberos enctypes, in which case the right thing to do is to just leave off the -e flag completely and let the Kerberos infrastructure use whatever its default configured enctype list is. Recent versions of the nfs userland (1.2.5 and up, i think) rely on getting a report from the kernel about what enctypes the kernel supports. I think that data is usually reported by the kernel in /proc/fs/nfsd/supported_krb5_enctypes, where the enctypes are identified by number, like so: 18,17,16,23,3,1,2 note that there has been some talk about moving the location of that file, but i'm not sure whether any decision has been made: http://thread.gmane.org/gmane.linux.nfs/40940 --dkg signature.asc Description: OpenPGP digital signature
Bug#656911: linux-image-3.2.0-1-686-pae: kernel NULL pointer dereference in vsnprintf
Subject: linux-image-3.2.0-1-686-pae: kernel NULL pointer dereference in vsnprintf Package: linux-2.6 Version: 3.2.1-1 Severity: normal Hi debian kernel team-- i just upgraded to 3.2 from unstable on this Asus EeePC 900. The machine was only up for about 20 minutes (i was already logged in, though), when i got the OOPS recorded below. I'm happy to provide any additional information or to run tests. please let me know what would be useful. Thanks for your work on Linux in Debian! Regards, --dkg -- Package-specific info: ** Version: Linux version 3.2.0-1-686-pae (Debian 3.2.1-1) (b...@decadent.org.uk) (gcc version 4.6.2 (Debian 4.6.2-11) ) #1 SMP Thu Jan 19 10:56:51 UTC 2012 ** Command line: BOOT_IMAGE=/vmlinuz-3.2.0-1-686-pae root=/dev/mapper/vg_pip0-root ro verbose ** Tainted: D (128) * Kernel has oopsed before. ** Kernel log: [ 1202.032255] keyboard: can't emulate rawmode for keycode 240 [ 1202.096235] keyboard: can't emulate rawmode for keycode 240 [ 1202.096256] keyboard: can't emulate rawmode for keycode 240 [ 1202.140272] keyboard: can't emulate rawmode for keycode 240 [ 1202.140294] keyboard: can't emulate rawmode for keycode 240 [ 1202.184250] keyboard: can't emulate rawmode for keycode 240 [ 1202.184271] keyboard: can't emulate rawmode for keycode 240 [ 1225.053589] kbd_keycode: 4 callbacks suppressed [ 1225.053597] keyboard: can't emulate rawmode for keycode 240 [ 1225.053615] keyboard: can't emulate rawmode for keycode 240 [ 1237.556379] keyboard: can't emulate rawmode for keycode 240 [ 1237.556402] keyboard: can't emulate rawmode for keycode 240 [ 1238.036243] keyboard: can't emulate rawmode for keycode 240 [ 1238.036265] keyboard: can't emulate rawmode for keycode 240 [ 1238.096226] keyboard: can't emulate rawmode for keycode 240 [ 1238.096248] keyboard: can't emulate rawmode for keycode 240 [ 1249.237586] keyboard: can't emulate rawmode for keycode 240 [ 1249.237608] keyboard: can't emulate rawmode for keycode 240 [ 1389.225219] keyboard: can't emulate rawmode for keycode 240 [ 1389.225242] keyboard: can't emulate rawmode for keycode 240 [ 1389.246101] keyboard: can't emulate rawmode for keycode 240 [ 1389.246123] keyboard: can't emulate rawmode for keycode 240 [ 1389.288280] keyboard: can't emulate rawmode for keycode 240 [ 1389.288302] keyboard: can't emulate rawmode for keycode 240 [ 1389.604610] keyboard: can't emulate rawmode for keycode 240 [ 1389.604632] keyboard: can't emulate rawmode for keycode 240 [ 1389.664405] keyboard: can't emulate rawmode for keycode 240 [ 1389.664429] keyboard: can't emulate rawmode for keycode 240 [ 1399.237568] kbd_keycode: 8 callbacks suppressed [ 1399.237577] keyboard: can't emulate rawmode for keycode 240 [ 1399.237597] keyboard: can't emulate rawmode for keycode 240 [ 1399.309418] keyboard: can't emulate rawmode for keycode 240 [ 1399.309441] keyboard: can't emulate rawmode for keycode 240 [ 1400.396275] keyboard: can't emulate rawmode for keycode 240 [ 1400.396296] keyboard: can't emulate rawmode for keycode 240 [ 1825.436224] keyboard: can't emulate rawmode for keycode 240 [ 1825.436246] keyboard: can't emulate rawmode for keycode 240 [ 1825.584448] keyboard: can't emulate rawmode for keycode 240 [ 1825.584469] keyboard: can't emulate rawmode for keycode 240 [ 1825.628218] keyboard: can't emulate rawmode for keycode 240 [ 1825.628239] keyboard: can't emulate rawmode for keycode 240 [ 1837.029568] keyboard: can't emulate rawmode for keycode 240 [ 1837.029590] keyboard: can't emulate rawmode for keycode 240 [ 2032.796611] keyboard: can't emulate rawmode for keycode 240 [ 2032.796632] keyboard: can't emulate rawmode for keycode 240 [ 2032.844226] keyboard: can't emulate rawmode for keycode 240 [ 2032.844249] keyboard: can't emulate rawmode for keycode 240 [ 2032.920253] keyboard: can't emulate rawmode for keycode 240 [ 2032.920274] keyboard: can't emulate rawmode for keycode 240 [ 2044.293560] keyboard: can't emulate rawmode for keycode 240 [ 2044.293582] keyboard: can't emulate rawmode for keycode 240 [ 2146.849187] keyboard: can't emulate rawmode for keycode 240 [ 2146.849209] keyboard: can't emulate rawmode for keycode 240 [ 2146.854246] keyboard: can't emulate rawmode for keycode 240 [ 2146.854268] keyboard: can't emulate rawmode for keycode 240 [ 2146.896355] keyboard: can't emulate rawmode for keycode 240 [ 2146.896377] keyboard: can't emulate rawmode for keycode 240 [ 2147.212237] keyboard: can't emulate rawmode for keycode 240 [ 2147.212259] keyboard: can't emulate rawmode for keycode 240 [ 2158.013555] keyboard: can't emulate rawmode for keycode 240 [ 2158.013577] keyboard: can't emulate rawmode for keycode 240 [ 2158.263290] BUG: unable to handle kernel NULL pointer dereference at (null) [ 2158.263455] IP: [c1161254] vsnprintf+0xb4/0x247 [ 2158.263561] *pdpt = 34bbe001 *pde = [ 2158.263682] Oops: 0002 [#1] SMP [ 2158.263759] Modules linked in: bnep bluetooth crc16 binfmt_misc uinput fuse arc4
Bug#656911: linux-image-3.2.0-1-686-pae: kernel NULL pointer dereference in vsnprintf
Hi Ben-- Thanks for the prompt followup! On 01/22/2012 11:21 PM, Ben Hutchings wrote: On Sun, 2012-01-22 at 15:11 -0500, Daniel Kahn Gillmor wrote: It looks like we got to the memcpy() in vsnprintf() with str == NULL. Which seems to mean that seq_file is seriously broken. But it hasn't changed between 3.1 and 3.2, so I doubt it's really the source of the problem. yes, agreed that this seems unlikely. Have you seen any more of these? Do you remember doing anything in particular before this crash (aside from running ps)? well, i'd only just booted them machine and hadn't really done much regular work on it. I'd rebooted it because it had crashed earlier with a horrible graphics malfunction (which left me unable to get any good data for a backtrace), but i haven't seen any filesystem errors. The graphics had crashed while i was futzing with the wireless on a moving train, though, so the one thing that i had done since this boot was to cycle the rfkill trigger on the machine a couple times, which disabled and enabled the wirless (you can see the ath and wlan0 business earlier in the log). But i hadn't done that in several minutes when it OOPSed. So i'm a bit at a loss. I subsequently rebooted and tested all 2GiB RAM with memtest86+, and it showed no errors. I've actually had the same graphics failure since then, but no more OOPSes. i don't know how to gather data to debug the graphics failure, though, or i'd send in separate report for that one. Maybe i'll carry around a camera to snap a picture of the screen if it happens again. If you have any other ideas about the OOPS, i'd be happy to investigate them. --dkg signature.asc Description: OpenPGP digital signature
Bug#651558: nfs-utils: NFSv4 sec=krb5 clients must install nfs-kernel-server to use rpc.svcgssd to receive delegations
Package: nfs-utils Version: 1.2.5-2 According to J. Bruce Fields on the linux-nfs mailing list [0], NFSv4 clients using any sec=krb5 variant will need to run rpc.svcgssd to receive delegations. On debian, this appears to mean that the clients will need to install nfs-kernel-server, even if they do not intend to act as a server. Should rpc.svcgssd get moved out to the nfs-common package (or, if the fragmentation isn't too much, to its own package)? It doesn't seem like encouraging clients to run nfsd when they have no intention of serving files is a good idea. Another alternative is to consider encouraging NFSv4.1 instead of NFSv4 (apparently the delegations in 4.1 happen over the client-initiated channels instead of establishing new connections back), but this was only been enabled in debian kernels since 3.1. If moving the daemon implementation between packages isn't the right idea, it would at least be good to document what's going on here and what the recommended configuration is for decently-performing cryptographically-secured NFS. I see no mention of the multi-daemon requirement for clients in /usr/share/doc/nfs-common/README.Debian.nfsv4, for example. If i wasn't stumbling my way through this setup myself, i'd offer to write improved documentation, but i'm not in deep enough to know best-practices or advise others at the moment. Thanks for maintaining nfs-utils in debian, --dkg [0] http://thread.gmane.org/gmane.linux.nfs/45498/focus=45502 pgpKXhjyq6KSa.pgp Description: PGP signature
Bug#651354: nfs-utils: needs build-dep on libgssglue-dev (= 0.3)
Package: nfs-utils Version: 1.2.5-2 trying to build a backport of nfs-utils 1.2.5-2 on squeeze shows: checking for GSSGLUE... no configure: error: Package requirements (libgssglue = 0.3) were not met: Requested 'libgssglue = 0.3' but version of libgssglue is 0.1 Consider adjusting the PKG_CONFIG_PATH environment variable if you installed software in a non-standard prefix. Alternatively, you may set the environment variables GSSGLUE_CFLAGS and GSSGLUE_LIBS to avoid the need to call pkg-config. See the pkg-config man page for more details. make: *** [build-stamp] Error 1 I believe this means that the Build-Depends: on libgssglue-dev needs to be versioned to (= 0.3). Regards, --dkg pgpuVpKNr1O5E.pgp Description: PGP signature
Bug#648939: whoops!
reopen 648939 fixed 648938 2.67-0.1 thanks whoops! I closed the wrong bug in the changelog for libio-socket-inet6-perl 2.67-0.1. I apologize for the confusion! --dkg of the off-by-one errors signature.asc Description: OpenPGP digital signature
Bug#631976: kernel BUG when mounting btrfs volume
On 11/24/2011 12:35 AM, Jonathan Nieder wrote: Hi dkg, Ben Hutchings wrote: On Tue, 2011-06-28 at 16:32 -0400, Daniel Kahn Gillmor wrote: I'm seeing a kernel bug when trying to mount a btrfs volume. [...] [ 277.859243] device fsid 79440663a654fc14-ff2c4fbce89e5eb5 devid 1 transid 90154 /dev/sda2 [ 279.295469] [ cut here ] [ 279.298762] kernel BUG at [...]/fs/btrfs/tree-log.c:809! [...] This code was changed in Linux 3.0-rc2 to accept failure of read_one_inode() where it was previously expected (and asserted) always to be successful. Please test the current package in experimental (linux-image-3.0.0-rc5-686-pae etc.). Now I'm in suspense. Were you able to recover the system and test this? alas, i haven't booted that machine in a while, because i don't have a newer kernel handy to boot into on that box. I hope to be able to rectify the newer-kernel-to-boot situation this week, which might let me pop this one off my stack as well. sorry for the suspense! --dkg signature.asc Description: OpenPGP digital signature
Bug#622146: nfs-kernel-server: error Encryption type not permitted
On 11/14/2011 01:19 PM, Russ Allbery wrote: The NFS machinery is going to need to support either arcfour-hmac or aes128, since Windows never supported 3DES, and you don't want to use plain DES any more (and it has to be specifically enabled on the Windows side, if they haven't dropped it entirely now). I'm not sure what enctypes the kernel-level support currently implements. You'll need the kernel from squeeze-backports or later to get enctypes other than des-cbc-crc. I can attest that 2.6.39-3~bpo60+1 works with aes128-cts with SHA-1 HMAC, as long as you're using the nfs-kernel-server from bpo or later. I haven't tried it against a win2k8 kdc, though. --dkg signature.asc Description: OpenPGP digital signature
Bug#648155: linux-image-3.xx nfs4 mount hangs when kerberos ticket expires. Squeeze used give EPERM
On 11/09/2011 08:44 AM, John Hughes wrote: This is a kernel bug not a nfs-common bug. Using the squeeze kernel (2.6.32-5) in place of the current unstable kernel what happens is that attempts to access the nfs4 mounted system get an EPERM instead of hanging and no horrid messages are written to the log. I can confirm that i'm seeing this bug in 2.6.39 (from backports.org) as well. It's particularly bad because if two users are connected to the kerberized mount, and the ticket of one of them expires, access for *both* users ends up hanging this way. Interesting that it's not present in 2.6.32; i haven't been able to test that because 2.6.32 doesn't support any modern encryption types for krb5, and the domain i'm administering is restricted to modern encryption types. I'd be happy to set up test systems to debug this, including trying kernel patches if anyone can suggest something worth trying. --dkg signature.asc Description: OpenPGP digital signature
Bug#636797: followup on debian bug #636797
Bjoern wrote: I just wanted to ask if the attached kernel oops is also related to this issue? I can't tell from your attached png because not enough of the oops is included. It looks like that screenshot is from a virtual machine emulated VGA console. To catch future issues like this, I recommend running virtual machines with a virtual serial console so that their kernel's textmode output can be cleanly recorded and transmitted in full. Regards, --dkg signature.asc Description: OpenPGP digital signature
Bug#622146: This is broken for me.
On 10/24/2011 09:42 AM, Rob Naccarato wrote: supported_enctypes = aes256-cts:normal arcfour-hmac:normal \ des3-hmac-sha1:normal des-cbc-crc:normal des:normal des:v4 des:norealm \ des:onlyrealm des:afs3 aes128-cts:normal Client (khan) attempting to use sec=krb5. root@khan:/# klist -e -k /etc/krb5.keytab Keytab name: WRFILE:/etc/krb5.keytab KVNO Principal -- 2 host/khan.some.domain...@naccy.org (AES-256 CTS mode with 96-bit SHA-1 HMAC) 2 host/khan.some.domain...@naccy.org (ArcFour with HMAC/md5) 2 host/khan.some.domain...@naccy.org (Triple DES cbc mode with HMAC/sha1) 2 host/khan.some.domain...@naccy.org (DES cbc mode with CRC-32) 2 nfs/khan.some.domain...@naccy.org (AES-256 CTS mode with 96-bit SHA-1 HMAC) 2 nfs/khan.some.domain...@naccy.org (ArcFour with HMAC/md5) 2 nfs/khan.some.domain...@naccy.org (Triple DES cbc mode with HMAC/sha1) 2 nfs/khan.some.domain...@naccy.org (DES cbc mode with CRC-32) this appears to have everything *but* aes128-cts:normal, fwiw. My example client has: 0 example:~# klist -e -k /etc/krb5.keytab Keytab name: WRFILE:/etc/krb5.keytab KVNO Principal -- 2 host/example.example@example.org (AES-128 CTS mode with 96-bit SHA-1 HMAC) 0 example:~# /etc/fstab: blackdog:/ /shares nfs4_netdev,auto,sec=krb5,acl 0 0 0 example:~# grep nfs /etc/fstab nfshost:/ /usr/local/data nfs4 sec=krb5p,fsc 0 0 0 example:~# i don't think the fsc is relevant to this discussion -- and i can't imagine that the difference between krb5 and krb5p is the issue. Server (blackdog), with kdc, exporting nfs4, when I attempt to mount the above: Oct 24 09:32:36 blackdog rpc.svcgssd[22979]: ERROR: GSS-API: error in handle_nullreq: gss_accept_sec_context(): GSS_S_FAILURE (Unspecified GSS failure. Minor code may provide more information) - Encryption type not permitted can you show the same klist on blackdog? here's what i've got on my server: 0 nfshost:~# klist -e -k /etc/krb5.keytab Keytab name: WRFILE:/etc/krb5.keytab KVNO Principal -- 8 nfs/nfshost.example@example.org (AES-128 CTS mode with 96-bit SHA-1 HMAC) 0 nfshost:~# Both machines, client and server have: linux-image-2.6.39-bpo.2-amd64 nfs-kernel-server 1:1.2.4-1~bpo60+1 you shouldn't need nfs-kernel-server on the client -- what version of nfs-common do you have on the client? Both machines, client and server have in krb5.conf: allow_weak_crypto = true A useful test might be to *reduce* the number of supported_enctypes to a select one or two, then change the keys for the client and the server (and for any user account using krb5 authentication) and re-try. hth, --dkg signature.asc Description: OpenPGP digital signature
Bug#622146: This is broken for me.
On 10/24/2011 03:09 PM, Rob Naccarato wrote: Fair enough, I now have this on the client: root@khan:/etc# klist -e -k /etc/krb5.keytab Keytab name: WRFILE:/etc/krb5.keytab KVNO Principal -- 4 nfs/khan.some.domain...@naccy.org (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 host/khan.some.domain...@naccy.org (AES-128 CTS mode with 96-bit SHA-1 HMAC) this looks reasonable to me (funnily, i also have a machine named khan!) I also have this on the server: blackdog:/etc# klist -e -k /etc/krb5.keytab Keytab name: WRFILE:/etc/krb5.keytab KVNO Principal -- 8 host/blackdog.some.domain...@naccy.org (AES-128 CTS mode with 96-bit SHA-1 HMAC) 7 nfs/blackdog.some.domain...@naccy.org (AES-128 CTS mode with 96-bit SHA-1 HMAC) this also looks reasonable to me (there's no need for the kvno to match between the credentials for the two different principals) you shouldn't need nfs-kernel-server on the client -- what version of nfs-common do you have on the client? nfs-common 1:1.2.4-1~bpo60+1 ok, that matches my setup. A useful test might be to *reduce* the number of supported_enctypes to a select one or two, then change the keys for the client and the server (and for any user account using krb5 authentication) and re-try. So, reduce the list to, say, just aes128-cts:normal? Should I also remove the allow_weak_crypto option? yes, that's what i would try -- it appears to be currently working for me. Perhaps someone more experienced with krb5 and nfs than i am can also weigh in with suggestions. Regards, --dkg signature.asc Description: OpenPGP digital signature
Bug#622146: This is broken for me.
On 10/23/2011 02:25 PM, Rob Naccarato wrote: On 11-10-23 01:18 PM, Sam Hartman wrote: Rob == Rob Naccarator...@naccy.org writes: Rob This doesn't appear to be fixed to me. I get the same Rob problems. I have even installed backported kernel Rob (2.6.39-bpo.2-amd64) and nfs-utils (1:1.2.4-1~bpo60+1) and I Rob still get these: This requires fixes in krb5 and nfs-utils. krb5 has been fixed, but nothing gets better until the nfs-utils fix. So, nfs-utils 1.2.5, then? When's that suppose to be available? I imagine this is a pretty critical issue for people. It is for me, at least. I'm the current backporter of nfs-utils. I use 1:1.2.4-1~bpo60+1 with the squeeze-backports kernel (nfs server and nfs clients both use these versions) and a squeeze kdc configured with: supported_enctypes = aes128-cts:normal I'm able to use kerberized (sec=krb5p) nfsv4 mounts in this arrangement. Could you clarify how your configuration differs from what i've described above so i could be sure what might need changing? Regards, --dkg signature.asc Description: OpenPGP digital signature
Bug#646412: linux-image-3.0.0-2-powerpc does not load pata_macio from the initramfs, cannot find root filesystem
Package: linux-2.6 Version: 3.0.0-5 Severity: important This powermac G4 cube has been running fine with 2.6.38 + squeeze for a while. I just upgraded to 3.0.0 from sid, and found that booting the machine fails by dropping into an initramfs shell, unable to find the root filesystem. from the initramfs shell, i can work around this by doing: modprobe pata_macio exit at which point, the boot proceeds as usual. It seems to me that this module should be auto-loaded (or at least somehow detected for this particular hardware). If you need me to experiment with different versions of udev or initramfstools, or if there is a patch i can try applying, please let me know. I'd be happy to provide any other sort of debugging information needed. Regards, --dkg -- Package-specific info: ** Version: Linux version 3.0.0-2-powerpc (Debian 3.0.0-5) (b...@decadent.org.uk) (gcc version 4.5.3 (Debian 4.5.3-9) ) #1 Fri Oct 7 21:49:07 UTC 2011 ** Command line: BOOT_IMAGE=/boot/vmlinux-3.0.0-2-powerpc root=UUID=dca56d69-14fe-457b-90bb-95d49692ff60 ro quiet ** Not tainted ** Kernel log: [1.916501] hub 2-0:1.0: 2 ports detected [2.118857] Btrfs loaded [2.154177] device-mapper: uevent: version 1.0.3 [2.155887] device-mapper: ioctl: 4.20.0-ioctl (2011-02-02) initialised: dm-de...@redhat.com [2.223362] usb 1-1: new full speed USB device number 2 using ohci_hcd [2.251705] firewire_core: created device fw0: GUID 003065fffedbbc06, S400 [2.430375] usb 1-1: New USB device found, idVendor=05ac, idProduct=1002 [2.430395] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [2.430411] usb 1-1: Product: Hub in Apple Extended USB Keyboard [2.430424] usb 1-1: Manufacturer: Mitsumi Electric [2.432595] hub 1-1:1.0: USB hub found [2.434406] hub 1-1:1.0: 3 ports detected [2.724381] usb 1-1.1: new full speed USB device number 3 using ohci_hcd [2.834374] usb 1-1.1: New USB device found, idVendor=05ac, idProduct=0204 [2.834393] usb 1-1.1: New USB device strings: Mfr=1, Product=3, SerialNumber=0 [2.834408] usb 1-1.1: Product: Apple Extended USB Keyboard [2.834421] usb 1-1.1: Manufacturer: Mitsumi Electric [2.894829] input: Mitsumi Electric Apple Extended USB Keyboard as /devices/pci0001:10/0001:10:18.0/usb1/1-1/1-1.1/1-1.1:1.0/input/input1 [2.895467] generic-usb 0003:05AC:0204.0001: input,hidraw0: USB HID v1.10 Keyboard [Mitsumi Electric Apple Extended USB Keyboard] on usb-0001:10:18.0-1.1/input0 [2.907924] input: Mitsumi Electric Apple Extended USB Keyboard as /devices/pci0001:10/0001:10:18.0/usb1/1-1/1-1.1/1-1.1:1.1/input/input2 [2.908392] generic-usb 0003:05AC:0204.0002: input,hidraw1: USB HID v1.10 Device [Mitsumi Electric Apple Extended USB Keyboard] on usb-0001:10:18.0-1.1/input1 [2.909149] usbcore: registered new interface driver usbhid [2.909163] usbhid: USB HID core driver [2.912457] usb 1-1.2: new low speed USB device number 4 using ohci_hcd [3.025382] usb 1-1.2: New USB device found, idVendor=05ac, idProduct=0306 [3.025404] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [3.025419] usb 1-1.2: Product: Apple Optical USB Mouse [3.025431] usb 1-1.2: Manufacturer: Fujitsu Component [3.034968] input: Fujitsu Component Apple Optical USB Mouse as /devices/pci0001:10/0001:10:18.0/usb1/1-1/1-1.2/1-1.2:1.0/input/input3 [3.035764] generic-usb 0003:05AC:0306.0003: input,hidraw2: USB HID v1.10 Mouse [Fujitsu Component Apple Optical USB Mouse] on usb-0001:10:18.0-1.2/input0 [4.091514] gem 0002:20:0f.0: eth0: Link is up at 100 Mbps, full-duplex [ 48.972281] SCSI subsystem initialized [ 49.007387] libata version 3.00 loaded. [ 50.031341] pata-macio 0.0001f000:ata-4: Activating pata-macio chipset KeyLargo ATA-4, Apple bus ID 2 [ 50.035703] scsi0 : pata_macio [ 50.037492] ata1: PATA max UDMA/66 irq 19 [ 51.055334] pata-macio 0.0002:ata-3: Activating pata-macio chipset KeyLargo ATA-3, Apple bus ID 0 [ 51.059623] scsi1 : pata_macio [ 51.059844] ata2: PATA max MWDMA2 irq 20 [ 52.079338] pata-macio 0.00021000:ata-3: Activating pata-macio chipset KeyLargo ATA-3, Apple bus ID 1 [ 52.084186] scsi2 : pata_macio [ 52.084446] ata3: PATA max MWDMA2 irq 21 [ 55.231349] ata1: link is slow to respond, please be patient (ready=0) [ 55.920034] ata1.00: ATA-5: QUANTUM FIREBALLP LM30, A35.0700, max UDMA/66 [ 55.920052] ata1.00: 58633344 sectors, multi 0: LBA [ 55.920086] ata1.01: ATAPI: MATSHITADVD-ROM SR-8186, F213, max UDMA/33 [ 55.935992] ata1.00: configured for UDMA/66 [ 55.951673] ata1.01: configured for UDMA/33 [ 55.953044] scsi 0:0:0:0: Direct-Access ATA QUANTUM FIREBALL A35. PQ: 0 ANSI: 5 [ 55.955601] scsi 0:0:1:0: CD-ROMMATSHITA DVD-ROM SR-8186 F213 PQ: 0 ANSI: 5 [ 55.993058] sd 0:0:0:0: [sda] 58633344 512-byte logical blocks: (30.0 GB/27.9 GiB) [ 55.993359] sd 0:0:0:0: [sda] Write Protect is off [
Bug#646412: linux-image-3.0.0-2-powerpc does not load pata_macio from the initramfs, cannot find root filesystem
On 10/23/2011 08:19 PM, Daniel Kahn Gillmor wrote: from the initramfs shell, i can work around this by doing: modprobe pata_macio exit at which point, the boot proceeds as usual. It seems to me that this module should be auto-loaded (or at least somehow detected for this particular hardware). I've further entrenched my workaround by doing: echo pata_macio /etc/initramfs-tools/modules update-initramfs -k $(uname -r) -u After which, the system boots as normal. Most users won't be able to figure this out as the Right Fix, though, so this probably needs to be detected automatically somehow. Feel free to reassign if you think this bug belongs in udev or initramfs-tools or some other package. Thanks for maintaining the linux kernel in debian, --dkg signature.asc Description: OpenPGP digital signature
Bug#646025: linux-image-3.0.0-2-powerpc: VGA monitor unrecognized from DVI port on RV280 [Radeon 9200] on powerpc
Package: linux-2.6 Version: 3.0.0-5 Severity: normal I'm using an acer AL715 LCD monitor, connected via a full 15-pin VGA cable to the analog pins on the DVI port on this powerpc Mac Mini. the computer doesn't seem to properly detect the monitor, and chooses awkward/low resolution settings instead of the native 1280x1024. It used to give native 1280x1024 both on the virtual terminals and under X11 several weeks or months ago using sid, but unfortunately i don't have records of when it specifically changed :( The monitor is not properly detected under X11, and read-edid doesn't seem to work either: consoleuser@bigpuff:/tmp$ xrandr Screen 0: minimum 320 x 200, current 1152 x 864, maximum 1360 x 1360 DVI-0 disconnected 1152x864+0+0 (normal left inverted right x axis y axis) 0mm x 0mm S-video disconnected (normal left inverted right x axis y axis) 1152x864 (0x55) 81.6MHz h: width 1152 start 1216 end 1336 total 1520 skew0 clock 53.7KHz v: height 864 start 865 end 868 total 895 clock 60.0Hz consoleuser@bigpuff:/tmp$ get-edid | parse-edid Can't find EDID file in /proc/device-tree/pci@f000/ATY,RockHopper2Parent@10/ATY,RockHopper2_A@0 parse-edid: parse-edid version 2.0.0 parse-edid: IO error reading EDID consoleuser@bigpuff:/tmp$ ls -l /proc/device-tree/pci@f000/ATY,RockHopper2Parent@10/ATY,RockHopper2_A@0/ total 0 -r--r--r-- 1 root root 4 Oct 20 12:56 AAPL,gray-page -r--r--r-- 1 root root 4 Oct 20 12:56 address -r--r--r-- 1 root root 10 Oct 20 12:56 character-set -r--r--r-- 1 root root 16 Oct 20 12:56 compatible -r--r--r-- 1 root root 4 Oct 20 12:56 connector-type -r--r--r-- 1 root root 4 Oct 20 12:56 depth -r--r--r-- 1 root root 8 Oct 20 12:56 device_type -r--r--r-- 1 root root 4 Oct 20 12:56 display-type -r--r--r-- 1 root root 4 Oct 20 12:56 height -r--r--r-- 1 root root 0 Oct 20 12:56 iso6429-1983-colors -r--r--r-- 1 root root 4 Oct 20 12:56 linebytes -r--r--r-- 1 root root 0 Oct 20 12:56 linux,boot-display -r--r--r-- 1 root root 0 Oct 20 12:56 linux,opened -r--r--r-- 1 root root 4 Oct 20 12:56 linux,phandle -r--r--r-- 1 root root 18 Oct 20 12:56 name -r--r--r-- 1 root root 4 Oct 20 12:56 reg -r--r--r-- 1 root root 4 Oct 20 12:56 width consoleuser@bigpuff:/tmp$ I've installed fbset, and i can use fbset 1280x1024-70 from a virtual terminal to get a native resolution instead of 640x480, but this used to happen automatically during boot, and i'm not sure why it doesn't any more. I'm happy to provide any other data that would be useful, and of course would be fine if there is a better package to reassign this to (libdrm? something else?) but i don't know what it would be. Regards, --dkg -- Package-specific info: ** Version: Linux version 3.0.0-2-powerpc (Debian 3.0.0-5) (b...@decadent.org.uk) (gcc version 4.5.3 (Debian 4.5.3-9) ) #1 Fri Oct 7 21:49:07 UTC 2011 ** Command line: BOOT_IMAGE=/vmlinux-3.0.0-2-powerpc root=/dev/mapper/bigpuff-root ro quiet ** Not tainted ** Kernel log: [2.456598] input: USB Keyboard as /devices/pci0001:10/0001:10:1b.0/usb3/3-1/3-1:1.1/input/input2 [2.456744] generic-usb 0003:1241:1603.0002: input,hidraw1: USB HID v1.10 Device [ USB Keyboard] on usb-0001:10:1b.0-1/input1 [2.456794] usbcore: registered new interface driver usbhid [2.456798] usbhid: USB HID core driver [2.572387] usb 4-1: new low speed USB device number 2 using ohci_hcd [2.786591] usb 4-1: New USB device found, idVendor=046d, idProduct=c401 [2.786599] usb 4-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [2.786605] usb 4-1: Product: USB-PS/2 Trackball [2.786610] usb 4-1: Manufacturer: Logitech [2.797297] input: Logitech USB-PS/2 Trackball as /devices/pci0001:10/0001:10:1b.1/usb4/4-1/4-1:1.0/input/input3 [2.797489] generic-usb 0003:046D:C401.0003: input,hidraw2: USB HID v1.00 Mouse [Logitech USB-PS/2 Trackball] on usb-0001:10:1b.1-1/input0 [2.884381] pata-macio 0.0002:ata-3: Activating pata-macio chipset KeyLargo ATA-3, Apple bus ID 0 [2.886502] scsi1 : pata_macio [2.886633] ata2: PATA max MWDMA2 irq 24 [3.058081] sd 0:0:0:0: [sda] 39070080 512-byte logical blocks: (20.0 GB/18.6 GiB) [3.058191] sd 0:0:0:0: [sda] Write Protect is off [3.058198] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [3.058243] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [3.148302] sda: [mac] sda1 sda2 sda3 sda4 sda5 [3.157998] sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray [3.158007] cdrom: Uniform CD-ROM driver Revision: 3.20 [3.158223] sr 0:0:1:0: Attached scsi CD-ROM sr0 [3.168255] sd 0:0:0:0: [sda] Attached SCSI disk [3.181702] sd 0:0:0:0: Attached scsi generic sg0 type 0 [3.182677] sr 0:0:1:0: Attached scsi generic sg1 type 5 [3.897790] device-mapper: uevent: version 1.0.3 [3.898712] device-mapper: ioctl: 4.20.0-ioctl (2011-02-02) initialised: dm-de...@redhat.com [
Bug#637461: nfs-common: it worked for a while, and now again
On 09/30/2011 01:12 AM, yellow wrote: again the station does not want to mount the nfs share at boot/reboot/power on. if I do su , mount /myshare it works what to do to test and check what is going on? Could you show the contents of /etc/fstab ? Do you have logs of your system's boot that show what is happening when the other filesystems are mounted (and possibly error messages from attempts to mount /myshare)? hopefully we can get this working for you, --dkg signature.asc Description: OpenPGP digital signature
Bug#636797: patch for WARN_OUT?
On 09/05/2011 11:38 PM, Ben Hutchings wrote: The code dump actually corresponds to this line in update_sg_lb_stats(), which has been compiled inline with find_busiest_group(): sgs-avg_load = (sgs-group_load * SCHED_LOAD_SCALE) / group-cpu_power; OK, i'm happy to take your word for it. :) But i'd also really like to be able to make these inferences myself for next time i run into something like this. I don't see how to get to this conclusion from the backtrace+code dump. Do you have pointers to a doc or two that might help me make more sense of these backtraces+code dumps on my own in the future? Thanks, --dkg signature.asc Description: OpenPGP digital signature
Bug#639691: updating build-deps for nfs-utils to ease backporting to squeeze
On 08/30/2011 01:25 AM, Luk Claes wrote: On 08/30/2011 06:16 AM, Daniel Kahn Gillmor wrote: I concur with Sean Finney that nfs-utils should Build-Depend on libnfsidmap-dev = 0.24 to ease backporting. I'm hoping to prepare nfs-utils 1.2.4 as a backport for squeeze, and it'd be nice to modify the source package as minimally as possible. What features do you need/want from 1.2.4? the version of nfs-utils in squeeze is only capable of using des-cbc-crc kerberos tickets. This is a poor choice for network security. No one setting up a modern system should be using plain DES for anything. from 1:1.2.3-1: - Try to use kernel function to determine supported Kerberos enctypes (258f10f) (Closes: #474037) Taking advantage of this appears to require the kernel from squeeze-backports as well, but i don't think that's an unreasonable tradeoff (and i have verified that it works with 2.6.39-bpo.2, currently in squeeze-backports). --dkg signature.asc Description: OpenPGP digital signature
Bug#636797: patch for WARN_OUT?
On 08/29/2011 02:40 PM, Ben Hutchings wrote: This is what I've added for 2.6.32-36. Any review would be appreciated. Thanks, Ben! Two crashes i have documentation for show the division-by-zero error happening in find_busiest_group, which was patched in the initial diff i submitted, but not in your patch below. Perhaps this is because the divisor there is sds.total_pwr instead of group-cpu_power. My diff also included cleanups to a possible division-by-zero in * update_group_shares_cpu (divisor: sd_rq_weight), and * find_busiest_queue (divisor: power) which are missing in your patch. Here are backtraces showing find_busiest_group as the innermost function at the time of the error: https://support.mayfirst.org/ticket/4423 https://support.mayfirst.org/ticket/4343 Could you cover at least find_busiest_group() in your patch? I'd propose it myself, but your patch introduced me to several C concepts i'm only starting to make sense of (sorry, i'm a kernel n00b), so i don't think my edits would be very useful to you. Thanks very much for sorting this out and for your sensible approach to trying to learn more about the problem rather than just papering it over. Regards, --dkg signature.asc Description: OpenPGP digital signature
Bug#639691: updating build-deps for nfs-utils to ease backporting to squeeze
I concur with Sean Finney that nfs-utils should Build-Depend on libnfsidmap-dev = 0.24 to ease backporting. I'm hoping to prepare nfs-utils 1.2.4 as a backport for squeeze, and it'd be nice to modify the source package as minimally as possible. I just got the thumbs-up from anibal to backport libnfsidmap-dev 0.24 to squeeze-backports; i've done it successfully for myself (it works fine), so i just need to review the build log to submit it to the backports repo. Once that's done, i'd like to put nfs-utils in the squeeze-backports repo as well, unless the regular maintainers want to do it (or have serious reservations about me doing it myself). Any objections or concerns i should be aware of? Regards, --dkg signature.asc Description: OpenPGP digital signature
Bug#614622: linux-image-2.6.37-1-686: atl2 NIC claims NO CARRIER after suspend/resume; rmmod+insmod fixes the problem
forwarded 614622 https://bugzilla.kernel.org/show_bug.cgi?id=40732 thanks On 08/08/2011 01:26 PM, Moritz Mühlenhoff wrote: Please report this upstream at http://bugzilla.kernel.org, product Drivers and component Networking. OK, done. Hope this helps, --dkg signature.asc Description: OpenPGP digital signature
Bug#636797: linux-image-2.6.32-5-amd64: avoid divide-by-zero (divide error: 0000) in scheduler
Hi Ben-- Thanks for the quick followup! On 08/07/2011 12:36 PM, Ben Hutchings wrote: On Fri, 2011-08-05 at 18:36 -0400, Daniel Kahn Gillmor wrote: We've applied the attached patch (a simple workaround to ensure no division-by-zero) to the debian packages for several weeks in production (over a month on some machines) and haven't seen a recurrence of the problem. This doesn't really fix the bug - division by zero is just a symptom of a more fundamental problem which has yet to be identified. yep, that's why i called it a workaround :) As a result, it hasn't been accepted upstream and won't be accepted in Debian. That said, I would consider applying a variant that WARNs before 'fixing up' the zero divisor, as a *temporary* measure to aid in understanding the bug (more like https://bugzilla.kernel.org/show_bug.cgi?id=16991#c13). That sounds reasonable to me. Are you up for preparing such a patch or do you need me to do it? I notice your 'oops' messages show 'Tainted: G W' which indicates there was an earlier kernel warning. What was the previous warning? hmm, we've seen this on multiple machines, and they didn't all have a prior warning. in the referenced machine, though, it was 5 months previously, a netdev watchdog timeout. It doesn't seem related to me, but i'm happy to include the dump here in case anyone else can extract meaning from it: 2011-01-04_10:28:18.85061 [3129874.324489] [ cut here ] 2011-01-04_10:28:18.89235 [3129874.329286] WARNING: at /build/buildd-linux-2.6_2.6.32-28-amd64-EUJiNq/linux-2.6-2.6.32/debian/build/source_amd64_none/net/sched/sch_generic.c:261 dev_watchdog+0xe2/0x194() 2011-01-04_10:28:18.89236 [3129874.344808] Hardware name: PowerEdge R410 2011-01-04_10:28:18.89237 [3129874.348981] NETDEV WATCHDOG: eth0 (bnx2): transmit queue 1 timed out 2011-01-04_10:28:18.89238 [3129874.355561] Modules linked in: btrfs zlib_deflate crc32c libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs ext4 jbd2 crc16 ext2 bridge stp kvm_intel kvm tun loop snd_pcm snd_timer snd soundcore snd_page_alloc dcdbas pcspkr psmouse serio_raw evdev button power_meter processor ext3 jbd mbcache sha256_generic aes_x86_64 aes_generic cbc dm_crypt dm_mod raid1 md_mod sd_mod crc_t10dif sg sr_mod cdrom ata_generic uhci_hcd mpt2sas ehci_hcd thermal ata_piix thermal_sys usbcore nls_base scsi_transport_sas libata scsi_mod bnx2 [last unloaded: scsi_wait_scan] 2011-01-04_10:28:18.89240 [3129874.408913] Pid: 0, comm: swapper Not tainted 2.6.32-5-amd64 #1 2011-01-04_10:28:18.89240 [3129874.415063] Call Trace: 2011-01-04_10:28:18.89242 [3129874.417740] IRQ [81261c12] ? dev_watchdog+0xe2/0x194 2011-01-04_10:28:18.89243 [3129874.424219] [81261c12] ? dev_watchdog+0xe2/0x194 2011-01-04_10:28:18.89244 [3129874.430018] [8104dd6c] ? warn_slowpath_common+0x77/0xa3 2011-01-04_10:28:18.89245 [3129874.436423] [81261b30] ? dev_watchdog+0x0/0x194 2011-01-04_10:28:18.89246 [3129874.442131] [8104ddf4] ? warn_slowpath_fmt+0x51/0x59 2011-01-04_10:28:18.89247 [3129874.448276] [81041b41] ? enqueue_task_fair+0x3e/0x82 2011-01-04_10:28:18.89248 [3129874.454420] [8103fbfa] ? task_rq_lock+0x46/0x79 2011-01-04_10:28:18.89249 [3129874.460132] [8104a252] ? try_to_wake_up+0x2a7/0x2b9 2011-01-04_10:28:18.89250 [3129874.466191] [81261b04] ? netif_tx_lock+0x3d/0x69 2011-01-04_10:28:18.89250 [3129874.471989] [8124c97c] ? netdev_drivername+0x3b/0x40 2011-01-04_10:28:18.89251 [3129874.478132] [81261c12] ? dev_watchdog+0xe2/0x194 2011-01-04_10:28:18.89252 [3129874.483930] [8103a9cd] ? __wake_up_common+0x44/0x72 2011-01-04_10:28:18.89253 [3129874.489992] [81057560] ? cascade+0x5f/0x77 2011-01-04_10:28:18.89253 [3129874.495278] [8105a337] ? run_timer_softirq+0x1c9/0x268 2011-01-04_10:28:18.89254 [3129874.501594] [81053aaf] ? __do_softirq+0xdd/0x1a2 2011-01-04_10:28:18.89256 [3129874.507398] [8102419a] ? lapic_next_event+0x18/0x1d 2011-01-04_10:28:18.89256 [3129874.513458] [81011cac] ? call_softirq+0x1c/0x30 2011-01-04_10:28:18.89257 [3129874.519166] [8101322b] ? do_softirq+0x3f/0x7c 2011-01-04_10:28:18.89261 [3129874.524774] [8105391e] ? irq_exit+0x36/0x76 2011-01-04_10:28:19.85162 [3129874.530164] [81024c68] ? smp_apic_timer_interrupt+0x87/0x95 2011-01-04_10:28:19.85163 [3129874.536911] [81011673] ? apic_timer_interrupt+0x13/0x20 2011-01-04_10:29:45.93714 x9d/0xb8 [processor] 2011-01-04_10:29:45.93717 [3129874.551277] [a01c024c] ? acpi_idle_enter_c1+0x78/0xb8 [processor] 2011-01-04_10:29:45.93718 [3129874.558550] [81238f62] ? cpuidle_idle_call+0x94/0xee 2011-01-04_10:29:45.93719 [3129874.564695] [8100feb1] ? cpu_idle+0xa2/0xda hth, --dkg
Bug#636797: linux-image-2.6.32-5-amd64: avoid divide-by-zero (divide error: 0000) in scheduler
Package: linux-2.6 Version: 2.6.32-35 Tags: patch We've now seen multiple crashes during periods of heavy IO on amd64 architecture machines running 2.6.32-5-amd64 from stock squeeze installs. An example crash [0] yields a backtrace like this: 2011-06-26_12:46:14.63097 [62478.818625] divide error: [#1] SMP 2011-06-26_12:46:14.68003 [62478.822564] last sysfs file: /sys/devices/pci:00/:00:1e.0/:04:03.0/class 2011-06-26_12:46:14.68004 [62478.830287] CPU 0 2011-06-26_12:46:14.68005 [62478.832304] Modules linked in: rng_core btrfs zlib_deflate crc32c libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs ext4 jbd2 crc16 ext2 bridge stp kvm_intel kvm tun loop snd_pcm snd_timer snd soundcore snd_page_alloc dcdbas pcspkr psmouse serio_raw evdev button power_meter processor ext3 jbd mbcache sh a256_generic aes_x86_64 aes_generic cbc dm_crypt dm_mod raid1 md_mod sd_mod crc_t10dif sg sr_mod cdrom ata_generic uhci_hcd mpt2sas ehci_hcd thermal ata_piix thermal_sys usbcore nls_b ase scsi_transport_sas libata scsi_mod bnx2 [last unloaded: scsi_wait_scan] 2011-06-26_12:46:14.68007 [62478.885126] Pid: 32653, comm: kvm Tainted: G W 2.6.32-5-amd64 #1 PowerEdge R410 2011-06-26_12:46:14.68008 [62478.893108] RIP: 0010:[81044d3a] [81044d3a] find_busiest_group+0x3d0/0x876 2011-06-26_12:46:14.68009 [62478.901803] RSP: 0018:8804c5a8ba68 EFLAGS: 00010046 2011-06-26_12:46:14.68010 [62478.907101] RAX: RBX: RCX: 8103a601 2011-06-26_12:46:14.68012 [62478.914219] RDX: RSI: RDI: 0200 2011-06-26_12:46:14.68013 [62478.921334] RBP: 88044e40fd50 R08: R09: 88083e4400b0 2011-06-26_12:46:14.68014 [62478.928449] R10: 880298c3a8b8 R11: a0253fb7 R12: 00015780 2011-06-26_12:46:14.68014 [62478.935565] R13: R14: 0001 R15: 88083e440060 2011-06-26_12:46:14.68015 [62478.942683] FS: 7f995a599700() GS:88044e40() knlGS: 2011-06-26_12:46:14.68016 [62478.950753] CS: 0010 DS: 002b ES: 002b CR0: 80050033 2011-06-26_12:46:14.68017 [62478.956483] CR2: 7f80157a6000 CR3: 000393285000 CR4: 26e0 2011-06-26_12:46:14.68018 [62478.963601] DR0: DR1: DR2: 2011-06-26_12:46:14.68019 [62478.970716] DR3: DR6: 0ff0 DR7: 0400 2011-06-26_12:46:14.68020 [62478.977833] Process kvm (pid: 32653, threadinfo 8804c5a8a000, task 88083e67a350) 2011-06-26_12:46:14.68021 [62478.985901] Stack: 2011-06-26_12:46:14.68021 [62478.987907] 00015788 00015780 0008 00015780 2011-06-26_12:46:14.68022 [62478.995142] 0 00015780 00015780 813cd8a8 8106fde3 2011-06-26_12:46:14.68027 [62479.002834] 0 88001d1e8e10 88044e410108 88044e40f9e0 2011-06-26_12:46:14.68027 [62479.010711] Call Trace: 2011-06-26_12:46:14.68028 [62479.013157] [8106fde3] ? tick_dev_program_event+0x2d/0x95 2011-06-26_12:46:14.68029 [62479.019496] [81067b20] ? __hrtimer_start_range_ns+0x22f/0x242 2011-06-26_12:46:14.68030 [62479.026183] [812f9b40] ? schedule+0x2bd/0x7cb 2011-06-26_12:46:14.68030 [62479.031491] [a02856ec] ? x86_emulate_insn+0x1f08/0x2fc4 [kvm] 2011-06-26_12:46:14.68031 [62479.038184] [a026d858] ? kvm_vcpu_block+0x94/0xb4 [kvm] 2011-06-26_12:46:14.68032 [62479.044349] [81064bee] ? autoremove_wake_function+0x0/0x2e 2011-06-26_12:46:14.68033 [62479.050781] [a0278127] ? kvm_arch_vcpu_ioctl_run+0x80b/0xa44 [kvm] 2011-06-26_12:46:14.68033 [62479.057901] [8104a252] ? try_to_wake_up+0x2a7/0x2b9 2011-06-26_12:46:14.68034 [62479.063719] [8107188f] ? wake_futex+0x31/0x4e 2011-06-26_12:46:14.68035 [62479.069024] [a026a9d1] ? kvm_vcpu_ioctl+0xf1/0x4e6 [kvm] 2011-06-26_12:46:14.68035 [62479.075275] [81067b20] ? __hrtimer_start_range_ns+0x22f/0x242 2011-06-26_12:46:14.68036 [62479.081964] [810fa492] ? vfs_ioctl+0x21/0x6c 2011-06-26_12:46:14.68038 [62479.087176] [810fa9e0] ? do_vfs_ioctl+0x48d/0x4cb 2011-06-26_12:46:14.68039 [62479.092822] [81073c0a] ? sys_futex+0x113/0x131 2011-06-26_12:46:14.68039 [62479.098210] [8451] ? block_llseek+0x75/0x81 2011-06-26_12:46:14.68040 [62479.103681] [810faa6f] ? sys_ioctl+0x51/0x70 2011-06-26_12:46:14.68041 [62479.108894] [81010b42] ? system_call_fastpath+0x16/0x1b 2011-06-26_12:46:14.68042 [62479.115056] Code: bc 24 a0 01 00 00 00 74 10 48 8b 94 24 a0 01 00 00 c7 02 00 00 00 00 eb 65 41 8b 77 08 48 8b 84 24 38 01 00 00 31 d2 48 c1 e0 0a 48 f7 f6 48 8b b4 24 40 01 00 00 48 89 84 24 30 01 00 00 31 c0 2011-06-26_12:46:14.68042 [62479.134577] RIP [81044d3a] find_busiest_group+0x3d0/0x876
Bug#614622: linux-image-2.6.37-1-686: atl2 NIC claims NO CARRIER after suspend/resume; rmmod+insmod fixes the problem
On 07/29/2011 11:20 AM, Moritz Mühlenhoff wrote: On Tue, Feb 22, 2011 at 09:06:44PM -0500, Daniel Kahn Gillmor wrote: I run this machine every day, connect it to multiple wired networks, and have a usage pattern of suspend-to-ram at least twice a day. I never saw this problem until i was running 2.6.37-1. I don't think i was simply lucky with the previous versions. Does this still occur with more recent kernels, e.g. 3.0? I'm now running 3.0 (3.0.0-1-686-pae), and i see the same misbehavior. Plugging/unplugging the network cable does not resolve the no-carrier state; power cycling the peer switch does not resolve it. Removing and re-loading atl2.ko does resolve it. Sorry to be the bearer of bad news :( --dkg signature.asc Description: OpenPGP digital signature
Bug#632272: debian-kernel-handbook: http://kernel-handbook.alioth.debian.org/ is out-of-date
Package: debian-kernel-handbook Version: 1.0.10 Severity: wishlist It looks to me like http://kernel-handbook.alioth.debian.org/ is out-of-date. it says: version 1.0.9, Tue Nov 23 17:56:29 GMT 2010 but http://kernel-handbook.alioth.debian.org/ch-scope.html says: The latest released version is always available from http://kernel-handbook.alioth.debian.org. and The SGML source of the book may be checked out from the Debian subversion (svn) repository at svn://svn.debian.org/svn/kernel-handbook/trunk/. and i think it's now under git. Please update the web-facing version! Thanks, --dkg -- System Information: Debian Release: wheezy/sid APT prefers testing APT policy: (500, 'testing'), (200, 'unstable'), (1, 'experimental') Architecture: i386 (i686) Kernel: Linux 2.6.39-1-686-pae (SMP w/1 CPU core) Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash debian-kernel-handbook depends on no packages. Versions of packages debian-kernel-handbook recommends: ii chromium [www-brows 12.0.742.91~r87961-1 Chromium browser ii iceape-browser [www 2.0.14-2 Iceape Navigator (Internet browser ii iceweasel [www-brow 5.0-1Web browser based on Firefox ii konqueror [www-brow 4:4.6.3-1advanced file manager, web browser ii links [www-browser] 2.3~pre1-1+b1Web browser running in text mode debian-kernel-handbook suggests no packages. -- no debconf information -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110701040123.19187.61860.reportbug@localhost.localdomain
Bug#631976: kernel BUG when mounting btrfs volume
Package: linux-image-2.6.32-5-486 I'm seeing a kernel bug when trying to mount a btrfs volume. I see the same bug when i boot with 2.6.39-2-686-pae, fwiw, though i have only been able to transcribe it thus far with 2.6.32-5-486. This is on standard x86 hardware (a Dell Dimension 4500S desktop machine): [ 277.859243] device fsid 79440663a654fc14-ff2c4fbce89e5eb5 devid 1 transid 90154 /dev/sda2 [ 279.295469] [ cut here ] [ 279.298762] kernel BUG at /build/buildd-linux-2.6_2.6.32-30-i386-UYhWt7/linux-2.6-2.6.32/debian/build/source_i386_none/fs/btrfs/tree-log.c:809! [ 279.298762] invalid opcode: [#1] [ 279.298762] last sysfs file: /sys/devices/virtual/bdi/btrfs-1/uevent [ 279.298762] Modules linked in: nls_utf8 nls_cp437 vfat fat usb_storage btrfs zlib_deflate crc32c libcrc32c ftdi_sio usbserial usbhid hid dm_crypt dm_mod sg sr_mod cdrom sd_mod crc_t10dif snd_intel8x0 ata_generic e100 i915 snd_ac97_codec tulip mii ac97_bus drm_kms_helper snd_pcm uhci_hcd snd_timer drm ata_piix i2c_algo_bit evdev snd libata shpchp i2c_i801 soundcore parport_pc video parport button dcdbas processor psmouse serio_raw ehci_hcd thermal pcspkr scsi_mod snd_page_alloc i2c_core rng_core pci_hotplug usbcore nls_base thermal_sys output [ 279.298762] [ 279.298762] Pid: 1215, comm: mount Not tainted (2.6.32-5-486 #1) Dimension 4500S [ 279.298762] EIP: 0060:[f8aadbdf] EFLAGS: 00010246 CPU: 0 [ 279.298762] EIP is at add_inode_ref+0x4d/0x3a2 [btrfs] [ 279.298762] EAX: EBX: 0097 ECX: c10a21af EDX: 008d [ 279.298762] ESI: c5049d80 EDI: c5049ee8 EBP: c5651800 ESP: c5453c80 [ 279.298762] DS: 007b ES: 007b FS: GS: 00e0 SS: 0068 [ 279.298762] Process mount (pid: 1215, ti=c5452000 task=c550cc30 task.ti=c5452000) [ 279.298762] Stack: [ 279.298762] c5651400 c504e000 c505a56c c5453cb4 0011 c5049ee8 00ac [ 279.298762] 0 f8a96649 0002 0097 c5049d80 c5049ee8 0002 f8aaeb9b c504b0e0 [ 279.298762] 0 c5049ee8 0002 c5453ce7 c5453d94 c5651400 0009 c504b0e0 c5651800 [ 279.298762] Call Trace: [ 279.298762] [f8a96649] ? btrfs_item_size+0x93/0x9d [btrfs] [ 279.298762] [f8aaeb9b] ? replay_one_buffer+0x1c9/0x24d [btrfs] [ 279.298762] [f8aac688] ? walk_down_log_tree+0x13e/0x3a3 [btrfs] [ 279.298762] [f8aac95f] ? walk_log_tree+0x72/0x178 [btrfs] [ 279.298762] [f8aad6fc] ? btrfs_recover_log_trees+0x158/0x23e [btrfs] [ 279.298762] [f8aae9d2] ? replay_one_buffer+0x0/0x24d [btrfs] [ 279.298762] [f8a87770] ? open_ctree+0xddf/0x1045 [btrfs] [ 279.298762] [c1113870] ? strlcpy+0x11/0x3d [ 279.298762] [f8a712b5] ? btrfs_get_sb+0x22c/0x3c4 [btrfs] [ 279.298762] [c1091845] ? __kmalloc_track_caller+0x11f/0x12b [ 279.298762] [c107b3d6] ? kstrdup+0x24/0x3e [ 279.298762] [c10958d0] ? vfs_kern_mount+0x85/0x11c [ 279.298762] [c10959a5] ? do_kern_mount+0x2f/0xb8 [ 279.298762] [c10a5ba3] ? do_mount+0x588/0x5de [ 279.298762] [c1115ebf] ? copy_from_user+0x27/0x10e [ 279.298762] [c10a5c5f] ? sys_mount+0x66/0x97 [ 279.298762] [c100314c] ? syscall_call+0x7/0xb [ 279.298762] Code: 44 24 08 b8 fe ff ff ff 83 7c 24 08 00 0f 84 65 03 00 00 8b 44 24 48 8b 10 8b 48 04 89 e8 e8 8e ee ff ff 85 c0 89 44 24 0c 75 04 0f 0b eb fe 6b 44 24 44 19 8d 58 65 8b 44 24 40 89 da e8 5d 8a [ 279.298762] EIP: [f8aadbdf] add_inode_ref+0x4d/0x3a2 [btrfs] SS:ESP 0068:c5453c80 [ 279.563984] ---[ end trace 161cef534dc1a9ad ]--- Regards, --dkg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87oc1hk88h@fifthhorseman.net
Bug#631976: kernel BUG when mounting btrfs volume
Hi Ben--- On 06/28/2011 10:48 PM, Ben Hutchings wrote: This code was changed in Linux 3.0-rc2 to accept failure of read_one_inode() where it was previously expected (and asserted) always to be successful. Please test the current package in experimental (linux-image-3.0.0-rc5-686-pae etc.). Hmm, given that this is the root filesystem on this machine, even installing a new kernel is going to be tough. I'll probably try to make a debirf rescue image with a 3.0-rc kernel and pxe-boot it. It's going to take me a while to get that to happen, though, unfortunately. Thanks for the quick response! I'm sorry i won't be able to do a quicker turnaround myself to get you an answer. --dkg signature.asc Description: OpenPGP digital signature
Bug#624343: linux-image-2.6.38-2-amd64: frequent message bio too big device md0 (248 240) in kern.log
On 05/01/2011 08:00 PM, Ben Hutchings wrote: On Sun, 2011-05-01 at 15:06 -0700, Jameson Graef Rollins wrote: Hi, Ben. Can you explain why this is not expected to work? Which part exactly is not expected to work and why? Adding another type of disk controller (USB storage versus whatever the SSD interface is) to a RAID that is already in use. [...] The normal state of a RAID set is that all disks are online. You have deliberately turned this on its head; the normal state of your RAID set is that one disk is missing. This is such a basic principle that most documentation won't mention it. This is somewhat worrisome to me. Consider a fileserver with non-hotswap disks. One disk fails in the morning, but the machine is in production use, and the admin's goals are: * minimize downtime, * reboot only during off-hours, and * minimize the amount of time that the array is spent de-synced. A responsible admin might reasonably expect to attach a disk via a well-tested USB or ieee1394 adapter, bring the array back into sync, announce to the rest of the organization that there will be a scheduled reboot later in the evening. Then, at the scheduled reboot, move the disk from the USB/ieee1394 adapter to the direct ATA interface on the machine. If this sequence of operations is likely (or even possible) to cause data loss, it should be spelled out in BIG RED LETTERS someplace. I don't think any of the above steps seem unreasonable, and the set of goals the admin is attempting to meet are certainly commonplace goals. The error is that you changed the I/O capabilities of the RAID while it was already in use. But what I was describing as 'correct' was that an error code was returned, rather than the error condition only being logged. If the error condition is not properly propagated then it could lead to data loss. How is an admin to know which I/O capabilities to check before adding a device to a RAID array? When is it acceptable to mix I/O capabilities? Can a RAID array which is not currently being used as a backing store for a filesystem be assembled of unlike disks? What if it is then (later) used as a backing store for a filesystem? One of the advantages people tout for in-kernel software raid (over many H/W RAID implementations) is the ability to mix disks, so that you're not reliant on a single vendor during a failure. If this advantage doesn't extend across certain classes of disk, it would be good to be unambiguous about what can be mixed and what cannot. Regards, --dkg signature.asc Description: OpenPGP digital signature
Bug#624343: debian #624343 affects debian-installer
affects 624343 debian-installer thanks I note that debian-installer happily creates LVM-over-RAID and dmcrypt-over-RAID setups (and lvm-over-dmcrypt-over-RAID setups, for that matter), and provides no warnings to the admin that these RAiD setups may not be re-syncable in the face of hardware failure without taking their associated filesystems offline (or obeying some other unnamed constraints). I've marked this bug as affecting debian-installer because this seems to potentially surprising to administrators using d-i to set up their systems. Perhaps d-i should prefer alternate block device layerings that do not have these constraints? Regards, --dkg signature.asc Description: OpenPGP digital signature
Bug#624343: linux-image-2.6.38-2-amd64: frequent message bio too big device md0 (248 240) in kern.log
On 05/01/2011 08:22 PM, NeilBrown wrote: However if there is another layer in between md and the filesystem - such as dm - then there can be problem. There is no mechanism in the kernl for md to tell dm that things have changed, so dm never changes its configuration to match any change in the config of the md device. A filesystem always queries the config of the device as it prepares the request. As this is not an 'active' query (i.e. it just looks at variables, it doesn't call a function) there is no opportunity for dm to then query md. Thanks for this followup, Neil. Just to clarify, it sounds like any one of the following situations on its own is *not* problematic from the kernel's perspective: 0) having a RAID array that is more often in a de-synced state than in an online state. 1) mixing various types of disk in a single RAID array (e.g. SSD and spinning metal) 2) mixing various disk access channels within a single RAID array (e.g. USB and SATA) 3) putting other block device layers (e.g. loopback, dm-crypt, dm (via lvm or otherwise) above md and below a filesystem 4) hot-adding a device to an active RAID array from which filesystems are mounted. However, having any layers between md and the filesystem becomes problematic if the array is re-synced while the filesystem is online, because the intermediate layer can't communicate $SOMETHING (what specifically?) from md to the kernel's filesystem code. As a workaround, would the following sequence of actions (perhaps impossible for any given machine's operational state) allow a RAID re-sync without the errors jrollins reports or requiring a reboot? a) unmount all filesystems which ultimately derive from the RAID array b) hot-add the device with mdadm c) re-mount the filesystems or would something else need to be done with lvm (or cryptsetup, or the loopback device) between steps b and c? Coming at it from another angle: is there a way that an admin can ensure that the RAID array can be re-synced without unmounting the filesystems other than limiting themselves to exactly the same models of hardware for all components in the storage chain? Alternately, Is there a way to manually inform a given mounted filesystem that it should change $SOMETHING (what?), so that an aware admin could keep filesystems online by issuing this instruction before a raid re-sync? From a modular-kernel perspective: Is this specifically a problem with md itself, or would it also be the case with other block-device layering in the kernel? For example, suppose an admin has (without md) lvm over a bare disk, and a filesystem mounted from an LV. The admin then adds a second bare disk as a PV to the VG, and uses pvmove to transfer the physical extents of the active filesystem to the new disk, while mounted. Assuming that the new disk doesn't have the same characteristics (which characteristics?), does the fact that LVM sits between the underlying disk and the filesystem cause the same problem? What if dm-crypt sits between the disk and lvm? Between lvm and the filesystem? What if the layering is disk-dm-md-fs instead of disk-md-dm-fs ? Sorry for all the questions without having much concrete to contribute at the moment. If these limitations are actually well-documented somewhere, I would be grateful for a pointer. As a systems administrator, i would be unhappy to be caught out by some as-yet-unknown constraints during a hardware failure. I'd like to at least know my constraints beforehand. Regards, --dkg signature.asc Description: OpenPGP digital signature
Bug#620835: linux-image-2.6.37-2-686: kernel BUG in intel_tv_detect_type
Package: linux-2.6 Version: 2.6.37-2 Severity: normal Here's a backtrace of a NULL pointer dereference in 2.6.37-2-686 on a machine with an intel chipset. This machine (an Asus EeePC 900) has no physical TV connector. The machine is regularly suspended to RAM, and gets different external VGA monitors plugged in (and sometimes operates only with the LVDS). At the time of the Oops, the machine was operated with no external monitor. Apr 1 11:25:39 localhost kernel: imklog 5.7.8, log source = /proc/kmsg started. Apr 1 11:25:39 localhost kernel: [0.00] Initializing cgroup subsys cpuset Apr 1 11:25:39 localhost kernel: [0.00] Initializing cgroup subsys cpu Apr 1 11:25:39 localhost kernel: [0.00] Linux version 2.6.37-2-686 (Debian 2.6.37-2) (b...@decadent.org.uk) (gcc version 4.4.5 (Debian 4.4.5-11) ) #1 SMP Sun Feb 27 10:51:32 UTC 2011 [...] Apr 1 20:21:53 localhost kernel: [30919.095086] BUG: unable to handle kernel NULL pointer dereference at 0100 Apr 1 20:21:53 localhost kernel: [30919.095094] IP: [f930d019] intel_tv_detect_type+0xa2/0x203 [i915] Apr 1 20:21:53 localhost kernel: [30919.095125] *pde = Apr 1 20:21:53 localhost kernel: [30919.095130] Oops: [#1] SMP Apr 1 20:21:53 localhost kernel: [30919.095135] last sysfs file: /sys/devices/pci:00/:00:02.0/drm/card0/card0-SVIDEO-1/status Apr 1 20:21:53 localhost kernel: [30919.095142] Modules linked in: arc4 ecb pl2303 usbserial sco bnep l2cap crc16 bluetooth binfmt_misc uinput fuse ath5k ath mac80211 cfg80211 loop snd_hda_codec_realtek joydev snd_hda_intel snd_hda_codec i915 snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm drm_kms_helper snd_seq_midi drm snd_rawmidi snd_seq_midi_event snd_seq i2c_algo_bit uvcvideo snd_timer snd_seq_device snd videodev eeepc_laptop psmouse v4l1_compat sparse_keymap i2c_core rfkill tpm_tis video tpm evdev processor shpchp rng_core tpm_bios ac serio_raw output battery soundcore pci_hotplug power_supply snd_page_alloc button ext3 jbd mbcache sha256_generic aes_i586 aes_generic cbc dm_crypt dm_mod raid1 md_mod usb_storage uas sd_mod crc_t10dif ata_generic ahci libahci ata_piix libata uhci_hcd ehci_hcd scsi_mod usbcore thermal atl2 thermal_sys nls_base [last unloaded: scsi_wait_scan] Apr 1 20:21:53 localhost kernel: [30919.095238] Apr 1 20:21:53 localhost kernel: [30919.095243] Pid: 2866, comm: upowerd Not tainted 2.6.37-2-686 #1 ASUSTeK Computer INC. 900/900 Apr 1 20:21:53 localhost kernel: [30919.095253] EIP: 0060:[f930d019] EFLAGS: 00010246 CPU: 0 Apr 1 20:21:53 localhost kernel: [30919.095273] EIP is at intel_tv_detect_type+0xa2/0x203 [i915] Apr 1 20:21:53 localhost kernel: [30919.095278] EAX: EBX: f7112000 ECX: f9668004 EDX: 00068004 Apr 1 20:21:53 localhost kernel: [30919.095284] ESI: f72fb800 EDI: 000c0c37 EBP: f711229c ESP: f1231e2c Apr 1 20:21:53 localhost kernel: [30919.095289] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 1 20:21:53 localhost kernel: [30919.095295] Process upowerd (pid: 2866, ti=f123 task=f72d74d0 task.ti=f123) Apr 1 20:21:53 localhost kernel: [30919.095299] Stack: Apr 1 20:21:53 localhost kernel: [30919.095302] f9317da4 7000 000c0c30 f72fbc00 f1231e54 f1231f24 f72fb800 Apr 1 20:21:53 localhost kernel: [30919.095312] f930d203 0100 0003 4353544e Apr 1 20:21:53 localhost kernel: [30919.095321] 30383420 0069 Apr 1 20:21:53 localhost kernel: [30919.095331] Call Trace: Apr 1 20:21:53 localhost kernel: [30919.095354] [f930d203] ? intel_tv_detect+0x89/0x12d [i915] Apr 1 20:21:53 localhost kernel: [30919.095376] [f9203cef] ? status_show+0x0/0x2f [drm] Apr 1 20:21:53 localhost kernel: [30919.095392] [f9203d03] ? status_show+0x14/0x2f [drm] Apr 1 20:21:53 localhost kernel: [30919.095403] [c11c390a] ? dev_attr_show+0x16/0x32 Apr 1 20:21:53 localhost kernel: [30919.095413] [c10fc020] ? sysfs_read_file+0x8c/0xf5 Apr 1 20:21:53 localhost kernel: [30919.095420] [c10fbf94] ? sysfs_read_file+0x0/0xf5 Apr 1 20:21:53 localhost kernel: [30919.095429] [c10ba3aa] ? vfs_read+0x7c/0xd6 Apr 1 20:21:53 localhost kernel: [30919.095436] [c10b8a34] ? do_sys_open+0xb5/0xbe Apr 1 20:21:53 localhost kernel: [30919.095443] [c10ba497] ? sys_read+0x3c/0x60 Apr 1 20:21:53 localhost kernel: [30919.095451] [c1002f9f] ? sysenter_do_call+0x12/0x28 Apr 1 20:21:53 localhost kernel: [30919.095455] Code: d8 e8 30 f8 ff ff ba 04 80 06 00 89 d8 8b 4c 24 0c 81 c9 aa 00 00 0f e8 1a f8 ff ff ba 04 80 06 00 89 d8 e8 ea f7 ff ff 8b 46 20 8b 90 00 01 00 00 8b 06 e8 40 11 ff ff b8 14 00 00 00 8b 35 40 Apr 1 20:21:53 localhost kernel: [30919.095500] EIP: [f930d019] intel_tv_detect_type+0xa2/0x203 [i915] SS:ESP 0068:f1231e2c Apr 1 20:21:53 localhost kernel: [30919.095523] CR2: 0100 Apr 1 20:21:53 localhost kernel: [30919.095528] ---[ end trace 63beda03e83c9f6c ]---
Bug#620835: linux-image-2.6.37-2-686: kernel BUG in intel_tv_detect_type
On 04/04/2011 11:20 AM, Ben Hutchings wrote: On Tue, Apr 05, 2011 at 05:46:01AM -0400, Daniel Kahn Gillmor wrote: Package: linux-2.6 Version: 2.6.37-2 [...] Please test 2.6.38-2. i'm currently running 2.6.38-2, and i have not yet seen this particular bug (NULL dereference in intel_tv_detect_type). otoh, I only saw this bug after running with 2.6.37-2 since March 5th (nearly a month). So it's hard for me to say that switching to 2.6.38-2 fixed things. --dkg signature.asc Description: OpenPGP digital signature
Bug#620374: followup for #620374
Over at https://bugs.freedesktop.org/show_bug.cgi?id=35936 , Chris Wilson asked about the inclusion of commit 29c5a587284195278e233eec5c2234c24fb2c204 in 2.6.38-2. From looking at the changelogs, i don't think it was included, but i'd appreciate if someone from the kernel team could provide a more authoritative follow up. Regards, --dkg signature.asc Description: OpenPGP digital signature
Bug#617377: linux fails to functionally boot under EFI (using grub-efi-amd64)
Package: linux-image-2.6.37-2-amd64 I got grub-efi-amd64 working on a very modern macbook. When i tried to use it to boot linux (using both the squeeze kernel (linux-image-2.6.32-5-amd64) and the unstable kernel, booting the kernel with no parameters resulted in a hung machine with no output on the video console at all. If i supplied the noefi kernel parameter, the machine would boot, and the console messages would come up, but there would be no keyboard, and i'd get error messages from ehci_hcd and ohci_hcd suggesting that i try setting pci=biosirq. If i set both noefi and pci=biosirq, i continue getting the same errors. (sorry i don't have the exact transcript of the error messages -- i no longer have the machine to copy them down). I believe modern macbook kbds are connected via USB, so the module failures would explain why the kbd was unresponsive. Ultimately, i gave up on booting through EFI and booted with emulated BIOS mode. The machine works OK under emulated bios, but it would be nice to avoid the extra layer of cruft if possible. I found a gentoo discussion of what it took to get the kernel running cleanly under plain EFI on a comparable machine: https://forums.gentoo.org/viewtopic-t-860544.html But the reference link (which was actually full of detailed info on saturday) is now showing an IIS7 welcome graphic :/ http://www.tomjepp.co.uk/?page=gentoo_mbp62 They pointed in particular to this patch for running EFI in physical mode: https://patchwork.kernel.org/patch/119823/ Sorry i don't have more details at the moment. I can gather more details about the machine from its owner if that would be useful. --dkg signature.asc Description: OpenPGP digital signature
Bug#614622: linux-image-2.6.37-1-686: atl2 NIC claims NO CARRIER after suspend/resume; rmmod+insmod fixes the problem
Package: linux-2.6 Version: 2.6.37-1 Severity: normal I recently switched from 2.6.37-trunk-686 to 2.6.37-1-686. after the switch, i find that sometimes my atl2.ko-driven onboard NIC persistently claims NO CARRIER after resuming from suspend-to-RAM, even when plugged into a legitimate ethernet port. This is not entirely reliable, but maybe 50% of the time. if i remove and re-load atl2.ko, the interface can properly detect the ethernet. i'm happy to help debug this further if there is any information you want me to gather on this hardware. Please let me know. --dkg -- Package-specific info: ** Version: Linux version 2.6.37-1-686 (Debian 2.6.37-1) (b...@decadent.org.uk) (gcc version 4.4.5 (Debian 4.4.5-10) ) #1 SMP Tue Feb 15 18:21:50 UTC 2011 ** Command line: BOOT_IMAGE=/vmlinuz-2.6.37-1-686 root=/dev/mapper/vg_pip0-root ro verbose ** Not tainted ** Kernel log: [67880.156933] uhci_hcd :00:1d.3: PCI INT D disabled [67880.156948] uhci_hcd :00:1d.2: PCI INT C disabled [67880.156962] uhci_hcd :00:1d.1: PCI INT B disabled [67880.156976] uhci_hcd :00:1d.0: PCI INT A disabled [67880.157932] i915 :00:02.0: PCI INT A disabled [67880.260087] HDA Intel :00:1b.0: PCI INT A disabled [67880.276036] PM: suspend of devices complete after 147.583 msecs [67880.292274] PM: late suspend of devices complete after 16.227 msecs [67880.292426] ACPI: Preparing to enter system sleep state S3 [67880.316648] PM: Saving platform NVS memory [67880.357927] Disabling non-boot CPUs ... [67880.357927] Back to C! [67880.357927] PM: Restoring platform NVS memory [67880.357927] Force enabled HPET at resume [67880.357927] ACPI: Waking up from system sleep state S3 [67880.400865] HDA Intel :00:1b.0: restoring config space at offset 0x1 (was 0x16, writing 0x12) [67880.400904] pci :00:1c.0: restoring config space at offset 0x9 (was 0x1fff1, writing 0x3fc13fb1) [67880.400913] pci :00:1c.0: restoring config space at offset 0x8 (was 0xfff0, writing 0x3fa03f90) [67880.400923] pci :00:1c.0: restoring config space at offset 0x7 (was 0xf0, writing 0x1010) [67880.400937] pci :00:1c.0: restoring config space at offset 0x1 (was 0x100104, writing 0x100107) [67880.400976] pci :00:1c.1: restoring config space at offset 0x9 (was 0x1fff1, writing 0x3fe13fd1) [67880.400987] pci :00:1c.1: restoring config space at offset 0x7 (was 0xf0, writing 0x2020) [67880.401001] pci :00:1c.1: restoring config space at offset 0x1 (was 0x100106, writing 0x100107) [67880.401042] pci :00:1c.2: restoring config space at offset 0x7 (was 0xf0, writing 0x3030) [67880.401056] pci :00:1c.2: restoring config space at offset 0x1 (was 0x100106, writing 0x100107) [67880.401099] uhci_hcd :00:1d.0: restoring config space at offset 0x1 (was 0x285, writing 0x281) [67880.401132] uhci_hcd :00:1d.1: restoring config space at offset 0x1 (was 0x285, writing 0x281) [67880.401165] uhci_hcd :00:1d.2: restoring config space at offset 0x1 (was 0x285, writing 0x281) [67880.401197] uhci_hcd :00:1d.3: restoring config space at offset 0x1 (was 0x285, writing 0x281) [67880.401239] ehci_hcd :00:1d.7: restoring config space at offset 0x1 (was 0x296, writing 0x292) [67880.401265] pci :00:1e.0: restoring config space at offset 0xf (was 0x6, writing 0x600ff) [67880.401538] PM: early resume of devices complete after 0.795 msecs [67880.406036] HDA Intel :00:1b.0: PCI INT A - GSI 16 (level, low) - IRQ 16 [67880.406048] HDA Intel :00:1b.0: setting latency timer to 64 [67880.406093] HDA Intel :00:1b.0: irq 40 for MSI/MSI-X [67880.406136] pci :00:1c.0: PCI INT A - GSI 16 (level, low) - IRQ 16 [67880.406143] pci :00:1c.0: setting latency timer to 64 [67880.406156] pci :00:1c.1: PCI INT B - GSI 17 (level, low) - IRQ 17 [67880.406163] pci :00:1c.1: setting latency timer to 64 [67880.406176] pci :00:1c.2: PCI INT C - GSI 18 (level, low) - IRQ 18 [67880.406183] pci :00:1c.2: setting latency timer to 64 [67880.406198] uhci_hcd :00:1d.0: PCI INT A - GSI 23 (level, low) - IRQ 23 [67880.406208] uhci_hcd :00:1d.0: setting latency timer to 64 [67880.406234] usb usb2: root hub lost power or was reset [67880.406253] uhci_hcd :00:1d.1: PCI INT B - GSI 19 (level, low) - IRQ 19 [67880.406262] uhci_hcd :00:1d.1: setting latency timer to 64 [67880.406287] usb usb3: root hub lost power or was reset [67880.406303] uhci_hcd :00:1d.2: PCI INT C - GSI 18 (level, low) - IRQ 18 [67880.406313] uhci_hcd :00:1d.2: setting latency timer to 64 [67880.406337] usb usb4: root hub lost power or was reset [67880.406354] uhci_hcd :00:1d.3: PCI INT D - GSI 16 (level, low) - IRQ 16 [67880.406363] uhci_hcd :00:1d.3: setting latency timer to 64 [67880.406387] usb usb5: root hub lost power or was reset [67880.406405] ehci_hcd :00:1d.7: PCI INT A - GSI 23 (level, low) - IRQ 23 [67880.406415] ehci_hcd :00:1d.7: