Bug#989571: linux-image-5.10.0-0.bpo.3-amd64: Incorrect large USB disk sizing leading to data corruption
Close please. The 17G was from trying to blank the drive, which for some reason disconnected in the process resulting in a file written in /dev with the name sda. From there on the loop and so on. So there was a /dev/sda file as a left-over after that. Thanks for pointing me in the right direction and apologies. I am going to continue investigating why I got the data corruption in the first place, before I tried to blank it, but it looks like it may have been a hardware issue with the original USB-to-ATA bridge. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#989571: linux-image-5.10.0-0.bpo.3-amd64: Incorrect large USB disk sizing leading to data corruption
Package: src:linux Version: 5.10.13-1~bpo10+1 Severity: critical Justification: causes serious data loss Dear Maintainer, Large USB drives (example - Seagate 4TB Backup) which work perfectly fine with 4.19 are identified as incorrect size. In the case of the 4TB sized USB it's identified as a 17GB and for some unfatomable reason mounted as loop. The result is severe data corruption making all 4TB of data on the drive unrecoverable. Tested with the original USB bridge coming with the drive and after attaching the SATA drive inside to an alternative USB bridge. Same result in both cases. -- Package-specific info: ** Version: Linux version 5.10.0-0.bpo.3-amd64 (debian-kernel@lists.debian.org) (gcc-8 (Debian 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP Debian 5.10.13-1~bpo10+1 (2021-02-11) ** Command line: BOOT_IMAGE=diskless/amd64/vmlinuz-5.10.0-0.bpo.3-amd64 initrd=diskless/amd64/initrd.img-5.10.0-0.bpo.3-amd64 root=/dev/nfs ip=dhcp nfsroot=192.168.3.3:/exports/boot/madding mitigations=off rw -- ** Tainted: S (4) * SMP kernel oops on an officially SMP incapable processor ** Kernel log: [754632.929276] nfs: server 192.168.3.3 OK [754635.600887] rpc_check_timeout: 443 callbacks suppressed [754635.600889] nfs: server 192.168.3.3 not responding, still trying [754635.612996] nfs: server 192.168.3.3 not responding, still trying [754635.625266] nfs: server 192.168.3.3 not responding, still trying [754635.625462] nfs: server 192.168.3.3 not responding, still trying [754635.637374] nfs: server 192.168.3.3 not responding, still trying [754635.649472] nfs: server 192.168.3.3 not responding, still trying [754635.661739] nfs: server 192.168.3.3 not responding, still trying [754635.661922] nfs: server 192.168.3.3 not responding, still trying [754635.673850] nfs: server 192.168.3.3 not responding, still trying [754635.686131] nfs: server 192.168.3.3 not responding, still trying [791938.374623] lxc-bridge0: port 3(tap-opsft2-0) entered blocking state [791938.374628] lxc-bridge0: port 3(tap-opsft2-0) entered forwarding state [791938.374654] lxc-bridge0: port 4(tap-opsft3-0) entered blocking state [791938.374655] lxc-bridge0: port 4(tap-opsft3-0) entered forwarding state [791938.375075] lxc-bridge0: port 2(tap-opsft1-0) entered blocking state [791938.375078] lxc-bridge0: port 2(tap-opsft1-0) entered forwarding state [791938.388241] k8-bridge0: port 2(tap-opsft1-1) entered blocking state [791938.388243] k8-bridge0: port 2(tap-opsft1-1) entered forwarding state [791938.388402] k8-bridge0: port 4(tap-opsft3-1) entered blocking state [791938.388405] k8-bridge0: port 4(tap-opsft3-1) entered forwarding state [791938.388481] k8-bridge0: port 3(tap-opsft2-1) entered blocking state [791938.388484] k8-bridge0: port 3(tap-opsft2-1) entered forwarding state [801076.265404] usb 4-2.4: new SuperSpeed Gen 1 USB device number 5 using xhci_hcd [801076.289933] usb 4-2.4: New USB device found, idVendor=174c, idProduct=55aa, bcdDevice= 1.00 [801076.289937] usb 4-2.4: New USB device strings: Mfr=2, Product=3, SerialNumber=1 [801076.289939] usb 4-2.4: Product: ASM105x [801076.289940] usb 4-2.4: Manufacturer: ASMT [801076.289942] usb 4-2.4: SerialNumber: [801076.291139] scsi host10: uas [801076.291557] scsi 10:0:0:0: Direct-Access ASMT 2115 0 PQ: 0 ANSI: 6 [801076.292065] sd 10:0:0:0: Attached scsi generic sg0 type 0 [801076.292232] sd 10:0:0:0: [sda] Spinning up disk... [801077.321342] ..ready [801082.447597] sd 10:0:0:0: [sda] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB) [801082.447600] sd 10:0:0:0: [sda] 4096-byte physical blocks [801082.447673] sd 10:0:0:0: [sda] Write Protect is off [801082.447674] sd 10:0:0:0: [sda] Mode Sense: 43 00 00 00 [801082.447832] sd 10:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [801082.448032] sd 10:0:0:0: [sda] Optimal transfer size 33553920 bytes not a multiple of physical block size (4096 bytes) [801082.494646] sd 10:0:0:0: [sda] Attached SCSI disk [801150.687429] loop: module loaded [801150.815997] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) [803002.579925] blk_update_request: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 [803002.579960] blk_update_request: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 [803017.725341] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) [803081.125594] blk_update_request: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 [803081.125635] blk_update_request: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 [803085.522063] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) [803239.336895] blk_update_request: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 [803239.336950] blk_update_request: I/O
Bug#940821: NFS Caching broken in 4.19.37
On 26/02/2021 15:03, Timo Rothenpieler wrote: I think I can reproduce this, or something that at least looks very similar to this, on 5.10. Namely on 5.10.17 (On both Client and Server). I think this is a different issue - see below. We are running slurm, and since a while now (coincides with updating from 5.4 to 5.10, but a whole bunch of other stuff was updated at the same time, so it took me a while to correlate this) the logs it writes have been truncated, but only while they're being observed on the client, using tail -f or something like that. Looks like this then: On Server: store01 /srv/export/home/users/timo/TestRun # ls -l slurm-41101.out -rw-r--r-- 1 timo timo 1931 Feb 26 15:46 slurm-41101.out store01 /srv/export/home/users/timo/TestRun # wc -l slurm-41101.out 61 slurm-41101.out On Client: timo@login01 ~/TestRun $ ls -l slurm-41101.out -rw-r--r-- 1 timo timo 1931 Feb 26 15:46 slurm-41101.out timo@login01 ~/TestRun $ wc -l slurm-41101.out 24 slurm-41101.out See https://gist.github.com/BtbN/b9eb4fc08ccc53bb20087bce0bf9f826 for the respective file-contents. If I run the same test job, wait until its done, and then look at its slurm.out file, it matches between NFS Client and Server. If I tail -f the slurm.out on an NFS client, the file stops getting updated on the client, but keeps getting more logs written to it on the NFS server. The slurm.out file is being written to by another NFS client, which is running on one of the compute nodes of the system. It's being reads from a login node. These are two different clients, then what you see is possible on NFS with client side caching. If you have multiple clients reading/writing to the same files you usually need to tune the caching options and/or use locking. I suspect that if you leave it for a while (until the cache expires) it will sort itself out. In my test-case it is just one client, it missed a file deletion and nothing short of an unmount and remount fixes that. I have waited for 30 mins+. It does not seem to refresh or expire. I also see the opposite behavior - the bug shows up on 4.x up to at least 5.4. I do not see it on 5.10. Brgds, Timo On 21.02.2021 16:53, Anton Ivanov wrote: Client side. This seems to be an entirely client side issue. A variety of kernels on the clients starting from 4.9 and up to 5.10 using 4.19 servers. I have observed it on a 4.9 client versus 4.9 server earlier. 4.9 fails, 4.19 fails, 5.2 fails, 5.4 fails, 5.10 works. At present the server is at 4.19.67 in all tests. Linux jain 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux I can set-up a couple of alternative servers during the week, but so far everything is pointing towards a client fs cache issue, not a server one. Brgds, -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#940821: NFS Caching broken in 4.19.37
On 21/02/2021 14:37, Bruce Fields wrote: On Sun, Feb 21, 2021 at 11:38:51AM +, Anton Ivanov wrote: On 21/02/2021 09:13, Salvatore Bonaccorso wrote: On Sat, Feb 20, 2021 at 08:16:26PM +, Chuck Lever wrote: Confirming you are varying client-side kernels. Should the Linux NFS client maintainers be Cc'd? Ok, agreed. Let's add them as well. NFS client maintainers any ideas on how to trackle this? This is not observed with Debian backports 5.10 package uname -a Linux madding 5.10.0-0.bpo.3-amd64 #1 SMP Debian 5.10.13-1~bpo10+1 (2021-02-11) x86_64 GNU/Linux I'm still unclear: when you say you tested a certain kernel: are you varying the client-side kernel version, or the server side, or both at once? Client side. This seems to be an entirely client side issue. A variety of kernels on the clients starting from 4.9 and up to 5.10 using 4.19 servers. I have observed it on a 4.9 client versus 4.9 server earlier. 4.9 fails, 4.19 fails, 5.2 fails, 5.4 fails, 5.10 works. At present the server is at 4.19.67 in all tests. Linux jain 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux I can set-up a couple of alternative servers during the week, but so far everything is pointing towards a client fs cache issue, not a server one. Brgds, --b. -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#940821: NFS Caching broken in 4.19.37
On 21/02/2021 09:13, Salvatore Bonaccorso wrote: Hi, On Sat, Feb 20, 2021 at 08:16:26PM +, Chuck Lever wrote: On Feb 20, 2021, at 3:13 PM, Anton Ivanov wrote: On 20/02/2021 20:04, Salvatore Bonaccorso wrote: Hi, On Mon, Jul 08, 2019 at 07:19:54PM +0100, Anton Ivanov wrote: Hi list, NFS caching appears broken in 4.19.37. The more cores/threads the easier to reproduce. Tested with identical results on Ryzen 1600 and 1600X. 1. Mount an openwrt build tree over NFS v4 2. Run make -j `cat /proc/cpuinfo | grep vendor | wc -l` ; make clean in a loop 3. Result after 3-4 iterations: State on the client ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ State as seen on the server (mounted via nfs from localhost): ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h Actual state on the filesystem: ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h So the client has quite clearly lost the plot. Telling it to drop caches and re-reading the directory shows the file present. It is possible to reproduce this using a linux kernel tree too, just takes much more iterations - 10+ at least. Both client and server run 4.19.37 from Debian buster. This is filed as debian bug 931500. I originally thought it to be autofs related, but IMHO it is actually something fundamentally broken in nfs caching resulting in cache corruption. According to the reporter downstream in Debian, at https://bugs.debian.org/940821#26 thi seem still reproducible with more recent kernels than the initial reported. Is there anything Anton can provide to try to track down the issue? Anton, can you reproduce with current stable series? 100% reproducible with any kernel from 4.9 to 5.4, stable or backports. It may exist in earlier versions, but I do not have a machine with anything before 4.9 to test at present. Confirming you are varying client-side kernels. Should the Linux NFS client maintainers be Cc'd? Ok, agreed. Let's add them as well. NFS client maintainers any ideas on how to trackle this? This is not observed with Debian backports 5.10 package uname -a Linux madding 5.10.0-0.bpo.3-amd64 #1 SMP Debian 5.10.13-1~bpo10+1 (2021-02-11) x86_64 GNU/Linux I left the testcase running for ~ 4 hours on a 6core/12thread Ryzen. It should have blown up 10 times by now. So one of the commits between 5.4 and 5.10.13 fixed it. If nobody can think of a particular commit which fixes it, I can try dissecting it during the week. A. From 1-2 make clean && make cycles to one afternoon depending on the number of machine cores. More cores/threads the faster it does it. I tried playing with protocol minor versions, caching options, etc - it is still reproducible for any nfs4 settings as long as there is client side caching of metadata. A. Regards, Salvatore -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/ -- Chuck Lever Regards, Salvatore -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#940821: NFS Caching broken in 4.19.37
On 20/02/2021 20:04, Salvatore Bonaccorso wrote: Hi, On Mon, Jul 08, 2019 at 07:19:54PM +0100, Anton Ivanov wrote: Hi list, NFS caching appears broken in 4.19.37. The more cores/threads the easier to reproduce. Tested with identical results on Ryzen 1600 and 1600X. 1. Mount an openwrt build tree over NFS v4 2. Run make -j `cat /proc/cpuinfo | grep vendor | wc -l` ; make clean in a loop 3. Result after 3-4 iterations: State on the client ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ State as seen on the server (mounted via nfs from localhost): ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h Actual state on the filesystem: ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h So the client has quite clearly lost the plot. Telling it to drop caches and re-reading the directory shows the file present. It is possible to reproduce this using a linux kernel tree too, just takes much more iterations - 10+ at least. Both client and server run 4.19.37 from Debian buster. This is filed as debian bug 931500. I originally thought it to be autofs related, but IMHO it is actually something fundamentally broken in nfs caching resulting in cache corruption. According to the reporter downstream in Debian, at https://bugs.debian.org/940821#26 thi seem still reproducible with more recent kernels than the initial reported. Is there anything Anton can provide to try to track down the issue? Anton, can you reproduce with current stable series? 100% reproducible with any kernel from 4.9 to 5.4, stable or backports. It may exist in earlier versions, but I do not have a machine with anything before 4.9 to test at present. From 1-2 make clean && make cycles to one afternoon depending on the number of machine cores. More cores/threads the faster it does it. I tried playing with protocol minor versions, caching options, etc - it is still reproducible for any nfs4 settings as long as there is client side caching of metadata. A. Regards, Salvatore -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Bug#940821: closed by Bastian Blank (No response by submitter)
On 20/02/2021 10:33, Debian Bug Tracking System wrote: This is an automatic notification regarding your Bug report which was filed against the src:linux package: #940821: linux-image-5.2.0-2-amd64: file cache corruption with nfs4 It has been closed by Bastian Blank . Their explanation is attached below along with your original report. If this explanation is unsatisfactory and you have not received a better one in a separate message then please contact Bastian Blank by replying to this email. I missed the question. Probably hit the spam bucket for some reason. I am able to reproduce it with more recent versions as well. The most recent one I have around is 5.4.0-0.bpo.2-amd64 Still reproducible 100% - just tested it. It is trivial to reproduce if anyone actually bothers to do so. Just grab a big enough tree where make runs truly in parallel - openwrt is best, but even the Linux kernel does the job. Mount it via nfs4 from another server (it will work even locally, but takes longer to reproduce - may take a whole afternoon) Run while make -j 12 clean && make -j 12 ; do true ; done Leave it to run. On 6 cores/12 threads it takes 2-3 builds of openwrt or ~ 5-8 linux kernel builds to blow up. More cores - faster. Less cores slower. I sent it to the mailing list too, but nobody could be bothered to even ask any questions. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#945213: Info received (Bug#945213: linux-image-5.2.0-3-amd64: OOM handling broken if hugepages are enabled)
[0.00] Linux version 5.2.0-3-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-22)) #1 SMP Debian 5.2.17-1 (2019-09-26) [0.00] Command line: BOOT_IMAGE=diskless/amd64/vmlinuz-5.2.0-3-amd64 initrd=diskless/amd64/initrd.img-5.2.0-3-amd64 root=/dev/nfs ip=dhcp nfsroot=192.168.3.3:/exports/boot/buster-bess mitigations=off rw -- [0.00] random: get_random_u32 called from bsp_init_amd+0x20b/0x2b0 with crng_init=0 [0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009e7ff] usable [0.00] BIOS-e820: [mem 0x0009e800-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x9dc43fff] usable [0.00] BIOS-e820: [mem 0x9dc44000-0x9ddc] reserved [0.00] BIOS-e820: [mem 0x9ddd-0x9ddd] ACPI data [0.00] BIOS-e820: [mem 0x9dde-0x9e13bfff] ACPI NVS [0.00] BIOS-e820: [mem 0x9e13c000-0x9e694fff] reserved [0.00] BIOS-e820: [mem 0x9e695000-0x9e695fff] usable [0.00] BIOS-e820: [mem 0x9e696000-0x9e89bfff] ACPI NVS [0.00] BIOS-e820: [mem 0x9e89c000-0x9ecb1fff] usable [0.00] BIOS-e820: [mem 0x9ecb2000-0x9eff3fff] reserved [0.00] BIOS-e820: [mem 0x9eff4000-0x9eff] usable [0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved [0.00] BIOS-e820: [mem 0xfec1-0xfec10fff] reserved [0.00] BIOS-e820: [mem 0xfed0-0xfed00fff] reserved [0.00] BIOS-e820: [mem 0xfed8-0xfed8] reserved [0.00] BIOS-e820: [mem 0xff00-0x] reserved [0.00] BIOS-e820: [mem 0x00011000-0x00015eff] usable [0.00] NX (Execute Disable) protection: active [0.00] SMBIOS 2.7 present. [0.00] DMI: System manufacturer System Product Name/F2A55, BIOS 5301 10/10/2012 [0.00] tsc: Fast TSC calibration using PIT [0.00] tsc: Detected 3501.783 MHz processor [0.003478] e820: update [mem 0x-0x0fff] usable ==> reserved [0.003479] e820: remove [mem 0x000a-0x000f] usable [0.003485] last_pfn = 0x15f000 max_arch_pfn = 0x4 [0.003490] MTRR default type: uncachable [0.003490] MTRR fixed ranges enabled: [0.003491] 0-9 write-back [0.003492] A-B write-through [0.003493] C-D2FFF write-protect [0.003494] D3000-E7FFF uncachable [0.003494] E8000-F write-protect [0.003495] MTRR variable ranges enabled: [0.003496] 0 base mask 8000 write-back [0.003497] 1 base 8000 mask E000 write-back [0.003498] 2 base 9F00 mask FF00 uncachable [0.003498] 3 disabled [0.003499] 4 disabled [0.003499] 5 disabled [0.003500] 6 disabled [0.003500] 7 disabled [0.003501] TOM2: 00015f00 aka 5616M [0.003713] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT [0.003882] e820: update [mem 0x9f00-0x] usable ==> reserved [0.003887] last_pfn = 0x9f000 max_arch_pfn = 0x4 [0.007940] found SMP MP-table at [mem 0x000fd870-0x000fd87f] [0.030016] Using GB pages for direct mapping [0.030018] BRK [0x133801000, 0x133801fff] PGTABLE [0.030020] BRK [0x133802000, 0x133802fff] PGTABLE [0.030021] BRK [0x133803000, 0x133803fff] PGTABLE [0.030074] BRK [0x133804000, 0x133804fff] PGTABLE [0.030076] BRK [0x133805000, 0x133805fff] PGTABLE [0.030380] BRK [0x133806000, 0x133806fff] PGTABLE [0.030449] BRK [0x133807000, 0x133807fff] PGTABLE [0.030551] BRK [0x133808000, 0x133808fff] PGTABLE [0.030642] BRK [0x133809000, 0x133809fff] PGTABLE [0.030767] BRK [0x13380a000, 0x13380afff] PGTABLE [0.030857] BRK [0x13380b000, 0x13380bfff] PGTABLE [0.030919] BRK [0x13380c000, 0x13380cfff] PGTABLE [0.031040] RAMDISK: [mem 0x7e75-0x7fff] [0.031046] ACPI: Early table checksum verification disabled [0.039448] ACPI: RSDP 0x000F0490 24 (v02 ALASKA) [0.039451] ACPI: XSDT 0x9DDD8078 64 (v01 ALASKA A M I 01072009 AMI 00010013) [0.039457] ACPI: FACP 0x9DDDE868 00010C (v05 ALASKA A M I 01072009 AMI 00010013) [0.039461] ACPI BIOS
Bug#945213: linux-image-5.2.0-3-amd64: OOM handling broken if hugepages are enabled
On 22/11/2019 19:32, Ben Hutchings wrote: Control: reassign -1 src:linux 5.2.17-1 Control: tag -1 moreinfo On Thu, 2019-11-21 at 08:58 +, Anton Ivanov wrote: Package: linux-image-5.2.0-3-amd64 Version: 5.2.17+1 Severity: important Dear Maintainer, Dear Maintainer, OOM handling appears to be broken in 5.2.17-1 if hugepages are enabled. Test system: AMD A4-5300, 40G RAM, no swap, booted disklessly. Without hugepages enabled can compile dpdk without any issues. With huge pages enabled it will reproducibly OOM when trying to link one of the libraries. There are 20G+ free RAM at that point according to free with the rest being mostly used as buffers. It is sufficient to just enable huge pages to trigger this (2G out of 40G), they are not allocated or used by anything. What do you mean by "if hugepages are enabled"? hugetlbfs and THP are enabled by default. $ tail -2 sysctl.conf vm.nr_hugepages=1024 If you do not have that, compile completes fine. If you have that compile blows up when linking one of the dpdk libraries. At that point the machine has ~ 20G free RAM. You need to provide a log of the OOM messages. Ack. I will re-run the tests tomorrow and update the bug with detailed logs and the OOM. Ben. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#945213: linux-image-5.2.0-3-amd64: OOM handling broken if hugepages are enabled
Package: linux-image-5.2.0-3-amd64 Version: 5.2.17+1 Severity: important Dear Maintainer, Dear Maintainer, OOM handling appears to be broken in 5.2.17-1 if hugepages are enabled. Test system: AMD A4-5300, 40G RAM, no swap, booted disklessly. Without hugepages enabled can compile dpdk without any issues. With huge pages enabled it will reproducibly OOM when trying to link one of the libraries. There are 20G+ free RAM at that point according to free with the rest being mostly used as buffers. It is sufficient to just enable huge pages to trigger this (2G out of 40G), they are not allocated or used by anything. -- System Information: Debian Release: 10.2 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 5.2.0-3-amd64 (SMP w/2 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled
Bug#940820: UML not loading on Debian buster with a 5.2 kernel from testing
This is a regression in the randomization of the va setting. UML will boot on debian 4.19 kernel host with kernel.randomize_va_space = 2 UML will not boot debian 5.2 kernel host with kernel.randomize_va_space = 2 UML will boot on 5.2 once kernel.randomize_va_space is set to 0 on the host. So something has changed in how randomize is implemented between 4.19 and 5.2. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#941637: linux-image-4.19.0-6-amd64: noht flag on command line has no effect for 6 core/12 Thread Ryzens
On 03/10/2019 16:06, Salvatore Bonaccorso wrote: Control: tags -1 + moreinfo Hi On Thu, Oct 03, 2019 at 09:24:26AM +0100, Anton Ivanov wrote: Package: src:linux Version: 4.19.67-2+deb10u1 Severity: important Dear Maintainer, noht has no effect. I have been trying to chase down a weird hang which occurs only on 6 core/12 thread Ryzens (I cannot reproduce it on 4/8 or older CPUs). As a part of that I tried to disable ht. Well, it cannot be disabled - the noht command line arg has no effect whatosever. As ht can be a security hole this may have security implications as well. Do you mean 'nosmt'? (See kernel-parameters.txt). You can find further information as well in Documentation/admin-guide/hw-vuln/l1tf.rst. I picked up noht from an older document somewhere and I cannot remember the actual source. It was definitely in the older version of RHEL guides, etc. I can see that the parameter is nosmt now. You can close the bug. Regards, Salvatore -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#941637: linux-image-4.19.0-6-amd64: noht flag on command line has no effect for 6 core/12 Thread Ryzens
Package: src:linux Version: 4.19.67-2+deb10u1 Severity: important Dear Maintainer, noht has no effect. I have been trying to chase down a weird hang which occurs only on 6 core/12 thread Ryzens (I cannot reproduce it on 4/8 or older CPUs). As a part of that I tried to disable ht. Well, it cannot be disabled - the noht command line arg has no effect whatosever. As ht can be a security hole this may have security implications as well. -- Package-specific info: ** Version: Linux version 4.19.0-6-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.67-2+deb10u1 (2019-09-20) ** Command line: BOOT_IMAGE=/boot/vmlinuz-4.19.0-6-amd64 root=UUID=8eb17efb-6574-42d0-885e-487b98364059 ro mitigations=off noht quiet ** Not tainted ** Kernel log: [4.833468] EDAC amd64: Node 0: DRAM ECC disabled. [4.833470] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.892875] EDAC amd64: Node 0: DRAM ECC disabled. [4.892877] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.932919] EDAC amd64: Node 0: DRAM ECC disabled. [4.932920] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.968846] audit: type=1400 audit(1570086470.642:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-senddoc" pid=638 comm="apparmor_parser" [4.969330] audit: type=1400 audit(1570086470.642:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-xpdfimport" pid=643 comm="apparmor_parser" [4.971460] audit: type=1400 audit(1570086470.642:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash" pid=636 comm="apparmor_parser" [4.972463] pktcdvd: pktcdvd0: writer mapped to sr0 [4.973798] audit: type=1400 audit(1570086470.646:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=639 comm="apparmor_parser" [4.973802] audit: type=1400 audit(1570086470.646:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=639 comm="apparmor_parser" [4.976702] EDAC amd64: Node 0: DRAM ECC disabled. [4.976704] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.977529] audit: type=1400 audit(1570086470.650:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=646 comm="apparmor_parser" [4.977534] audit: type=1400 audit(1570086470.650:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=646 comm="apparmor_parser" [4.977537] audit: type=1400 audit(1570086470.650:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=646 comm="apparmor_parser" [4.977935] audit: type=1400 audit(1570086470.650:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/tcpdump" pid=647 comm="apparmor_parser" [5.036714] EDAC amd64: Node 0: DRAM ECC disabled. [5.036716] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [5.057409] new mount options do not match the existing superblock, will be ignored [5.108619] EDAC amd64: Node 0: DRAM ECC disabled. [5.108621] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [5.130890] fuse init (API version 7.27) [5.164629] EDAC amd64: Node 0: DRAM ECC disabled. [5.164630] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [5.212714] EDAC amd64: Node 0: DRAM ECC disabled. [5.212716] EDAC amd64:
Bug#940820: linux-image-5.2.0-2-amd64: breaks UML all versions, both debian stock and compiled from source.
Looks like the culprit is a different default elf start address on 5.x What changes is not the sbrk(0) or _end - these are pretty much identical as in 4.x. It is the START which after some "fixups" in arch/um/kernel/uml.lds.S becomes __binary_start I do not see an easy way to fix it :( A. On 20/09/2019 15:48, Anton Ivanov wrote: These are the Start (that is what sbrk(0) returns) and &_end values I get for the two kernels: Linux 4.19 on host - Start 1645867008 end 1631412224 diff 14454784 Linux 5.2 on host - Start 93825006145536 end 1631412224 diff 93823374733312 I think the whole logic in UML here is broken because with memory model = large &_end is less than start to start off with so reserving XM gap does not quite make sense. I am going to see if I can sort out the UML side, but I think we still need to check the host kernel side and what is reason for the sudden change in behavior. A. On 20/09/2019 11:12, Anton Ivanov wrote: Package: src:linux Version: 5.2.9-2 Severity: important Dear Maintainer, Any attempt to run UML on a machine running 5.2.9-2 results in: Adding 9382334992 bytes to physical memory to account for exec-shield gap Too few physical memory! Needed=93823417974784, given=547037904896 Running the same UML images on 4.19 debian stock has no issues. A. -- Package-specific info: ** Version: Linux version 5.2.0-2-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-21)) #1 SMP Debian 5.2.9-2 (2019-08-21) ** Command line: BOOT_IMAGE=/boot/vmlinuz-5.2.0-2-amd64 root=UUID=8eb17efb-6574-42d0-885e-487b98364059 ro mitigations=off noht quiet ** Not tainted ** Kernel log: [ 3.684402] input: HD-Audio Generic Front Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input8 [ 3.684490] input: HD-Audio Generic Rear Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input9 [ 3.684555] input: HD-Audio Generic Line as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input10 [ 3.685553] input: HD-Audio Generic Line Out as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input11 [ 3.685627] input: HD-Audio Generic Front Headphone as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input12 [ 3.806626] kvm: Nested Virtualization enabled [ 3.806636] kvm: Nested Paging enabled [ 3.806637] SVM: Virtual VMLOAD VMSAVE supported [ 3.806637] SVM: Virtual GIF supported [ 3.820371] MCE: In-kernel MCE decoding enabled. [ 3.824533] EDAC amd64: Node 0: DRAM ECC disabled. [ 3.824536] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [ 3.872569] pktcdvd: pktcdvd0: writer mapped to sr0 [ 3.900858] EDAC amd64: Node 0: DRAM ECC disabled. [ 3.900860] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [ 3.948661] EDAC amd64: Node 0: DRAM ECC disabled. [ 3.948662] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [ 3.996651] EDAC amd64: Node 0: DRAM ECC disabled. [ 3.996652] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [ 4.002382] audit: type=1400 audit(1568973482.655:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-xpdfimport" pid=706 comm="apparmor_parser" [ 4.002712] audit: type=1400 audit(1568973482.655:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-senddoc" pid=701 comm="apparmor_parser" [ 4.005254] audit: type=1400 audit(1568973482.659:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash" pid=699 comm="apparmor_parser" [ 4.007555] audit: type=1400 audit(1568973482.659:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=702 comm="apparmor_parser" [ 4.007558] audit: type=1400 audit(1568973482.659:6): apparmor="STATUS" operation="profile_load" profile="unconfined"
Bug#940820: linux-image-5.2.0-2-amd64: breaks UML all versions, both debian stock and compiled from source.
Package: src:linux Version: 5.2.9-2 Severity: important Dear Maintainer, Any attempt to run UML on a machine running 5.2.9-2 results in: Adding 9382334992 bytes to physical memory to account for exec-shield gap Too few physical memory! Needed=93823417974784, given=547037904896 Running the same UML images on 4.19 debian stock has no issues. A. -- Package-specific info: ** Version: Linux version 5.2.0-2-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-21)) #1 SMP Debian 5.2.9-2 (2019-08-21) ** Command line: BOOT_IMAGE=/boot/vmlinuz-5.2.0-2-amd64 root=UUID=8eb17efb-6574-42d0-885e-487b98364059 ro mitigations=off noht quiet ** Not tainted ** Kernel log: [3.684402] input: HD-Audio Generic Front Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input8 [3.684490] input: HD-Audio Generic Rear Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input9 [3.684555] input: HD-Audio Generic Line as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input10 [3.685553] input: HD-Audio Generic Line Out as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input11 [3.685627] input: HD-Audio Generic Front Headphone as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input12 [3.806626] kvm: Nested Virtualization enabled [3.806636] kvm: Nested Paging enabled [3.806637] SVM: Virtual VMLOAD VMSAVE supported [3.806637] SVM: Virtual GIF supported [3.820371] MCE: In-kernel MCE decoding enabled. [3.824533] EDAC amd64: Node 0: DRAM ECC disabled. [3.824536] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.872569] pktcdvd: pktcdvd0: writer mapped to sr0 [3.900858] EDAC amd64: Node 0: DRAM ECC disabled. [3.900860] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.948661] EDAC amd64: Node 0: DRAM ECC disabled. [3.948662] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.996651] EDAC amd64: Node 0: DRAM ECC disabled. [3.996652] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.002382] audit: type=1400 audit(1568973482.655:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-xpdfimport" pid=706 comm="apparmor_parser" [4.002712] audit: type=1400 audit(1568973482.655:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-senddoc" pid=701 comm="apparmor_parser" [4.005254] audit: type=1400 audit(1568973482.659:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash" pid=699 comm="apparmor_parser" [4.007555] audit: type=1400 audit(1568973482.659:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=702 comm="apparmor_parser" [4.007558] audit: type=1400 audit(1568973482.659:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=702 comm="apparmor_parser" [4.011004] audit: type=1400 audit(1568973482.663:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=709 comm="apparmor_parser" [4.011007] audit: type=1400 audit(1568973482.663:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=709 comm="apparmor_parser" [4.011009] audit: type=1400 audit(1568973482.663:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=709 comm="apparmor_parser" [4.012542] audit: type=1400 audit(1568973482.667:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/ntpd" pid=705 comm="apparmor_parser" [4.052465] EDAC amd64: Node 0: DRAM ECC disabled. [4.052466] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.132680] EDAC amd64: Node 0: DRAM ECC disabled. [4.132682] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
Bug#940821: linux-image-5.2.0-2-amd64: file cache corruption with nfs4
Package: src:linux Version: 5.2.9-2 Severity: critical Justification: breaks unrelated software Dear Maintainer, NFSv4 caching is completely broken on SMP. How to reproduce: Option 1. clone openwrt, run while make clean && make -j `nproc` ; do true ; done It will break depending on number of CPUs within several runs. Symptoms of breakage. A directory on the client looks empty. Example (mnt is an NFSv4 mount): ls -laF /mnt/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Sep 20 10:51 ./ drwxr-xr-x 3 anivanov anivanov 4096 Sep 20 10:51 ../ While it actually has a file in it (same on server): ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Sep 20 10:51 ./ drwxr-xr-x 3 anivanov anivanov 4096 Sep 20 10:51 ../ -rw-r--r-- 1 anivanov anivanov 32 Sep 20 10:51 ipcbuf.h This cache entry on the client does not expire as it should per the NFSv4 caching documentation - the only way of dealing with it is reboot, unmount or caches drop. Option 2. Have your $HOME on nfsv4 and use thunderbird. Move mails between folders. Sooner or later (usually sooner) you will lose an email. So this is both "breaks unrelated software" and "data loss" depending on what you are doing. Tested on: AMD Ryzen 5 2400G, AMD Ryzen 5 1600X, AMD Ryzen 5 1600, AMD A8-6500 Shows up on all. Fastest on the 6 core 12 thread ryzens, slowest on the AMD A8 (takes up to 3 iterations of make there). Brgds, A. -- Package-specific info: ** Version: Linux version 5.2.0-2-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-21)) #1 SMP Debian 5.2.9-2 (2019-08-21) ** Command line: BOOT_IMAGE=/boot/vmlinuz-5.2.0-2-amd64 root=UUID=8eb17efb-6574-42d0-885e-487b98364059 ro mitigations=off noht quiet ** Not tainted ** Kernel log: [3.684402] input: HD-Audio Generic Front Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input8 [3.684490] input: HD-Audio Generic Rear Mic as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input9 [3.684555] input: HD-Audio Generic Line as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input10 [3.685553] input: HD-Audio Generic Line Out as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input11 [3.685627] input: HD-Audio Generic Front Headphone as /devices/pci:00/:00:08.1/:09:00.3/sound/card0/input12 [3.806626] kvm: Nested Virtualization enabled [3.806636] kvm: Nested Paging enabled [3.806637] SVM: Virtual VMLOAD VMSAVE supported [3.806637] SVM: Virtual GIF supported [3.820371] MCE: In-kernel MCE decoding enabled. [3.824533] EDAC amd64: Node 0: DRAM ECC disabled. [3.824536] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.872569] pktcdvd: pktcdvd0: writer mapped to sr0 [3.900858] EDAC amd64: Node 0: DRAM ECC disabled. [3.900860] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.948661] EDAC amd64: Node 0: DRAM ECC disabled. [3.948662] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [3.996651] EDAC amd64: Node 0: DRAM ECC disabled. [3.996652] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load. Either enable ECC checking or force module loading by setting 'ecc_enable_override'. (Note that use of the override may cause unknown side effects.) [4.002382] audit: type=1400 audit(1568973482.655:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-xpdfimport" pid=706 comm="apparmor_parser" [4.002712] audit: type=1400 audit(1568973482.655:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-senddoc" pid=701 comm="apparmor_parser" [4.005254] audit: type=1400 audit(1568973482.659:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash" pid=699 comm="apparmor_parser" [4.007555] audit: type=1400 audit(1568973482.659:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=702 comm="apparmor_parser" [4.007558] audit: type=1400 audit(1568973482.659:6):
Bug#931500:
Same picture with different NFS minor versions - 4.0, 4.1 Same picture with and without hyperthreading Same picture with and without different mitigations on/off via kernel command line. 100% reproducible within 4-5 repeats of make -j `cat /proc/cpuinfo | grep processor | wc -l` ; make clean on an openwrt tree. Reproducing it on a linux tree takes a bit longer, but it is also reproducible - 10-12 times. So actually the executive summary is - NFS is broken. Completely. That is not level 6 bug, that is a much higher, please adjust priority accordingly. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#931500: Acknowledgement (linux-image-4.19.0-5-amd64: kernel deadlock with autofs)
The most interesting part - it is always the same file. ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm/ipcbuf.h It becomes invisible from the client, but exists in the server. Usually takes ~4-5 builds in a loop to achieve that. A. On 08/07/2019 12:01, Anton Ivanov wrote: On 08/07/2019 11:59, Anton Ivanov wrote: There are clearly some issues with nfs across an autofs mount (maybe for hard mounts as well), so this may warrant an upgrade. Example test. Run make -j 12 ; make clean in a loop on an nfs mounted openwrt tree until it fails (usually 2-3 iterations). State on the client ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ State as seen on the server (mounted via nfs across localhost): ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h State on the filesystem: ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h So actually this looks like the caching on NFS is royally fubar Dropping caches restores things to normal, but that is not a solution. It is a diagnosis. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#931500: Acknowledgement (linux-image-4.19.0-5-amd64: kernel deadlock with autofs)
There are clearly some issues with nfs across an autofs mount (maybe for hard mounts as well), so this may warrant an upgrade. Example test. Run make -j 12 ; make clean in a loop on an nfs mounted openwrt tree until it fails (usually 2-3 iterations). State on the client ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ State as seen on the server (mounted via nfs across localhost): ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h State on the filesystem: ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h So actually this looks like the caching on NFS is royally fubar -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#931500: Acknowledgement (linux-image-4.19.0-5-amd64: kernel deadlock with autofs)
On 08/07/2019 11:59, Anton Ivanov wrote: There are clearly some issues with nfs across an autofs mount (maybe for hard mounts as well), so this may warrant an upgrade. Example test. Run make -j 12 ; make clean in a loop on an nfs mounted openwrt tree until it fails (usually 2-3 iterations). State on the client ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 8 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ State as seen on the server (mounted via nfs across localhost): ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h State on the filesystem: ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm total 12 drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./ drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../ -rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h So actually this looks like the caching on NFS is royally fubar Dropping caches restores things to normal, but that is not a solution. It is a diagnosis. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
Bug#931500: linux-image-4.19.0-5-amd64: kernel deadlock with autofs
Package: src:linux Version: 4.19.37-5 Severity: normal File: linux-image-4.19.0-5-amd64 Dear Maintainer, An attempt to mount an nfs mount via autofs when it is being unmounted sometimes results in a deadlock. This is easier to reproduce with nfsv3. It is more difficult but still possible with nfs4. I have been unable to reproduce it on any CPU with lower number of threads/cores than Ryzen 5 1600 (6/12). It is reliably reproducible on any 6 core 12 thread or higher Ryzen. It is not easy to trigger - usually takes up to 1-2 days of regular mount/unmounts at the normal autofs 5 min unmount interval to do that. It may sometimes happen in less than 30 minutes. In my case the culprit were system stats scripts executed every 5 minutes from cron. Raising the autofs timeout to 600s eliminated the deadlocks. The deadlock is usually hard and it is impossible to use Alt-SysRQ. The only time I managed to obtain a trace it was as follows: Jun 28 12:56:01 sleer kernel: [101497.077162] rcu: INFO: rcu_sched self-detected stall on CPU Jun 28 12:56:01 sleer kernel: [101497.077172] rcu: #0118-...!: (5250 ticks this GP) idle=6fa/1/0x4002 softirq=514095/514095 fqs=175 Jun 28 12:56:01 sleer kernel: [101497.077174] rcu: #011 (t=5250 jiffies g=2596081 q=15) Jun 28 12:56:01 sleer kernel: [101497.077179] rcu: rcu_sched kthread starved for 4900 jiffies! g2596081 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=7 Jun 28 12:56:01 sleer kernel: [101497.077180] rcu: RCU grace-period kthread stack dump: Jun 28 12:56:01 sleer kernel: [101497.077182] rcu_sched R running task 010 2 0x8000 Jun 28 12:56:01 sleer kernel: [101497.077185] Call Trace: Jun 28 12:56:01 sleer kernel: [101497.077192] ? __schedule+0x2a2/0x870 Jun 28 12:56:01 sleer kernel: [101497.077194] schedule+0x28/0x80 Jun 28 12:56:01 sleer kernel: [101497.077196] schedule_timeout+0x16b/0x390 Jun 28 12:56:01 sleer kernel: [101497.077200] ? __next_timer_interrupt+0xc0/0xc0 Jun 28 12:56:01 sleer kernel: [101497.077203] rcu_gp_kthread+0x40d/0x850 Jun 28 12:56:01 sleer kernel: [101497.077205] ? call_rcu_sched+0x20/0x20 Jun 28 12:56:01 sleer kernel: [101497.077207] kthread+0x112/0x130 Jun 28 12:56:01 sleer kernel: [101497.077209] ? kthread_bind+0x30/0x30 Jun 28 12:56:01 sleer kernel: [101497.077211] ret_from_fork+0x1f/0x40 Jun 28 12:56:01 sleer kernel: [101497.077213] NMI backtrace for cpu 8 Jun 28 12:56:01 sleer kernel: [101497.077215] CPU: 8 PID: 21552 Comm: localStorage DB Tainted: GE 4.19.0-5-amd64 #1 Debian 4.19.37-5 Jun 28 12:56:01 sleer kernel: [101497.077216] Hardware name: System manufacturer System Product Name/PRIME B450M-A, BIOS 0604 12/07/2018 Jun 28 12:56:01 sleer kernel: [101497.077217] Call Trace: Jun 28 12:56:01 sleer kernel: [101497.077218] Jun 28 12:56:01 sleer kernel: [101497.077220] dump_stack+0x5c/0x80 Jun 28 12:56:01 sleer kernel: [101497.077223] nmi_cpu_backtrace.cold.4+0x13/0x50 Jun 28 12:56:01 sleer kernel: [101497.077225] ? lapic_can_unplug_cpu.cold.29+0x3b/0x3b Jun 28 12:56:01 sleer kernel: [101497.077227] nmi_trigger_cpumask_backtrace+0xf9/0xfb Jun 28 12:56:01 sleer kernel: [101497.077229] rcu_dump_cpu_stacks+0x9b/0xcb Jun 28 12:56:01 sleer kernel: [101497.077231] rcu_check_callbacks.cold.80+0x1db/0x338 Jun 28 12:56:01 sleer kernel: [101497.077234] ? tick_sched_do_timer+0x60/0x60 Jun 28 12:56:01 sleer kernel: [101497.077236] update_process_times+0x28/0x60 Jun 28 12:56:01 sleer kernel: [101497.077238] tick_sched_handle+0x22/0x60 Jun 28 12:56:01 sleer kernel: [101497.077240] tick_sched_timer+0x37/0x70 Jun 28 12:56:01 sleer kernel: [101497.077241] __hrtimer_run_queues+0x100/0x280 Jun 28 12:56:01 sleer kernel: [101497.077243] hrtimer_interrupt+0x100/0x220 Jun 28 12:56:01 sleer kernel: [101497.077245] ? handle_irq_event+0x47/0x5c Jun 28 12:56:01 sleer kernel: [101497.077247] smp_apic_timer_interrupt+0x6a/0x140 Jun 28 12:56:01 sleer kernel: [101497.077248] apic_timer_interrupt+0xf/0x20 Jun 28 12:56:01 sleer kernel: [101497.077249] Jun 28 12:56:01 sleer kernel: [101497.077251] RIP: 0010:smp_call_function_many+0x1f8/0x250 Jun 28 12:56:01 sleer kernel: [101497.077253] Code: c7 e8 0c c4 5e 00 3b 05 1a 86 01 01 0f 83 8c fe ff ff 48 63 d0 48 8b 0b 48 03 0c d5 00 b7 8c a4 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c8 48 c7 c2 60 e3 b2 a4 4c 89 fe 89 df Jun 28 12:56:01 sleer kernel: [101497.077254] RSP: 0018:b93dc9cd3bb8 EFLAGS: 0202 ORIG_RAX: ff13 Jun 28 12:56:01 sleer kernel: [101497.077256] RAX: RBX: 9309fec22c00 RCX: 9309fea27000 Jun 28 12:56:01 sleer kernel: [101497.077256] RDX: 0001 RSI: RDI: 9309fec22c08 Jun 28 12:56:01 sleer kernel: [101497.077257] RBP: 9309fec22c08 R08: 0004 R09: 9309fec22c48 Jun 28 12:56:01 sleer kernel: [101497.077258] R10: 9309fec22c08 R11: 0008 R12: a3a6ca90 Jun 28 12:56:01 sleer kernel:
Bug#931048: linux-image-4.19.0-4-amd64: bridge MAC learning is broken
Package: src:linux Version: 4.19.28-1 Severity: normal File: linux-image-4.19.0-4-amd64 Dear Maintainer, Bridge MAC learning is completely broken at present. How to reproduce: 1. Build one or more MINIMAL vms or connect machines with MINIMAL installs to interfaces which join to a Linux bridge 2. Observe the bridge fdb using the bridge utility or brctl. 3. Run traffic. Obvious issues: 1. MACs expire even if there are gigabytes of traffic flowing to/from them. The refresh if used is completely broken 2. MACs are not immediately reinstated into the forwarding database if there is traffic upon expiry Observations: This seems to be a result of learning being tightly bound with the idea of neighbour and neighbour discovery code. MACs are learned instantaneously if one of the hosts issues a multicast join - f.e. performs IPv6 neighbour discovery or runs avahi. If either one of these is not present the bridge code does not function as it should. While as an idea this is good it should not completely replace learning from unicast traffic. -- Package-specific info: ** Version: Linux version 4.19.0-4-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-2)) #1 SMP Debian 4.19.28-1 (2019-03-12) ** Command line: BOOT_IMAGE=/boot/vmlinuz-4.19.0-4-amd64 root=UUID=3db3d925-a3d9-4c1d-b63d-c087261f1fb2 ro quiet ** Tainted: WE (8704) * Taint on warning. * Unsigned module has been loaded. ** Kernel log: Unable to read kernel log; any relevant messages should be attached ** Model information sys_vendor: System manufacturer product_name: System Product Name product_version: System Version chassis_vendor: Default string chassis_version: Default string bios_vendor: American Megatrends Inc. bios_version: 0409 board_vendor: ASUSTeK COMPUTER INC. board_name: PRIME B450M-A board_version: Rev X.0x ** Loaded modules: cfg80211(E) bnep(E) nfnetlink_queue(E) nfnetlink_log(E) nfnetlink(E) bluetooth(E) drbg(E) ansi_cprng(E) ecdh_generic(E) squashfs(E) loop(E) ufs(E) qnx4(E) hfsplus(E) hfs(E) minix(E) ntfs(E) msdos(E) jfs(E) xfs(E) dm_mod(E) cpuid(E) uas(E) usb_storage(E) xt_nat(E) xt_tcpudp(E) xt_conntrack(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) ip6table_filter(E) ip6_tables(E) nfsv3(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) iptable_filter(E) veth(E) bridge(E) 8021q(E) garp(E) mrp(E) stp(E) llc(E) fuse(E) tun(E) binfmt_misc(E) nls_ascii(E) eeepc_wmi(E) asus_wmi(E) nls_cp437(E) sparse_keymap(E) rfkill(E) wmi_bmof(E) vfat(E) fat(E) edac_mce_amd(E) uvcvideo(E) videobuf2_vmalloc(E) videobuf2_memops(E) videobuf2_v4l2(E) kvm_amd(E) videobuf2_common(E) ccp(E) amdkfd(E) videodev(E) rng_core(E) media(E) snd_usb_audio(E) joydev(E) snd_usbmidi_lib(E) kvm(E) snd_rawmidi(E) evdev(E) snd_seq_device(E) irqbypass(E) efi_pstore(E) crct10dif_pclmul(E) crc32_pclmul(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) amdgpu(E) ghash_clmulni_intel(E) snd_hda_codec_hdmi(E) efivars(E) snd_hda_intel(E) pcspkr(E) chash(E) snd_hda_codec(E) gpu_sched(E) snd_hda_core(E) ttm(E) snd_hwdep(E) k10temp(E) sp5100_tco(E) snd_pcm_oss(E) snd_mixer_oss(E) drm_kms_helper(E) snd_pcm(E) snd_timer(E) drm(E) snd(E) soundcore(E) sg(E) wmi(E) video(E) button(E) pcc_cpufreq(E) acpi_cpufreq(E) hwmon_vid(E) parport_pc(E) nfsd(E) auth_rpcgss(E) ppdev(E) nfs_acl(E) lockd(E) lp(E) grace(E) parport(E) sunrpc(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) ecb(E) btrfs(E) zstd_decompress(E) zstd_compress(E) xxhash(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) raid0(E) multipath(E) linear(E) raid1(E) md_mod(E) sd_mod(E) hid_generic(E) usbhid(E) hid(E) crc32c_intel(E) aesni_intel(E) aes_x86_64(E) crypto_simd(E) cryptd(E) glue_helper(E) ahci(E) mptsas(E) xhci_pci(E) libahci(E) igb(E) mptscsih(E) r8169(E) i2c_piix4(E) xhci_hcd(E) realtek(E) mptbase(E) i2c_algo_bit(E) libphy(E) libata(E) scsi_transport_sas(E) dca(E) usbcore(E) usb_common(E) scsi_mod(E) gpio_amdpt(E) gpio_generic(E) ** PCI devices: 00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:15d0] Subsystem: ASUSTeK Computer Inc. Device [1043:876b] Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452] Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: Kernel driver in use: pcieport 00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc.
Bug#924460: linux-image-4.19.0-0.bpo.2-amd64: Weird hangs on AMD Ryzen
Package: src:linux Version: 4.19.16-1~bpo9+1 Severity: important Dear Maintainer, Occasional hangs, under X only. During the hang no new processes can be spawned from any terminal windows in the X session, windows which use DRM like firefox, thunderbird, etc do not update. Windows can be moved and it is possible to switch to a new desktop. At the same time the rest of the machine works fine. Switching to a text console works fine and any processes launched from there also work fine. Firefox and other processes relying on DRM during the hang are shown in D state. The machine recovers by itself in less than a minute. The hang frequency is once in a 3-4 hours. I am using an up-todate out of tree it87 version to get the right sensors on the MB. The bug shows both with and without this driver. I also had to pull the most recent firmware from kernel.org for the video. The bug is not observed when using a plug-in video card (Nvidia Quadro 290 NVS) so this looks like something related to DRM or amdgpu power management. -- Package-specific info: ** Version: Linux version 4.19.0-0.bpo.2-amd64 (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)) #1 SMP Debian 4.19.16-1~bpo9+1 (2019-02-07) ** Command line: BOOT_IMAGE=/boot/vmlinuz-4.19.0-0.bpo.2-amd64 root=UUID=3db3d925-a3d9-4c1d-b63d-c087261f1fb2 ro quiet ** Tainted: WOE (12800) * Taint on warning. * Out-of-tree module has been loaded. * Unsigned module has been loaded. ** Kernel log: [665617.595702] CR2: 557931720f18 CR3: 00024536e000 CR4: 003406e0 [665617.595703] Call Trace: [665617.595751] optc1_lock+0x9e/0xb0 [amdgpu] [665617.595796] dcn10_pipe_control_lock.part.25+0x2d/0x70 [amdgpu] [665617.595840] dcn10_apply_ctx_for_surface+0xdf/0x540 [amdgpu] [665617.595883] ? hubbub1_verify_allow_pstate_change_high+0x82/0x1a0 [amdgpu] [665617.595924] dc_commit_state+0x23d/0x550 [amdgpu] [665617.595963] ? set_freesync_on_streams.part.7+0xce/0x2c0 [amdgpu] [665617.596002] ? mod_freesync_set_user_enable+0x16d/0x1b0 [amdgpu] [665617.596046] amdgpu_dm_atomic_commit_tail+0x33e/0xe60 [amdgpu] [665617.596079] ? amdgpu_bo_pin_restricted+0x68/0x280 [amdgpu] [665617.596083] ? _cond_resched+0x16/0x40 [665617.596085] ? wait_for_completion_timeout+0x3b/0x1a0 [665617.596087] ? refcount_inc_checked+0x5/0x30 [665617.596119] ? amdgpu_bo_ref+0x17/0x20 [amdgpu] [665617.596127] commit_tail+0x3d/0x70 [drm_kms_helper] [665617.596133] drm_atomic_helper_commit+0xb4/0x120 [drm_kms_helper] [665617.596147] drm_atomic_connector_commit_dpms+0xe5/0xf0 [drm] [665617.596159] drm_mode_obj_set_property_ioctl+0x247/0x290 [drm] [665617.596170] ? drm_connector_set_obj_prop+0x80/0x80 [drm] [665617.596181] drm_connector_property_set_ioctl+0x3e/0x60 [drm] [665617.596191] drm_ioctl_kernel+0xaa/0xf0 [drm] [665617.596194] ? sock_write_iter+0x87/0x100 [665617.596204] drm_ioctl+0x2ff/0x390 [drm] [665617.596215] ? drm_connector_set_obj_prop+0x80/0x80 [drm] [665617.596217] ? do_iter_write+0xd6/0x180 [665617.596248] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [665617.596251] do_vfs_ioctl+0xa2/0x640 [665617.596254] ? do_sigaction+0xad/0x1e0 [665617.596256] ksys_ioctl+0x70/0x80 [665617.596258] __x64_sys_ioctl+0x16/0x20 [665617.596260] do_syscall_64+0x55/0x110 [665617.596262] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [665617.596264] RIP: 0033:0x7fb56083a017 [665617.596265] Code: 00 00 00 48 8b 05 81 7e 2b 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 7e 2b 00 f7 d8 64 89 01 48 [665617.596266] RSP: 002b:7ffd64cbfd08 EFLAGS: 3246 ORIG_RAX: 0010 [665617.596267] RAX: ffda RBX: RCX: 7fb56083a017 [665617.596268] RDX: 7ffd64cbfd40 RSI: c01064ab RDI: 000e [665617.596269] RBP: 7ffd64cbfd40 R08: 556b0190 R09: 556aff1154d0 [665617.596270] R10: R11: 3246 R12: c01064ab [665617.596270] R13: 000e R14: 556afdb28fb0 R15: 556afd86d580 [665617.596272] ---[ end trace 070aabde88b649c0 ]--- [665929.195580] [drm:generic_reg_wait [amdgpu]] *ERROR* REG_WAIT timeout 1us * 10 tries - optc1_lock line:628 [665929.195675] WARNING: CPU: 4 PID: 15694 at /build/linux-qcc0VE/linux-4.19.16/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:254 generic_reg_wait+0xe5/0x150 [amdgpu] [665929.195676] Modules linked in: 8021q garp mrp stp llc nls_utf8 isofs uas usb_storage fuse ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs dm_mod cpuid nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache binfmt_misc eeepc_wmi asus_wmi sparse_keymap rfkill wmi_bmof nls_ascii uvcvideo nls_cp437 amdkfd vfat videobuf2_vmalloc videobuf2_memops fat videobuf2_v4l2 videobuf2_common efi_pstore videodev edac_mce_amd snd_usb_audio media amdgpu snd_hda_codec_realtek kvm_amd snd_hda_codec_generic joydev ccp snd_usbmidi_lib snd_rawmidi rng_core
Bug#884284: nfs-kernel-server: NFSv4 broken
Package: nfs-kernel-server Version: 1:1.3.4-2.1 Severity: important Dear Maintainer, NFSv4 in stretch is broken and unusable. After some time the server exporting the directories starts throwing [1130732.440356] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [1130734.801510] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [1173981.176268] NFS: nfs4_reclaim_open_state: Lock reclaim failed! messages, read/writes slow down to a crawl and at the end there is no choice but to reboot the server. Restarting nfs-kernel-server, unmounting from all known clients and remouting does not help. I have now been forced to downgrade back to nfsv3 across the board. The same setup works fine with NFSv3. NFSv4 used to work perfectly fine in jessie and before that. I am not sure if this started from the stretch upgrade or after one of the stretch mid-life kernel updates (I think it is the latter). Setup: Standard mid-size classic Linux/Unix multiuser install. Server(s) exporting $HOME and other directories to a local network. Clients mount via autofs when needed. Most directories are mounted from at least 2 (usually more) clients. -- Package-specific info: -- rpcinfo -- program vers proto port service 104 tcp111 portmapper 103 tcp111 portmapper 102 tcp111 portmapper 104 udp111 portmapper 103 udp111 portmapper 102 udp111 portmapper 151 udp 58357 mountd 151 tcp 37131 mountd 152 udp 54135 mountd 152 tcp 32951 mountd 153 udp 47587 mountd 153 tcp 41773 mountd 133 tcp 2049 nfs 134 tcp 2049 nfs 1002273 tcp 2049 133 udp 2049 nfs 134 udp 2049 nfs 1002273 udp 2049 1000211 udp 46283 nlockmgr 1000213 udp 46283 nlockmgr 1000214 udp 46283 nlockmgr 1000211 tcp 40039 nlockmgr 1000213 tcp 40039 nlockmgr 1000214 tcp 40039 nlockmgr 142 udp856 ypserv 141 udp856 ypserv 142 tcp857 ypserv 141 tcp857 ypserv 191 udp866 yppasswdd 6001000691 udp874 fypxfrd 6001000691 tcp875 fypxfrd 172 udp969 ypbind 171 udp969 ypbind 172 tcp970 ypbind 171 tcp970 ypbind 1000241 udp 44513 status 1000241 tcp 58657 status -- /etc/default/nfs-kernel-server -- RPCNFSDCOUNT=8 RPCNFSDPRIORITY=0 RPCMOUNTDOPTS="--manage-gids" NEED_SVCGSSD="" RPCSVCGSSDOPTS="" -- /etc/exports -- /exports 192.168.0.0/16(rw,async,no_root_squash,no_subtree_check,nohide,fsid=root) 127.0.0.0/8(rw,async,no_root_squash,no_subtree_check,nohide,fsid=root) /exports/md0 192.168.0.0/16(rw,async,no_root_squash,no_subtree_check,nohide) 127.0.0.0/8(rw,async,no_root_squash,no_subtree_check,nohide) /exports/md1 192.168.0.0/16(rw,async,no_root_squash,no_subtree_check,nohide) 127.0.0.0/8(rw,async,no_root_squash,no_subtree_check,nohide) /exports/md2 192.168.0.0/16(rw,async,no_root_squash,no_subtree_check,nohide) 127.0.0.0/8(rw,async,no_root_squash,no_subtree_check,nohide) -- /proc/fs/nfs/exports -- # Version 1.1 # Path Client(Flags) # IPs /exports/md0 192.168.0.0/16(rw,no_root_squash,async,wdelay,nohide,no_subtree_check,uuid=a114f04d:9e54427e:b051ce17:4dc02e9f,sec=1) /exports 192.168.0.0/16(rw,no_root_squash,async,wdelay,nohide,no_subtree_check,fsid=0,uuid=a3734f7a:774744b7:b41d4cea:bc2a4f0f,sec=1) /exports 127.0.0.0/8(rw,no_root_squash,async,wdelay,nohide,no_subtree_check,fsid=0,uuid=a3734f7a:774744b7:b41d4cea:bc2a4f0f,sec=1) /exports/md0 127.0.0.0/8(rw,no_root_squash,async,wdelay,nohide,no_subtree_check,uuid=a114f04d:9e54427e:b051ce17:4dc02e9f,sec=1) -- System Information: Debian Release: 9.2 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 4.9.0-4-amd64 (SMP w/2 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages nfs-kernel-server depends on: ii init-system-helpers 1.48 ii keyutils 1.5.9-9 ii libblkid12.29.2-1 ii libc62.24-11+deb9u1 ii libcap2 1:2.25-1 ii libsqlite3-0 3.16.2-5 ii libtirpc10.2.5-1.2 ii libwrap0 7.6.q-26 ii lsb-base 9.20161125 ii netbase 5.4 ii nfs-common 1:1.3.4-2.1 ii ucf 3.0036 nfs-kernel-server recommends no packages. nfs-kernel-server suggests no packages. -- no debconf information
OpenWRT Build Process broken on recent Debian NFS
Hi all, I am observing an interesting issue with the OpenWRT build process when building on an up-to-date stretch host. It no longer works on NFS on debian (it used to work). If I run make with a clean freshly cloned directory tree on a normally mounted filesystem it completes OK. If I do a fresh git clone, mount the filesystem via nfs I get the following: SHELL= flock /var/autofs/local/src/openwrt/tmp/.patch-2.7.5.tar.xz.flock -c ' /var/autofs/local/src/openwrt/scripts/download.pl "/var/autofs/local/src/openwrt/dl" "patch-2.7.5.tar.xz" "e3da7940431633fb65a01b91d3b7a27a" "" "@GNU/patch"' flock: /var/autofs/local/src/openwrt/tmp/.patch-2.7.5.tar.xz.flock: Bad file descriptor Makefile:23: recipe for target '/var/autofs/local/src/openwrt/dl/patch-2.7.5.tar.xz' failed The results are the same if I mount the system via autofs or directly via command line mount. If I run the flock statement "by hand" it completes OK as well so this happens only if it is invoked out of the openwrt build process (I smell a race here somewhere...). I wish I could pinpoint the exact moment it broke. However, as the actual problem is with downloads/stamps it is difficult to determine the actual point in time it stopped working. I tried running the build on a "pristine" stretch with no updates it was already broken so this most likely happened somewhere between jessie and stretch. Any ideas (I do not want to file a Debian bug before narrowing it down)? A.
Bug#752403: linux-image-3.12-0.bpo.1-amd64: gre fragmentation broken
Package: src:linux Version: 3.12.9-1~bpo70+1 Severity: important Dear Maintainer, The following should setup a gre tunnel which has MTU 1500 and fragments gre correctly as needed. ip link add gt0 type gretap remote 10.0.48.1 local 192.168.128.1 ip link set gt0 up ifconfig gt0 mtu 1500 This works fine on 3.2 from wheezy. Well, on 3.12 (and also tested on 3.10 from OpenWRT Barrier Breaker) it does not. For some reason the kernel transmits _ONLY_ the second frag, not the first (big) one. As aresult anything relying on 1500 mtu GRE breaks outright. I have noticed that backports is now @ 3.14, I will retest with that shortly. -- Package-specific info: ** Version: Linux version 3.12-0.bpo.1-amd64 (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.12.9-1~bpo70+1 (2014-02-07) ** Command line: BOOT_IMAGE=/boot/vmlinuz-3.12-0.bpo.1-amd64 root=UUID=49a2baa4-c4fb-4b25-a847-da38aabf6eb4 ro quiet rootdelay=10 ** Not tainted ** Kernel log: [ 121.018798] ppdev: user-space parallel port driver [ 126.617541] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 126.683548] NFSD: starting 90-second grace period (net 81883dc0) [ 153.542230] usb 4-3: USB disconnect, device number 2 [ 153.576943] lenovo_tpkbd 0003:17EF:6009.0002: usb_submit_urb(ctrl) failed: -19 [ 153.577006] lenovo_tpkbd 0003:17EF:6009.0002: usb_submit_urb(ctrl) failed: -19 [ 180.619815] ip_tables: (C) 2000-2006 Netfilter Core Team [ 180.652075] ip6_tables: (C) 2000-2006 Netfilter Core Team [ 180.692646] nf_conntrack version 0.5.0 (16384 buckets, 65536 max) [ 181.426186] tg3 :04:00.0 eth0: Link is down [ 184.867657] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready [ 184.870369] device eth2 entered promiscuous mode [ 185.310452] IPv6: ADDRCONF(NETDEV_CHANGE): tap1: link becomes ready [ 185.310517] br0: port 2(tap1) entered forwarding state [ 185.310540] br0: port 2(tap1) entered forwarding state [ 185.340931] IPv6: ADDRCONF(NETDEV_CHANGE): tap0: link becomes ready [ 185.341022] br0: port 1(tap0) entered forwarding state [ 185.341053] br0: port 1(tap0) entered forwarding state [ 186.308096] br0: port 2(tap1) entered disabled state [ 186.969420] tg3 :04:00.0 eth0: Link is up at 1000 Mbps, full duplex [ 186.969449] tg3 :04:00.0 eth0: Flow control is off for TX and off for RX [ 187.945130] e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [ 187.945457] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready [ 200.392550] br0: port 1(tap0) entered forwarding state [ 251.635353] Key type dns_resolver registered [ 251.665092] NFS: Registering the id_resolver key type [ 251.665138] Key type id_resolver registered [ 251.665143] Key type id_legacy registered [ 499.831589] br0: port 1(tap0) entered disabled state [ 526.554515] device l2tp0 entered promiscuous mode [ 526.554713] br0: port 3(l2tp0) entered forwarding state [ 526.554730] br0: port 3(l2tp0) entered forwarding state [ 541.589078] br0: port 3(l2tp0) entered forwarding state [ 906.452815] device l2tp1 entered promiscuous mode [ 906.452948] br0: port 4(l2tp1) entered forwarding state [ 906.452964] br0: port 4(l2tp1) entered forwarding state [ 921.507056] br0: port 4(l2tp1) entered forwarding state [22276.366266] perf samples too long (2505 2500), lowering kernel.perf_event_max_sample_rate to 5 [25678.723114] ICMPv6 checksum failed [2a01:348:6:4c4::1 2a01:348:6:4c4::2] [33901.361167] nr_pdflush_threads exported in /proc is scheduled for removal [33901.361441] sysctl: The scan_unevictable_pages sysctl/node-interface has been disabled for lack of a legitimate use case. If you have one, please send an email to linux...@kvack.org. [34047.286645] device br0 entered promiscuous mode [34054.745393] device br0 left promiscuous mode [34509.021723] device lo entered promiscuous mode [34523.995709] device lo left promiscuous mode [34565.706977] device lo entered promiscuous mode [34574.849005] device lo left promiscuous mode [94270.499549] ICMPv6 checksum failed [2a01:348:6:4c4::1 2a01:348:6:4c4::2] [100331.491781] ICMPv6 checksum failed [2a01:348:6:4c4::1 2a01:348:6:4c4::2] [106812.568949] ICMPv6 checksum failed [2a01:348:6:4c4::1 2a01:348:6:4c4::2] [126549.143683] CE: hpet increased min_delta_ns to 20115 nsec [126549.143785] CE: hpet increased min_delta_ns to 30172 nsec [126549.143893] CE: hpet increased min_delta_ns to 45258 nsec [156243.297918] br0: port 4(l2tp1) entered disabled state [156243.297967] br0: port 3(l2tp0) entered disabled state [156260.892403] device l2tp1 left promiscuous mode [156260.892419] br0: port 4(l2tp1) entered disabled state [156260.892763] device l2tp0 left promiscuous mode [156260.892769] br0: port 3(l2tp0) entered disabled state [156260.893006] device tap1 left promiscuous mode [156260.893012] br0: port 2(tap1) entered disabled state [156260.893230] device tap0 left promiscuous mode [156260.893235] br0: port 1(tap0) entered
Bug#751215: linux-image-3.12-0.bpo.1-amd64: bridge broken for tunnel interfaces
Package: src:linux Version: 3.12.9-1~bpo70+1 Severity: important Dear Maintainer, Tunnel interfaces using Evernet over LTPv3 are broken for bridge use. Scenario: l2tp0-br0-l2tp1 Arp requests from l2tp0 emits OK Arp request travels across bridge to l2tp1 OK (l2tp packets observed on host) Arp reply on l2tp1 emits OK Arp reply is emitted by br0 as l2tp0, but the packet never ever arrives on the l2tp client. It is eaten by the kernel somewhere. I have used several alternative userspace eol2tp implementations (all working and tested) to test this. Running them under debugger shows that they never get the encapsulated arp reply packet (it for some reason ends up being eaten by kernel). At the same time tcpdump shows it on localhost. As a result anything connected to l2tp0 can ping host (on br0), but cannot ping anything on l2tp1 and vice versa. Additional info not provided by normal scripts - tunnel setup: ip l2tp add tunnel remote 127.0.0.1 local 127.0.0.1 encap udp \ tunnel_id 1 peer_tunnel_id 1 udp_sport 16384 udp_dport 16385 ip l2tp add session name l2tp0 tunnel_id 1 session_id 0x \ peer_session_id 0x \ cookie deadbeefdeadbeef \ peer_cookie beefdeadbeefdead /sbin/ifconfig l2tp0 mtu 1500 up /sbin/brctl addif br0 l2tp0 ip l2tp add tunnel remote 127.0.0.1 local 127.0.0.1 encap udp \ tunnel_id 2 peer_tunnel_id 2 udp_sport 16386 udp_dport 16387 ip l2tp add session name l2tp1 tunnel_id 2 session_id 0x \ peer_session_id 0x \ cookie deadbeefdeadbeef \ peer_cookie beefdeadbeefdead /sbin/ifconfig l2tp1 mtu 1500 up /sbin/brctl addif br0 l2tp1 So if I configure the bridge like that (perfectly legit config) it does not work. -- Package-specific info: ** Version: Linux version 3.12-0.bpo.1-amd64 (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.12.9-1~bpo70+1 (2014-02-07) ** Command line: BOOT_IMAGE=/boot/vmlinuz-3.12-0.bpo.1-amd64 root=UUID=49a2baa4-c4fb-4b25-a847-da38aabf6eb4 ro quiet rootdelay=10 ** Not tainted ** Kernel log: [ 18.661366] l2tp_netlink: L2TP netlink interface [ 18.662553] l2tp_eth: L2TP ethernet pseudowire support (L2TPv3) [ 21.573268] EXT4-fs (md1): mounting ext3 file system using the ext4 subsystem [ 21.615349] EXT4-fs (md1): mounted filesystem with ordered data mode. Opts: (null) [ 21.643764] EXT4-fs (md2): mounting ext3 file system using the ext4 subsystem [ 21.704521] EXT4-fs (md2): mounted filesystem with ordered data mode. Opts: (null) [ 21.728474] EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem [ 21.757964] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null) [ 21.781781] EXT4-fs (sdd1): mounting ext3 file system using the ext4 subsystem [ 21.810221] EXT4-fs (sdd1): mounted filesystem with ordered data mode. Opts: (null) [ 23.570895] tg3 :04:00.0: irq 50 for MSI/MSI-X [ 24.346403] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 26.936826] tg3 :04:00.0 eth0: Link is up at 1000 Mbps, full duplex [ 26.936838] tg3 :04:00.0 eth0: Flow control is on for TX and on for RX [ 26.936877] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 61.093148] Bridge firewalling registered [ 61.141009] tun: Universal TUN/TAP device driver, 1.6 [ 61.141015] tun: (C) 1999-2004 Max Krasnyansky m...@qualcomm.com [ 61.178845] IPv6: ADDRCONF(NETDEV_UP): tap0: link is not ready [ 61.180350] IPv6: ADDRCONF(NETDEV_UP): tap1: link is not ready [ 61.182004] device tap0 entered promiscuous mode [ 61.183701] device tap1 entered promiscuous mode [ 61.209149] br0: port 2(tap1) entered forwarding state [ 61.209163] br0: port 2(tap1) entered forwarding state [ 62.140931] br0: port 2(tap1) entered disabled state [ 98.287522] RPC: Registered named UNIX socket transport module. [ 98.287528] RPC: Registered udp transport module. [ 98.287531] RPC: Registered tcp transport module. [ 98.287534] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 98.331544] FS-Cache: Loaded [ 98.354281] FS-Cache: Netfs 'nfs' registered for caching [ 98.397833] Installing knfsd (copyright (C) 1996 o...@monad.swb.de). [ 98.683597] fuse init (API version 7.22) [ 104.033349] sit: IPv6 over IPv4 tunneling driver [ 120.652842] Bluetooth: Core ver 2.16 [ 120.652897] NET: Registered protocol family 31 [ 120.652902] Bluetooth: HCI device and connection manager initialized [ 120.652922] Bluetooth: HCI socket layer initialized [ 120.652928] Bluetooth: L2CAP socket layer initialized [ 120.652942] Bluetooth: SCO socket layer initialized [ 120.713330] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [ 120.713337] Bluetooth: BNEP filters: protocol multicast [ 120.713352] Bluetooth: BNEP socket layer initialized [ 120.742620] Bluetooth: RFCOMM TTY layer initialized [ 120.742648] Bluetooth:
Bug#663906: linux-image-2.6.32-5-amd64: ksm does not work
Package: linux-2.6 Version: 2.6.32-35squeeze2 Severity: normal enabling ksm by echo 1 /sys/kernel/mm/ksm/run has no effect full_scans are always 0, no increment in any of the other variables -- Package-specific info: ** Version: Linux version 2.6.32-5-amd64 (Debian 2.6.32-35squeeze2) (da...@debian.org) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Fri Sep 9 20:23:16 UTC 2011 ** Command line: BOOT_IMAGE=/boot/vmlinuz-2.6.32-5-amd64 root=UUID=49a2baa4-c4fb-4b25-a847-da38aabf6eb4 ro quiet ** Tainted: P (1) * Proprietary module has been loaded. ** Kernel log: [1888762.816160] sr 0:0:0:0: [sr0] CDB: Read(10): 28 00 00 7e 61 43 00 00 01 00 [1888762.816185] ata1.00: cmd a0/01:00:00:00:08/00:00:00:00:00/a0 tag 0 dma 2048 in [1888762.816188] res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout) [1888762.816195] ata1.00: status: { DRDY } [1888762.816207] ata1: hard resetting link [1888763.136059] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [1888763.160243] ata1.00: configured for UDMA/100 [1888763.165702] ata1: EH complete [1888793.816223] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [1888793.816236] sr 0:0:0:0: [sr0] CDB: Read(10): 28 00 00 7e 61 43 00 00 01 00 [1888793.816260] ata1.00: cmd a0/01:00:00:00:08/00:00:00:00:00/a0 tag 0 dma 2048 in [1888793.816264] res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout) [1888793.816270] ata1.00: status: { DRDY } [1888793.816284] ata1: hard resetting link [1888794.136058] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [1888794.160243] ata1.00: configured for UDMA/100 [1888794.160859] ata1: EH complete [124.816182] ata1.00: limiting speed to UDMA/66:PIO4 [124.816192] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [124.816202] sr 0:0:0:0: [sr0] CDB: Read(10): 28 00 00 7e 61 43 00 00 01 00 [124.816226] ata1.00: cmd a0/01:00:00:00:08/00:00:00:00:00/a0 tag 0 dma 2048 in [124.816230] res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout) [124.816236] ata1.00: status: { DRDY } [124.816249] ata1: hard resetting link [125.136058] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [125.160243] ata1.00: configured for UDMA/66 [125.160812] sr 0:0:0:0: [sr0] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [125.160820] sr 0:0:0:0: [sr0] Sense Key : Aborted Command [current] [descriptor] [125.160827] Descriptor sense data with sense descriptors (in hex): [125.160831] 72 0b 00 00 00 00 00 0e 09 0c 00 00 00 02 00 00 [125.160845] 00 08 00 00 a0 40 [125.160853] sr 0:0:0:0: [sr0] Add. Sense: No additional sense information [125.160860] sr 0:0:0:0: [sr0] CDB: Read(10): 28 00 00 7e 61 43 00 00 01 00 [125.160874] end_request: I/O error, dev sr0, sector 33129740 [125.160907] ata1: EH complete [155.816222] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [155.816235] sr 0:0:0:0: [sr0] CDB: Read(10): 28 00 00 00 05 2c 00 00 01 00 [155.816259] ata1.00: cmd a0/01:00:00:00:08/00:00:00:00:00/a0 tag 0 dma 2048 in [155.816263] res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout) [155.816269] ata1.00: status: { DRDY } [155.816283] ata1: hard resetting link [156.136058] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [156.160244] ata1.00: configured for UDMA/66 [156.160811] ata1: EH complete [1898559.432027] usb 1-6: new high speed USB device using ehci_hcd and address 4 [1898560.344047] hub 1-0:1.0: unable to enumerate USB device on port 6 [1898560.716021] usb 4-2: new full speed USB device using ohci_hcd and address 2 [1898560.918991] usb 4-2: not running at top speed; connect to a high speed hub [1898560.942991] usb 4-2: New USB device found, idVendor=1949, idProduct=0004 [1898560.942998] usb 4-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [1898560.943004] usb 4-2: Product: Amazon Kindle [1898560.943008] usb 4-2: Manufacturer: Amazon [1898560.943012] usb 4-2: SerialNumber: B008D0A112830JDV [1898560.944961] usb 4-2: configuration #1 chosen from 1 choice [1898561.378518] Initializing USB Mass Storage driver... [1898561.378723] scsi7 : SCSI emulation for USB Mass Storage devices [1898561.378972] usbcore: registered new interface driver usb-storage [1898561.378979] USB Mass Storage support registered. [1898561.382766] usb-storage: device found at 2 [1898561.382772] usb-storage: waiting for device to settle before scanning [1898566.382615] usb-storage: device scan complete [1898566.389592] scsi 7:0:0:0: Direct-Access Kindle Internal Storage 0100 PQ: 0 ANSI: 2 [1898566.392415] sd 7:0:0:0: Attached scsi generic sg5 type 0 [1898566.416569] sd 7:0:0:0: [sdc] 6410688 512-byte logical blocks: (3.28 GB/3.05 GiB) [1898566.534602] sd 7:0:0:0: [sdc] Write Protect is off [1898566.534611] sd 7:0:0:0: [sdc] Mode Sense: 0f 00 00 00 [1898566.534616] sd 7:0:0:0: [sdc] Assuming drive cache: write through
Bug#661162: This is fixed in 3.2 from backports
Upgrade to backports fixes that. Brgds, A. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4f5a166d.9050...@kot-begemot.co.uk
Bug#661162: iwl wifi hangs on lots of traffic
Package: linux-2.6 Version: 2.6.32-41 Severity: normal iwl wifi hangs on heavy traffic. Mounting an nfs server via wifi and untarring linux kernel gives a reproducible hang 1 out of 4 times or so. At the same time if the traffic is light it can work for days with no problem. -- Package-specific info: ** Version: Linux version 2.6.32-5-amd64 (Debian 2.6.32-41) (b...@decadent.org.uk) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Mon Jan 16 16:22:28 UTC 2012 ** Command line: BOOT_IMAGE=/boot/vmlinuz-2.6.32-5-amd64 root=UUID=8aa6c209-ca56-4f03-a332-97c130d8cbf8 ro quiet ** Tainted: W (512) * Taint on warning. ** Kernel log: [18243.724048] wlan0: associated [18304.830702] wlan0: deauthenticated from 28:94:0f:75:75:f0 (Reason: 1) [18308.103844] wlan0: direct probe to AP 28:94:0f:75:75:e0 (try 1) [18308.107980] wlan0: direct probe responded [18308.107987] wlan0: authenticate with AP 28:94:0f:75:75:e0 (try 1) [18308.109739] wlan0: authenticated [18308.109776] wlan0: associate with AP 28:94:0f:75:75:e0 (try 1) [18308.113929] wlan0: RX AssocResp from 28:94:0f:75:75:e0 (capab=0x431 status=0 aid=1) [18308.113933] wlan0: associated [18318.117506] wlan0: deauthenticating from 28:94:0f:75:75:e0 by local choice (reason=3) [18321.397240] wlan0: direct probe to AP 28:94:0f:75:75:e0 (try 1) [18321.402371] wlan0: direct probe responded [18321.402377] wlan0: authenticate with AP 28:94:0f:75:75:e0 (try 1) [18321.405751] wlan0: authenticated [18321.405781] wlan0: associate with AP 28:94:0f:75:75:e0 (try 1) [18321.408651] wlan0: RX AssocResp from 28:94:0f:75:75:e0 (capab=0x431 status=0 aid=1) [18321.408656] wlan0: associated [18904.803655] wlan0: deauthenticated from 28:94:0f:75:75:e0 (Reason: 1) [18908.156216] wlan0: direct probe to AP 28:94:0f:75:75:f0 (try 1) [18908.355928] wlan0: direct probe to AP 28:94:0f:75:75:f0 (try 2) [18908.356458] wlan0: direct probe responded [18908.356463] wlan0: authenticate with AP 28:94:0f:75:75:f0 (try 1) [18908.357216] wlan0: authenticated [18908.357240] wlan0: associate with AP 28:94:0f:75:75:f0 (try 1) [18908.555631] wlan0: associate with AP 28:94:0f:75:75:f0 (try 2) [18908.556893] wlan0: RX AssocResp from 28:94:0f:75:75:f0 (capab=0x111 status=0 aid=1) [18908.556897] wlan0: associated [19504.521603] wlan0: deauthenticating from 28:94:0f:75:75:f0 by local choice (reason=3) [19504.561894] wlan0: direct probe to AP 28:94:0f:75:75:f0 (try 1) [19504.562363] wlan0: direct probe responded [19504.562366] wlan0: authenticate with AP 28:94:0f:75:75:f0 (try 1) [19504.563622] wlan0: authenticated [19504.563640] wlan0: associate with AP 28:94:0f:75:75:f0 (try 1) [19504.759735] wlan0: associate with AP 28:94:0f:75:75:f0 (try 2) [19504.762632] wlan0: deauthenticated from 28:94:0f:75:75:f0 (Reason: 6) [19517.708072] wlan0: direct probe to AP 28:94:0f:75:75:f0 (try 1) [19517.709101] wlan0: direct probe responded [19517.709105] wlan0: authenticate with AP 28:94:0f:75:75:f0 (try 1) [19517.904967] wlan0: authenticate with AP 28:94:0f:75:75:f0 (try 2) [19517.905550] wlan0: authenticated [19517.905581] wlan0: associate with AP 28:94:0f:75:75:f0 (try 1) [19517.906740] wlan0: RX AssocResp from 28:94:0f:75:75:f0 (capab=0x111 status=0 aid=1) [19517.906744] wlan0: associated [19560.702855] wlan0: deauthenticating from 28:94:0f:75:75:f0 by local choice (reason=3) [19560.733096] wlan0: direct probe to AP 28:94:0f:75:75:e0 (try 1) [19560.739341] wlan0: direct probe responded [19560.739347] wlan0: authenticate with AP 28:94:0f:75:75:e0 (try 1) [19560.741039] wlan0: authenticated [19560.741071] wlan0: associate with AP 28:94:0f:75:75:e0 (try 1) [19560.744230] wlan0: RX AssocResp from 28:94:0f:75:75:e0 (capab=0x431 status=0 aid=1) [19560.744236] wlan0: associated [19644.170131] svc: failed to register lockdv1 RPC service (errno 97). [20104.699514] wlan0: deauthenticated from 28:94:0f:75:75:e0 (Reason: 1) [20107.943709] wlan0: direct probe to AP 28:94:0f:75:75:e0 (try 1) [20107.947453] wlan0: direct probe responded [20107.947460] wlan0: authenticate with AP 28:94:0f:75:75:e0 (try 1) [20107.949211] wlan0: authenticated [20107.949244] wlan0: associate with AP 28:94:0f:75:75:e0 (try 1) [20107.961955] wlan0: RX AssocResp from 28:94:0f:75:75:e0 (capab=0x431 status=0 aid=1) [20107.961960] wlan0: associated [20704.640487] wlan0: deauthenticated from 28:94:0f:75:75:e0 (Reason: 1) [20707.980510] wlan0: direct probe to AP 28:94:0f:75:75:e0 (try 1) [20707.984186] wlan0: direct probe responded [20707.984192] wlan0: authenticate with AP 28:94:0f:75:75:e0 (try 1) [20707.991442] wlan0: authenticated [20707.991477] wlan0: associate with AP 28:94:0f:75:75:e0 (try 1) [20707.994404] wlan0: RX AssocResp from 28:94:0f:75:75:e0 (capab=0x431 status=0 aid=1) [20707.994409] wlan0: associated [21304.628216] wlan0: deauthenticated from 28:94:0f:75:75:e0 (Reason: 1) [21307.937258] wlan0: direct probe to AP 28:94:0f:75:75:e0 (try 1) [21307.940867] wlan0: direct probe responded [21307.940874] wlan0: authenticate with AP
Bug#661163: e1000e ignores mitigation settings
Package: linux-2.6 Version: 2.6.32-41 Severity: normal Attempts to do QoS (CBQ using tc) show tell-tale signs of interrupt mitigation at work. Bandwidth fluctates and is not measured properly. I have set all applicable settings via ethtool at 0, I have also passed the relevant intel-style parameters to modprobe. TxIntDelay=0x0 TxAbsIntDelay=0x0 RxIntDelay=0x0 RxAbsIntDelay=0x0 InterruptThrottleRate=0x0 Neither one gives any difference. The card continues to behave as if mitigation is enabled. Brgds, -- Package-specific info: ** Version: Linux version 2.6.32-5-amd64 (Debian 2.6.32-41) (b...@decadent.org.uk) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Mon Jan 16 16:22:28 UTC 2012 ** Command line: BOOT_IMAGE=/boot/vmlinuz-2.6.32-5-amd64 root=UUID=8aa6c209-ca56-4f03-a332-97c130d8cbf8 ro quiet ** Tainted: W (512) * Taint on warning. ** Kernel log: [18243.724048] wlan0: associated [18304.830702] wlan0: deauthenticated from 28:94:0f:75:75:f0 (Reason: 1) [18308.103844] wlan0: direct probe to AP 28:94:0f:75:75:e0 (try 1) [18308.107980] wlan0: direct probe responded [18308.107987] wlan0: authenticate with AP 28:94:0f:75:75:e0 (try 1) [18308.109739] wlan0: authenticated [18308.109776] wlan0: associate with AP 28:94:0f:75:75:e0 (try 1) [18308.113929] wlan0: RX AssocResp from 28:94:0f:75:75:e0 (capab=0x431 status=0 aid=1) [18308.113933] wlan0: associated [18318.117506] wlan0: deauthenticating from 28:94:0f:75:75:e0 by local choice (reason=3) [18321.397240] wlan0: direct probe to AP 28:94:0f:75:75:e0 (try 1) [18321.402371] wlan0: direct probe responded [18321.402377] wlan0: authenticate with AP 28:94:0f:75:75:e0 (try 1) [18321.405751] wlan0: authenticated [18321.405781] wlan0: associate with AP 28:94:0f:75:75:e0 (try 1) [18321.408651] wlan0: RX AssocResp from 28:94:0f:75:75:e0 (capab=0x431 status=0 aid=1) [18321.408656] wlan0: associated [18904.803655] wlan0: deauthenticated from 28:94:0f:75:75:e0 (Reason: 1) [18908.156216] wlan0: direct probe to AP 28:94:0f:75:75:f0 (try 1) [18908.355928] wlan0: direct probe to AP 28:94:0f:75:75:f0 (try 2) [18908.356458] wlan0: direct probe responded [18908.356463] wlan0: authenticate with AP 28:94:0f:75:75:f0 (try 1) [18908.357216] wlan0: authenticated [18908.357240] wlan0: associate with AP 28:94:0f:75:75:f0 (try 1) [18908.555631] wlan0: associate with AP 28:94:0f:75:75:f0 (try 2) [18908.556893] wlan0: RX AssocResp from 28:94:0f:75:75:f0 (capab=0x111 status=0 aid=1) [18908.556897] wlan0: associated [19504.521603] wlan0: deauthenticating from 28:94:0f:75:75:f0 by local choice (reason=3) [19504.561894] wlan0: direct probe to AP 28:94:0f:75:75:f0 (try 1) [19504.562363] wlan0: direct probe responded [19504.562366] wlan0: authenticate with AP 28:94:0f:75:75:f0 (try 1) [19504.563622] wlan0: authenticated [19504.563640] wlan0: associate with AP 28:94:0f:75:75:f0 (try 1) [19504.759735] wlan0: associate with AP 28:94:0f:75:75:f0 (try 2) [19504.762632] wlan0: deauthenticated from 28:94:0f:75:75:f0 (Reason: 6) [19517.708072] wlan0: direct probe to AP 28:94:0f:75:75:f0 (try 1) [19517.709101] wlan0: direct probe responded [19517.709105] wlan0: authenticate with AP 28:94:0f:75:75:f0 (try 1) [19517.904967] wlan0: authenticate with AP 28:94:0f:75:75:f0 (try 2) [19517.905550] wlan0: authenticated [19517.905581] wlan0: associate with AP 28:94:0f:75:75:f0 (try 1) [19517.906740] wlan0: RX AssocResp from 28:94:0f:75:75:f0 (capab=0x111 status=0 aid=1) [19517.906744] wlan0: associated [19560.702855] wlan0: deauthenticating from 28:94:0f:75:75:f0 by local choice (reason=3) [19560.733096] wlan0: direct probe to AP 28:94:0f:75:75:e0 (try 1) [19560.739341] wlan0: direct probe responded [19560.739347] wlan0: authenticate with AP 28:94:0f:75:75:e0 (try 1) [19560.741039] wlan0: authenticated [19560.741071] wlan0: associate with AP 28:94:0f:75:75:e0 (try 1) [19560.744230] wlan0: RX AssocResp from 28:94:0f:75:75:e0 (capab=0x431 status=0 aid=1) [19560.744236] wlan0: associated [19644.170131] svc: failed to register lockdv1 RPC service (errno 97). [20104.699514] wlan0: deauthenticated from 28:94:0f:75:75:e0 (Reason: 1) [20107.943709] wlan0: direct probe to AP 28:94:0f:75:75:e0 (try 1) [20107.947453] wlan0: direct probe responded [20107.947460] wlan0: authenticate with AP 28:94:0f:75:75:e0 (try 1) [20107.949211] wlan0: authenticated [20107.949244] wlan0: associate with AP 28:94:0f:75:75:e0 (try 1) [20107.961955] wlan0: RX AssocResp from 28:94:0f:75:75:e0 (capab=0x431 status=0 aid=1) [20107.961960] wlan0: associated [20704.640487] wlan0: deauthenticated from 28:94:0f:75:75:e0 (Reason: 1) [20707.980510] wlan0: direct probe to AP 28:94:0f:75:75:e0 (try 1) [20707.984186] wlan0: direct probe responded [20707.984192] wlan0: authenticate with AP 28:94:0f:75:75:e0 (try 1) [20707.991442] wlan0: authenticated [20707.991477] wlan0: associate with AP 28:94:0f:75:75:e0 (try 1) [20707.994404] wlan0: RX AssocResp from 28:94:0f:75:75:e0 (capab=0x431 status=0 aid=1) [20707.994409] wlan0: associated
Bug#618744: nfsd gets stuck in D state
On 10/02/12 01:38, Jonathan Nieder wrote: Hi, Anton Ivanov wrote: nfsd gets stuck in D state. Initially some machines, later all which read off the nfs server fail to read. Messages like: Mar 17 22:03:34 localhost kernel: [1899559.532028] statd: server rpc.statd not responding, timed out Mar 17 22:03:34 localhost kernel: [1899559.532055] lockd: cannot monitor greebo Mph, that's no good. What kernel do you use now? Any changes? Can you use alt+sysrq; w while in that state to get backtraces for blocked tasks, in case it helps find a deadlock? (You may need to use echo 1/proc/sys/kernel/sysrq before that will work; see Documentation/sysrq.txt for details.) I have upgraded most machines to current stable and I have stopped seeing thi. Hope that helps, Jonathan -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4f380add.3090...@kot-begemot.co.uk
Bug#576405: linux-image-2.6.26: Deadlock during combined NFS3/NFS4 use
On 12/09/11 10:07, Jonathan Nieder wrote: Hi Anton, Anton Ivanov wrote: When an export is exported and mounted via autofs using BOTH NFSv3 and NFSv4 the NFSv4 one deadlocks. We ought to have put this in the hands of upstream about a year ago. Better late than never, so I will echo Ben: Ben Hutchings wrote: OK. Next, can you test whether the kernel version in unstable (linux-image-2.6.32-4-* version 2.6.32-10) or testing (linux-image-2.6.32-3-* version 2.6.32-9) also has this bug? I cannot reproduce it on 2.6.32 with autofs5 (squeeze). You can close it. In general 2.6.26/autofs4 (lenny) was quite fragile with automounted nfs4. nfs4 itself was OK, autofs itself was OK, together the combination was rather explosive. Squeeze fixed all of that. I have yet to observe a single problem with nfs4 + autofs on my squeeze systems. Can you reproduce this with a recent (3.x) kernel? (If so, upstream might care, and if not, we can try to find the fix and backport it.) Thanks for an interesting report, and sorry to have left it hanging. Jonathan -- Humans are allergic to change. They love to say, We've always done it this way. I try to fight that. That's why I have a clock on my wall that runs counter-clockwise. -- R.A. Grace Hopper A. R. Ivanov E-mail: anton.iva...@kot-begemot.co.uk -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4e7b3f3d.2010...@kot-begemot.co.uk
Bug#576405: linux-image-2.6.26: Deadlock during combined NFS3/NFS4 use
On 12/09/11 10:07, Jonathan Nieder wrote: Hi Anton, Anton Ivanov wrote: When an export is exported and mounted via autofs using BOTH NFSv3 and NFSv4 the NFSv4 one deadlocks. We ought to have put this in the hands of upstream about a year ago. Better late than never, so I will echo Ben: Ben Hutchings wrote: OK. Next, can you test whether the kernel version in unstable (linux-image-2.6.32-4-* version 2.6.32-10) or testing (linux-image-2.6.32-3-* version 2.6.32-9) also has this bug? Can you reproduce this with a recent (3.x) kernel? (If so, upstream might care, and if not, we can try to find the fix and backport it.) I will try to find some time to retest with current and 3.x this week. I do not recall seeing it on 2.6.32 lately. I have, however, changed back to v3 a lot of autofs entries so this is not indicative. Thanks for an interesting report, and sorry to have left it hanging. No worries. Jonathan -- Humans are allergic to change. They love to say, We've always done it this way. I try to fight that. That's why I have a clock on my wall that runs counter-clockwise. -- R.A. Grace Hopper A. R. Ivanov E-mail: anton.iva...@kot-begemot.co.uk -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4e6dd52a.9000...@kot-begemot.co.uk
Bug#629428: linux-image-2.6.32-5-686: rtl818x broken for RTL-8185 IEEE 802.11a/b/g
Package: linux-2.6 Version: 2.6.32-34squeeze1 Severity: normal The card is identified correctly, but never manages to associate with an AP. The card is a generic rtl8185 cannibalised out of a Maplin HTC PC bundle. Dmesg self-explanatory. Unfortunately the web site on sourceforge seems a bit blank so I cannot pick up a more recent driver to test build to see if this can be fixed. -- Package-specific info: ** Version: Linux version 2.6.32-5-686 (Debian 2.6.32-34squeeze1) (da...@debian.org) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Wed May 18 07:08:50 UTC 2011 ** Command line: BOOT_IMAGE=/boot/vmlinuz-2.6.32-5-686 root=UUID=ddf5a907-47ff-4167-adc8-f00fd59ed1f8 ro acpi_enforce_resources=lax quiet ** Not tainted ** Kernel log: [2.646116] uhci_hcd :00:10.3: PCI INT B - Link[ALKB] - GSI 21 (level, low) - IRQ 21 [2.646134] uhci_hcd :00:10.3: setting latency timer to 64 [2.646141] uhci_hcd :00:10.3: UHCI Host Controller [2.646160] uhci_hcd :00:10.3: new USB bus registered, assigned bus number 5 [2.646199] uhci_hcd :00:10.3: irq 21, io base 0xf400 [2.646281] usb usb5: New USB device found, idVendor=1d6b, idProduct=0001 [2.646289] usb usb5: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [2.646295] usb usb5: Product: UHCI Host Controller [2.646300] usb usb5: Manufacturer: Linux 2.6.32-5-686 uhci_hcd [2.646306] usb usb5: SerialNumber: :00:10.3 [2.646640] usb usb5: configuration #1 chosen from 1 choice [2.647236] hub 5-0:1.0: USB hub found [2.647276] hub 5-0:1.0: 2 ports detected [2.824026] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [2.988236] ata2.00: ATA-7: Maxtor 6Y080M0, YAR511W0, max UDMA/100 [2.988245] ata2.00: 156301488 sectors, multi 16: LBA48 [3.004253] ata2.00: configured for UDMA/100 [3.004477] scsi 1:0:0:0: Direct-Access ATA Maxtor 6Y080M0 YAR5 PQ: 0 ANSI: 5 [3.303701] sd 1:0:0:0: [sda] 156301488 512-byte logical blocks: (80.0 GB/74.5 GiB) [3.303789] sd 1:0:0:0: [sda] Write Protect is off [3.303796] sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 [3.303832] sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [3.304151] sda: sda1 sda2 [3.326469] sd 1:0:0:0: [sda] Attached SCSI disk [3.599716] PM: Starting manual resume from disk [3.599727] PM: Resume from partition 8:1 [3.599732] PM: Checking hibernation image. [3.600083] PM: Error -22 checking image file [3.600087] PM: Resume from disk failed. [3.674961] kjournald starting. Commit interval 5 seconds [3.674984] EXT3-fs: mounted filesystem with ordered data mode. [5.254134] udev[309]: starting version 164 [5.907667] pci_hotplug: PCI Hot Plug PCI Core version: 0.5 [5.936574] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0 [5.936593] ACPI: Power Button [PWRB] [5.936735] input: Sleep Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0E:00/input/input1 [5.936752] ACPI: Sleep Button [SLPB] [5.936915] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input2 [5.936925] ACPI: Power Button [PWRF] [6.076088] processor LNXCPU:00: registered as cooling_device0 [6.113043] input: PC Speaker as /devices/platform/pcspkr/input/input3 [6.197120] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [6.261327] parport_pc 00:0a: reported by Plug and Play ACPI [6.261383] parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE] [6.458051] cfg80211: Using static regulatory domain info [6.458058] cfg80211: Regulatory domain: US [6.458064] (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp) [6.458072] (2402000 KHz - 2472000 KHz @ 4 KHz), (600 mBi, 2700 mBm) [6.458079] (517 KHz - 519 KHz @ 4 KHz), (600 mBi, 2300 mBm) [6.458087] (519 KHz - 521 KHz @ 4 KHz), (600 mBi, 2300 mBm) [6.458095] (521 KHz - 523 KHz @ 4 KHz), (600 mBi, 2300 mBm) [6.458102] (523 KHz - 533 KHz @ 4 KHz), (600 mBi, 2300 mBm) [6.458110] (5735000 KHz - 5835000 KHz @ 4 KHz), (600 mBi, 3000 mBm) [6.459361] cfg80211: Calling CRDA for country: US [7.418256] rtl8180 :00:08.0: PCI INT A - GSI 16 (level, low) - IRQ 16 [7.659638] phy0: Selected rate control algorithm 'minstrel' [7.661526] phy0: hwaddr 00:e0:46:50:00:40, RTL8185vD + rtl8225z2 [8.002042] ACPI: PCI Interrupt Link [ALKC] enabled at IRQ 22 [8.002065] VIA 82xx Audio :00:11.5: PCI INT C - Link[ALKC] - GSI 22 (level, low) - IRQ 22 [8.002248] VIA 82xx Audio :00:11.5: setting latency timer to 64 [9.498139] Adding 1959888k swap on /dev/sda1. Priority:-1 extents:1 across:1959888k [9.763059] EXT3 FS on sda2, internal journal [9.974774] loop: module loaded [ 10.124184] it87: Found IT8716F chip at 0x290, revision 1 [ 10.124196] it87: in3 is VCC (+5V) [ 10.124200]
Bug#621737: linux-image-2.6.32-5-powerpc: ath ignores regulatory domain setting
On 04/10/11 00:55, Ben Hutchings wrote: On Fri, 2011-04-08 at 13:39 +0100, Anton Ivanov wrote: Package: linux-2.6 Version: 2.6.32-30 Severity: minor ath driver ignores reg domain setting passed via cfg80211 and uses one from EEPROM instead. This setting a lot of cheap cards is CN. As a result the reg domain is set incorrectly (and for some countries illegally). [...] What do you mean by 'passed via cfg80211'? Are you setting the ieee80211_regdom module parameter? Ben. Yes. No effect. ath still reads from eeprom. -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanovai...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4da14eb0.10...@sigsegv.cx
Bug#621737: linux-image-2.6.32-5-powerpc: ath ignores regulatory domain setting
On 04/10/11 16:23, Stefan Lippers-Hollmann wrote: Hi On Sunday 10 April 2011, Anton Ivanov wrote: On 04/10/11 00:55, Ben Hutchings wrote: On Fri, 2011-04-08 at 13:39 +0100, Anton Ivanov wrote: [...] ath driver ignores reg domain setting passed via cfg80211 and uses one from EEPROM instead. This setting a lot of cheap cards is CN. As a result the reg domain is set incorrectly (and for some countries illegally). [...] What do you mean by 'passed via cfg80211'? Are you setting the ieee80211_regdom module parameter? [...] Yes. No effect. ath still reads from eeprom. The EEPROM settings are authoritative, you can only restrict the regulatory settings further to aid regulatory compliance in different regions, but never relax them. Tools like crda always intersect the EEPROM's (OTP in newer chipset generations) with the chosen regulatory domain as provided by wireless-regdb or the in-kernel regdb; regulatory hints like IEEE 802.11d may also restrict the allowed frequencies even further. http://wireless.kernel.org/en/users/Drivers/ath#Regulatory This is intended beaviour and required for FCC compliance (keep in mind that calibration data is also only validated for the given regdomain), not a bug. So a card that returns only CN from EEPROM is basically intended to be sold _ONLY_ in China. Right? Brgds, Regards Stefan Lippers-Hollmann -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanovai...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4da1f416.9000...@sigsegv.cx
Bug#621737: linux-image-2.6.32-5-powerpc: ath ignores regulatory domain setting
OK, cool, thanks, you can close the bug. I agree - thankfully it is not a 5GHz part so no harm done from it reporting CN. Brgds, On 04/10/11 20:08, Stefan Lippers-Hollmann wrote: Hi On Sunday 10 April 2011, Anton Ivanov wrote: On 04/10/11 16:23, Stefan Lippers-Hollmann wrote: Hi On Sunday 10 April 2011, Anton Ivanov wrote: On 04/10/11 00:55, Ben Hutchings wrote: On Fri, 2011-04-08 at 13:39 +0100, Anton Ivanov wrote: [...] Yes. No effect. ath still reads from eeprom. The EEPROM settings are authoritative, you can only restrict the regulatory settings further to aid regulatory compliance in different regions, but never relax them. Tools like crda always intersect the EEPROM's (OTP in newer chipset generations) with the chosen regulatory domain as provided by wireless-regdb or the in-kernel regdb; regulatory hints like IEEE 802.11d may also restrict the allowed frequencies even further. http://wireless.kernel.org/en/users/Drivers/ath#Regulatory This is intended beaviour and required for FCC compliance (keep in mind that calibration data is also only validated for the given regdomain), not a bug. So a card that returns only CN from EEPROM is basically intended to be sold _ONLY_ in China. Right? [...] Correct, it's arguably even illegal to sell in ETSI regions. Although it's technically a little more complex as Atheros groups regdom regions with identical mappings together[1], which makes reading the EEPROM based regulatory domain code a bit strange (the alphabetically first match corresponding to the regdom group gets printed to dmesg). In your particular case, with a 2.4 GHz-only AR2417 PHY, 0x52 (APL1_WORLD vs ETSI1_WORLD, GB) doesn't actually do any harm, as 'CN' allows channel 1-13 just as well as the most permissive regdomains (ch14 in Japan is only allowed for CSMA/CA == 11 MBit/s, not the more common OFDM rates (= 54 MBit/s)). So even though your device is wrongly programmed, it doesn't actually limit your abilities (unless you'd add an additional 5 GHz capable card, which would suffer from an 'unfortunate' intersection) - and neither allows you to access non-public frequency bands. This situation would be seriously worse (both technically and legally) for 5 GHz operations, but your device doesn't support that anyways. country CN: (2402 - 2482 @ 40), (N/A, 20) (5735 - 5835 @ 40), (N/A, 30) country GB: (2402 - 2482 @ 40), (N/A, 20) (5170 - 5250 @ 40), (N/A, 20) (5250 - 5330 @ 40), (N/A, 20), DFS (5490 - 5710 @ 40), (N/A, 27), DFS However I'm aware of the sad truth that most commonly sold cards are wrongly programmed for CN or (worse for 2.4 GHz operations) US... Regards Stefan Lippers-Hollmann [1] http://wireless.kernel.org/en/users/Drivers/ath#line-28 -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanovai...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4da20b48.4070...@sigsegv.cx
Bug#621737: linux-image-2.6.32-5-powerpc: ath ignores regulatory domain setting
Package: linux-2.6 Version: 2.6.32-30 Severity: minor ath driver ignores reg domain setting passed via cfg80211 and uses one from EEPROM instead. This setting a lot of cheap cards is CN. As a result the reg domain is set incorrectly (and for some countries illegally). dmesg selfexplanatory. -- Package-specific info: ** Version: Linux version 2.6.32-5-powerpc (Debian 2.6.32-30) (b...@decadent.org.uk) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 Wed Jan 12 04:47:03 UTC 2011 ** Command line: root=/dev/hda4 ro ** Tainted: W (512) * Taint on warning. ** Kernel log: [37906.170686] PHY ID: 2060e1, addr: 0 [37908.211322] eth1: Airport waking up [37908.659045] hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4 [37908.659178] adb: starting probe task... [37908.660207] hda: UDMA/66 mode selected [37908.664146] hdc: host max PIO4 wanted PIO255(auto-tune) selected PIO4 [37908.664517] hdc: MWDMA2 mode selected [37908.907906] adb devices: [2]: 2 c4 [3]: 3 1 [7]: 7 1f [37908.913808] ADB keyboard at 2, handler 1 [37908.928828] ADB mouse at 3, handler set to 4 (trackpad) [37908.987527] adb: finished probe task... [37909.512236] PM: Finishing wakeup. [37909.512244] Restarting tasks ... done. [37909.986288] ath5k 0001:11:00.0: enabling device ( - 0002) [37909.986396] ath5k 0001:11:00.0: registered as 'phy1' [37910.772036] ath: EEPROM regdomain: 0x809c [37910.772046] ath: EEPROM indicates we should expect a country code [37910.772054] ath: doing EEPROM country-regdmn map search [37910.772061] ath: country maps to regdmn code: 0x52 [37910.772069] ath: Country alpha2 being used: CN [37910.772075] ath: Regpair used: 0x52 [37910.847157] agpgart-uninorth :00:0b.0: putting AGP V2 device into 4x mode [37910.847179] radeonfb :00:10.0: putting AGP V2 device into 4x mode [37910.874578] phy1: Selected rate control algorithm 'minstrel' [37910.878605] ath5k phy1: Atheros AR2417 chip found (MAC: 0xf0, PHY: 0x70) [37910.878634] cfg80211: Calling CRDA for country: CN [37912.612186] ondemand governor failed, too long transition latency of HW, fallback to performance governor [37988.110403] ondemand governor failed, too long transition latency of HW, fallback to performance governor [37988.359273] hda: host max PIO4 wanted PIO0 selected PIO0 [38096.963649] ADDRCONF(NETDEV_UP): wlan0: link is not ready [38102.988503] ADDRCONF(NETDEV_UP): wlan0: link is not ready [38103.192041] ADDRCONF(NETDEV_UP): eth0: link is not ready [38103.454283] ADDRCONF(NETDEV_UP): wlan0: link is not ready [38106.938822] wlan0: direct probe to AP 00:90:4c:91:00:03 (try 1) [38106.940456] wlan0: direct probe responded [38106.940465] wlan0: authenticate with AP 00:90:4c:91:00:03 (try 1) [38106.942155] wlan0: authenticated [38106.942187] wlan0: associate with AP 00:90:4c:91:00:03 (try 1) [38106.98] wlan0: RX AssocResp from 00:90:4c:91:00:03 (capab=0x411 status=0 aid=5) [38106.944457] wlan0: associated [38106.945747] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready [38153.913695] tun0: Disabled Privacy Extensions [38513.727694] pcmcia_socket pcmcia_socket0: pccard: card ejected from slot 0 [38513.763178] wlan0: deauthenticating from 00:90:4c:91:00:03 by local choice (reason=3) [38513.808570] ath5k phy1: failed to wakeup the MAC Chip [38514.658638] ADDRCONF(NETDEV_UP): eth0: link is not ready [38582.438927] cfg80211: Using static regulatory domain info [38582.438937] cfg80211: Regulatory domain: EU [38582.438941] (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp) [38582.438949] (2402000 KHz - 2482000 KHz @ 4 KHz), (600 mBi, 2000 mBm) [38582.438956] (517 KHz - 519 KHz @ 4 KHz), (600 mBi, 2300 mBm) [38582.438964] (519 KHz - 521 KHz @ 4 KHz), (600 mBi, 2300 mBm) [38582.438971] (521 KHz - 523 KHz @ 4 KHz), (600 mBi, 2300 mBm) [38582.438978] (523 KHz - 533 KHz @ 4 KHz), (600 mBi, 2000 mBm) [38582.438985] (549 KHz - 571 KHz @ 4 KHz), (600 mBi, 3000 mBm) [38582.442195] cfg80211: Calling CRDA for country: EU [38582.442360] cfg80211: Calling CRDA for country: EU [38590.296292] pcmcia_socket pcmcia_socket0: pccard: CardBus card inserted into slot 0 [38590.296354] pci 0001:11:00.0: reg 10 32bit mmio: [0x00-0x00] [38590.366503] ath5k 0001:11:00.0: enabling device ( - 0002) [38590.366594] ath5k 0001:11:00.0: registered as 'phy0' [38590.869302] ath: EEPROM regdomain: 0x809c [38590.869309] ath: EEPROM indicates we should expect a country code [38590.869315] ath: doing EEPROM country-regdmn map search [38590.869321] ath: country maps to regdmn code: 0x52 [38590.869326] ath: Country alpha2 being used: CN [38590.869331] ath: Regpair used: 0x52 [38590.870149] phy0: Selected rate control algorithm 'minstrel' [38590.902742] ath5k phy0: Atheros AR2417 chip found (MAC: 0xf0, PHY: 0x70) [38590.902765] cfg80211: Calling CRDA for country: CN [38591.064341] ADDRCONF(NETDEV_UP): wlan0: link is not ready [38593.360236] wlan0: direct probe to AP 00:90:4c:91:00:03 (try 1)
Bug#618744: linux-image-2.6.32-5-amd64: nfsd gets stuck in D state
Package: linux-2.6 Version: 2.6.32-30 Severity: important nfsd gets stuck in D state. Initially some machines, later all which read off the nfs server fail to read. Messages like: Mar 17 22:03:34 localhost kernel: [1899559.532028] statd: server rpc.statd not responding, timed out Mar 17 22:03:34 localhost kernel: [1899559.532055] lockd: cannot monitor greebo appear in the kernel. I tried to restart nfs-kernel-daemon and got: Mar 17 22:26:14 localhost kernel: [1900919.668030] rpcbind: server localhost not responding, timed out Mar 17 22:26:15 localhost kernel: [1900920.452074] INFO: task rpc.nfsd:12072 blocked for more than 120 seconds. Mar 17 22:26:15 localhost kernel: [1900920.452150] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. Mar 17 22:26:15 localhost kernel: [1900920.452226] rpc.nfsd D 0 12072 12061 0x0004 Mar 17 22:26:15 localhost kernel: [1900920.452238] 814611f0 0082 880146d0 88007f803110 Mar 17 22:26:15 localhost kernel: [1900920.452249] 03961016 0001 f9e0 880035d3bfd8 Mar 17 22:26:15 localhost kernel: [1900920.452258] 00015780 00015780 880003863880 880003863b78 Mar 17 22:26:15 localhost kernel: [1900920.452267] Call Trace: Mar 17 22:26:15 localhost kernel: [1900920.452284] [810b9ea8] ? __alloc_pages_nodemask+0x11c/0x5f4 Mar 17 22:26:15 localhost kernel: [1900920.452303] [812fb05a] ? __mutex_lock_common+0x122/0x192 Mar 17 22:26:15 localhost kernel: [1900920.452315] [812fb182] ? mutex_lock+0x1a/0x31 Mar 17 22:26:15 localhost kernel: [1900920.452339] [a0f7ac44] ? write_ports+0x2a/0x28a [nfsd] Mar 17 22:26:15 localhost kernel: [1900920.452348] [810b92e4] ? __get_free_pages+0x9/0x46 Mar 17 22:26:15 localhost kernel: [1900920.452358] [81106ce3] ? simple_transaction_get+0x8c/0xa6 Mar 17 22:26:15 localhost kernel: [1900920.452375] [a0f7ac1a] ? write_ports+0x0/0x28a [nfsd] Mar 17 22:26:15 localhost kernel: [1900920.452392] [a0f7a971] ? nfsctl_transaction_write+0x43/0x64 [nfsd] Mar 17 22:26:15 localhost kernel: [1900920.452409] [a0f7b85a] ? nfsctl_transaction_read+0x27/0x4d [nfsd] Mar 17 22:26:15 localhost kernel: [1900920.452420] [810ef252] ? vfs_read+0xa6/0xff Mar 17 22:26:15 localhost kernel: [1900920.452428] [810ef367] ? sys_read+0x45/0x6e Mar 17 22:26:15 localhost kernel: [1900920.452437] [81010b42] ? system_call_fastpath+0x16/0x1b Mar 17 22:26:44 localhost kernel: [1900949.668032] rpcbind: server localhost not responding, timed out The machine is stable, has been used for around a year in the current hardware config with 2.6.26 and 2.6.32.bpo My suspicion is offload in e1000. Something similar to this one: http://www.mail-archive.com/e1000-devel@lists.sourceforge.net/msg02489.html I have turned off all offload except checksumming for the time being. -- Package-specific info: ** Version: Linux version 2.6.32-5-amd64 (Debian 2.6.32-30) (b...@decadent.org.uk) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Wed Jan 12 03:40:32 UTC 2011 ** Command line: BOOT_IMAGE=/boot/vmlinuz-2.6.32-5-amd64 root=UUID=49a2baa4-c4fb-4b25-a847-da38aabf6eb4 ro quiet ** Tainted: P (1) * Proprietary module has been loaded. ** Kernel log: [4.200149] raid0: == UNIQUE [4.200150] raid0: 1 zones [4.200151] raid0: looking at sdb3 [4.200153] raid0: comparing sdb3(1297270784) [4.200155] with sda3(1297270784) [4.200156] raid0: EQUAL [4.200158] raid0: FINAL 1 zones [4.200162] raid0: done. [4.200164] raid0 : md_size is 2594541568 sectors. [4.200167] *** md2 configuration * [4.200168] zone0=[sda3/sdb3/] [4.200172] zone offset=0kb device offset=0kb size=1297270784kb [4.200173] ** [4.200174] [4.200225] md2: detected capacity change from 0 to 1328405282816 [4.202838] md2: unknown partition table [4.461090] kjournald starting. Commit interval 5 seconds [4.461102] EXT3-fs: mounted filesystem with ordered data mode. [5.654925] udev[404]: starting version 164 [6.032719] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input2 [6.032728] ACPI: Power Button [PWRB] [6.032795] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3 [6.032798] ACPI: Power Button [PWRF] [6.038008] input: PC Speaker as /devices/platform/pcspkr/input/input4 [6.046956] processor LNXCPU:00: registered as cooling_device0 [6.047038] pci_hotplug: PCI Hot Plug PCI Core version: 0.5 [6.150018] EDAC MC: Ver: 2.1.0 Jan 12 2011 [6.170251] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [6.208884] parport_pc 00:07: reported by Plug and Play ACPI [6.208964] parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE] [6.295118] EDAC amd64_edac: Ver: 3.2.0
Bug#613225: Acknowledgement (linux-image-2.6.32: fails to suspend to RAM)
You can close this one or downgrade it. There is a problem somewhere in the suspend/resume, but it is too esoteric and hard to reproduce. Upgrade to KDE4 left a total mess in various KDE cache files resulting in the machine keeping open files on autofs NFS across a VPN link and not unmounting (as it should) stuff before trying to suspend. So on the negative side - if there are files open on an automounted NFS mount there may be circumstances where the kernel can go berserk when it tries to suspend. On the positive side - once the KDE cache, history, etc was cleared and the files were not being accessed any more the machine started suspending properly once again. I have not been able to reproduce this issue from that point onwards. Brgds, On 13/02/11 16:03, Debian Bug Tracking System wrote: Thank you for filing a new Bug report with Debian. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): Debian Kernel Teamdebian-kernel@lists.debian.org If you wish to submit further information on this problem, please send it to 613...@bugs.debian.org. Please do not send mail to ow...@bugs.debian.org unless you wish to report a problem with the Bug-tracking system. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4d6a4887.1010...@sigsegv.cx
Bug#613225: linux-image-2.6.32: fails to suspend to RAM
Package: linux-2.6 Version: 2.6.32-30 Severity: normal File: linux-image-2.6.32 Current .32 and last BPO for lenny prior to squeeze release fail to suspend to RAM most of the time. Up to 2.6.32-15~bpo50+1 suspend was flawless. The machine is a TiBook G4 1GHz with 1G of RAM and a new harddrive. -- Package-specific info: ** Version: Linux version 2.6.32-5-powerpc (Debian 2.6.32-30) (b...@decadent.org.uk) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 Wed Jan 12 04:47:03 UTC 2011 ** Command line: root=/dev/hda4 ro ** Not tainted ** Kernel log: [ 19.862778] airport 0.0003:radio: WEP supported, 104-bit key [ 19.870813] airport 0.0003:radio: WPA-PSK supported [ 22.229679] Adding 1464836k swap on /dev/hda3. Priority:-1 extents:1 across:1464836k [ 22.540964] EXT3 FS on hda4, internal journal [ 23.024549] loop: module loaded [ 23.146664] SCSI subsystem initialized [ 23.596933] irq: irq 1 on host /pci@f200/mac-io@17/interrupt-controller@4 mapped to virtual irq 30 [ 23.596974] irq: irq 2 on host /pci@f200/mac-io@17/interrupt-controller@4 mapped to virtual irq 31 [ 23.604578] irq: irq 61 on host /pci@f200/mac-io@17/interrupt-controller@4 mapped to virtual irq 61 [ 24.019623] input: PowerMac Beep as /devices/pci0001:10/0001:10:17.0/input/input5 [ 26.930734] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 27.427036] RPC: Registered udp transport module. [ 27.433683] RPC: Registered tcp transport module. [ 27.440302] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 27.518815] Slow work thread pool: Starting up [ 27.525637] Slow work thread pool: Ready [ 27.532918] FS-Cache: Loaded [ 27.652868] FS-Cache: Netfs 'nfs' registered for caching [ 27.743174] Installing knfsd (copyright (C) 1996 o...@monad.swb.de). [ 35.221394] ondemand governor failed, too long transition latency of HW, fallback to performance governor [ 54.138182] svc: failed to register lockdv1 RPC service (errno 97). [ 54.148002] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 54.188930] NFSD: starting 90-second grace period [ 56.203753] hda: host max PIO4 wanted PIO0 selected PIO0 [ 58.267898] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 60.748975] Bluetooth: Core ver 2.15 [ 60.762837] NET: Registered protocol family 31 [ 60.770678] Bluetooth: HCI device and connection manager initialized [ 60.778526] Bluetooth: HCI socket layer initialized [ 60.838194] Bluetooth: L2CAP ver 2.14 [ 60.845924] Bluetooth: L2CAP socket layer initialized [ 60.896361] Bluetooth: RFCOMM TTY layer initialized [ 60.904497] Bluetooth: RFCOMM socket layer initialized [ 60.912310] Bluetooth: RFCOMM ver 1.11 [ 61.010434] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [ 61.018283] Bluetooth: BNEP filters: protocol multicast [ 61.226590] Bridge firewalling registered [ 61.331182] Bluetooth: SCO (Voice Link) ver 0.6 [ 61.339212] Bluetooth: SCO socket layer initialized [ 61.747515] lp: driver loaded but no devices found [ 64.940332] nf_conntrack version 0.5.0 (16120 buckets, 64480 max) [ 64.959457] CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use [ 64.967574] nf_conntrack.acct=1 kernel parameter, acct=1 nf_conntrack module option or [ 64.975699] sysctl net.netfilter.nf_conntrack_acct=1 to enable it. [ 65.205175] ip_tables: (C) 2000-2006 Netfilter Core Team [ 65.717038] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 66.171018] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 66.395982] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 68.261963] radeonfb :00:10.0: Invalid ROM contents [ 68.262528] radeonfb :00:10.0: Invalid ROM contents [ 68.421552] [drm] Initialized drm 1.1.0 20060810 [ 68.654781] [drm] radeon defaulting to userspace modesetting. [ 68.660143] [drm] Initialized radeon 1.32.0 20080528 for :00:10.0 on minor 0 [ 68.832689] eth1: Lucent/Agere firmware doesn't support manual roaming [ 69.047762] agpgart-uninorth :00:0b.0: putting AGP V2 device into 4x mode [ 69.047783] radeonfb :00:10.0: putting AGP V2 device into 4x mode [ 69.313279] [drm] Setting GART location based on new memory map [ 69.313426] [drm] Loading R200 Microcode [ 69.315238] platform radeon_cp.0: firmware: requesting radeon/R200_cp.bin [ 69.389693] [drm] writeback test succeeded in 1 usecs [ 73.564031] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 73.909699] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 74.085192] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 75.425353] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 75.753039] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 76.267197] eth1: Lucent/Agere firmware doesn't support manual roaming [ 82.122204] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 82.314088] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 82.410441] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 83.511624] ADDRCONF(NETDEV_UP): eth1:
Bug#611622: linux-image-2.6.32-bpo.5-686: VM problems
Hi Ben, You were correct. It is offload and it is X and/or pulse which is throwing enough TCP at the system to trigger the memory allocation failures. You can close the bug now. Turning off all offloads except checksumming looks like a valid workaround. I have had the system running for a while. The memory allocation failures should have shown up by now. It may be worth it to have an init script as a part of the ethtool package which sets offloads and defaults to turning off segmentation offloads at if there is no swap. I will be happy to write it, if you and the ethtool maintainer think it is a good idea. Brgds, -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ai...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4d52c09a.70...@sigsegv.cx
Bug#611622: linux-image-2.6.32-bpo.5-686: VM problems
Ben Hutchings wrote: On Sat, 2011-02-05 at 13:45 +, Anton Ivanov wrote: Ben Hutchings wrote: On Mon, 2011-01-31 at 11:16 +, Anton Ivanov wrote: Package: linux-2.6 Version: 2.6.32-30~bpo50+1 Severity: normal I keep getting VM failure messages. I suspect the machine is simply a bit too slow for the network card which is in it. It is a via Nehemia at 1.7GHz with an extra Intel GigE server adapter. The backtraces look like showing problems in the network receive/xmit routines. This is an allocation failure for a *huge* allocation (order 5 = 128 KB chunk) in atomic (non-sleeping) context. I think this may be related to (1) use of GRO on the receive path to coalesce packets (2) a netfilter/iptables rule that requires the packet to be duplicated, or requires the contents to be made contiguous. 1. Do you mean gso? I do not see gro as an option on ethtool. I mean what I said. Install ethtool from squeeze. Understood. Will test and submit results. 2. I think I know the culprit. I have recently made the machine to double up as a X-term. Some pixmap updates can easily pass around chunks that size. I have a couple of other systems with similar hardware so I will see if I can reproduce it with them. That doesn't require contiguous blocks. But it will still reduce the amount of free memory. 3. While the machine has a few netfilter rules they are all on another interface (towards a wifi AP) and it does not do any NAT so no need to reconstruct packets. That's strange. The only traffic of notice the machine has is NFS, Xterm and a bit of mysql from time to time. NFS is mostly read and clients use -orsize=4096 [snip] Really, you think Linux hasn't improved in 7 years? Oh it has. It is now much better on handling failed hardware/hardware gone away. Fair point. I will test how exactly does it look if you swap to a device and the device suddenly goes away nowdays. Ben. Brgds, -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ai...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4d4e572a.4040...@sigsegv.cx
Bug#611622: linux-image-2.6.32-bpo.5-686: VM problems
Ben Hutchings wrote: On Mon, 2011-01-31 at 11:16 +, Anton Ivanov wrote: Package: linux-2.6 Version: 2.6.32-30~bpo50+1 Severity: normal I keep getting VM failure messages. I suspect the machine is simply a bit too slow for the network card which is in it. It is a via Nehemia at 1.7GHz with an extra Intel GigE server adapter. The backtraces look like showing problems in the network receive/xmit routines. This is an allocation failure for a *huge* allocation (order 5 = 128 KB chunk) in atomic (non-sleeping) context. I think this may be related to (1) use of GRO on the receive path to coalesce packets (2) a netfilter/iptables rule that requires the packet to be duplicated, or requires the contents to be made contiguous. 1. Do you mean gso? I do not see gro as an option on ethtool. 2. I think I know the culprit. I have recently made the machine to double up as a X-term. Some pixmap updates can easily pass around chunks that size. I have a couple of other systems with similar hardware so I will see if I can reproduce it with them. 3. While the machine has a few netfilter rules they are all on another interface (towards a wifi AP) and it does not do any NAT so no need to reconstruct packets. The machine is swapless and is used mostly as an NFS server. It was not showing this behaviour under 2.6.26 [...] Probably because e1000 did not use LRO or GRO there. You can test this by turning off GRO with 'ethtool -K eth0 gro off'. However I would also recommend configuring the machine with some swap space. The kernel has trouble defragmenting memory without swapping. It is my always-on server with everything raid-ed. If I configure swap the reliability is out of the window. I did that mistake once a while back (7 years or so) and it ended up with some serious damage. The only to get swap for it is hardware RAID. Ben. -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ai...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4d4d545e.3010...@sigsegv.cx
Bug#611622: linux-image-2.6.32-bpo.5-686: VM problems
Package: linux-2.6 Version: 2.6.32-30~bpo50+1 Severity: normal I keep getting VM failure messages. I suspect the machine is simply a bit too slow for the network card which is in it. It is a via Nehemia at 1.7GHz with an extra Intel GigE server adapter. The backtraces look like showing problems in the network receive/xmit routines. The machine is swapless and is used mostly as an NFS server. It was not showing this behaviour under 2.6.26 Best Regards, -- Package-specific info: ** Version: Linux version 2.6.32-bpo.5-686 (Debian 2.6.32-30~bpo50+1) (norb...@tretkowski.de) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) #1 SMP Tue Jan 18 23:27:36 UTC 2011 ** Command line: auto BOOT_IMAGE=Lin_2.6.32-bpo ro root=900 acpi_enforce_resources=lax ** Not tainted ** Kernel log: [120567.850253] HighMem: 1*4kB 5*8kB 0*16kB 1*32kB 2*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 204kB [120567.850276] 171605 total pagecache pages [120567.850281] 0 pages in swap cache [120567.850286] Swap cache stats: add 0, delete 0, find 0/0 [120567.850291] Free swap = 0kB [120567.850295] Total swap = 0kB [120567.867575] 245472 pages RAM [120567.867584] 19186 pages HighMem [120567.867588] 3410 pages reserved [120567.867592] 29182 pages shared [120567.867596] 216768 pages non-shared [120567.867696] swapper: page allocation failure. order:5, mode:0x4020 [120567.867705] Pid: 0, comm: swapper Not tainted 2.6.32-bpo.5-686 #1 [120567.867710] Call Trace: [120567.867731] [c108c099] ? __alloc_pages_nodemask+0x484/0x4d9 [120567.867743] [c108c0fa] ? __get_free_pages+0xc/0x17 [120567.867752] [c10ae8fe] ? __kmalloc+0x30/0x128 [120567.867763] [c11d29ec] ? pskb_expand_head+0x4f/0x157 [120567.867772] [c11d2e2f] ? __pskb_pull_tail+0x40/0x1f6 [120567.867786] [c11d9ad4] ? dev_queue_xmit+0xe4/0x38e [120567.867801] [c11fb191] ? ip_finish_output+0x0/0x5c [120567.867810] [c11fb156] ? ip_finish_output2+0x187/0x1c2 [120567.867820] [c11fa657] ? ip_local_out+0x15/0x17 [120567.867829] [c11fae38] ? ip_queue_xmit+0x31e/0x379 [120567.867838] [c10add2d] ? __slab_alloc+0x97/0x431 [120567.867849] [c126e058] ? _spin_lock_bh+0x8/0x1e [120567.867870] [f8775a86] ? __nf_ct_refresh_acct+0x66/0xa4 [nf_conntrack] [120567.867884] [c1209e8e] ? tcp_transmit_skb+0x595/0x5cc [120567.867894] [c120bef2] ? tcp_write_xmit+0x7a3/0x874 [120567.867903] [c1207926] ? tcp_ack+0x1611/0x1802 [120567.867912] [c120921b] ? tcp_established_options+0x1d/0x8b [120567.867921] [c12094df] ? tcp_current_mss+0x38/0x53 [120567.867931] [c120c009] ? __tcp_push_pending_frames+0x1e/0x50 [120567.867940] [c1207b32] ? tcp_data_snd_check+0x1b/0xd2 [120567.867949] [c12081d1] ? tcp_rcv_established+0xd2/0x626 [120567.867960] [c120e958] ? tcp_v4_do_rcv+0x15f/0x2cf [120567.867970] [c120ee9a] ? tcp_v4_rcv+0x3d2/0x602 [120567.867980] [c11f71d6] ? ip_local_deliver_finish+0x10c/0x18c [120567.867989] [c11f6dfc] ? ip_rcv_finish+0x2c4/0x2d8 [120567.867999] [c11d8d99] ? netif_receive_skb+0x3bb/0x3d6 [120567.868095] [f7c6ca2c] ? e1000_clean_rx_irq+0x351/0x400 [e1000] [120567.868130] [f7c703c6] ? e1000_clean+0x29f/0x40d [e1000] [120567.868142] [c104684c] ? hrtimer_get_next_event+0x8c/0xa0 [120567.868155] [c103b2df] ? get_next_timer_interrupt+0x190/0x1fb [120567.868165] [c1007569] ? sched_clock+0x5/0x7 [120567.868175] [c1047beb] ? sched_clock_local+0x15/0x11b [120567.868184] [c11d9319] ? net_rx_action+0x96/0x194 [120567.868196] [c10354dc] ? __do_softirq+0xaa/0x151 [120567.868205] [c10355b4] ? do_softirq+0x31/0x3c [120567.868213] [c103568a] ? irq_exit+0x26/0x58 [120567.868225] [c1004699] ? do_IRQ+0x78/0x89 [120567.868234] [c10037f0] ? common_interrupt+0x30/0x38 [120567.868250] [c101a818] ? native_safe_halt+0x2/0x3 [120567.868259] [c1008597] ? default_idle+0x3c/0x5a [120567.868267] [c1002389] ? cpu_idle+0x89/0xa4 [120567.868279] [c13bf7fc] ? start_kernel+0x318/0x31d [120567.868284] Mem-Info: [120567.868288] DMA per-cpu: [120567.868293] CPU0: hi:0, btch: 1 usd: 0 [120567.868298] Normal per-cpu: [120567.868304] CPU0: hi: 186, btch: 31 usd: 43 [120567.868308] HighMem per-cpu: [120567.868314] CPU0: hi: 18, btch: 3 usd: 2 [120567.868327] active_anon:21435 inactive_anon:28240 isolated_anon:0 [120567.868331] active_file:7502 inactive_file:163906 isolated_file:0 [120567.868334] unevictable:0 dirty:30 writeback:0 unstable:0 [120567.868338] free:4413 slab_reclaimable:3626 slab_unreclaimable:1704 [120567.868341] mapped:1903 shmem:189 pagetables:547 bounce:0 [120567.868359] DMA free:3556kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:904kB active_file:8kB inactive_file:10832kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15804kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:104kB slab_unreclaimable:248kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [120567.868375] lowmem_reserve[]: 0 861 935 935 [120567.868397] Normal
Bug#610859: linux-image-2.6.32-bpo.5: Sound unusable on 1GHz Powerbook (TiBook)
Package: linux-2.6 Version: 2.6.32-15~bpo50+1 Severity: normal File: linux-image-2.6.32-bpo.5 Sound module which is unusable. Any playback starts OK, but gets interrupted with hissing/ticking by other activity. Moving windows, disk activity, etc cause sound interruptions up to a couple of seconds in length. With hands off there is the occasional hiss-n-tick I have tried both the AOA and powermac sound modules. AOA does not detect the onboard sound card. Powermac detects it, but hisses/ticks. I am observing the same with 2.6.26 and with 2.6.32 from backports. -- Package-specific info: ** Version: Linux version 2.6.32-bpo.5-powerpc (Debian 2.6.32-15~bpo50+1) (norb...@tretkowski.de) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) #1 Sat Jun 12 11:36:21 UTC 2010 ** Command line: root=/dev/hda4 ro ** Not tainted ** Kernel log: [ 119.408798] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 119.634006] eth1: Lucent/Agere firmware doesn't support manual roaming [ 119.705128] eth1: New link status: Connected (0001) [ 119.705685] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready [ 128.768782] tun: Universal TUN/TAP device driver, 1.6 [ 128.768793] tun: (C) 1999-2004 Max Krasnyansky m...@qualcomm.com [ 128.771214] tun0: Disabled Privacy Extensions [ 130.690459] eth1: no IPv6 routers present [ 192.420338] eth1: New link status: AP Out of Range (0004) [ 192.476681] irq: irq 1 on host /pci@f200/mac-io@17/interrupt-controller@4 mapped to virtual irq 30 [ 192.476723] irq: irq 2 on host /pci@f200/mac-io@17/interrupt-controller@4 mapped to virtual irq 31 [ 192.522927] eth1: New link status: AP In Range (0005) [ 192.577697] irq: irq 61 on host /pci@f200/mac-io@17/interrupt-controller@4 mapped to virtual irq 61 [ 192.991053] input: PowerMac Beep as /devices/pci0001:10/0001:10:17.0/input/input5 [ 309.309899] eth1: New link status: AP Out of Range (0004) [ 310.245024] eth1: New link status: AP In Range (0005) [ 453.575083] eth1: New link status: AP Out of Range (0004) [ 453.673571] eth1: New link status: AP In Range (0005) [ 463.199890] eth1: New link status: AP Out of Range (0004) [ 463.298378] eth1: New link status: AP In Range (0005) [ 914.766359] eth1: New link status: AP Out of Range (0004) [ 914.867941] eth1: New link status: AP In Range (0005) [ 1760.135494] input: PowerMac Beep as /devices/pci0001:10/0001:10:17.0/input/input6 [ 2101.517585] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 2101.811146] eth1: New link status: Connected (0001) [ 2101.811702] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready [ 2107.495653] eth1: New link status: Disconnected (0002) [ 2108.696631] eth1: New link status: Connected (0001) [ 2112.075720] eth1: no IPv6 routers present [ 2112.875570] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 2113.011062] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 2113.250253] eth1: Lucent/Agere firmware doesn't support manual roaming [ 2113.941866] eth1: New link status: Connected (0001) [ 2113.942576] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready [ 2119.300944] eth1: New link status: Disconnected (0002) [ 2120.489911] eth1: New link status: Connected (0001) [ 2123.945793] eth1: no IPv6 routers present [ 2139.908804] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 2140.086915] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 2140.347740] eth1: Lucent/Agere firmware doesn't support manual roaming [ 2140.589689] eth1: New link status: Connected (0001) [ 2140.590395] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready [ 2146.189335] eth1: New link status: Disconnected (0002) [ 2147.406630] eth1: New link status: Connected (0001) [ 2151.312134] eth1: no IPv6 routers present [ 2304.962355] ADDRCONF(NETDEV_UP): eth1: link is not ready [ 2305.191994] eth1: New link status: Connected (0001) [ 2305.192556] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready [ 2310.687020] eth1: New link status: Disconnected (0002) [ 2312.007900] eth1: New link status: Connected (0001) [ 2315.490011] eth1: no IPv6 routers present [ 2316.217673] usb 2-1: new full speed USB device using ohci_hcd and address 2 [ 2316.426840] usb 2-1: New USB device found, idVendor=19d2, idProduct=0103 [ 2316.426856] usb 2-1: New USB device strings: Mfr=3, Product=2, SerialNumber=4 [ 2316.426866] usb 2-1: Product: ZTE WCDMA Technologies MSM [ 2316.426874] usb 2-1: Manufacturer: ZTE,Incorporated [ 2316.426882] usb 2-1: SerialNumber: P673A3H3GD01 [ 2316.428864] usb 2-1: configuration #1 chosen from 1 choice [ 2317.081553] Initializing USB Mass Storage driver... [ 2317.084319] scsi0 : SCSI emulation for USB Mass Storage devices [ 2317.086268] usbcore: registered new interface driver usb-storage [ 2317.086278] USB Mass Storage support registered. [ 2317.093232] usb-storage: device found at 2 [ 2317.093239] usb-storage: waiting for device to settle before scanning [ 2318.651190] usb 2-1: USB disconnect, address 2 [ 2319.080219] usb 2-1: new full speed USB device using ohci_hcd and address 3 [
Bug#610859: linux-image-2.6.32-bpo.5: Sound unusable on 1GHz Powerbook (TiBook)
Ben Hutchings wrote: On Sun, 2011-01-23 at 12:55 +, Anton Ivanov wrote: Package: linux-2.6 Version: 2.6.32-15~bpo50+1 Why are you using a 7 month old version? Try the current version (2.6.32-30). Ben. The current one does not manifest the problem. You can close this bug. Apologies, I should have updated to latest first instead of looking at the sources and wondering why the hell is it doing this :) Brgds, -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4d3c672b.9090...@sigsegv.cx
Bug#534444: Acknowledgement (linux-image-2.6.26-2-486: CBQ broken)
I thought we had it closed :) I have not tried with anything past 2.6.26. 2.6.26 as discussed before on this bug has two problems: 1. Counterintuitive use of the word borrow in the stats. 2. Bad estimator precision compared to 2.6.18 and older kernels. If you know about both you can workaround. I do not believe that the use of Borrow will be changed any time soon. It will be interesting to see if the timers have improved, but I have no test rig to test it right now. Brgds, Moritz Muehlenhoff wrote: tags 53 moreinfo thanks On Wed, Jun 24, 2009 at 11:08:36PM +0100, Ben Hutchings wrote: On Wed, 2009-06-24 at 16:42 +0100, Anton Ivanov wrote: Additional information. It does not do it on all classes. I can observe it on a particular class parented to the root CBQ qdisc with multiple burstable children. isolated put on another class parented to the root qdisc is similarly ignored. I will try to dig through the source to see exactly where the bug is, but it is definitely a bug (the results of tc are confirmed by delay measurements and bandwidth measurements in the relevant classes). Please also test whether the configuration you're trying is still broken in kernel version 2.6.30 (from unstable). If so, please report this bug upstream on bugzilla.kernel.org. Anton, does still occur with current kernels? Did you report it in the kernel.org bugzilla? Cheers, Moritz -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ai...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4c39a35b.5030...@sigsegv.cx
Bug#576405: Acknowledgement (linux-image-2.6.26: Deadlock during combined NFS3/NFS4 use)
Just remembered something - I did quite a lot of copying off manually mounted via v3 exports which were at the same time accessed elsewhere via v4. So this is probably autofs related and needs to me mounted under an autofs point to be triggered. The only reference I have been able to find to something similar was in the fedora bugs for v9. By the look of it using autofs to set up a Unix workstation network properly has become something of a lost art :( The setup can be found in details at: http://foswiki.sigsegv.cx/bin/view/Net/LinuxNFSv4 I have used the v3 setup in production environments with tens of users and on my home network for many years and the only reason to look at v4 at all is that most apps like firefox, etc have now moved to use sqlite which locks/unlocks like crazy so v3 starts hitting performance limitations. Brgds, On Sun, 2010-04-04 at 09:33 +, Debian Bug Tracking System wrote: Thank you for filing a new Bug report with Debian. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): unknown-pack...@qa.debian.org If you wish to submit further information on this problem, please send it to 576...@bugs.debian.org. Please do not send mail to ow...@bugs.debian.org unless you wish to report a problem with the Bug-tracking system. -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ariva...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1270377747.18041.7.ca...@moonbird.sigsegv.cx
Bug#576405: linux-image-2.6.26: Deadlock during combined NFS3/NFS4 use
Package: linux-image-2.6.26 Version: nfsfix.1 Severity: important When an export is exported and mounted via autofs using BOTH NFSv3 and NFSv4 the NFSv4 one deadlocks. Setup - transition from v3 to v4. System A is still perusing the old map: cat /etc/auto.local | grep iPodResolution iPodResolution -rsize=4096,wsize=4096,rw eden:/exports/md4/videoiPod System B is using the newever version of same map with a v4 mount: cat /var/yp/auto.local | grep iPodResolution iPodResolution -fstype=nfs4 eden:/md4/videoiPod If system B is writing to the mount and A is reading from it B starts getting I/O errors/BAD FDs. If B is running from disk lots of things fail. If B is running diskless - total lock up. Same setup with V3 only has been working flawlessly for 5+ years. Same setup with V4 only (when the v3 machines are off) seems to work OK as well. Fairly easy to reproduce. I am not sure at this point if autofs has any role in this and I will not be in a position to retest until 19th. Apologies. Tested with: stock debian, older version with just the nfs regression fix, stock recompiled with preempt, 686 and 486 versions. -- System Information: Debian Release: 5.0.3 APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Kernel: Linux 2.6.26-2-686 (SMP w/1 CPU core) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages linux-image-2.6.26 depends on: ii coreutils 6.10-6 The GNU core utilities ii debconf [debconf-2.0] 1.5.24 Debian configuration management sy linux-image-2.6.26 recommends no packages. Versions of packages linux-image-2.6.26 suggests: ii fdutils 5.5-20060227-3 Linux floppy utilities pn ksymoops none (no description available) pn linux-doc-2.6.26 | linux- none (no description available) -- debconf-show failed -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/2010040408.10335.29590.report...@eden.sigsegv.cx
Bug#576405: linux-image-2.6.26: Deadlock during combined NFS3/NFS4 use
On Sun, 2010-04-04 at 20:40 +0100, Ben Hutchings wrote: On Sun, 2010-04-04 at 09:44 +0100, Anton Ivanov wrote: Package: linux-image-2.6.26 Version: nfsfix.1 What does that version mean? Have you applied your own patches? Can you reproduce this with an official kernel package? That is a rather old official with only the NFS regression applied (the one I dug out a while back - bug 524199). It worked and I have not touched it since. That is just one machine on which I am observing it. I also see it on several other machines with: 1. Latest Official - 486, 686 2. Rebuilt with only preemption enabled 3. Official as released only with NFS regression applied (524199) I have not done a full matrix of all client/server options for these as there are quite a few possible combinations. I can reboot all of the machines in question into official and retest formally after I am back from holidays on the 19th. I would not expect to see anything particularly different though (even on the ones that do not run official the difference from official is marginal - a few option tweaks and/or official patches from debian-security applied by hand to older kernels). -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: anton.iva...@kot-begemot.co.uk WWW: http://www.kot-begemot.co.uk/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ai...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1270412451.30401.10.ca...@localhost.localdomain
Bug#534430: Info received (linux-image-2.6.26: CBQ broken)
Sure, that is the same bug. I actually thought that I was updating that one when submitting the recent bug reports. Close please. -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: anton.iva...@kot-begemot.co.uk WWW: http://www.kot-begemot.co.uk/ -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1266659595.31060.0.ca...@vorlon.sigsegv.cx
Bug#534430: Info received (linux-image-2.6.26: CBQ broken)
On Sun, 2010-01-31 at 13:12 +0100, Moritz Muehlenhoff wrote: On Wed, Jan 27, 2010 at 11:38:07AM +, Anton Ivanov wrote: Sorry, ignore my previous email. I think I got to it, for whatever reason it is not getting set in cbq_set_lss(), just can't figure out what is wrong. Anton, as per your posting on linux-netdev I understand this bug can be closed? Yes. It is bad english in the output of tc combined with bad timing since kernel has gone to high perf timers. 2.6.9 and even 2.6.18 delivered considerably better traffic shaping performance. | I am using CBQ myself with recent kernels and never found it | 'borrowing', could you post a copy of your rules, or better, a subset of | them desmonstrating the problem ? | | Actually after going through it several times and looking at the code | Jarek pointed out it doesn't. | | Just the stats are very confusing and precision is not particularly | great. Cheers, Moritz -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: anton.iva...@kot-begemot.co.uk WWW: http://www.kot-begemot.co.uk/ -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#534430: linux-image-2.6.26: CBQ broken
I have finally gotten around to look at it properly (it has been annoying me all morning so I did not have choice, but to get to it). There is no way I can see the current kernel code to work. It sets borrow to be _ALWAYS_ equal to the parent on line 2077 of cbq_shed.c. For a bounded class it should change the used bandwidth in the parent as in the other bits of code around this part and after that set it to NULL. That bit of code is completely missing. I just downloaded 33-rc5 will look if it is there. Brgds, -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ariva...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#534430: linux-image-2.6.26: CBQ broken
I think I found the breakage. Have a look at cbq_set_lss() Instead of checking the flags with and working based on that it actually ANDs the flags every time (which if the flag is not already set results in an eternal false). I am rebuilding it at the moment after replacing the offending with . If it works after that we can hopefully consider this one closed. Brgds, On Wed, 2010-01-27 at 10:49 +, Anton Ivanov wrote: I have finally gotten around to look at it properly (it has been annoying me all morning so I did not have choice, but to get to it). There is no way I can see the current kernel code to work. It sets borrow to be _ALWAYS_ equal to the parent on line 2077 of cbq_shed.c. For a bounded class it should change the used bandwidth in the parent as in the other bits of code around this part and after that set it to NULL. That bit of code is completely missing. I just downloaded 33-rc5 will look if it is there. Brgds, -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ariva...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#534430: Info received (linux-image-2.6.26: CBQ broken)
Sorry, ignore my previous email. I think I got to it, for whatever reason it is not getting set in cbq_set_lss(), just can't figure out what is wrong. Brgds, -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: anton.iva...@kot-begemot.co.uk WWW: http://www.kot-begemot.co.uk/ -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#552255: linux-image-2.6.26-2-686: /proc permission bypass
[snip] I imagine such applications are already totally insecure. Sure, agree 100%. However, under normal circumstances they can be bolted down by a sysadmin using directory permissions until the developers see the light. Fourth, during the discussion it was claimed that this does not work on Linux proper. In a listing of /proc/self/fd the files appear with read and/or write permissions depending on the file descriptor mode. But when a process tries to open them they are treated as symbolic links, which have no permissions of their own. This is fairly obvious when looking at the code and it's not something we change. I did not have the time to look at it in detail. After one of the people on the cc-list of the actual discussion said that it does not apply to plain linux and this is debian-specific I looked at the current debian patch for .26. I saw some that there are some patches that apply to the relevant files for proc, but I have not had the time do decipher what they do. I have some doubts about the claim, but cannot verify it (I am off on holiday in an hour or so). It maybe Debian specific or specific to a patch which Debian and more than one other distro is using (ptrace comes to mind). I personally do not think that is the case, however it is worth checking and if it is coming from the ptrace patches double check if they do not introduce something worse than that somewhere. I don't know what patches you're talking about. See above. As I said, I have not had the time to test this vs a vanilla kernel. I am on my way to chop wood for a week instead of chopping code. Sorry. Will fw you the relevant email just in case it does not make the bugtraq moderator queue. Ben. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#552255: [Fwd: Re: /proc filesystem allows bypassing directory permissions on Linux]
Personally, I think the chap needs ceiling replastered. Too many scratches from the nose being ploughed through it at high velocity. As I said, I did not have the resources to test if he is right or wrong yesterday. Brgds, -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ariva...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 ---BeginMessage--- On 24.10.2009 22:05, Anton Ivanov wrote: It works on Debian 2.6.26 out of the box. It is not an obscure patched kernel case I am afraid. If you redir an FD to a file using thus redir-ed FD in /proc allows you to bypass directory permissions for where the file is located. Thankfully, file permissions still apply so you need an app which has silly file perms in a bolted down directory for this. Symlinking the same file to a link on a normal ext3 or nfs filesystem as a sanity check shows correct permission behaviour. If you try to write to that symlink you get permission denied so the permissions on the fs actually work. No need to be root, nothing. It is not a case of forget to drop EID or something else like that either. It looks like what it says on the tin - permission bypass. Not that I would have expected anything different considering who posted it in the first place. Thus Debian kernel team should be blamed for that misbehaviour. Don't worry, hardlinks behave just the same way, as you describe. Use authentic Linux kernels, if you dislike that. -- Sincerely Your, Dan. ---End Message---
Bug#552255: linux-image-2.6.26-2-686: /proc permission bypass
Package: linux-image-2.6.26-2-686 Version: 2.6.26-17 Severity: important Currently discussed on bugtraq Cut-n-pasting the email Hi! This is forward from lkml, so no, I did not invent this hole. Unfortunately, I do not think lkml sees this as a security hole, so... Jamie Lokier said: a) the current permission model under /proc/PID/fd has a security hole (which Jamie is worried about) I believe its bugtraq time. Being able to reopen file with additional permissions looks like a security problem... Jamie, do you have some test script? And do you want your 15 minutes of bugtraq fame? ;-). The reopen does check the inode permission, but it does not require you have any reachable path to the file. Someone _might_ use that as a traditional unix security mechanism, but if so it's probably quite rare. Ok, I got this, with two users. I guess it is real (but obscure) security hole. So, we have this scenario. pavel/root is not doing anything interesting in the background. pa...@toy:/tmp$ uname -a Linux toy.ucw.cz 2.6.32-rc3 #21 Mon Oct 19 07:32:02 CEST 2009 armv5tel GNU/Linux pa...@toy:/tmp mkdir my_priv; cd my_priv pa...@toy:/tmp/my_priv$ echo this file should never be writable unwritable_file # lock down directory pa...@toy:/tmp/my_priv$ chmod 700 . # relax file permissions, directory is private, so this is safe # check link count on unwritable_file. We would not want someone # to have a hard link to work around our permissions, would we? pa...@toy:/tmp/my_priv$ chmod 666 unwritable_file pa...@toy:/tmp/my_priv$ cat unwritable_file this file should never be writable pa...@toy:/tmp/my_priv$ cat unwritable_file got you # Security problem here [Please pause here for a while before reading how guest did it.] Unexpected? Well, yes, to me anyway. Linux specific? Yes, I think so. So what did happen? User guest was able to work around directory permissions in the background, using /proc filesystem. gu...@toy:~$ bash 3 /tmp/my_priv/unwritable_file # Running inside nested shell gu...@toy:~$ read A 3 gu...@toy:~$ echo $A this file should never be writable gu...@toy:~$ cd /tmp/my_priv gu...@toy:/tmp/my_priv$ ls unwritable_file # pavel did chmod 000, chmod 666 here gu...@toy:/tmp/my_priv$ ls ls: cannot open directory .: Permission denied # Linux correctly prevents guest from writing to that file gu...@toy:/tmp/my_priv$ cat unwritable_file cat: unwritable_file: Permission denied gu...@toy:/tmp/my_priv$ echo got you 3 bash: echo: write error: Bad file descriptor # ...until we take a way around it with /proc filesystem. Oops. gu...@toy:/tmp/my_priv$ echo got you /proc/self/fd/3 -- Package-specific info: -- System Information: Debian Release: 5.0.2 APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Kernel: Linux 2.6.26 (SMP w/1 CPU core; PREEMPT) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages linux-image-2.6.26-2-686 depends on: ii debconf [debconf-2.0] 1.5.24 Debian configuration management sy ii initramfs-tools [linux-initra 0.92o tools for generating an initramfs ii module-init-tools 3.4-1 tools for managing Linux kernel mo Versions of packages linux-image-2.6.26-2-686 recommends: ii libc6-i6862.7-18 GNU C Library: Shared libraries [i Versions of packages linux-image-2.6.26-2-686 suggests: ii grub 0.97-47lenny2 GRand Unified Bootloader (Legacy v ii lilo 1:22.8-7 LInux LOader - The Classic OS load pn linux-doc-2.6.26 none(no description available) -- debconf-show failed -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#552255: linux-image-2.6.26-2-686: /proc permission bypass
We have been having a back and fourth on this with a couple of people. It has not shown up on BUGTRAQ yet because it is sitting in the moderator queue. First of all, any permission bypass is bad. Principle of least surprise. Second, the important thing here is that directory permissions are ignored. Whatever the reason, that is not good. The case shown by Pavel is an extreme example (using 666), but you can most likely have a less extreme example where this can be put to good use. Third, there is a non-zero size class of applications where it is likely such idiocy like 666 protected one level above by dir to be found - ported from Windows. Under windows, locking is non-advisory and apps tend to scribble under themselves. So if you open a file with an exclusive Read/write lock nobody can read/write it regardless of permissions. When a program gets ported to unix developers (or the porting toolkit) replaces the code with flocks or fcntl which are advisory and the file becomes nicely accessible. No such code in debian proper, but that does not mean that there is no such code out there in the wild. Fourth, during the discussion it was claimed that this does not work on Linux proper. I have some doubts about the claim, but cannot verify it (I am off on holiday in an hour or so). It maybe Debian specific or specific to a patch which Debian and more than one other distro is using (ptrace comes to mind). I personally do not think that is the case, however it is worth checking and if it is coming from the ptrace patches double check if they do not introduce something worse than that somewhere. Cheers, On Sat, 2009-10-24 at 21:18 +0100, Ben Hutchings wrote: On Sat, 2009-10-24 at 20:19 +0100, Anton Ivanov wrote: Package: linux-image-2.6.26-2-686 Version: 2.6.26-17 Severity: important Currently discussed on bugtraq Cut-n-pasting the email Hi! This is forward from lkml, so no, I did not invent this hole. Unfortunately, I do not think lkml sees this as a security hole, so... Jamie Lokier said: a) the current permission model under /proc/PID/fd has a security hole (which Jamie is worried about) I believe its bugtraq time. Being able to reopen file with additional permissions looks like a security problem... Jamie, do you have some test script? And do you want your 15 minutes of bugtraq fame? ;-). The reopen does check the inode permission, but it does not require you have any reachable path to the file. Someone _might_ use that as a traditional unix security mechanism, but if so it's probably quite rare. Ok, I got this, with two users. I guess it is real (but obscure) security hole. So obscure that it doesn't really count as important. So, we have this scenario. pavel/root is not doing anything interesting in the background. pa...@toy:/tmp$ uname -a Linux toy.ucw.cz 2.6.32-rc3 #21 Mon Oct 19 07:32:02 CEST 2009 armv5tel GNU/Linux pa...@toy:/tmp mkdir my_priv; cd my_priv pa...@toy:/tmp/my_priv$ echo this file should never be writable unwritable_file # lock down directory pa...@toy:/tmp/my_priv$ chmod 700 . # relax file permissions, directory is private, so this is safe # check link count on unwritable_file. We would not want someone # to have a hard link to work around our permissions, would we? pa...@toy:/tmp/my_priv$ chmod 666 unwritable_file [...] But who's really going to do that, other that to demonstrate this? Ben. -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ariva...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#534430: linux-image-2.6.26: CBQ broken
Bounded classes are allowed to borrow at least under some circumstances. In my config there is a bounded class parented to root on my DSL uplink and a hierarchy sitting under it where most classes are allowed to borrow. If the root class is bounded it all works like a breeze. I have used to use a replica of this setup under BSD for nearly 10 years and recently moved it to Linux. Because the root class was borrowing the underlying hierarchy was exceeding their allocated bandwidths on casual basis. As a result - no QoS. I have worked around it by bringing down the bandwidth of the parent root CBQ qdisc at the moment. It is now still borrowing: class cbq 2:16 parent 2: rate 38bit (bounded) prio 2 Sent 1041746585 bytes 7852722 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 borrowed 7984534 overactions 0 avgidle 78 undertime 0 However, it just gets dropped by the qdisc. Bounded classes in lower levels in the hierarchy actually work. Putting a few more classes between the root and the first class that is bounded does not. Overall, it is broken and broken pretty badly. I have not had the time to sit down and read the actual code yet to see exactly where it is broken. Apologies, Best Regards, On Sat, 2009-07-25 at 22:30 +0200, Moritz Muehlenhoff wrote: On Wed, Jun 24, 2009 at 10:21:05AM +0100, Anton Ivanov wrote: Package: linux-image-2.6.26 Version: nfsfix.1 Severity: normal CBQ is completely broken. The borrowed counters never increase and from there on the bandwidth computation is totally fubar Please explain the problem more verbosely. What exactly did you do and what result did you expect? Cheers, Moritz -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ariva...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#521727: Preempt
I can confirm that. While the difference between PREEMPT and normal kernels with 2.6.18 and prior to that was mostly for connoisseurs, with 2.6.26 it is clearly visible with the naked eye (tested on a bog standard 2GHz Athlon XP). This should not really be the case especially under light or no load. There is yet another performance regression somewhere besides the NFS one and the block IO one which got fixed recently. -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: anton.iva...@kot-begemot.co.uk WWW: http://www.kot-begemot.co.uk/ -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#534430: linux-image-2.6.26: CBQ broken
Package: linux-image-2.6.26 Version: nfsfix.1 Severity: normal CBQ is completely broken. The borrowed counters never increase and from there on the bandwidth computation is totally fubar. -- System Information: Debian Release: 5.0.1 APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Kernel: Linux 2.6.26 (SMP w/1 CPU core) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages linux-image-2.6.26 depends on: ii coreutils 6.10-6 The GNU core utilities ii debconf [debconf-2.0] 1.5.24 Debian configuration management sy linux-image-2.6.26 recommends no packages. Versions of packages linux-image-2.6.26 suggests: ii fdutils 5.5-20060227-3 Linux floppy utilities pn ksymoops none (no description available) pn linux-doc-2.6.26 | linux- none (no description available) -- debconf-show failed -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#534444: linux-image-2.6.26-2-486: CBQ broken
Package: linux-image-2.6.26-2-486 Version: 2.6.26-15 Severity: normal CBQ fails to perform correctly. class cbq 2:16 parent 2: leaf 76: rate 32bit (bounded) prio 1 Sent 551644 bytes 1279 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 borrowed 810 overactions 0 avgidle 18424 undertime 0 This should never happen. From there on the entire CBQ subsystem is fubar... -- Package-specific info: -- System Information: Debian Release: 5.0.1 APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Kernel: Linux 2.6.26 (SMP w/1 CPU core) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages linux-image-2.6.26-2-486 depends on: ii debconf [debconf-2.0] 1.5.24 Debian configuration management sy ii initramfs-tools [linux-initra 0.92o tools for generating an initramfs ii module-init-tools 3.4-1 tools for managing Linux kernel mo linux-image-2.6.26-2-486 recommends no packages. Versions of packages linux-image-2.6.26-2-486 suggests: ii grub 0.97-47lenny2 GRand Unified Bootloader (Legacy v ii lilo 1:22.8-7 LInux LOader - The Classic OS load pn linux-doc-2.6.26 none(no description available) -- debconf-show failed -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#534444: Acknowledgement (linux-image-2.6.26-2-486: CBQ broken)
Additional information. It does not do it on all classes. I can observe it on a particular class parented to the root CBQ qdisc with multiple burstable children. isolated put on another class parented to the root qdisc is similarly ignored. I will try to dig through the source to see exactly where the bug is, but it is definitely a bug (the results of tc are confirmed by delay measurements and bandwidth measurements in the relevant classes). Brgds, On Wed, 2009-06-24 at 12:09 +, Debian Bug Tracking System wrote: Thank you for filing a new Bug report with Debian. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): Debian Kernel Team debian-kernel@lists.debian.org If you wish to submit further information on this problem, please send it to 534...@bugs.debian.org, as before. Please do not send mail to ow...@bugs.debian.org unless you wish to report a problem with the Bug-tracking system. -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ariva...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#524199: test build
It is loaded on one of my machines (the one that sees heavy use). The results should be available in 6-8 hours. On Thu, 2009-04-16 at 00:23 -0600, dann frazier wrote: Can you test this build to see if it fixes the issue? http://people.debian.org/~dannf/bugs/524199/ -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ariva...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#524199: test build
Does not fix it I am afraid. It took it longer to show up, but it is showing up none the less. So it is not just that bit of code. There is something buggered elsewhere in the NFS subsystem I am afraid :-( Took around 8 hours of medium level usage - reading mail, digging for stuff around the internet, browsing mplayer sources, etc. aiva...@falkor:~$ uptime 16:19:30 up 7:47, 2 users, load average: 0.82, 1.03, 0.95 It started doing it right after I ran a couple of find scripts on stuff mounted over NFS. This is on my workstation which is a diskless P3 running of an etch NFS server. On Thu, 2009-04-16 at 00:23 -0600, dann frazier wrote: Can you test this build to see if it fixes the issue? http://people.debian.org/~dannf/bugs/524199/ -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ariva...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#524199: linux-image-2.6.26-1-686: nfs unusable
Package: linux-image-2.6.26-1-686 Version: 2.6.26-13lenny2 Severity: important Tags: patch Overtime any system with NFS usage grinds to a crawl. While root or /usr on NFS are most affected the same problem should affect other NFS usage. The symptoms are extremely high system load, taking 20 minutes to open/close applications, taking half a minute to create the menu on right click in KDE, becoming unusable for 5 minutes while rebuilding menus in KDE, etc. This is reported multiple times versus other packages elsewhere (mostly KDE due to the idiotic way it builds its menus). The reason seems to be: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.gita=commitdiffh=23918b03060f6e572168fdde1798a905679d2e06 discussed here: http://lkml.indiana.edu/hypermail/linux/kernel/0812.2/00418.html On the positive side this made me find out that a whole bunch of programs are coded with the left foot. This patch needs to be retrofitted into the kernel to make it usable again for anyone using NFS and especially diskless clients. Otherwise, nfs in current kernel is unusable. After waiting for 30 minutes for tex to update its map on an otherwise fast machine I downgraded all of my diskless clients to 2.6.18 -- Package-specific info: -- System Information: Debian Release: 5.0.1 APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Kernel: Linux 2.6.18-6-686 (SMP w/1 CPU core) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages linux-image-2.6.26-1-686 depends on: ii debconf [debconf-2.0] 1.5.24 Debian configuration management sy ii initramfs-tools [linux-initra 0.92o tools for generating an initramfs ii module-init-tools 3.4-1 tools for managing Linux kernel mo Versions of packages linux-image-2.6.26-1-686 recommends: ii libc6-i6862.7-18 GNU C Library: Shared libraries [i Versions of packages linux-image-2.6.26-1-686 suggests: ii grub 0.97-47lenny2 GRand Unified Bootloader (Legacy v ii lilo 1:22.8-7 LInux LOader - The Classic OS load pn linux-doc-2.6.26 none(no description available) -- debconf-show failed -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#524199: linux-image-2.6.26-1-686: nfs unusable
Frankly, this deserves a higher bug rating than important. A Unix without a working NFS is not a Unix. At least not a usable one. I can now confirm that downgrading to 2.6.18 seems to fix it. All 3 machines I have with diskless Lenny are still up and usable. By this time they would have been out of commission with 2.6.26. On Wed, 2009-04-15 at 13:52 -0400, John Morrissey wrote: On Wed, Apr 15, 2009 at 01:30:51PM +0100, Anton Ivanov wrote: Overtime any system with NFS usage grinds to a crawl. While root or /usr on NFS are most affected the same problem should affect other NFS usage. The symptoms are extremely high system load, [snip] The reason seems to be: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.gita=commitdiffh=23918b03060f6e572168fdde1798a905679d2e06 FWIW, we're also seeing this, on machines mounting several large NFS volumes that serve virtual accounts (i.e., they all have the same UID/GID and no shell access). After ~two days of uptime, these moderately loaded machines (running Apache, ProFTPD, and Courier IMAP) start spending 90% of CPU time in system state and become so unresponsive that they must be rebooted. Running the 2.6.28 that was recently in sid (which contains the commit Anton mentions) fixed this. john -- Understanding is a three-edged sword: your side, their side, and the truth. --Kosh Naranek A. R. Ivanov E-mail: aiva...@sigsegv.cx WWW: http://www.sigsegv.cx/ pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ariva...@sigsegv.cx Fingerprint: C824 CBD7 EE4B D7F8 5331 89D5 FCDA 572E DDE5 E715 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#421443: linux-image-2.6.18-4-686: ide tape broken and ide scsi disabled, ide tapes unuseable
Package: linux-image-2.6.18-4-686 Version: 2.6.18.dfsg.1-12 Severity: important Kernel detects ide tape ide-tape: hdb - ht0: Seagate STT2A rev 8A51 ide-tape: hdb - ht0: 1000KBps, 6*54kB buffer, 9720kB pipeline, 108ms tDSC, DMA After which all userland utilities fail to access it or issue any commands to it. The drive mostly works using ide-tape in 2.6.14 and 2.6.16 on write (some read problems). Works fine in 2.6.16 using ide-scsi read/write. Frankly, anyone who has had to use ide-tapes knows that Linus can go get lost with his statement about the IDE tape driver now being a perfect replacement for ide-scsi (multiple times on lkm since 2003). It isn't. In fact I have yet to see a kernel release where it works fine. So disabling IDE-SCSI is not nice. That is the only means to use ide tape drives at the moment. -- System Information: Debian Release: 4.0 APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.6.18-4-686 Locale: LANG=en_GB, LC_CTYPE=en_GB (charmap=ISO-8859-1) Versions of packages linux-image-2.6.18-4-686 depends on: ii coreutils 5.97-5.3 The GNU core utilities ii debconf [debconf-2.0] 1.5.11 Debian configuration management sy ii initramfs-tools [linux-initra 0.85g tools for generating an initramfs ii module-init-tools 3.3-pre4-2 tools for managing Linux kernel mo Versions of packages linux-image-2.6.18-4-686 recommends: ii libc6-i686 2.3.6.ds1-13 GNU C Library: Shared libraries [i -- debconf information excluded -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]