Re: b44: high ping times with wireless-dev
On Sunday 17 June 2007, Michael Buesch wrote: > On Saturday 16 June 2007 23:27:43 Maximilian Engelhardt wrote: > > [...] > > ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10 > > ACPI: PCI Interrupt :02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low) > > -> IRQ 10 > > ssb: Sonics Silicon Backplane found on PCI device :02:02.0 > > b44.c:v2.0 > > eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7 > > [...] > > Ok, I prepared two debugging patches. > > Please enable SonicsSiliconBackplane Debugging in the kernel kconfig, > so I can get more detail information about your card. > Device Drivers/Sonics Silicon Backplane/SSB debugging > (Must disable "No SSB kernel messages") > > Please apply and test the attached debugging patches in a row. > So apply patch 1 and test if it works again. If not, apply > patch 2 and test if it works. > Always save complete dmesg log on each test run and send it to me. > > Thanks for testing. > (This time it seems we are actually getting somewhere, when > dealing with sane people. :D ) I did the tests with my kernel where only the card is on interrupt 10. dmesg is attached. With the first patch applied networking does work again. I also additionally tried patch2 and it also does work. Maxi Linux version 2.6.22-rc4-wireless-dev-20070616-test1 ([EMAIL PROTECTED]) (gcc version 4.1.3 20070601 (prerelease) (Debian 4.1.2-12)) #6 PREEMPT Sun Jun 17 13:24:13 CEST 2007 BIOS-provided physical RAM map: BIOS-e820: - 0009f800 (usable) BIOS-e820: 0009f800 - 000a (reserved) BIOS-e820: 000ce000 - 000d (reserved) BIOS-e820: 000e - 0010 (reserved) BIOS-e820: 0010 - 4dee (usable) BIOS-e820: 4dee - 4deec000 (ACPI data) BIOS-e820: 4deec000 - 4df0 (ACPI NVS) BIOS-e820: 4df0 - 5000 (reserved) BIOS-e820: fec1 - fec2 (reserved) BIOS-e820: ff80 - ffc0 (reserved) BIOS-e820: fc00 - 0001 (reserved) 350MB HIGHMEM available. 896MB LOWMEM available. Entering add_active_range(0, 0, 319200) 0 entries of 256 used Zone PFN ranges: DMA 0 -> 4096 Normal 4096 -> 229376 HighMem229376 -> 319200 early_node_map[1] active PFN ranges 0:0 -> 319200 On node 0 totalpages: 319200 DMA zone: 32 pages used for memmap DMA zone: 0 pages reserved DMA zone: 4064 pages, LIFO batch:0 Normal zone: 1760 pages used for memmap Normal zone: 223520 pages, LIFO batch:31 HighMem zone: 701 pages used for memmap HighMem zone: 89123 pages, LIFO batch:15 DMI present. ACPI: RSDP 000F6050, 0014 (r0 ACER ) ACPI: RSDT 4DEE5A39, 0030 (r1 ACER Wagtail 20020114 LTP0) ACPI: FACP 4DEEBF2C, 0074 (r1 ACER Wagtail 20020114 PTL50) ACPI: DSDT 4DEE5A69, 64C3 (r1 ACER Wagtail 20020114 MSFT 10E) ACPI: FACS 4DEFCFC0, 0040 ACPI: HPET 4DEEBFA0, 0038 (r1 ACER Wagtail 20020114 PTL 0) ACPI: BOOT 4DEEBFD8, 0028 (r1 ACER Wagtail 20020114 LTP1) ACPI: PM-Timer IO Port: 0x1008 ACPI: HPET id: 0x8086a201 base: 0x0 Allocating PCI resources starting at 6000 (gap: 5000:aec1) Built 1 zonelists. Total pages: 316707 Kernel command line: root=/dev/sda1 ro vga=0x31b resume=/dev/sda2 Local APIC disabled by BIOS -- you can enable it with "lapic" mapped APIC to d000 (019c4000) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 4096 (order: 12, 16384 bytes) Detected 1395.565 MHz processor. Console: colour dummy device 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 1259844k/1276800k available (3572k kernel code, 16168k reserved, 1152k data, 220k init, 359296k highmem) virtual kernel memory layout: fixmap : 0xfffaa000 - 0xf000 ( 340 kB) pkmap : 0xff80 - 0xffc0 (4096 kB) vmalloc : 0xf880 - 0xff7fe000 ( 111 MB) lowmem : 0xc000 - 0xf800 ( 896 MB) .init : 0xc05a - 0xc05d7000 ( 220 kB) .data : 0xc047d05d - 0xc059d0b0 (1152 kB) .text : 0xc010 - 0xc047d05d (3572 kB) Checking if this processor honours the WP bit even in supervisor mode... Ok. SLUB: Genslabs=23, HWalign=64, Order=0-1, MinObjects=4, Processors=1, Nodes=1 Calibrating delay using timer specific routine.. 2793.34 BogoMIPS (lpj=4653358) Mount-cache hash table entries: 512 CPU: After generic identify, caps: a7e9f9bf 0180 CPU: L1 I cache: 32K, L1 D cache: 32K CPU: L2 cache: 1024K CPU: After all inits, caps: a7e9f9bf 2040 0180 Intel m
Re: b44: high ping times with wireless-dev
On Sunday 17 June 2007, Michael Buesch wrote: On Saturday 16 June 2007 23:27:43 Maximilian Engelhardt wrote: [...] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10 ACPI: PCI Interrupt :02:02.0[A] - Link [LNKD] - GSI 10 (level, low) - IRQ 10 ssb: Sonics Silicon Backplane found on PCI device :02:02.0 b44.c:v2.0 eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7 [...] Ok, I prepared two debugging patches. Please enable SonicsSiliconBackplane Debugging in the kernel kconfig, so I can get more detail information about your card. Device Drivers/Sonics Silicon Backplane/SSB debugging (Must disable No SSB kernel messages) Please apply and test the attached debugging patches in a row. So apply patch 1 and test if it works again. If not, apply patch 2 and test if it works. Always save complete dmesg log on each test run and send it to me. Thanks for testing. (This time it seems we are actually getting somewhere, when dealing with sane people. :D ) I did the tests with my kernel where only the card is on interrupt 10. dmesg is attached. With the first patch applied networking does work again. I also additionally tried patch2 and it also does work. Maxi Linux version 2.6.22-rc4-wireless-dev-20070616-test1 ([EMAIL PROTECTED]) (gcc version 4.1.3 20070601 (prerelease) (Debian 4.1.2-12)) #6 PREEMPT Sun Jun 17 13:24:13 CEST 2007 BIOS-provided physical RAM map: BIOS-e820: - 0009f800 (usable) BIOS-e820: 0009f800 - 000a (reserved) BIOS-e820: 000ce000 - 000d (reserved) BIOS-e820: 000e - 0010 (reserved) BIOS-e820: 0010 - 4dee (usable) BIOS-e820: 4dee - 4deec000 (ACPI data) BIOS-e820: 4deec000 - 4df0 (ACPI NVS) BIOS-e820: 4df0 - 5000 (reserved) BIOS-e820: fec1 - fec2 (reserved) BIOS-e820: ff80 - ffc0 (reserved) BIOS-e820: fc00 - 0001 (reserved) 350MB HIGHMEM available. 896MB LOWMEM available. Entering add_active_range(0, 0, 319200) 0 entries of 256 used Zone PFN ranges: DMA 0 - 4096 Normal 4096 - 229376 HighMem229376 - 319200 early_node_map[1] active PFN ranges 0:0 - 319200 On node 0 totalpages: 319200 DMA zone: 32 pages used for memmap DMA zone: 0 pages reserved DMA zone: 4064 pages, LIFO batch:0 Normal zone: 1760 pages used for memmap Normal zone: 223520 pages, LIFO batch:31 HighMem zone: 701 pages used for memmap HighMem zone: 89123 pages, LIFO batch:15 DMI present. ACPI: RSDP 000F6050, 0014 (r0 ACER ) ACPI: RSDT 4DEE5A39, 0030 (r1 ACER Wagtail 20020114 LTP0) ACPI: FACP 4DEEBF2C, 0074 (r1 ACER Wagtail 20020114 PTL50) ACPI: DSDT 4DEE5A69, 64C3 (r1 ACER Wagtail 20020114 MSFT 10E) ACPI: FACS 4DEFCFC0, 0040 ACPI: HPET 4DEEBFA0, 0038 (r1 ACER Wagtail 20020114 PTL 0) ACPI: BOOT 4DEEBFD8, 0028 (r1 ACER Wagtail 20020114 LTP1) ACPI: PM-Timer IO Port: 0x1008 ACPI: HPET id: 0x8086a201 base: 0x0 Allocating PCI resources starting at 6000 (gap: 5000:aec1) Built 1 zonelists. Total pages: 316707 Kernel command line: root=/dev/sda1 ro vga=0x31b resume=/dev/sda2 Local APIC disabled by BIOS -- you can enable it with lapic mapped APIC to d000 (019c4000) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 4096 (order: 12, 16384 bytes) Detected 1395.565 MHz processor. Console: colour dummy device 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 1259844k/1276800k available (3572k kernel code, 16168k reserved, 1152k data, 220k init, 359296k highmem) virtual kernel memory layout: fixmap : 0xfffaa000 - 0xf000 ( 340 kB) pkmap : 0xff80 - 0xffc0 (4096 kB) vmalloc : 0xf880 - 0xff7fe000 ( 111 MB) lowmem : 0xc000 - 0xf800 ( 896 MB) .init : 0xc05a - 0xc05d7000 ( 220 kB) .data : 0xc047d05d - 0xc059d0b0 (1152 kB) .text : 0xc010 - 0xc047d05d (3572 kB) Checking if this processor honours the WP bit even in supervisor mode... Ok. SLUB: Genslabs=23, HWalign=64, Order=0-1, MinObjects=4, Processors=1, Nodes=1 Calibrating delay using timer specific routine.. 2793.34 BogoMIPS (lpj=4653358) Mount-cache hash table entries: 512 CPU: After generic identify, caps: a7e9f9bf 0180 CPU: L1 I cache: 32K, L1 D cache: 32K CPU: L2 cache: 1024K CPU: After all inits, caps: a7e9f9bf 2040 0180 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. Compat vDSO mapped to e000. CPU: Intel(R) Pentium(R) M processor 1400MHz stepping 05 Checking
Re: b44: high ping times with wireless-dev
On Sunday 17 June 2007, Stephen Hemminger wrote: > On Sat, 16 Jun 2007 23:27:43 +0200 > > Maximilian Engelhardt <[EMAIL PROTECTED]> wrote: > > Hello, > > > > I recently did some test and found out something interesting about the > > b44 problem I wrote earlier. > > > > The problem is the following: > > When I use my BCM4401 with the b44 driver in wireless-dev I get very high > > ping times looking like this: > > > > 64 bytes from 172.30.10.1: icmp_seq=1 ttl=64 time=1863 ms > > 64 bytes from 172.30.10.1: icmp_seq=2 ttl=64 time=855 ms > > 64 bytes from 172.30.10.1: icmp_seq=3 ttl=64 time=1855 ms > > 64 bytes from 172.30.10.1: icmp_seq=4 ttl=64 time=855 ms > > 64 bytes from 172.30.10.1: icmp_seq=5 ttl=64 time=1854 ms > > 64 bytes from 172.30.10.1: icmp_seq=6 ttl=64 time=854 ms > > 64 bytes from 172.30.10.1: icmp_seq=7 ttl=64 time=1851 ms > > 64 bytes from 172.30.10.1: icmp_seq=8 ttl=64 time=851 ms > > 64 bytes from 172.30.10.1: icmp_seq=9 ttl=64 time=1851 ms > > 64 bytes from 172.30.10.1: icmp_seq=10 ttl=64 time=851 ms > > > > I also found out that shortly after I boot my laptop and log into kde > > ping times are not that high but start to increase very quickly: > > > > 64 bytes from 172.30.10.1: icmp_seq=53 ttl=64 time=2.19 ms > > 64 bytes from 172.30.10.1: icmp_seq=54 ttl=64 time=2.22 ms > > 64 bytes from 172.30.10.1: icmp_seq=55 ttl=64 time=2.20 ms > > 64 bytes from 172.30.10.1: icmp_seq=56 ttl=64 time=2.20 ms > > 64 bytes from 172.30.10.1: icmp_seq=57 ttl=64 time=18.6 ms > > 64 bytes from 172.30.10.1: icmp_seq=58 ttl=64 time=1268 ms > > 64 bytes from 172.30.10.1: icmp_seq=59 ttl=64 time=268 ms > > 64 bytes from 172.30.10.1: icmp_seq=60 ttl=64 time=1268 ms > > 64 bytes from 172.30.10.1: icmp_seq=61 ttl=64 time=268 ms > > 64 bytes from 172.30.10.1: icmp_seq=62 ttl=64 time=6.08 ms > > 64 bytes from 172.30.10.1: icmp_seq=63 ttl=64 time=268 ms > > 64 bytes from 172.30.10.1: icmp_seq=64 ttl=64 time=1264 ms > > 64 bytes from 172.30.10.1: icmp_seq=65 ttl=64 time=264 ms > > > > After some time digging around I found out something really interesting. > > When I play some music ping times are immediately lower. If I stop > > playing music they are back to the same times as they were before. > > > > I guess that there is a problem with interrupts so I post some > > information of my system in hope it will be usefull. > > > > [EMAIL PROTECTED]:~$ cat /proc/interrupts > > CPU0 > > 0: 126317XT-PIC-XTtimer > > 1: 3600XT-PIC-XTi8042 > > 2: 0XT-PIC-XTcascade > > 7: 1XT-PIC-XTparport0 > > 8: 1XT-PIC-XTrtc > > 9: 17371XT-PIC-XTacpi > > 10: 13237XT-PIC-XTfirewire_ohci, yenta, yenta, > > ehci_hcd:usb1, uhci_hcd:usb3, uhci_hcd:usb4, Intel 82801DB-ICH4, Intel > > 82801DB-ICH4 Modem, eth0 > > 11: 89059XT-PIC-XTuhci_hcd:usb2, [EMAIL > > PROTECTED]::00:02.0 > > 12:632XT-PIC-XTi8042 > > 14: 10354XT-PIC-XTlibata > > 15: 7408XT-PIC-XTlibata > > NMI: 0 > > ERR: 0 > > > > > > [...] > > ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10 > > ACPI: PCI Interrupt :02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low) > > -> IRQ 10 > > ssb: Sonics Silicon Backplane found on PCI device :02:02.0 > > b44.c:v2.0 > > eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7 > > [...] > > > > This problem did only happen with wireless-dev (checkout this evening) > > and with -mm kernels I used some time ago for testing. Currently I'm > > running 2.6.22-rc4 that works perfectly fine and doesn't show that > > problem. > > > > Maxi > > Can you build with APIC for uniprocessor. I did enable CONFIG_X86_UP_APIC and CONFIG_X86_UP_IOAPIC and tried with lapic and apic=force but couldn't get APIC working. > > There is lots of IRQ sharing, so > - one of the other device's may be not handling shared IRQ properly. >Try unloading firewhire modem and yenta devices. > > - IRQ might be set edge triggered which doesn't work with NAPI > or shared IRQ. I did build a kernel without the three mentioned above but the problem is still the same. I also did remove everything but eth0 on interrupt 10 so the only device using that interrupt is eth0 and then the card completely stopped working. Maxi signature.asc Description: This is a digitally signed message part.
b44: high ping times with wireless-dev
Hello, I recently did some test and found out something interesting about the b44 problem I wrote earlier. The problem is the following: When I use my BCM4401 with the b44 driver in wireless-dev I get very high ping times looking like this: 64 bytes from 172.30.10.1: icmp_seq=1 ttl=64 time=1863 ms 64 bytes from 172.30.10.1: icmp_seq=2 ttl=64 time=855 ms 64 bytes from 172.30.10.1: icmp_seq=3 ttl=64 time=1855 ms 64 bytes from 172.30.10.1: icmp_seq=4 ttl=64 time=855 ms 64 bytes from 172.30.10.1: icmp_seq=5 ttl=64 time=1854 ms 64 bytes from 172.30.10.1: icmp_seq=6 ttl=64 time=854 ms 64 bytes from 172.30.10.1: icmp_seq=7 ttl=64 time=1851 ms 64 bytes from 172.30.10.1: icmp_seq=8 ttl=64 time=851 ms 64 bytes from 172.30.10.1: icmp_seq=9 ttl=64 time=1851 ms 64 bytes from 172.30.10.1: icmp_seq=10 ttl=64 time=851 ms I also found out that shortly after I boot my laptop and log into kde ping times are not that high but start to increase very quickly: 64 bytes from 172.30.10.1: icmp_seq=53 ttl=64 time=2.19 ms 64 bytes from 172.30.10.1: icmp_seq=54 ttl=64 time=2.22 ms 64 bytes from 172.30.10.1: icmp_seq=55 ttl=64 time=2.20 ms 64 bytes from 172.30.10.1: icmp_seq=56 ttl=64 time=2.20 ms 64 bytes from 172.30.10.1: icmp_seq=57 ttl=64 time=18.6 ms 64 bytes from 172.30.10.1: icmp_seq=58 ttl=64 time=1268 ms 64 bytes from 172.30.10.1: icmp_seq=59 ttl=64 time=268 ms 64 bytes from 172.30.10.1: icmp_seq=60 ttl=64 time=1268 ms 64 bytes from 172.30.10.1: icmp_seq=61 ttl=64 time=268 ms 64 bytes from 172.30.10.1: icmp_seq=62 ttl=64 time=6.08 ms 64 bytes from 172.30.10.1: icmp_seq=63 ttl=64 time=268 ms 64 bytes from 172.30.10.1: icmp_seq=64 ttl=64 time=1264 ms 64 bytes from 172.30.10.1: icmp_seq=65 ttl=64 time=264 ms After some time digging around I found out something really interesting. When I play some music ping times are immediately lower. If I stop playing music they are back to the same times as they were before. I guess that there is a problem with interrupts so I post some information of my system in hope it will be usefull. [EMAIL PROTECTED]:~$ cat /proc/interrupts CPU0 0: 126317XT-PIC-XTtimer 1: 3600XT-PIC-XTi8042 2: 0XT-PIC-XTcascade 7: 1XT-PIC-XTparport0 8: 1XT-PIC-XTrtc 9: 17371XT-PIC-XTacpi 10: 13237XT-PIC-XTfirewire_ohci, yenta, yenta, ehci_hcd:usb1, uhci_hcd:usb3, uhci_hcd:usb4, Intel 82801DB-ICH4, Intel 82801DB-ICH4 Modem, eth0 11: 89059XT-PIC-XTuhci_hcd:usb2, [EMAIL PROTECTED]::00:02.0 12:632XT-PIC-XTi8042 14: 10354XT-PIC-XTlibata 15: 7408XT-PIC-XTlibata NMI: 0 ERR: 0 [...] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10 ACPI: PCI Interrupt :02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low) -> IRQ 10 ssb: Sonics Silicon Backplane found on PCI device :02:02.0 b44.c:v2.0 eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7 [...] This problem did only happen with wireless-dev (checkout this evening) and with -mm kernels I used some time ago for testing. Currently I'm running 2.6.22-rc4 that works perfectly fine and doesn't show that problem. Maxi signature.asc Description: This is a digitally signed message part.
b44: high ping times with wireless-dev
Hello, I recently did some test and found out something interesting about the b44 problem I wrote earlier. The problem is the following: When I use my BCM4401 with the b44 driver in wireless-dev I get very high ping times looking like this: 64 bytes from 172.30.10.1: icmp_seq=1 ttl=64 time=1863 ms 64 bytes from 172.30.10.1: icmp_seq=2 ttl=64 time=855 ms 64 bytes from 172.30.10.1: icmp_seq=3 ttl=64 time=1855 ms 64 bytes from 172.30.10.1: icmp_seq=4 ttl=64 time=855 ms 64 bytes from 172.30.10.1: icmp_seq=5 ttl=64 time=1854 ms 64 bytes from 172.30.10.1: icmp_seq=6 ttl=64 time=854 ms 64 bytes from 172.30.10.1: icmp_seq=7 ttl=64 time=1851 ms 64 bytes from 172.30.10.1: icmp_seq=8 ttl=64 time=851 ms 64 bytes from 172.30.10.1: icmp_seq=9 ttl=64 time=1851 ms 64 bytes from 172.30.10.1: icmp_seq=10 ttl=64 time=851 ms I also found out that shortly after I boot my laptop and log into kde ping times are not that high but start to increase very quickly: 64 bytes from 172.30.10.1: icmp_seq=53 ttl=64 time=2.19 ms 64 bytes from 172.30.10.1: icmp_seq=54 ttl=64 time=2.22 ms 64 bytes from 172.30.10.1: icmp_seq=55 ttl=64 time=2.20 ms 64 bytes from 172.30.10.1: icmp_seq=56 ttl=64 time=2.20 ms 64 bytes from 172.30.10.1: icmp_seq=57 ttl=64 time=18.6 ms 64 bytes from 172.30.10.1: icmp_seq=58 ttl=64 time=1268 ms 64 bytes from 172.30.10.1: icmp_seq=59 ttl=64 time=268 ms 64 bytes from 172.30.10.1: icmp_seq=60 ttl=64 time=1268 ms 64 bytes from 172.30.10.1: icmp_seq=61 ttl=64 time=268 ms 64 bytes from 172.30.10.1: icmp_seq=62 ttl=64 time=6.08 ms 64 bytes from 172.30.10.1: icmp_seq=63 ttl=64 time=268 ms 64 bytes from 172.30.10.1: icmp_seq=64 ttl=64 time=1264 ms 64 bytes from 172.30.10.1: icmp_seq=65 ttl=64 time=264 ms After some time digging around I found out something really interesting. When I play some music ping times are immediately lower. If I stop playing music they are back to the same times as they were before. I guess that there is a problem with interrupts so I post some information of my system in hope it will be usefull. [EMAIL PROTECTED]:~$ cat /proc/interrupts CPU0 0: 126317XT-PIC-XTtimer 1: 3600XT-PIC-XTi8042 2: 0XT-PIC-XTcascade 7: 1XT-PIC-XTparport0 8: 1XT-PIC-XTrtc 9: 17371XT-PIC-XTacpi 10: 13237XT-PIC-XTfirewire_ohci, yenta, yenta, ehci_hcd:usb1, uhci_hcd:usb3, uhci_hcd:usb4, Intel 82801DB-ICH4, Intel 82801DB-ICH4 Modem, eth0 11: 89059XT-PIC-XTuhci_hcd:usb2, [EMAIL PROTECTED]::00:02.0 12:632XT-PIC-XTi8042 14: 10354XT-PIC-XTlibata 15: 7408XT-PIC-XTlibata NMI: 0 ERR: 0 [...] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10 ACPI: PCI Interrupt :02:02.0[A] - Link [LNKD] - GSI 10 (level, low) - IRQ 10 ssb: Sonics Silicon Backplane found on PCI device :02:02.0 b44.c:v2.0 eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7 [...] This problem did only happen with wireless-dev (checkout this evening) and with -mm kernels I used some time ago for testing. Currently I'm running 2.6.22-rc4 that works perfectly fine and doesn't show that problem. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: high ping times with wireless-dev
On Sunday 17 June 2007, Stephen Hemminger wrote: On Sat, 16 Jun 2007 23:27:43 +0200 Maximilian Engelhardt [EMAIL PROTECTED] wrote: Hello, I recently did some test and found out something interesting about the b44 problem I wrote earlier. The problem is the following: When I use my BCM4401 with the b44 driver in wireless-dev I get very high ping times looking like this: 64 bytes from 172.30.10.1: icmp_seq=1 ttl=64 time=1863 ms 64 bytes from 172.30.10.1: icmp_seq=2 ttl=64 time=855 ms 64 bytes from 172.30.10.1: icmp_seq=3 ttl=64 time=1855 ms 64 bytes from 172.30.10.1: icmp_seq=4 ttl=64 time=855 ms 64 bytes from 172.30.10.1: icmp_seq=5 ttl=64 time=1854 ms 64 bytes from 172.30.10.1: icmp_seq=6 ttl=64 time=854 ms 64 bytes from 172.30.10.1: icmp_seq=7 ttl=64 time=1851 ms 64 bytes from 172.30.10.1: icmp_seq=8 ttl=64 time=851 ms 64 bytes from 172.30.10.1: icmp_seq=9 ttl=64 time=1851 ms 64 bytes from 172.30.10.1: icmp_seq=10 ttl=64 time=851 ms I also found out that shortly after I boot my laptop and log into kde ping times are not that high but start to increase very quickly: 64 bytes from 172.30.10.1: icmp_seq=53 ttl=64 time=2.19 ms 64 bytes from 172.30.10.1: icmp_seq=54 ttl=64 time=2.22 ms 64 bytes from 172.30.10.1: icmp_seq=55 ttl=64 time=2.20 ms 64 bytes from 172.30.10.1: icmp_seq=56 ttl=64 time=2.20 ms 64 bytes from 172.30.10.1: icmp_seq=57 ttl=64 time=18.6 ms 64 bytes from 172.30.10.1: icmp_seq=58 ttl=64 time=1268 ms 64 bytes from 172.30.10.1: icmp_seq=59 ttl=64 time=268 ms 64 bytes from 172.30.10.1: icmp_seq=60 ttl=64 time=1268 ms 64 bytes from 172.30.10.1: icmp_seq=61 ttl=64 time=268 ms 64 bytes from 172.30.10.1: icmp_seq=62 ttl=64 time=6.08 ms 64 bytes from 172.30.10.1: icmp_seq=63 ttl=64 time=268 ms 64 bytes from 172.30.10.1: icmp_seq=64 ttl=64 time=1264 ms 64 bytes from 172.30.10.1: icmp_seq=65 ttl=64 time=264 ms After some time digging around I found out something really interesting. When I play some music ping times are immediately lower. If I stop playing music they are back to the same times as they were before. I guess that there is a problem with interrupts so I post some information of my system in hope it will be usefull. [EMAIL PROTECTED]:~$ cat /proc/interrupts CPU0 0: 126317XT-PIC-XTtimer 1: 3600XT-PIC-XTi8042 2: 0XT-PIC-XTcascade 7: 1XT-PIC-XTparport0 8: 1XT-PIC-XTrtc 9: 17371XT-PIC-XTacpi 10: 13237XT-PIC-XTfirewire_ohci, yenta, yenta, ehci_hcd:usb1, uhci_hcd:usb3, uhci_hcd:usb4, Intel 82801DB-ICH4, Intel 82801DB-ICH4 Modem, eth0 11: 89059XT-PIC-XTuhci_hcd:usb2, [EMAIL PROTECTED]::00:02.0 12:632XT-PIC-XTi8042 14: 10354XT-PIC-XTlibata 15: 7408XT-PIC-XTlibata NMI: 0 ERR: 0 [...] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10 ACPI: PCI Interrupt :02:02.0[A] - Link [LNKD] - GSI 10 (level, low) - IRQ 10 ssb: Sonics Silicon Backplane found on PCI device :02:02.0 b44.c:v2.0 eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7 [...] This problem did only happen with wireless-dev (checkout this evening) and with -mm kernels I used some time ago for testing. Currently I'm running 2.6.22-rc4 that works perfectly fine and doesn't show that problem. Maxi Can you build with APIC for uniprocessor. I did enable CONFIG_X86_UP_APIC and CONFIG_X86_UP_IOAPIC and tried with lapic and apic=force but couldn't get APIC working. There is lots of IRQ sharing, so - one of the other device's may be not handling shared IRQ properly. Try unloading firewhire modem and yenta devices. - IRQ might be set edge triggered which doesn't work with NAPI or shared IRQ. I did build a kernel without the three mentioned above but the problem is still the same. I also did remove everything but eth0 on interrupt 10 so the only device using that interrupt is eth0 and then the card completely stopped working. Maxi signature.asc Description: This is a digitally signed message part.
Re: iperf: performance regression (was b44 driver problem?)
On Monday 04 June 2007, Stephen Hemminger wrote: > On Mon, 4 Jun 2007 21:47:59 +0200 > > Maximilian Engelhardt <[EMAIL PROTECTED]> wrote: > > On Monday 04 June 2007, Ingo Molnar wrote: > > > * Stephen Hemminger <[EMAIL PROTECTED]> wrote: > > > > Yes, the following patch makes iperf work better than ever. But are > > > > other broken applications going to have same problem. Sounds like the > > > > old "who runs first" fork() problems. > > > > > > this is the first such app and really, and even for this app: i've been > > > frequently running iperf on -rt kernels for _years_ and never noticed > > > how buggy its 'locking' code was, and that it would under some > > > circumstances use up the whole CPU on high-res timers. > > > > I must admit I don't know much about that topic, but there is one thing I > > don't understand. Why is iperf (even if it's buggy) able to use up the > > whole cpu? I didn't run it as root but as my normal user so it should > > have limited rights. Shouldn't the linux scheduler distribute cpu time > > among all running processes? > > In this case, there are two threads. One is receiving data and the other > is spinning checking on progress. If the spinning thread doesn't yield, > it will end up using it's whole quantum (10ms at 100hz), before the > scheduler lets the receiver run again. If the receiving thread doesn't > get to run then on a UP the performance stinks. > Ok, let's see if I got this right: If there are other processes that want cpu time they will get it after the quantum for the iperf thread is used up. So cpu time will be distributed among other processes, but it takes some time until they get it and this increases latency. > The problem only showed up laptop because most of my other systems are > SMP (or fake SMP/HT), and usually set HZ to 1000 not 100. Hm, on my laptop (Pentium M) I have configured CONFIG_HZ_300 and CONFIG_NO_HZ. On my desktop PC (Athlon 2000+, also UP) I also have CONFIG_HZ_300 and CONFIG_NO_HZ but didn't notice the problem. Maxi signature.asc Description: This is a digitally signed message part.
Re: iperf: performance regression (was b44 driver problem?)
On Monday 04 June 2007, Ingo Molnar wrote: > * Stephen Hemminger <[EMAIL PROTECTED]> wrote: > > Yes, the following patch makes iperf work better than ever. But are > > other broken applications going to have same problem. Sounds like the > > old "who runs first" fork() problems. > > this is the first such app and really, and even for this app: i've been > frequently running iperf on -rt kernels for _years_ and never noticed > how buggy its 'locking' code was, and that it would under some > circumstances use up the whole CPU on high-res timers. I must admit I don't know much about that topic, but there is one thing I don't understand. Why is iperf (even if it's buggy) able to use up the whole cpu? I didn't run it as root but as my normal user so it should have limited rights. Shouldn't the linux scheduler distribute cpu time among all running processes? Maxi signature.asc Description: This is a digitally signed message part.
Re: iperf: performance regression (was b44 driver problem?)
On Monday 04 June 2007, Ingo Molnar wrote: * Stephen Hemminger [EMAIL PROTECTED] wrote: Yes, the following patch makes iperf work better than ever. But are other broken applications going to have same problem. Sounds like the old who runs first fork() problems. this is the first such app and really, and even for this app: i've been frequently running iperf on -rt kernels for _years_ and never noticed how buggy its 'locking' code was, and that it would under some circumstances use up the whole CPU on high-res timers. I must admit I don't know much about that topic, but there is one thing I don't understand. Why is iperf (even if it's buggy) able to use up the whole cpu? I didn't run it as root but as my normal user so it should have limited rights. Shouldn't the linux scheduler distribute cpu time among all running processes? Maxi signature.asc Description: This is a digitally signed message part.
Re: iperf: performance regression (was b44 driver problem?)
On Monday 04 June 2007, Stephen Hemminger wrote: On Mon, 4 Jun 2007 21:47:59 +0200 Maximilian Engelhardt [EMAIL PROTECTED] wrote: On Monday 04 June 2007, Ingo Molnar wrote: * Stephen Hemminger [EMAIL PROTECTED] wrote: Yes, the following patch makes iperf work better than ever. But are other broken applications going to have same problem. Sounds like the old who runs first fork() problems. this is the first such app and really, and even for this app: i've been frequently running iperf on -rt kernels for _years_ and never noticed how buggy its 'locking' code was, and that it would under some circumstances use up the whole CPU on high-res timers. I must admit I don't know much about that topic, but there is one thing I don't understand. Why is iperf (even if it's buggy) able to use up the whole cpu? I didn't run it as root but as my normal user so it should have limited rights. Shouldn't the linux scheduler distribute cpu time among all running processes? In this case, there are two threads. One is receiving data and the other is spinning checking on progress. If the spinning thread doesn't yield, it will end up using it's whole quantum (10ms at 100hz), before the scheduler lets the receiver run again. If the receiving thread doesn't get to run then on a UP the performance stinks. Ok, let's see if I got this right: If there are other processes that want cpu time they will get it after the quantum for the iperf thread is used up. So cpu time will be distributed among other processes, but it takes some time until they get it and this increases latency. The problem only showed up laptop because most of my other systems are SMP (or fake SMP/HT), and usually set HZ to 1000 not 100. Hm, on my laptop (Pentium M) I have configured CONFIG_HZ_300 and CONFIG_NO_HZ. On my desktop PC (Athlon 2000+, also UP) I also have CONFIG_HZ_300 and CONFIG_NO_HZ but didn't notice the problem. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Thomas Gleixner wrote: > On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote: > > > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the > > > following combinations on the kernel command line: > > > > > > 1) highres=off nohz=off (should be the same as your working config) > > > 2) highres=off > > > 3) nohz=off > > > > I tested this with my 2.6.22-rc3 kernel, here are the results: > > > > without any special boot parameters: problem does appear > > highres=off nohz=off: problem does not appear > > highres=off: problem does not appear > > nohz=off: problem does appear > > Is there any other strange behavior of the high res enabled kernel than > the b44 problem ? I didn't notice anything in the past (as I wrote). But today I did some tests for an updated version of the p54 mac80211 wlan driver and I noticed exactly the same problem: when booting with highres=off everything is fine. But when I boot an highres enabled kernel and I do the iperf-test with the p54 driver, my systems becomes unresponsive during the test. It seems to be exactly the same problem I have with the b44 driver. So this might not be a bug in the b44 code but a bug somewhere in the linux networking code. I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built external as module. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Thomas Gleixner wrote: On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote: Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the following combinations on the kernel command line: 1) highres=off nohz=off (should be the same as your working config) 2) highres=off 3) nohz=off I tested this with my 2.6.22-rc3 kernel, here are the results: without any special boot parameters: problem does appear highres=off nohz=off: problem does not appear highres=off: problem does not appear nohz=off: problem does appear Is there any other strange behavior of the high res enabled kernel than the b44 problem ? I didn't notice anything in the past (as I wrote). But today I did some tests for an updated version of the p54 mac80211 wlan driver and I noticed exactly the same problem: when booting with highres=off everything is fine. But when I boot an highres enabled kernel and I do the iperf-test with the p54 driver, my systems becomes unresponsive during the test. It seems to be exactly the same problem I have with the b44 driver. So this might not be a bug in the b44 code but a bug somewhere in the linux networking code. I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built external as module. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Thomas Gleixner wrote: > On Mon, 2007-05-28 at 22:55 +0200, Maximilian Engelhardt wrote: > > > > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution > > > > Timer, but the high ping problem is still there. > > > > > > Hmm, that's mysterious. Wild guess is that highres exposes the hidden > > > "feature" in a different way than rc2-mm1 does. > > > > I think the bug in 2.6.21/22-rc3 is a different one that the one in > > 2.6.22-rc2-mm1, but that's also only a wild guess :) > > > > I'll explain this a bit: > > In 2.6.21/22-rc3 is the same b44 driver that has been in the stock > > kernels for some time. With this driver and High Resolution Timer turned > > on I get problems using iperf. The problems are that the systems becomes > > really slow and unresponsive. Michael Buesch thought this could be an > > IRQ storm which sounds logical to me. This bug did never happen to me > > before I startet the iperf test. > > Can you please apply > > http://www.tglx.de/projects/hrtimers/2.6.22-rc3/patch-2.6.22-rc3-hrt1.patch > > on top of rc3 and check, whether it has any effect on your problem. > The patch didn't change anything. > > The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 > > ssb spilt. It's independed wether High Resolution Timer is turned on or > > off I always get very varying and high ping times. The iperf-test doesn't > > show the problems from 2.6.21/22-rc3. > > Neither with nor without highres ? Yes, it doesn't matter if highres is turned on or off. iperf never showed the problem from 2.6.21/22-rc3. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Tuesday 29 May 2007, Gary Zambrano wrote: > On Mon, 2007-05-28 at 13:55 -0700, Maximilian Engelhardt wrote: > > On Monday 28 May 2007, Thomas Gleixner wrote: > > > On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote: > > > > > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try > > > > > the following combinations on the kernel command line: > > > > > > > > > > 1) highres=off nohz=off (should be the same as your working config) > > > > > 2) highres=off > > > > > 3) nohz=off > > > > > > > > I tested this with my 2.6.22-rc3 kernel, here are the results: > > > > > > > > without any special boot parameters: problem does appear > > > > highres=off nohz=off: problem does not appear > > > > highres=off: problem does not appear > > > > nohz=off: problem does appear > > > > > > Is there any other strange behavior of the high res enabled kernel than > > > the b44 problem ? > > > > I didn't notice anything. > > > > > > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution > > > > Timer, but the high ping problem is still there. > > > > > > Hmm, that's mysterious. Wild guess is that highres exposes the hidden > > > "feature" in a different way than rc2-mm1 does. > > > > I think the bug in 2.6.21/22-rc3 is a different one that the one in > > 2.6.22-rc2-mm1, but that's also only a wild guess :) > > > > I'll explain this a bit: > > In 2.6.21/22-rc3 is the same b44 driver that has been in the stock > > kernels for some time. With this driver and High Resolution Timer turned > > on I get problems using iperf. The problems are that the systems becomes > > really slow and unresponsive. Michael Buesch thought this could be an > > IRQ storm which sounds logical to me. This bug did never happen to me > > before I startet the iperf test. > > Can you please check to see if you notice anything out of the ordinary > using netperf in place of iperf in your high res timer on/off testbed? ok, here are the results, I also had a look at the cpu kernel usage. 'good' means that the kernel responsiveness during the test was as I would expect it and I didn't notice any problems. highres enabled: netperf: 80%sy 15%si (good) iperf: not really messureable (bad, problem described above) highres disabled: netperf: 80%sy 15%si (good) iperf: 5%sy 30%hi 15%si (good) for test tests I did run the following commands: netperf -l 60 192.168.1.1 iperf -c 192.168.1.1 -r -t 60 I also tried to run iperf without any additional arguments (iperf -c 192.168.1.1) on the problematic kernel but the result is the same as the command I wrote above. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Tuesday 29 May 2007, Gary Zambrano wrote: On Mon, 2007-05-28 at 13:55 -0700, Maximilian Engelhardt wrote: On Monday 28 May 2007, Thomas Gleixner wrote: On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote: Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the following combinations on the kernel command line: 1) highres=off nohz=off (should be the same as your working config) 2) highres=off 3) nohz=off I tested this with my 2.6.22-rc3 kernel, here are the results: without any special boot parameters: problem does appear highres=off nohz=off: problem does not appear highres=off: problem does not appear nohz=off: problem does appear Is there any other strange behavior of the high res enabled kernel than the b44 problem ? I didn't notice anything. I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution Timer, but the high ping problem is still there. Hmm, that's mysterious. Wild guess is that highres exposes the hidden feature in a different way than rc2-mm1 does. I think the bug in 2.6.21/22-rc3 is a different one that the one in 2.6.22-rc2-mm1, but that's also only a wild guess :) I'll explain this a bit: In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels for some time. With this driver and High Resolution Timer turned on I get problems using iperf. The problems are that the systems becomes really slow and unresponsive. Michael Buesch thought this could be an IRQ storm which sounds logical to me. This bug did never happen to me before I startet the iperf test. Can you please check to see if you notice anything out of the ordinary using netperf in place of iperf in your high res timer on/off testbed? ok, here are the results, I also had a look at the cpu kernel usage. 'good' means that the kernel responsiveness during the test was as I would expect it and I didn't notice any problems. highres enabled: netperf: 80%sy 15%si (good) iperf: not really messureable (bad, problem described above) highres disabled: netperf: 80%sy 15%si (good) iperf: 5%sy 30%hi 15%si (good) for test tests I did run the following commands: netperf -l 60 192.168.1.1 iperf -c 192.168.1.1 -r -t 60 I also tried to run iperf without any additional arguments (iperf -c 192.168.1.1) on the problematic kernel but the result is the same as the command I wrote above. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Thomas Gleixner wrote: On Mon, 2007-05-28 at 22:55 +0200, Maximilian Engelhardt wrote: I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution Timer, but the high ping problem is still there. Hmm, that's mysterious. Wild guess is that highres exposes the hidden feature in a different way than rc2-mm1 does. I think the bug in 2.6.21/22-rc3 is a different one that the one in 2.6.22-rc2-mm1, but that's also only a wild guess :) I'll explain this a bit: In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels for some time. With this driver and High Resolution Timer turned on I get problems using iperf. The problems are that the systems becomes really slow and unresponsive. Michael Buesch thought this could be an IRQ storm which sounds logical to me. This bug did never happen to me before I startet the iperf test. Can you please apply http://www.tglx.de/projects/hrtimers/2.6.22-rc3/patch-2.6.22-rc3-hrt1.patch on top of rc3 and check, whether it has any effect on your problem. The patch didn't change anything. The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 ssb spilt. It's independed wether High Resolution Timer is turned on or off I always get very varying and high ping times. The iperf-test doesn't show the problems from 2.6.21/22-rc3. Neither with nor without highres ? Yes, it doesn't matter if highres is turned on or off. iperf never showed the problem from 2.6.21/22-rc3. Maxi signature.asc Description: This is a digitally signed message part.
Re: software suspend doesn't work with 2.6.22-rc3
On Monday 28 May 2007, Rafael J. Wysocki wrote: > On Monday, 28 May 2007 09:59, Rafael J. Wysocki wrote: > > On Monday, 28 May 2007 02:21, Maximilian Engelhardt wrote: > > > On Sunday 27 May 2007, Rafael J. Wysocki wrote: > > > > On Sunday, 27 May 2007 22:41, Maximilian Engelhardt wrote: > > > > > On Sunday 27 May 2007, Rafael J. Wysocki wrote: > > > > > > On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote: > > > > > > > On Saturday 26 May 2007, Nigel Cunningham wrote: > > > > > > > > Hi. > > > > > > > > > > > > > > > > On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote: > > > > > > > > > On Saturday 26 May 2007, Nigel Cunningham wrote: > > > > > > > > > > Hi. > > > > > > > > > > > > > > > > > > > > On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote: > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > > > > > When I try software suspend on my laptop it always > > > > > > > > > > > returns to my running system after some time. > > > > > > > > > > > This is what's logged by the kernel: > > > > > > > > > > > > > > > > > > > > > > swsusp: Basic memory bitmaps created > > > > > > > > > > > Stopping tasks ... > > > > > > > > > > > Stopping kernel threads timed out after 20 seconds (1 > > > > > > > > > > > tasks refusing to freeze): > > > > > > > > > > > cryptd > > > > > > > > > > > Restarting tasks ... done. > > > > > > > > > > > swsusp: Basic memory bitmaps freed > > > > > > > > > > > > > > > > > > > > > > I have no idea what's the problem, but if you tell me > > > > > > > > > > > what I should do I can create debugging information > > > > > > > > > > > and/or test patches. > > > > > > > > > > > > > > > > > > > > Could you try this patch, please? It should help. > > > > > > > > > > > > > > > > > > > > Herbert, is this right? If cryptd is going to be used for > > > > > > > > > > block devs, the task should probably be PF_NOFREEZE (or > > > > > > > > > > whatever it is today) instead. > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > > > > > Nigel > > > > > > > > > > > > > > > > > > > > crypto/cryptd.c |1 + > > > > > > > > > > include/linux/freezer.h |3 +++ > > > > > > > > > > kernel/power/process.c |2 +- > > > > > > > > > > 3 files changed, 5 insertions(+), 1 deletion(-) > > > > > > > > > > diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c > > > > > > > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c --- > > > > > > > > > > 991-fix-cryptd.patch-old/crypto/cryptd.c2007-05-19 > > > > > > > > > > 18:16:47.0 +1000 +++ > > > > > > > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26 > > > > > > > > > > 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int > > > > > > > > > > cryptd_thread(void *data) > > > > > > > > > > > > > > > > > > > > mutex_unlock(>mutex); > > > > > > > > > > > > > > > > > > > > + try_to_freeze(); > > > > > > > > > > schedule(); > > > > > > > > > > } while (!stop); > > > > > > > > > > > > > > > > > > I tried your patch, but when I apply it my kernel doesn't > > > > > > > > > compile any more. I get these warnings/errors: > > > > > > > > > > > > > > >
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Thomas Gleixner wrote: > On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote: > > > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the > > > following combinations on the kernel command line: > > > > > > 1) highres=off nohz=off (should be the same as your working config) > > > 2) highres=off > > > 3) nohz=off > > > > I tested this with my 2.6.22-rc3 kernel, here are the results: > > > > without any special boot parameters: problem does appear > > highres=off nohz=off: problem does not appear > > highres=off: problem does not appear > > nohz=off: problem does appear > > Is there any other strange behavior of the high res enabled kernel than > the b44 problem ? I didn't notice anything. > > > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution > > Timer, but the high ping problem is still there. > > Hmm, that's mysterious. Wild guess is that highres exposes the hidden > "feature" in a different way than rc2-mm1 does. I think the bug in 2.6.21/22-rc3 is a different one that the one in 2.6.22-rc2-mm1, but that's also only a wild guess :) I'll explain this a bit: In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels for some time. With this driver and High Resolution Timer turned on I get problems using iperf. The problems are that the systems becomes really slow and unresponsive. Michael Buesch thought this could be an IRQ storm which sounds logical to me. This bug did never happen to me before I startet the iperf test. The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 ssb spilt. It's independed wether High Resolution Timer is turned on or off I always get very varying and high ping times. The iperf-test doesn't show the problems from 2.6.21/22-rc3. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Thomas Gleixner wrote: > On Mon, 2007-05-28 at 17:14 +0200, Michael Buesch wrote: > > > The -oldconfig1 is the kernel that had no problems and the other shows > > > the b44 problem. So if High Resolution Timer Support is disabled > > > everything works fine and if I enable it the problems do appear again. > > > > > > I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling > > > High Resolution Timer Support will also solve the problem there. > > > > > > The older kernels I tried also work perfectly fine and they didn't have > > > the High Resolution Timer Support yet. > > > > So, that's interesting, indeed. > > Any idea what's going on, someone? Thomas? > > Not off the top of my head. > > Maximilian, does the kernel work otherwise (I mean aside of the b44 > driver) ? > > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the > following combinations on the kernel command line: > > 1) highres=off nohz=off (should be the same as your working config) > 2) highres=off > 3) nohz=off I tested this with my 2.6.22-rc3 kernel, here are the results: without any special boot parameters: problem does appear highres=off nohz=off: problem does not appear highres=off: problem does not appear nohz=off: problem does appear I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution Timer, but the high ping problem is still there. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Michael Buesch wrote: > Can you also test the following patch? > I think there's a bug in b44 that is doesn't properly discard > shared IRQs, so it might possibly generate a NAPI storm, dunno. > Worth a try. > > Index: linux-2.6.22-rc3/drivers/net/b44.c > === > --- linux-2.6.22-rc3.orig/drivers/net/b44.c 2007-05-27 23:01:44.0 > +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c 2007-05-28 12:48:27.0 > +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq > spin_lock(>lock); > > istat = br32(bp, B44_ISTAT); > + if (istat == 0x) > + goto out; /* Shared IRQ not for us */ > imask = br32(bp, B44_IMASK); > > /* The interrupt mask register controls which interrupt bits > @@ -942,6 +944,7 @@ irq_ack: > bw32(bp, B44_ISTAT, istat); > br32(bp, B44_ISTAT); > } > +out: > spin_unlock(>lock); > return IRQ_RETVAL(handled); > } I did try this patch on a affected kernel, but I didn't notice any big difference. Perhaps the kernel is a bit less slow during the test, but It's hard to tell. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Michael Buesch wrote: > Can you give 2.6.16 a try? The diff is not that big and we might > be able to find out what broke if you find out 2.6.16 works. > You can also try later kernels like .17, .18, .19 to further > reduce the patch. (You could also git-bisect, if you have the time). > I did some testing and compiled some kernels and here are the results: I was able to find out what causes the problems for me. I did build two 2.6.21.3 kernels, and one does work fine and the other doesn't. This is a diff of the kernel configs I used: --- /usr/src/linux-2.6.21.3-oldconfig1/.config 2007-05-28 13:41:15.0 +0200 +++ /usr/src/linux-2.6.21.3/.config 2007-05-28 14:46:08.0 +0200 @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit # Linux kernel version: 2.6.21.3 -# Mon May 28 13:41:15 2007 +# Mon May 28 14:46:09 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y @@ -32,7 +32,7 @@ # # General setup # -CONFIG_LOCALVERSION="-oldconfig1" +CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SYSVIPC=y @@ -108,9 +108,9 @@ # # Processor type and features # -# CONFIG_TICK_ONESHOT is not set +CONFIG_TICK_ONESHOT=y # CONFIG_NO_HZ is not set -# CONFIG_HIGH_RES_TIMERS is not set +CONFIG_HIGH_RES_TIMERS=y # CONFIG_SMP is not set CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set The -oldconfig1 is the kernel that had no problems and the other shows the b44 problem. So if High Resolution Timer Support is disabled everything works fine and if I enable it the problems do appear again. I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling High Resolution Timer Support will also solve the problem there. The older kernels I tried also work perfectly fine and they didn't have the High Resolution Timer Support yet. Maxi signature.asc Description: This is a digitally signed message part.
Re: Oops with prism54 in 2.6.22-rc3
On Monday 28 May 2007, Björn Steinbrink wrote: > On 2007.05.26 14:42:30 +0200, Maximilian Engelhardt wrote: > > Hello, > > > > when using the prism54 driver including in the 2.6.22-rc3 kernel I get > > this Oops when putting the card into monitor mode: > > > > BUG: unable to handle kernel NULL pointer dereference at virtual address > > 01d8 > > printing eip: > > c0500608 > > *pde = > > Oops: 0002 [#1] > > PREEMPT > > Modules linked in: fuse > > CPU:0 > > EIP:0060:[]Not tainted VLI > > EFLAGS: 00010046 (2.6.22-rc3 #2) > > EIP is at netif_rx+0x48/0xc0 > > eax: ebx: c18fdbc0 ecx: c087991c edx: c0879910 > > esi: 0246 edi: f7c68010 ebp: f7fe0ba0 esp: c07bbef0 > > ds: 007b es: 007b fs: gs: ss: 0068 > > Process swapper (pid: 0, ti=c07ba000 task=c075a280 task.ti=c07ba000) > > Stack: f7ec c03d2b8f c07bbf24 0082 f7c68024 f7fe0800 > > c18fdbc0 0070 0046 0286 0286 0008 0007 0032dcd5 > > f7fe0ba0 0002 f7fe0800 c03d913d f7f4d2c0 > > Call Trace: > > [] islpci_eth_receive+0x12f/0x590 > > [] islpci_interrupt+0x1cd/0x280 > > [] handle_IRQ_event+0x25/0x50 > > [] handle_fasteoi_irq+0x5c/0xe0 > > [] do_IRQ+0x4a/0x80 > > [] common_interrupt+0x23/0x28 > > [] default_idle+0x2a/0x40 > > [] cpu_idle+0x43/0x80 > > [] start_kernel+0x21a/0x260 > > [] unknown_bootoption+0x0/0x260 > > === > > Code: c7 43 0c 00 00 00 00 c7 43 10 00 00 00 00 9c 5e fa ff 05 bc 9c 87 > > c0 a1 0c 99 87 c0 3b 05 c0 4a 7b c0 77 30 85 c0 74 43 8b 43 14 80 d8 > > 01 00 00 a1 08 99 87 c0 ff 05 0c 99 87 c0 c7 03 04 99 > > EIP: [] netif_rx+0x48/0xc0 SS:ESP 0068:c07bbef0 > > Kernel panic - not syncing: Fatal exception in interrupt > > > > After this the system is frozen. Using kernel 2.6.21 everything works > > fine, I can capture packets in monitor mode and do not get any Oops. > > That's probably due to commit 4c13eb6657fe9ef7b4dc8f1a405c902e9e5234e0, > which moved the setting of skb->dev into eth_type_trans, which is never > called when the card is in monitor mode. > > Could you try this patch? > > > Manually set the device of a skb for prism54 cards that are in monitor > mode as we never call eth_type_trans in that case. > > Signed-off-by: Björn Steinbrink <[EMAIL PROTECTED]> > --- > diff --git a/drivers/net/wireless/prism54/islpci_eth.c > b/drivers/net/wireless/prism54/islpci_eth.c index dd070cc..f49eb06 100644 > --- a/drivers/net/wireless/prism54/islpci_eth.c > +++ b/drivers/net/wireless/prism54/islpci_eth.c > @@ -378,9 +378,10 @@ islpci_eth_receive(islpci_private *priv) > display_buffer((char *) skb->data, skb->len); > #endif > /* take care of monitor mode and spy monitoring. */ > - if (unlikely(priv->iw_mode == IW_MODE_MONITOR)) > + if (unlikely(priv->iw_mode == IW_MODE_MONITOR)) { > + skb->dev = ndev; > discard = islpci_monitor_rx(priv, ); > - else { > + } else { > if (unlikely(skb->data[2 * ETH_ALEN] == 0)) { > /* The packet has a rx_annex. Read it for spy > monitoring, Then >* remove it, while keeping the 2 leading MAC addr. With this patch monitor mode does work fine. Maxi signature.asc Description: This is a digitally signed message part.
Re: Oops with prism54 in 2.6.22-rc3
On Monday 28 May 2007, Björn Steinbrink wrote: On 2007.05.26 14:42:30 +0200, Maximilian Engelhardt wrote: Hello, when using the prism54 driver including in the 2.6.22-rc3 kernel I get this Oops when putting the card into monitor mode: BUG: unable to handle kernel NULL pointer dereference at virtual address 01d8 printing eip: c0500608 *pde = Oops: 0002 [#1] PREEMPT Modules linked in: fuse CPU:0 EIP:0060:[c0500608]Not tainted VLI EFLAGS: 00010046 (2.6.22-rc3 #2) EIP is at netif_rx+0x48/0xc0 eax: ebx: c18fdbc0 ecx: c087991c edx: c0879910 esi: 0246 edi: f7c68010 ebp: f7fe0ba0 esp: c07bbef0 ds: 007b es: 007b fs: gs: ss: 0068 Process swapper (pid: 0, ti=c07ba000 task=c075a280 task.ti=c07ba000) Stack: f7ec c03d2b8f c07bbf24 0082 f7c68024 f7fe0800 c18fdbc0 0070 0046 0286 0286 0008 0007 0032dcd5 f7fe0ba0 0002 f7fe0800 c03d913d f7f4d2c0 Call Trace: [c03d2b8f] islpci_eth_receive+0x12f/0x590 [c03d913d] islpci_interrupt+0x1cd/0x280 [c0144e15] handle_IRQ_event+0x25/0x50 [c014669c] handle_fasteoi_irq+0x5c/0xe0 [c010674a] do_IRQ+0x4a/0x80 [c010498f] common_interrupt+0x23/0x28 [c0102b3a] default_idle+0x2a/0x40 [c01023e3] cpu_idle+0x43/0x80 [c07bcb2a] start_kernel+0x21a/0x260 [c07bc450] unknown_bootoption+0x0/0x260 === Code: c7 43 0c 00 00 00 00 c7 43 10 00 00 00 00 9c 5e fa ff 05 bc 9c 87 c0 a1 0c 99 87 c0 3b 05 c0 4a 7b c0 77 30 85 c0 74 43 8b 43 14 ff 80 d8 01 00 00 a1 08 99 87 c0 ff 05 0c 99 87 c0 c7 03 04 99 EIP: [c0500608] netif_rx+0x48/0xc0 SS:ESP 0068:c07bbef0 Kernel panic - not syncing: Fatal exception in interrupt After this the system is frozen. Using kernel 2.6.21 everything works fine, I can capture packets in monitor mode and do not get any Oops. That's probably due to commit 4c13eb6657fe9ef7b4dc8f1a405c902e9e5234e0, which moved the setting of skb-dev into eth_type_trans, which is never called when the card is in monitor mode. Could you try this patch? Manually set the device of a skb for prism54 cards that are in monitor mode as we never call eth_type_trans in that case. Signed-off-by: Björn Steinbrink [EMAIL PROTECTED] --- diff --git a/drivers/net/wireless/prism54/islpci_eth.c b/drivers/net/wireless/prism54/islpci_eth.c index dd070cc..f49eb06 100644 --- a/drivers/net/wireless/prism54/islpci_eth.c +++ b/drivers/net/wireless/prism54/islpci_eth.c @@ -378,9 +378,10 @@ islpci_eth_receive(islpci_private *priv) display_buffer((char *) skb-data, skb-len); #endif /* take care of monitor mode and spy monitoring. */ - if (unlikely(priv-iw_mode == IW_MODE_MONITOR)) + if (unlikely(priv-iw_mode == IW_MODE_MONITOR)) { + skb-dev = ndev; discard = islpci_monitor_rx(priv, skb); - else { + } else { if (unlikely(skb-data[2 * ETH_ALEN] == 0)) { /* The packet has a rx_annex. Read it for spy monitoring, Then * remove it, while keeping the 2 leading MAC addr. With this patch monitor mode does work fine. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Michael Buesch wrote: Can you give 2.6.16 a try? The diff is not that big and we might be able to find out what broke if you find out 2.6.16 works. You can also try later kernels like .17, .18, .19 to further reduce the patch. (You could also git-bisect, if you have the time). I did some testing and compiled some kernels and here are the results: I was able to find out what causes the problems for me. I did build two 2.6.21.3 kernels, and one does work fine and the other doesn't. This is a diff of the kernel configs I used: --- /usr/src/linux-2.6.21.3-oldconfig1/.config 2007-05-28 13:41:15.0 +0200 +++ /usr/src/linux-2.6.21.3/.config 2007-05-28 14:46:08.0 +0200 @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit # Linux kernel version: 2.6.21.3 -# Mon May 28 13:41:15 2007 +# Mon May 28 14:46:09 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y @@ -32,7 +32,7 @@ # # General setup # -CONFIG_LOCALVERSION=-oldconfig1 +CONFIG_LOCALVERSION= CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SYSVIPC=y @@ -108,9 +108,9 @@ # # Processor type and features # -# CONFIG_TICK_ONESHOT is not set +CONFIG_TICK_ONESHOT=y # CONFIG_NO_HZ is not set -# CONFIG_HIGH_RES_TIMERS is not set +CONFIG_HIGH_RES_TIMERS=y # CONFIG_SMP is not set CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set The -oldconfig1 is the kernel that had no problems and the other shows the b44 problem. So if High Resolution Timer Support is disabled everything works fine and if I enable it the problems do appear again. I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling High Resolution Timer Support will also solve the problem there. The older kernels I tried also work perfectly fine and they didn't have the High Resolution Timer Support yet. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Michael Buesch wrote: Can you also test the following patch? I think there's a bug in b44 that is doesn't properly discard shared IRQs, so it might possibly generate a NAPI storm, dunno. Worth a try. Index: linux-2.6.22-rc3/drivers/net/b44.c === --- linux-2.6.22-rc3.orig/drivers/net/b44.c 2007-05-27 23:01:44.0 +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c 2007-05-28 12:48:27.0 +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq spin_lock(bp-lock); istat = br32(bp, B44_ISTAT); + if (istat == 0x) + goto out; /* Shared IRQ not for us */ imask = br32(bp, B44_IMASK); /* The interrupt mask register controls which interrupt bits @@ -942,6 +944,7 @@ irq_ack: bw32(bp, B44_ISTAT, istat); br32(bp, B44_ISTAT); } +out: spin_unlock(bp-lock); return IRQ_RETVAL(handled); } I did try this patch on a affected kernel, but I didn't notice any big difference. Perhaps the kernel is a bit less slow during the test, but It's hard to tell. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Thomas Gleixner wrote: On Mon, 2007-05-28 at 17:14 +0200, Michael Buesch wrote: The -oldconfig1 is the kernel that had no problems and the other shows the b44 problem. So if High Resolution Timer Support is disabled everything works fine and if I enable it the problems do appear again. I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling High Resolution Timer Support will also solve the problem there. The older kernels I tried also work perfectly fine and they didn't have the High Resolution Timer Support yet. So, that's interesting, indeed. Any idea what's going on, someone? Thomas? Not off the top of my head. Maximilian, does the kernel work otherwise (I mean aside of the b44 driver) ? Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the following combinations on the kernel command line: 1) highres=off nohz=off (should be the same as your working config) 2) highres=off 3) nohz=off I tested this with my 2.6.22-rc3 kernel, here are the results: without any special boot parameters: problem does appear highres=off nohz=off: problem does not appear highres=off: problem does not appear nohz=off: problem does appear I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution Timer, but the high ping problem is still there. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Thomas Gleixner wrote: On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote: Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the following combinations on the kernel command line: 1) highres=off nohz=off (should be the same as your working config) 2) highres=off 3) nohz=off I tested this with my 2.6.22-rc3 kernel, here are the results: without any special boot parameters: problem does appear highres=off nohz=off: problem does not appear highres=off: problem does not appear nohz=off: problem does appear Is there any other strange behavior of the high res enabled kernel than the b44 problem ? I didn't notice anything. I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution Timer, but the high ping problem is still there. Hmm, that's mysterious. Wild guess is that highres exposes the hidden feature in a different way than rc2-mm1 does. I think the bug in 2.6.21/22-rc3 is a different one that the one in 2.6.22-rc2-mm1, but that's also only a wild guess :) I'll explain this a bit: In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels for some time. With this driver and High Resolution Timer turned on I get problems using iperf. The problems are that the systems becomes really slow and unresponsive. Michael Buesch thought this could be an IRQ storm which sounds logical to me. This bug did never happen to me before I startet the iperf test. The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 ssb spilt. It's independed wether High Resolution Timer is turned on or off I always get very varying and high ping times. The iperf-test doesn't show the problems from 2.6.21/22-rc3. Maxi signature.asc Description: This is a digitally signed message part.
Re: software suspend doesn't work with 2.6.22-rc3
On Monday 28 May 2007, Rafael J. Wysocki wrote: On Monday, 28 May 2007 09:59, Rafael J. Wysocki wrote: On Monday, 28 May 2007 02:21, Maximilian Engelhardt wrote: On Sunday 27 May 2007, Rafael J. Wysocki wrote: On Sunday, 27 May 2007 22:41, Maximilian Engelhardt wrote: On Sunday 27 May 2007, Rafael J. Wysocki wrote: On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote: On Saturday 26 May 2007, Nigel Cunningham wrote: Hi. On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote: On Saturday 26 May 2007, Nigel Cunningham wrote: Hi. On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote: Hello, When I try software suspend on my laptop it always returns to my running system after some time. This is what's logged by the kernel: swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed I have no idea what's the problem, but if you tell me what I should do I can create debugging information and/or test patches. Could you try this patch, please? It should help. Herbert, is this right? If cryptd is going to be used for block devs, the task should probably be PF_NOFREEZE (or whatever it is today) instead. Regards, Nigel crypto/cryptd.c |1 + include/linux/freezer.h |3 +++ kernel/power/process.c |2 +- 3 files changed, 5 insertions(+), 1 deletion(-) diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c 991-fix-cryptd.patch-new/crypto/cryptd.c --- 991-fix-cryptd.patch-old/crypto/cryptd.c2007-05-19 18:16:47.0 +1000 +++ 991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int cryptd_thread(void *data) mutex_unlock(state-mutex); + try_to_freeze(); schedule(); } while (!stop); I tried your patch, but when I apply it my kernel doesn't compile any more. I get these warnings/errors: [...] CC crypto/cryptd.o crypto/cryptd.c: In function ‘cryptd_thread’: crypto/cryptd.c:344: warning: implicit declaration of function ‘try_to_freeze’ [...] LD init/built-in.o LD .tmp_vmlinux1 crypto/built-in.o: In function `cryptd_thread': cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze' make: *** [.tmp_vmlinux1] Error 1 Ah. You'll need to add #include linux/freezer.h near that start of crypto/cryptd.c. Sorry for forgetting that. Nigel I added the include line and now I could compile the kernel, but suspending still doesn't work. swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed OK, this means that cryptd doesn't execute the try_to_freeze() for some reason. Please apply the appended patch on top of 2.6.22-rc3 and see if that helps. Greetings, Rafael --- crypto/cryptd.c |1 + 1 file changed, 1 insertion(+) Index: linux-2.6.22-rc3/crypto/cryptd.c = == --- linux-2.6.22-rc3.orig/crypto/cryptd.c +++ linux-2.6.22-rc3/crypto/cryptd.c @@ -316,6 +316,7 @@ static int cryptd_thread(void *data) struct cryptd_state *state = data; int stop; + current-flags |= PF_NOFREEZE; do { struct crypto_async_request *req, *backlog; Even with this patch suspending doesn't work, dmesg shows the same error message. I also did build a kernel without cryptd and suspending does work there. Well, that's strange, because in that case the freezer shouldn't even wait for cryptd. Can you please try the patch at http://lkml.org/lkml/2007/5/26/169 ? With this patch applied suspend does work fine. Hmm. IMO the patch is too intrusive for 2.6.22, but OTOH it's going into the direction preferred by some prominent people. ;-) Let's try to combine the two threads and see what results from that. Well, it looks like we have
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Michael Buesch wrote: > Ok, another question: On which CPU architecture are you? [EMAIL PROTECTED]:~$ uname -m i686 Maxi signature.asc Description: This is a digitally signed message part.
Re: software suspend doesn't work with 2.6.22-rc3
On Sunday 27 May 2007, Rafael J. Wysocki wrote: > On Sunday, 27 May 2007 22:41, Maximilian Engelhardt wrote: > > On Sunday 27 May 2007, Rafael J. Wysocki wrote: > > > On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote: > > > > On Saturday 26 May 2007, Nigel Cunningham wrote: > > > > > Hi. > > > > > > > > > > On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote: > > > > > > On Saturday 26 May 2007, Nigel Cunningham wrote: > > > > > > > Hi. > > > > > > > > > > > > > > On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote: > > > > > > > > Hello, > > > > > > > > > > > > > > > > When I try software suspend on my laptop it always returns to > > > > > > > > my running system after some time. > > > > > > > > This is what's logged by the kernel: > > > > > > > > > > > > > > > > swsusp: Basic memory bitmaps created > > > > > > > > Stopping tasks ... > > > > > > > > Stopping kernel threads timed out after 20 seconds (1 tasks > > > > > > > > refusing to freeze): > > > > > > > > cryptd > > > > > > > > Restarting tasks ... done. > > > > > > > > swsusp: Basic memory bitmaps freed > > > > > > > > > > > > > > > > I have no idea what's the problem, but if you tell me what I > > > > > > > > should do I can create debugging information and/or test > > > > > > > > patches. > > > > > > > > > > > > > > Could you try this patch, please? It should help. > > > > > > > > > > > > > > Herbert, is this right? If cryptd is going to be used for block > > > > > > > devs, the task should probably be PF_NOFREEZE (or whatever it > > > > > > > is today) instead. > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > Nigel > > > > > > > > > > > > > > crypto/cryptd.c |1 + > > > > > > > include/linux/freezer.h |3 +++ > > > > > > > kernel/power/process.c |2 +- > > > > > > > 3 files changed, 5 insertions(+), 1 deletion(-) > > > > > > > diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c > > > > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c --- > > > > > > > 991-fix-cryptd.patch-old/crypto/cryptd.c 2007-05-19 > > > > > > > 18:16:47.0 +1000 +++ > > > > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c 2007-05-26 > > > > > > > 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int > > > > > > > cryptd_thread(void *data) > > > > > > > > > > > > > > mutex_unlock(>mutex); > > > > > > > > > > > > > > + try_to_freeze(); > > > > > > > schedule(); > > > > > > > } while (!stop); > > > > > > > > > > > > I tried your patch, but when I apply it my kernel doesn't compile > > > > > > any more. I get these warnings/errors: > > > > > > > > > > > > [...] > > > > > > CC crypto/cryptd.o > > > > > > crypto/cryptd.c: In function ‘cryptd_thread’: > > > > > > crypto/cryptd.c:344: warning: implicit declaration of function > > > > > > ‘try_to_freeze’ [...] > > > > > > LD init/built-in.o > > > > > > LD .tmp_vmlinux1 > > > > > > crypto/built-in.o: In function `cryptd_thread': > > > > > > cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze' > > > > > > make: *** [.tmp_vmlinux1] Error 1 > > > > > > > > > > Ah. You'll need to add #include near that start > > > > > of crypto/cryptd.c. Sorry for forgetting that. > > > > > > > > > > Nigel > > > > > > > > I added the include line and now I could compile the kernel, but > > > > suspending still doesn't work. > > > > > > > > swsusp: Basic memory bitmaps created > > > > Stopping tasks ... > > > > Stopping kernel threads timed out after 20 seconds (1 tasks refusing > > > > to freeze): > > > > cryptd > > > > Restarting tasks ... done. > > > > swsusp: Basic memory bitmaps freed > > > > > > OK, this means that cryptd doesn't execute the try_to_freeze() for some > > > reason. > > > > > > Please apply the appended patch on top of 2.6.22-rc3 and see if that > > > helps. > > > > > > Greetings, > > > Rafael > > > > > > --- > > > crypto/cryptd.c |1 + > > > 1 file changed, 1 insertion(+) > > > > > > Index: linux-2.6.22-rc3/crypto/cryptd.c > > > === > > > --- linux-2.6.22-rc3.orig/crypto/cryptd.c > > > +++ linux-2.6.22-rc3/crypto/cryptd.c > > > @@ -316,6 +316,7 @@ static int cryptd_thread(void *data) > > > struct cryptd_state *state = data; > > > int stop; > > > > > > + current->flags |= PF_NOFREEZE; > > > do { > > > struct crypto_async_request *req, *backlog; > > > > Even with this patch suspending doesn't work, dmesg shows the same error > > message. > > I also did build a kernel without cryptd and suspending does work there. > > Well, that's strange, because in that case the freezer shouldn't even wait > for cryptd. > > Can you please try the patch at http://lkml.org/lkml/2007/5/26/169 ? With this patch applied suspend does work fine. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Sunday 27 May 2007, Michael Buesch wrote: > On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote: > > 2.6.21.1: > > [ 5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001 > > [ 5] 0.0-60.6 sec 1.13 MBytes157 Kbits/sec > > [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837 > > [ 4] 0.0-63.1 sec 2.82 MBytes375 Kbits/sec > > > > 2.6.22-rc3: > > [ 5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001 > > [ 5] 0.0-60.4 sec 58.9 MBytes 8.18 Mbits/sec > > [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633 > > [ 4] 0.0-63.1 sec 7.27 MBytes967 Kbits/sec > > This is the diff between these two kernels. > I'm not sure why you see a much better TX throughput here. > > Can you re-check to make sure it's not just some test-jitter? > 2.6.21.1: [ 5] local 192.168.1.2 port 54423 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.3 sec 3.06 MBytes426 Kbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 41053 [ 4] 0.0-163.0 sec130 MBytes 6.67 Mbits/sec 2.6.22-rc3: [ 5] local 192.168.1.2 port 46002 connected with 192.168.1.1 port 5001 [ 5] 0.0-61.5 sec 84.0 MBytes 11.5 Mbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 44379 [ 4] 0.0-93.8 sec 30.6 MBytes 2.74 Mbits/sec For TX the iperf server reports the same values as the client (all values are from the client) but for RX they are differen: 2.6.21.1: (iperf server log): [ 5] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 54423 [ 5] 0.0-60.5 sec 3.06 MBytes425 Kbits/sec [ 5] local 192.168.1.1 port 41053 connected with 192.168.1.2 port 5001 [ 5] 0.0-63.1 sec130 MBytes 17.2 Mbits/sec 2.6.22-rc3 (iperf server log): [ 4] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 46002 [ 4] 0.0-61.6 sec 84.0 MBytes 11.5 Mbits/sec [ 4] local 192.168.1.1 port 44379 connected with 192.168.1.2 port 5001 [ 4] 0.0-63.3 sec 30.6 MBytes 4.06 Mbits/sec I have no idea how iperf internally works and what can cause such different results here. > > --- linux-2.6.21.1/drivers/net/b44.c2007-05-27 22:58:01.0 +0200 > +++ linux-2.6.22-rc3/drivers/net/b44.c 2007-05-27 23:01:44.0 +0200 > @@ -825,12 +825,11 @@ > if (copy_skb == NULL) > goto drop_it_no_recycle; > > - copy_skb->dev = bp->dev; > skb_reserve(copy_skb, 2); > skb_put(copy_skb, len); > /* DMA sync done above, copy just the actual packet > */ - memcpy(copy_skb->data, skb->data+bp->rx_offset, > len); - > + skb_copy_from_linear_data_offset(skb, > bp->rx_offset, + > copy_skb->data, len); skb = copy_skb; > } > skb->ip_summed = CHECKSUM_NONE; > @@ -1007,7 +1006,8 @@ > goto err_out; > } > > - memcpy(skb_put(bounce_skb, len), skb->data, skb->len); > + skb_copy_from_linear_data(skb, skb_put(bounce_skb, len), > + skb->len); > dev_kfree_skb_any(skb); > skb = bounce_skb; > } signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Sunday 27 May 2007, Michael Buesch wrote: > On Sunday 27 May 2007 23:13:32 Michael Buesch wrote: > > On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote: > > > 2.6.21.1: > > > [ 5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001 > > > [ 5] 0.0-60.6 sec 1.13 MBytes157 Kbits/sec > > > [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837 > > > [ 4] 0.0-63.1 sec 2.82 MBytes375 Kbits/sec > > > > > > 2.6.22-rc3: > > > [ 5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001 > > > [ 5] 0.0-60.4 sec 58.9 MBytes 8.18 Mbits/sec > > > [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633 > > > [ 4] 0.0-63.1 sec 7.27 MBytes967 Kbits/sec > > > > This is the diff between these two kernels. > > I'm not sure why you see a much better TX throughput here. > > > > Can you re-check to make sure it's not just some test-jitter? > > Oh, eh, and what I forgot to ask: > Do you know an old kernel that works perfectly well for you, > so I can look at a diff between this one and anything >=2.6.21.1. I don't know any, most older kernels did work fine for me, but I never user iperf there so I guess if the bug is there also I simply didn't trigger it. If you think it's usefull I could go back and try different kernels, but that would take some time. Except the iperf bug 2.6.21.1 and 2.6.22-rc3 work fine. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Sunday 27 May 2007, Michael Buesch wrote: > On Sunday 27 May 2007 22:36:39 Maximilian Engelhardt wrote: > > When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in > > normal use I didn't notice any problems. It did work fine as I would > > expect it. I think the wget and ping tests here are as they should be. > > > > With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The > > ping test does confirm this, because here response times are very high. > > As far as I can remember the wget download rate was a bit slower than > > 2.6.21.1 or 2.6.22-rc3 till it stalled. > > I would expect it to be someting like the other two kernels. The two > > problems I see are the high ping times and the fact that the card stopped > > working. > > > > I don't know why the iperf results are so different from my personal > > experience. I guess the fact that I get so bad results with 2.6.21.1 and > > 2.6.22-rc3 is that iperf does something that causes the system to be > > extremely slow and thus degrading performance. This could be a bug > > somewhere in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has > > unintended been fixed by the ssb switch, but that's only a roughly guess. > > Ok. I guess (Yes I do :D) that there is an IRQ storm or something like > that, because you say that your system is becoming very slow and > unresponsive. It sounds like an IRQ is not ACKed correctly and so keeps > triggering and stalling the system. I'll take a look at a few diffs... > Do you see significant differences in the "hi" and/or "si" times in top? > Do you see a significant difference in the /proc/interrupts count. For > example that the kernel that works worse generates 10 times the IRQ count > for the same amount of data. ok, here are the results: Using 2.6.22-rc3 I get lot's of hi during TX and lots of hi and si during RX. Using 2.6.22-rc3-mm1 hi and si are significantly lower. It's difficult to give absolute numbers, because top refreshes very slow, but with 2.6.22-rc3 hi is about 30% during TX and RX and si is 0% during TX and 50% during RX. With Using 2.6.22-rc3-mm1 hi is 0% during TX and 0.3% during RX and si is 10% during TX and 0% during RX. When I do the same test on both kernels I get about 10 times (yes, it's really about ten times like in your example) more interrupts with 2.6.22-rc3 than with 2.6.22-rc3-mm1. An additional thing I noticed it that it's not the BCM4401 card that stops working but my e100 card. If I take the e100 card down and up again the connection is working again, so the BCM4401 doesn't have a "stops working" bug for me. Maxi signature.asc Description: This is a digitally signed message part.
Re: software suspend doesn't work with 2.6.22-rc3
On Sunday 27 May 2007, Rafael J. Wysocki wrote: > On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote: > > On Saturday 26 May 2007, Nigel Cunningham wrote: > > > Hi. > > > > > > On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote: > > > > On Saturday 26 May 2007, Nigel Cunningham wrote: > > > > > Hi. > > > > > > > > > > On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote: > > > > > > Hello, > > > > > > > > > > > > When I try software suspend on my laptop it always returns to my > > > > > > running system after some time. > > > > > > This is what's logged by the kernel: > > > > > > > > > > > > swsusp: Basic memory bitmaps created > > > > > > Stopping tasks ... > > > > > > Stopping kernel threads timed out after 20 seconds (1 tasks > > > > > > refusing to freeze): > > > > > > cryptd > > > > > > Restarting tasks ... done. > > > > > > swsusp: Basic memory bitmaps freed > > > > > > > > > > > > I have no idea what's the problem, but if you tell me what I > > > > > > should do I can create debugging information and/or test patches. > > > > > > > > > > Could you try this patch, please? It should help. > > > > > > > > > > Herbert, is this right? If cryptd is going to be used for block > > > > > devs, the task should probably be PF_NOFREEZE (or whatever it is > > > > > today) instead. > > > > > > > > > > Regards, > > > > > > > > > > Nigel > > > > > > > > > > crypto/cryptd.c |1 + > > > > > include/linux/freezer.h |3 +++ > > > > > kernel/power/process.c |2 +- > > > > > 3 files changed, 5 insertions(+), 1 deletion(-) > > > > > diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c > > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c --- > > > > > 991-fix-cryptd.patch-old/crypto/cryptd.c 2007-05-19 > > > > > 18:16:47.0 +1000 +++ > > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c 2007-05-26 > > > > > 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int > > > > > cryptd_thread(void *data) > > > > > > > > > > mutex_unlock(>mutex); > > > > > > > > > > + try_to_freeze(); > > > > > schedule(); > > > > > } while (!stop); > > > > > > > > I tried your patch, but when I apply it my kernel doesn't compile any > > > > more. I get these warnings/errors: > > > > > > > > [...] > > > > CC crypto/cryptd.o > > > > crypto/cryptd.c: In function ‘cryptd_thread’: > > > > crypto/cryptd.c:344: warning: implicit declaration of function > > > > ‘try_to_freeze’ [...] > > > > LD init/built-in.o > > > > LD .tmp_vmlinux1 > > > > crypto/built-in.o: In function `cryptd_thread': > > > > cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze' > > > > make: *** [.tmp_vmlinux1] Error 1 > > > > > > Ah. You'll need to add #include near that start of > > > crypto/cryptd.c. Sorry for forgetting that. > > > > > > Nigel > > > > I added the include line and now I could compile the kernel, but > > suspending still doesn't work. > > > > swsusp: Basic memory bitmaps created > > Stopping tasks ... > > Stopping kernel threads timed out after 20 seconds (1 tasks refusing to > > freeze): > > cryptd > > Restarting tasks ... done. > > swsusp: Basic memory bitmaps freed > > OK, this means that cryptd doesn't execute the try_to_freeze() for some > reason. > > Please apply the appended patch on top of 2.6.22-rc3 and see if that helps. > > Greetings, > Rafael > > --- > crypto/cryptd.c |1 + > 1 file changed, 1 insertion(+) > > Index: linux-2.6.22-rc3/crypto/cryptd.c > === > --- linux-2.6.22-rc3.orig/crypto/cryptd.c > +++ linux-2.6.22-rc3/crypto/cryptd.c > @@ -316,6 +316,7 @@ static int cryptd_thread(void *data) > struct cryptd_state *state = data; > int stop; > > + current->flags |= PF_NOFREEZE; > do { > struct crypto_async_request *req, *backlog; Even with this patch suspending doesn't work, dmesg shows the same error message. I also did build a kernel without cryptd and suspending does work there. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Sunday 27 May 2007, Michael Buesch wrote: > On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote: > > 2.6.22-rc3: > > > > [ 5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001 > > [ 5] 0.0-60.4 sec 58.9 MBytes 8.18 Mbits/sec > > [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633 > > [ 4] 0.0-63.1 sec 7.27 MBytes967 Kbits/sec > > Why do we have two different measurements here? Is one TX and one RX? > Which one? Yes, the first is TX (BCM4401 --> e100) and the second is RX. Both are tcp connections. I think iperf does display the ip addresses wrong in the second connection, but that's another issue. > > > koala:~# ping -c10 192.168.1.1 > > PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. > > 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms > > 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms > > 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms > > 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms > > 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms > > 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms > > 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms > > 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms > > 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms > > 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms > > > > --- 192.168.1.1 ping statistics --- > > 10 packets transmitted, 10 received, 0% packet loss, time 8997ms > > rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms > > > > System responsiveness was the same as with 2.6.21.1. > > > > wget got 11.23M/s, again same as 2.6.21.1. > > > > > > 2.6.22-rc2-mm1: > > > > [ 5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001 > > [ 5] 0.0-60.1 sec402 MBytes 56.1 Mbits/sec > > [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598 > > [ 4] 0.0-63.0 sec177 MBytes 23.6 Mbits/sec > > So with -mm (with ssb) you actually get better performace > then with plain 2.6.22-rc3? > > Can you elaborate a bit more about what you get an what you expect > on which kernel? When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in normal use I didn't notice any problems. It did work fine as I would expect it. I think the wget and ping tests here are as they should be. With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The ping test does confirm this, because here response times are very high. As far as I can remember the wget download rate was a bit slower than 2.6.21.1 or 2.6.22-rc3 till it stalled. I would expect it to be someting like the other two kernels. The two problems I see are the high ping times and the fact that the card stopped working. I don't know why the iperf results are so different from my personal experience. I guess the fact that I get so bad results with 2.6.21.1 and 2.6.22-rc3 is that iperf does something that causes the system to be extremely slow and thus degrading performance. This could be a bug somewhere in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has unintended been fixed by the ssb switch, but that's only a roughly guess. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
I send this again because my first mail accidently had html code in it and might have been filtered by some people. On Saturday 26 May 2007, Michael Buesch wrote: > On Saturday 26 May 2007 02:24:31 Stephen Hemminger wrote: > > Something is broken with the b44 driver in 2.6.22-rc1 or later. Now > > bisecting. The performance (with iperf) for receiving is normally 94Mbits > > or more. But something happened that dropped performance to less than > > 1Mbit, probably corrupted packets. > > > > There is nothing obvious in the commit log for drivers/net/b44.c, so it > > probably is something more general. > > > > > > Looking at the code in b44_rx(), I see a couple unrelated of bugs: > > 1. In the small packet case it recycles the skb before copying data > > out... Not good if new data arrives overwriting existing data. > > > > 2. Macros like RX_PKT_BUF_SZ that depend on local variables are evil!! > > Very interesting! > 2.6.22 doesn't include ssb, does it? > > Adding CCs to make reporters of another bugreport aware of this. I did some more tests with my BCM4401 and different kernels, here are the results: 2.6.21.1: iperf: [ 5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.6 sec 1.13 MBytes157 Kbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837 [ 4] 0.0-63.1 sec 2.82 MBytes375 Kbits/sec koala:~# ping -c10 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.241 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.215 ms 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.230 ms 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.238 ms 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.229 ms 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.228 ms 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.231 ms 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.229 ms 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.237 ms --- 192.168.1.1 ping statistics --- 10 packets transmitted, 10 received, 0% packet loss, time 8998ms rtt min/avg/max/mdev = 0.215/0.230/0.241/0.018 ms The system was unusable while i ran the iperf test, when I moved the mouse it was only jumping around and doing anything like starting programs or switching the desktop first happend after iperf had finished it's test. I did a http downlaod with wget and got 11.23M/s. 2.6.22-rc3: [ 5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.4 sec 58.9 MBytes 8.18 Mbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633 [ 4] 0.0-63.1 sec 7.27 MBytes967 Kbits/sec koala:~# ping -c10 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms --- 192.168.1.1 ping statistics --- 10 packets transmitted, 10 received, 0% packet loss, time 8997ms rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms System responsiveness was the same as with 2.6.21.1. wget got 11.23M/s, again same as 2.6.21.1. 2.6.22-rc2-mm1: [ 5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.1 sec402 MBytes 56.1 Mbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598 [ 4] 0.0-63.0 sec177 MBytes 23.6 Mbits/sec koala:~# ping -c10 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=39.8 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=52.7 ms 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=86.7 ms 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=8.22 ms 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=32.1 ms 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=56.0 ms 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=80.0 ms 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=1.52 ms 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=25.4 ms 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=49.3 ms --- 192.168.1.1 ping statistics --- 10 packets transmitted, 10 received, 0% packet loss, time 9000ms rtt min/avg/max/mdev = 1.526/43.207/86.700/26.369 ms Here system responsiveness was ok whil I ran iperf, I didn't notic anything anomalous. When I tried the wget http download the tranfer did stall and from this point on I couldn't send or receive anything on my BCM4401 anymore. Taken the
Re: software suspend doesn't work with 2.6.22-rc3
On Saturday 26 May 2007, Nigel Cunningham wrote: > Hi. > > On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote: > > On Saturday 26 May 2007, Nigel Cunningham wrote: > > > Hi. > > > > > > On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote: > > > > Hello, > > > > > > > > When I try software suspend on my laptop it always returns to my > > > > running system after some time. > > > > This is what's logged by the kernel: > > > > > > > > swsusp: Basic memory bitmaps created > > > > Stopping tasks ... > > > > Stopping kernel threads timed out after 20 seconds (1 tasks refusing > > > > to freeze): > > > > cryptd > > > > Restarting tasks ... done. > > > > swsusp: Basic memory bitmaps freed > > > > > > > > I have no idea what's the problem, but if you tell me what I should > > > > do I can create debugging information and/or test patches. > > > > > > Could you try this patch, please? It should help. > > > > > > Herbert, is this right? If cryptd is going to be used for block devs, > > > the task should probably be PF_NOFREEZE (or whatever it is today) > > > instead. > > > > > > Regards, > > > > > > Nigel > > > > > > crypto/cryptd.c |1 + > > > include/linux/freezer.h |3 +++ > > > kernel/power/process.c |2 +- > > > 3 files changed, 5 insertions(+), 1 deletion(-) > > > diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c > > > 991-fix-cryptd.patch-new/crypto/cryptd.c --- > > > 991-fix-cryptd.patch-old/crypto/cryptd.c 2007-05-19 18:16:47.0 > > > +1000 +++ 991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26 > > > 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int > > > cryptd_thread(void *data) > > > > > > mutex_unlock(>mutex); > > > > > > + try_to_freeze(); > > > schedule(); > > > } while (!stop); > > > > I tried your patch, but when I apply it my kernel doesn't compile any > > more. I get these warnings/errors: > > > > [...] > > CC crypto/cryptd.o > > crypto/cryptd.c: In function ‘cryptd_thread’: > > crypto/cryptd.c:344: warning: implicit declaration of function > > ‘try_to_freeze’ [...] > > LD init/built-in.o > > LD .tmp_vmlinux1 > > crypto/built-in.o: In function `cryptd_thread': > > cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze' > > make: *** [.tmp_vmlinux1] Error 1 > > Ah. You'll need to add #include near that start of > crypto/cryptd.c. Sorry for forgetting that. > > Nigel I added the include line and now I could compile the kernel, but suspending still doesn't work. swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed Maxi signature.asc Description: This is a digitally signed message part.
Re: software suspend doesn't work with 2.6.22-rc3
On Saturday 26 May 2007, Nigel Cunningham wrote: Hi. On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote: On Saturday 26 May 2007, Nigel Cunningham wrote: Hi. On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote: Hello, When I try software suspend on my laptop it always returns to my running system after some time. This is what's logged by the kernel: swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed I have no idea what's the problem, but if you tell me what I should do I can create debugging information and/or test patches. Could you try this patch, please? It should help. Herbert, is this right? If cryptd is going to be used for block devs, the task should probably be PF_NOFREEZE (or whatever it is today) instead. Regards, Nigel crypto/cryptd.c |1 + include/linux/freezer.h |3 +++ kernel/power/process.c |2 +- 3 files changed, 5 insertions(+), 1 deletion(-) diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c 991-fix-cryptd.patch-new/crypto/cryptd.c --- 991-fix-cryptd.patch-old/crypto/cryptd.c 2007-05-19 18:16:47.0 +1000 +++ 991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int cryptd_thread(void *data) mutex_unlock(state-mutex); + try_to_freeze(); schedule(); } while (!stop); I tried your patch, but when I apply it my kernel doesn't compile any more. I get these warnings/errors: [...] CC crypto/cryptd.o crypto/cryptd.c: In function ‘cryptd_thread’: crypto/cryptd.c:344: warning: implicit declaration of function ‘try_to_freeze’ [...] LD init/built-in.o LD .tmp_vmlinux1 crypto/built-in.o: In function `cryptd_thread': cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze' make: *** [.tmp_vmlinux1] Error 1 Ah. You'll need to add #include linux/freezer.h near that start of crypto/cryptd.c. Sorry for forgetting that. Nigel I added the include line and now I could compile the kernel, but suspending still doesn't work. swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
I send this again because my first mail accidently had html code in it and might have been filtered by some people. On Saturday 26 May 2007, Michael Buesch wrote: On Saturday 26 May 2007 02:24:31 Stephen Hemminger wrote: Something is broken with the b44 driver in 2.6.22-rc1 or later. Now bisecting. The performance (with iperf) for receiving is normally 94Mbits or more. But something happened that dropped performance to less than 1Mbit, probably corrupted packets. There is nothing obvious in the commit log for drivers/net/b44.c, so it probably is something more general. Looking at the code in b44_rx(), I see a couple unrelated of bugs: 1. In the small packet case it recycles the skb before copying data out... Not good if new data arrives overwriting existing data. 2. Macros like RX_PKT_BUF_SZ that depend on local variables are evil!! Very interesting! 2.6.22 doesn't include ssb, does it? Adding CCs to make reporters of another bugreport aware of this. I did some more tests with my BCM4401 and different kernels, here are the results: 2.6.21.1: iperf: [ 5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.6 sec 1.13 MBytes157 Kbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837 [ 4] 0.0-63.1 sec 2.82 MBytes375 Kbits/sec koala:~# ping -c10 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.241 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.215 ms 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.230 ms 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.238 ms 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.229 ms 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.228 ms 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.231 ms 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.229 ms 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.237 ms --- 192.168.1.1 ping statistics --- 10 packets transmitted, 10 received, 0% packet loss, time 8998ms rtt min/avg/max/mdev = 0.215/0.230/0.241/0.018 ms The system was unusable while i ran the iperf test, when I moved the mouse it was only jumping around and doing anything like starting programs or switching the desktop first happend after iperf had finished it's test. I did a http downlaod with wget and got 11.23M/s. 2.6.22-rc3: [ 5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.4 sec 58.9 MBytes 8.18 Mbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633 [ 4] 0.0-63.1 sec 7.27 MBytes967 Kbits/sec koala:~# ping -c10 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms --- 192.168.1.1 ping statistics --- 10 packets transmitted, 10 received, 0% packet loss, time 8997ms rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms System responsiveness was the same as with 2.6.21.1. wget got 11.23M/s, again same as 2.6.21.1. 2.6.22-rc2-mm1: [ 5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.1 sec402 MBytes 56.1 Mbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598 [ 4] 0.0-63.0 sec177 MBytes 23.6 Mbits/sec koala:~# ping -c10 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=39.8 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=52.7 ms 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=86.7 ms 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=8.22 ms 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=32.1 ms 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=56.0 ms 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=80.0 ms 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=1.52 ms 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=25.4 ms 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=49.3 ms --- 192.168.1.1 ping statistics --- 10 packets transmitted, 10 received, 0% packet loss, time 9000ms rtt min/avg/max/mdev = 1.526/43.207/86.700/26.369 ms Here system responsiveness was ok whil I ran iperf, I didn't notic anything anomalous. When I tried the wget http download the tranfer did stall and from this point on I couldn't send or receive anything on my BCM4401 anymore. Taken the interface down and up again didn't
Re: software suspend doesn't work with 2.6.22-rc3
On Sunday 27 May 2007, Rafael J. Wysocki wrote: On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote: On Saturday 26 May 2007, Nigel Cunningham wrote: Hi. On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote: On Saturday 26 May 2007, Nigel Cunningham wrote: Hi. On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote: Hello, When I try software suspend on my laptop it always returns to my running system after some time. This is what's logged by the kernel: swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed I have no idea what's the problem, but if you tell me what I should do I can create debugging information and/or test patches. Could you try this patch, please? It should help. Herbert, is this right? If cryptd is going to be used for block devs, the task should probably be PF_NOFREEZE (or whatever it is today) instead. Regards, Nigel crypto/cryptd.c |1 + include/linux/freezer.h |3 +++ kernel/power/process.c |2 +- 3 files changed, 5 insertions(+), 1 deletion(-) diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c 991-fix-cryptd.patch-new/crypto/cryptd.c --- 991-fix-cryptd.patch-old/crypto/cryptd.c 2007-05-19 18:16:47.0 +1000 +++ 991-fix-cryptd.patch-new/crypto/cryptd.c 2007-05-26 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int cryptd_thread(void *data) mutex_unlock(state-mutex); + try_to_freeze(); schedule(); } while (!stop); I tried your patch, but when I apply it my kernel doesn't compile any more. I get these warnings/errors: [...] CC crypto/cryptd.o crypto/cryptd.c: In function ‘cryptd_thread’: crypto/cryptd.c:344: warning: implicit declaration of function ‘try_to_freeze’ [...] LD init/built-in.o LD .tmp_vmlinux1 crypto/built-in.o: In function `cryptd_thread': cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze' make: *** [.tmp_vmlinux1] Error 1 Ah. You'll need to add #include linux/freezer.h near that start of crypto/cryptd.c. Sorry for forgetting that. Nigel I added the include line and now I could compile the kernel, but suspending still doesn't work. swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed OK, this means that cryptd doesn't execute the try_to_freeze() for some reason. Please apply the appended patch on top of 2.6.22-rc3 and see if that helps. Greetings, Rafael --- crypto/cryptd.c |1 + 1 file changed, 1 insertion(+) Index: linux-2.6.22-rc3/crypto/cryptd.c === --- linux-2.6.22-rc3.orig/crypto/cryptd.c +++ linux-2.6.22-rc3/crypto/cryptd.c @@ -316,6 +316,7 @@ static int cryptd_thread(void *data) struct cryptd_state *state = data; int stop; + current-flags |= PF_NOFREEZE; do { struct crypto_async_request *req, *backlog; Even with this patch suspending doesn't work, dmesg shows the same error message. I also did build a kernel without cryptd and suspending does work there. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Sunday 27 May 2007, Michael Buesch wrote: On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote: 2.6.22-rc3: [ 5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.4 sec 58.9 MBytes 8.18 Mbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633 [ 4] 0.0-63.1 sec 7.27 MBytes967 Kbits/sec Why do we have two different measurements here? Is one TX and one RX? Which one? Yes, the first is TX (BCM4401 -- e100) and the second is RX. Both are tcp connections. I think iperf does display the ip addresses wrong in the second connection, but that's another issue. koala:~# ping -c10 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms --- 192.168.1.1 ping statistics --- 10 packets transmitted, 10 received, 0% packet loss, time 8997ms rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms System responsiveness was the same as with 2.6.21.1. wget got 11.23M/s, again same as 2.6.21.1. 2.6.22-rc2-mm1: [ 5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.1 sec402 MBytes 56.1 Mbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598 [ 4] 0.0-63.0 sec177 MBytes 23.6 Mbits/sec So with -mm (with ssb) you actually get better performace then with plain 2.6.22-rc3? Can you elaborate a bit more about what you get an what you expect on which kernel? When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in normal use I didn't notice any problems. It did work fine as I would expect it. I think the wget and ping tests here are as they should be. With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The ping test does confirm this, because here response times are very high. As far as I can remember the wget download rate was a bit slower than 2.6.21.1 or 2.6.22-rc3 till it stalled. I would expect it to be someting like the other two kernels. The two problems I see are the high ping times and the fact that the card stopped working. I don't know why the iperf results are so different from my personal experience. I guess the fact that I get so bad results with 2.6.21.1 and 2.6.22-rc3 is that iperf does something that causes the system to be extremely slow and thus degrading performance. This could be a bug somewhere in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has unintended been fixed by the ssb switch, but that's only a roughly guess. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Sunday 27 May 2007, Michael Buesch wrote: On Sunday 27 May 2007 22:36:39 Maximilian Engelhardt wrote: When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in normal use I didn't notice any problems. It did work fine as I would expect it. I think the wget and ping tests here are as they should be. With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The ping test does confirm this, because here response times are very high. As far as I can remember the wget download rate was a bit slower than 2.6.21.1 or 2.6.22-rc3 till it stalled. I would expect it to be someting like the other two kernels. The two problems I see are the high ping times and the fact that the card stopped working. I don't know why the iperf results are so different from my personal experience. I guess the fact that I get so bad results with 2.6.21.1 and 2.6.22-rc3 is that iperf does something that causes the system to be extremely slow and thus degrading performance. This could be a bug somewhere in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has unintended been fixed by the ssb switch, but that's only a roughly guess. Ok. I guess (Yes I do :D) that there is an IRQ storm or something like that, because you say that your system is becoming very slow and unresponsive. It sounds like an IRQ is not ACKed correctly and so keeps triggering and stalling the system. I'll take a look at a few diffs... Do you see significant differences in the hi and/or si times in top? Do you see a significant difference in the /proc/interrupts count. For example that the kernel that works worse generates 10 times the IRQ count for the same amount of data. ok, here are the results: Using 2.6.22-rc3 I get lot's of hi during TX and lots of hi and si during RX. Using 2.6.22-rc3-mm1 hi and si are significantly lower. It's difficult to give absolute numbers, because top refreshes very slow, but with 2.6.22-rc3 hi is about 30% during TX and RX and si is 0% during TX and 50% during RX. With Using 2.6.22-rc3-mm1 hi is 0% during TX and 0.3% during RX and si is 10% during TX and 0% during RX. When I do the same test on both kernels I get about 10 times (yes, it's really about ten times like in your example) more interrupts with 2.6.22-rc3 than with 2.6.22-rc3-mm1. An additional thing I noticed it that it's not the BCM4401 card that stops working but my e100 card. If I take the e100 card down and up again the connection is working again, so the BCM4401 doesn't have a stops working bug for me. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Sunday 27 May 2007, Michael Buesch wrote: On Sunday 27 May 2007 23:13:32 Michael Buesch wrote: On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote: 2.6.21.1: [ 5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.6 sec 1.13 MBytes157 Kbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837 [ 4] 0.0-63.1 sec 2.82 MBytes375 Kbits/sec 2.6.22-rc3: [ 5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.4 sec 58.9 MBytes 8.18 Mbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633 [ 4] 0.0-63.1 sec 7.27 MBytes967 Kbits/sec This is the diff between these two kernels. I'm not sure why you see a much better TX throughput here. Can you re-check to make sure it's not just some test-jitter? Oh, eh, and what I forgot to ask: Do you know an old kernel that works perfectly well for you, so I can look at a diff between this one and anything =2.6.21.1. I don't know any, most older kernels did work fine for me, but I never user iperf there so I guess if the bug is there also I simply didn't trigger it. If you think it's usefull I could go back and try different kernels, but that would take some time. Except the iperf bug 2.6.21.1 and 2.6.22-rc3 work fine. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Sunday 27 May 2007, Michael Buesch wrote: On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote: 2.6.21.1: [ 5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.6 sec 1.13 MBytes157 Kbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837 [ 4] 0.0-63.1 sec 2.82 MBytes375 Kbits/sec 2.6.22-rc3: [ 5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.4 sec 58.9 MBytes 8.18 Mbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633 [ 4] 0.0-63.1 sec 7.27 MBytes967 Kbits/sec This is the diff between these two kernels. I'm not sure why you see a much better TX throughput here. Can you re-check to make sure it's not just some test-jitter? 2.6.21.1: [ 5] local 192.168.1.2 port 54423 connected with 192.168.1.1 port 5001 [ 5] 0.0-60.3 sec 3.06 MBytes426 Kbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 41053 [ 4] 0.0-163.0 sec130 MBytes 6.67 Mbits/sec 2.6.22-rc3: [ 5] local 192.168.1.2 port 46002 connected with 192.168.1.1 port 5001 [ 5] 0.0-61.5 sec 84.0 MBytes 11.5 Mbits/sec [ 4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 44379 [ 4] 0.0-93.8 sec 30.6 MBytes 2.74 Mbits/sec For TX the iperf server reports the same values as the client (all values are from the client) but for RX they are differen: 2.6.21.1: (iperf server log): [ 5] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 54423 [ 5] 0.0-60.5 sec 3.06 MBytes425 Kbits/sec [ 5] local 192.168.1.1 port 41053 connected with 192.168.1.2 port 5001 [ 5] 0.0-63.1 sec130 MBytes 17.2 Mbits/sec 2.6.22-rc3 (iperf server log): [ 4] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 46002 [ 4] 0.0-61.6 sec 84.0 MBytes 11.5 Mbits/sec [ 4] local 192.168.1.1 port 44379 connected with 192.168.1.2 port 5001 [ 4] 0.0-63.3 sec 30.6 MBytes 4.06 Mbits/sec I have no idea how iperf internally works and what can cause such different results here. --- linux-2.6.21.1/drivers/net/b44.c2007-05-27 22:58:01.0 +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c 2007-05-27 23:01:44.0 +0200 @@ -825,12 +825,11 @@ if (copy_skb == NULL) goto drop_it_no_recycle; - copy_skb-dev = bp-dev; skb_reserve(copy_skb, 2); skb_put(copy_skb, len); /* DMA sync done above, copy just the actual packet */ - memcpy(copy_skb-data, skb-data+bp-rx_offset, len); - + skb_copy_from_linear_data_offset(skb, bp-rx_offset, + copy_skb-data, len); skb = copy_skb; } skb-ip_summed = CHECKSUM_NONE; @@ -1007,7 +1006,8 @@ goto err_out; } - memcpy(skb_put(bounce_skb, len), skb-data, skb-len); + skb_copy_from_linear_data(skb, skb_put(bounce_skb, len), + skb-len); dev_kfree_skb_any(skb); skb = bounce_skb; } signature.asc Description: This is a digitally signed message part.
Re: software suspend doesn't work with 2.6.22-rc3
On Sunday 27 May 2007, Rafael J. Wysocki wrote: On Sunday, 27 May 2007 22:41, Maximilian Engelhardt wrote: On Sunday 27 May 2007, Rafael J. Wysocki wrote: On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote: On Saturday 26 May 2007, Nigel Cunningham wrote: Hi. On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote: On Saturday 26 May 2007, Nigel Cunningham wrote: Hi. On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote: Hello, When I try software suspend on my laptop it always returns to my running system after some time. This is what's logged by the kernel: swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed I have no idea what's the problem, but if you tell me what I should do I can create debugging information and/or test patches. Could you try this patch, please? It should help. Herbert, is this right? If cryptd is going to be used for block devs, the task should probably be PF_NOFREEZE (or whatever it is today) instead. Regards, Nigel crypto/cryptd.c |1 + include/linux/freezer.h |3 +++ kernel/power/process.c |2 +- 3 files changed, 5 insertions(+), 1 deletion(-) diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c 991-fix-cryptd.patch-new/crypto/cryptd.c --- 991-fix-cryptd.patch-old/crypto/cryptd.c 2007-05-19 18:16:47.0 +1000 +++ 991-fix-cryptd.patch-new/crypto/cryptd.c 2007-05-26 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int cryptd_thread(void *data) mutex_unlock(state-mutex); + try_to_freeze(); schedule(); } while (!stop); I tried your patch, but when I apply it my kernel doesn't compile any more. I get these warnings/errors: [...] CC crypto/cryptd.o crypto/cryptd.c: In function ‘cryptd_thread’: crypto/cryptd.c:344: warning: implicit declaration of function ‘try_to_freeze’ [...] LD init/built-in.o LD .tmp_vmlinux1 crypto/built-in.o: In function `cryptd_thread': cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze' make: *** [.tmp_vmlinux1] Error 1 Ah. You'll need to add #include linux/freezer.h near that start of crypto/cryptd.c. Sorry for forgetting that. Nigel I added the include line and now I could compile the kernel, but suspending still doesn't work. swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed OK, this means that cryptd doesn't execute the try_to_freeze() for some reason. Please apply the appended patch on top of 2.6.22-rc3 and see if that helps. Greetings, Rafael --- crypto/cryptd.c |1 + 1 file changed, 1 insertion(+) Index: linux-2.6.22-rc3/crypto/cryptd.c === --- linux-2.6.22-rc3.orig/crypto/cryptd.c +++ linux-2.6.22-rc3/crypto/cryptd.c @@ -316,6 +316,7 @@ static int cryptd_thread(void *data) struct cryptd_state *state = data; int stop; + current-flags |= PF_NOFREEZE; do { struct crypto_async_request *req, *backlog; Even with this patch suspending doesn't work, dmesg shows the same error message. I also did build a kernel without cryptd and suspending does work there. Well, that's strange, because in that case the freezer shouldn't even wait for cryptd. Can you please try the patch at http://lkml.org/lkml/2007/5/26/169 ? With this patch applied suspend does work fine. Maxi signature.asc Description: This is a digitally signed message part.
Re: b44: regression in 2.6.22 (resend)
On Monday 28 May 2007, Michael Buesch wrote: Ok, another question: On which CPU architecture are you? [EMAIL PROTECTED]:~$ uname -m i686 Maxi signature.asc Description: This is a digitally signed message part.
Re: software suspend doesn't work with 2.6.22-rc3
On Saturday 26 May 2007, Nigel Cunningham wrote: > Hi. > > On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote: > > Hello, > > > > When I try software suspend on my laptop it always returns to my running > > system after some time. > > This is what's logged by the kernel: > > > > swsusp: Basic memory bitmaps created > > Stopping tasks ... > > Stopping kernel threads timed out after 20 seconds (1 tasks refusing to > > freeze): > > cryptd > > Restarting tasks ... done. > > swsusp: Basic memory bitmaps freed > > > > I have no idea what's the problem, but if you tell me what I should do I > > can create debugging information and/or test patches. > > Could you try this patch, please? It should help. > > Herbert, is this right? If cryptd is going to be used for block devs, > the task should probably be PF_NOFREEZE (or whatever it is today) > instead. > > Regards, > > Nigel > > crypto/cryptd.c |1 + > include/linux/freezer.h |3 +++ > kernel/power/process.c |2 +- > 3 files changed, 5 insertions(+), 1 deletion(-) > diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c > 991-fix-cryptd.patch-new/crypto/cryptd.c --- > 991-fix-cryptd.patch-old/crypto/cryptd.c 2007-05-19 18:16:47.0 > +1000 +++ 991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26 > 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int cryptd_thread(void > *data) > > mutex_unlock(>mutex); > > + try_to_freeze(); > schedule(); > } while (!stop); I tried your patch, but when I apply it my kernel doesn't compile any more. I get these warnings/errors: [...] CC crypto/cryptd.o crypto/cryptd.c: In function ‘cryptd_thread’: crypto/cryptd.c:344: warning: implicit declaration of function ‘try_to_freeze’ [...] LD init/built-in.o LD .tmp_vmlinux1 crypto/built-in.o: In function `cryptd_thread': cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze' make: *** [.tmp_vmlinux1] Error 1 Maxi signature.asc Description: This is a digitally signed message part.
Oops with prism54 in 2.6.22-rc3
Hello, when using the prism54 driver including in the 2.6.22-rc3 kernel I get this Oops when putting the card into monitor mode: BUG: unable to handle kernel NULL pointer dereference at virtual address 01d8 printing eip: c0500608 *pde = Oops: 0002 [#1] PREEMPT Modules linked in: fuse CPU:0 EIP:0060:[]Not tainted VLI EFLAGS: 00010046 (2.6.22-rc3 #2) EIP is at netif_rx+0x48/0xc0 eax: ebx: c18fdbc0 ecx: c087991c edx: c0879910 esi: 0246 edi: f7c68010 ebp: f7fe0ba0 esp: c07bbef0 ds: 007b es: 007b fs: gs: ss: 0068 Process swapper (pid: 0, ti=c07ba000 task=c075a280 task.ti=c07ba000) Stack: f7ec c03d2b8f c07bbf24 0082 f7c68024 f7fe0800 c18fdbc0 0070 0046 0286 0286 0008 0007 0032dcd5 f7fe0ba0 0002 f7fe0800 c03d913d f7f4d2c0 Call Trace: [] islpci_eth_receive+0x12f/0x590 [] islpci_interrupt+0x1cd/0x280 [] handle_IRQ_event+0x25/0x50 [] handle_fasteoi_irq+0x5c/0xe0 [] do_IRQ+0x4a/0x80 [] common_interrupt+0x23/0x28 [] default_idle+0x2a/0x40 [] cpu_idle+0x43/0x80 [] start_kernel+0x21a/0x260 [] unknown_bootoption+0x0/0x260 === Code: c7 43 0c 00 00 00 00 c7 43 10 00 00 00 00 9c 5e fa ff 05 bc 9c 87 c0 a1 0c 99 87 c0 3b 05 c0 4a 7b c0 77 30 85 c0 74 43 8b 43 14 80 d8 01 00 00 a1 08 99 87 c0 ff 05 0c 99 87 c0 c7 03 04 99 EIP: [] netif_rx+0x48/0xc0 SS:ESP 0068:c07bbef0 Kernel panic - not syncing: Fatal exception in interrupt After this the system is frozen. Using kernel 2.6.21 everything works fine, I can capture packets in monitor mode and do not get any Oops. Maxi signature.asc Description: This is a digitally signed message part.
Re: BUG in 2.6.22-rc2-mm1: NIC module b44.c broken (Broadcom 4400)
On Saturday 26 May 2007, Michael Buesch wrote: > On Friday 25 May 2007 21:40, Uwe Bugla wrote: > > Am Freitag, 25. Mai 2007 20:48 schrieben Sie: > > > On Fri, 25 May 2007 17:59:29 +0200, Uwe Bugla wrote: > > > > Perhaps someone reading this could try to reproduce that problem on > > > > his machine. > > > > Now who of the readers owes a Broadcom 4401 NIC and can please try to > > > > test kernel 2.6.22-rc2-mm1? > > > > > > > > Those NICs have been used very very often as onboard controllers, > > > > especially on ASUS boards. > > > > > > I've been using 2.6.22-rc2 for some time and now I compiled 2.6.22-rc2- > > > mm1 and both work fine with the BCM4401 in my laptop. > > > > > > Maxi > > > > Hello Maxi, > > > > That may be true for your Laptop, but it unfortunately isn't true for my > > ASUS mainboard onboard controller. > > > > Unfortunately I cannot confirm this: > > > > My broadcom 4401 driver is not part of a notebook, but instead part of an > > ASUS P4PE mainboard. > > > > At my second attempt I went the conventional path (i. e. ignoring the > > fact that > > "Broadcom 4400 ethernet support appears twice in section "Network device > > support": > > > > Whether you leave out "EISA, VLB, PCI and on board controllers" or not it > > simply appears twice in kernel config! This is bug number 1. > > No it is NOT a bug. > It simply shows again that you don't know how b44, ssb or anything related > works. > > Would you _please_ take a look at the code, before calling features bugs. > And yes, this IS a feature. It is a feature to get b44 running on an > OpenWRT embedded device. These devices don't have a PCI bus. So b44 MUST > NOT depend on "EISA, VLB, PCI and on board controllers". > "Broadcom 4400 PCI device support" does depend on "EISA, VLB, PCI and on > board controllers". > > Everything is correct. > Bug number 1 is solved. > qed > > > This time I do get a "good" interrupt: IRQ 21 for the the device. > > > > BUT: > > > > Trying to ping another machine fails saying: > > > > "destination host unreachable" > > > > > > That means, Although the interrupt is fine now, the device is still not > > functionable. > > And it's completely impossible that you did a mistake when configuring > the device? Typo in the IP? Typo in the gateway or DNS entries? > Try it again, please. > And please try with current wireless-dev tree. > > And I simply do not get it why you suddenly get a good IRQ number, like > everybody else does, without fixing The Bug (tm). I did run my 2.6.22-rc2-mm1 kernel a bit longer and noticed that I was wrong in my first mail. The driver does work with my 4401 and network traffic seem to get out and in fine, but it has huge performance problems. If I do some pings and traceroutes I sometimes get response times of only a few ms but I also get times of a few seconds. Also trying to play games is totally impossible. This doesn't happen with 2.6.22-rc2 and 2.6.22-rc3. Maxi signature.asc Description: This is a digitally signed message part.
software suspend doesn't work with 2.6.22-rc3
Hello, When I try software suspend on my laptop it always returns to my running system after some time. This is what's logged by the kernel: swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed I have no idea what's the problem, but if you tell me what I should do I can create debugging information and/or test patches. I have my config attached, the kernel is 2.6.22-rc3 Maxi # # Automatically generated make config: don't edit # Linux kernel version: 2.6.22-rc3 # Sat May 26 10:07:12 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_IPC_NS is not set CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y # CONFIG_TASK_XACCT is not set # CONFIG_UTS_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=17 # CONFIG_SYSFS_DEPRECATED is not set # CONFIG_RELAY is not set # CONFIG_BLK_DEV_INITRD is not set # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_SLUB_DEBUG=y # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # # Loadable module support # CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_KMOD=y # # Block layer # CONFIG_BLOCK=y # CONFIG_LBD is not set # CONFIG_BLK_DEV_IO_TRACE is not set # CONFIG_LSF is not set # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y # CONFIG_DEFAULT_AS is not set # CONFIG_DEFAULT_DEADLINE is not set CONFIG_DEFAULT_CFQ=y # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="cfq" # # Processor type and features # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y # CONFIG_SMP is not set CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_PARAVIRT is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set CONFIG_MPENTIUMM=y # CONFIG_MCORE2 is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_MVIAC7 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_X86_XADD=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_CMPXCHG64=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_X86_CMOV=y CONFIG_X86_MINIMUM_CPU_MODEL=4 CONFIG_HPET_TIMER=y CONFIG_HPET_EMULATE_RTC=y # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT=y CONFIG_PREEMPT_BKL=y # CONFIG_X86_UP_APIC is not set CONFIG_X86_MCE=y CONFIG_X86_MCE_NONFATAL=y CONFIG_VM86=y # CONFIG_TOSHIBA is not set # CONFIG_I8K is not set # CONFIG_X86_REBOOTFIXUPS is not set # CONFIG_MICROCODE is not set # CONFIG_X86_MSR is not set # CONFIG_X86_CPUID is not set # # Firmware Drivers # # CONFIG_EDD is not set # CONFIG_DELL_RBU is not set #
software suspend doesn't work with 2.6.22-rc3
Hello, When I try software suspend on my laptop it always returns to my running system after some time. This is what's logged by the kernel: swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed I have no idea what's the problem, but if you tell me what I should do I can create debugging information and/or test patches. I have my config attached, the kernel is 2.6.22-rc3 Maxi # # Automatically generated make config: don't edit # Linux kernel version: 2.6.22-rc3 # Sat May 26 10:07:12 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION= CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_IPC_NS is not set CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y # CONFIG_TASK_XACCT is not set # CONFIG_UTS_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=17 # CONFIG_SYSFS_DEPRECATED is not set # CONFIG_RELAY is not set # CONFIG_BLK_DEV_INITRD is not set # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_SLUB_DEBUG=y # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # # Loadable module support # CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_KMOD=y # # Block layer # CONFIG_BLOCK=y # CONFIG_LBD is not set # CONFIG_BLK_DEV_IO_TRACE is not set # CONFIG_LSF is not set # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y # CONFIG_DEFAULT_AS is not set # CONFIG_DEFAULT_DEADLINE is not set CONFIG_DEFAULT_CFQ=y # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED=cfq # # Processor type and features # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y # CONFIG_SMP is not set CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_PARAVIRT is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set CONFIG_MPENTIUMM=y # CONFIG_MCORE2 is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_MVIAC7 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_X86_XADD=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_CMPXCHG64=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_X86_CMOV=y CONFIG_X86_MINIMUM_CPU_MODEL=4 CONFIG_HPET_TIMER=y CONFIG_HPET_EMULATE_RTC=y # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT=y CONFIG_PREEMPT_BKL=y # CONFIG_X86_UP_APIC is not set CONFIG_X86_MCE=y CONFIG_X86_MCE_NONFATAL=y CONFIG_VM86=y # CONFIG_TOSHIBA is not set # CONFIG_I8K is not set # CONFIG_X86_REBOOTFIXUPS is not set # CONFIG_MICROCODE is not set # CONFIG_X86_MSR is not set # CONFIG_X86_CPUID is not set # # Firmware Drivers # # CONFIG_EDD is not set # CONFIG_DELL_RBU is not set #
Re: BUG in 2.6.22-rc2-mm1: NIC module b44.c broken (Broadcom 4400)
On Saturday 26 May 2007, Michael Buesch wrote: On Friday 25 May 2007 21:40, Uwe Bugla wrote: Am Freitag, 25. Mai 2007 20:48 schrieben Sie: On Fri, 25 May 2007 17:59:29 +0200, Uwe Bugla wrote: Perhaps someone reading this could try to reproduce that problem on his machine. Now who of the readers owes a Broadcom 4401 NIC and can please try to test kernel 2.6.22-rc2-mm1? Those NICs have been used very very often as onboard controllers, especially on ASUS boards. I've been using 2.6.22-rc2 for some time and now I compiled 2.6.22-rc2- mm1 and both work fine with the BCM4401 in my laptop. Maxi Hello Maxi, That may be true for your Laptop, but it unfortunately isn't true for my ASUS mainboard onboard controller. Unfortunately I cannot confirm this: My broadcom 4401 driver is not part of a notebook, but instead part of an ASUS P4PE mainboard. At my second attempt I went the conventional path (i. e. ignoring the fact that Broadcom 4400 ethernet support appears twice in section Network device support: Whether you leave out EISA, VLB, PCI and on board controllers or not it simply appears twice in kernel config! This is bug number 1. No it is NOT a bug. It simply shows again that you don't know how b44, ssb or anything related works. Would you _please_ take a look at the code, before calling features bugs. And yes, this IS a feature. It is a feature to get b44 running on an OpenWRT embedded device. These devices don't have a PCI bus. So b44 MUST NOT depend on EISA, VLB, PCI and on board controllers. Broadcom 4400 PCI device support does depend on EISA, VLB, PCI and on board controllers. Everything is correct. Bug number 1 is solved. qed This time I do get a good interrupt: IRQ 21 for the the device. BUT: Trying to ping another machine fails saying: destination host unreachable That means, Although the interrupt is fine now, the device is still not functionable. And it's completely impossible that you did a mistake when configuring the device? Typo in the IP? Typo in the gateway or DNS entries? Try it again, please. And please try with current wireless-dev tree. And I simply do not get it why you suddenly get a good IRQ number, like everybody else does, without fixing The Bug (tm). I did run my 2.6.22-rc2-mm1 kernel a bit longer and noticed that I was wrong in my first mail. The driver does work with my 4401 and network traffic seem to get out and in fine, but it has huge performance problems. If I do some pings and traceroutes I sometimes get response times of only a few ms but I also get times of a few seconds. Also trying to play games is totally impossible. This doesn't happen with 2.6.22-rc2 and 2.6.22-rc3. Maxi signature.asc Description: This is a digitally signed message part.
Oops with prism54 in 2.6.22-rc3
Hello, when using the prism54 driver including in the 2.6.22-rc3 kernel I get this Oops when putting the card into monitor mode: BUG: unable to handle kernel NULL pointer dereference at virtual address 01d8 printing eip: c0500608 *pde = Oops: 0002 [#1] PREEMPT Modules linked in: fuse CPU:0 EIP:0060:[c0500608]Not tainted VLI EFLAGS: 00010046 (2.6.22-rc3 #2) EIP is at netif_rx+0x48/0xc0 eax: ebx: c18fdbc0 ecx: c087991c edx: c0879910 esi: 0246 edi: f7c68010 ebp: f7fe0ba0 esp: c07bbef0 ds: 007b es: 007b fs: gs: ss: 0068 Process swapper (pid: 0, ti=c07ba000 task=c075a280 task.ti=c07ba000) Stack: f7ec c03d2b8f c07bbf24 0082 f7c68024 f7fe0800 c18fdbc0 0070 0046 0286 0286 0008 0007 0032dcd5 f7fe0ba0 0002 f7fe0800 c03d913d f7f4d2c0 Call Trace: [c03d2b8f] islpci_eth_receive+0x12f/0x590 [c03d913d] islpci_interrupt+0x1cd/0x280 [c0144e15] handle_IRQ_event+0x25/0x50 [c014669c] handle_fasteoi_irq+0x5c/0xe0 [c010674a] do_IRQ+0x4a/0x80 [c010498f] common_interrupt+0x23/0x28 [c0102b3a] default_idle+0x2a/0x40 [c01023e3] cpu_idle+0x43/0x80 [c07bcb2a] start_kernel+0x21a/0x260 [c07bc450] unknown_bootoption+0x0/0x260 === Code: c7 43 0c 00 00 00 00 c7 43 10 00 00 00 00 9c 5e fa ff 05 bc 9c 87 c0 a1 0c 99 87 c0 3b 05 c0 4a 7b c0 77 30 85 c0 74 43 8b 43 14 ff 80 d8 01 00 00 a1 08 99 87 c0 ff 05 0c 99 87 c0 c7 03 04 99 EIP: [c0500608] netif_rx+0x48/0xc0 SS:ESP 0068:c07bbef0 Kernel panic - not syncing: Fatal exception in interrupt After this the system is frozen. Using kernel 2.6.21 everything works fine, I can capture packets in monitor mode and do not get any Oops. Maxi signature.asc Description: This is a digitally signed message part.
Re: software suspend doesn't work with 2.6.22-rc3
On Saturday 26 May 2007, Nigel Cunningham wrote: Hi. On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote: Hello, When I try software suspend on my laptop it always returns to my running system after some time. This is what's logged by the kernel: swsusp: Basic memory bitmaps created Stopping tasks ... Stopping kernel threads timed out after 20 seconds (1 tasks refusing to freeze): cryptd Restarting tasks ... done. swsusp: Basic memory bitmaps freed I have no idea what's the problem, but if you tell me what I should do I can create debugging information and/or test patches. Could you try this patch, please? It should help. Herbert, is this right? If cryptd is going to be used for block devs, the task should probably be PF_NOFREEZE (or whatever it is today) instead. Regards, Nigel crypto/cryptd.c |1 + include/linux/freezer.h |3 +++ kernel/power/process.c |2 +- 3 files changed, 5 insertions(+), 1 deletion(-) diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c 991-fix-cryptd.patch-new/crypto/cryptd.c --- 991-fix-cryptd.patch-old/crypto/cryptd.c 2007-05-19 18:16:47.0 +1000 +++ 991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int cryptd_thread(void *data) mutex_unlock(state-mutex); + try_to_freeze(); schedule(); } while (!stop); I tried your patch, but when I apply it my kernel doesn't compile any more. I get these warnings/errors: [...] CC crypto/cryptd.o crypto/cryptd.c: In function ‘cryptd_thread’: crypto/cryptd.c:344: warning: implicit declaration of function ‘try_to_freeze’ [...] LD init/built-in.o LD .tmp_vmlinux1 crypto/built-in.o: In function `cryptd_thread': cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze' make: *** [.tmp_vmlinux1] Error 1 Maxi signature.asc Description: This is a digitally signed message part.
Re: BUG in 2.6.22-rc2-mm1: NIC module b44.c broken (Broadcom 4400)
On Fri, 25 May 2007 17:59:29 +0200, Uwe Bugla wrote: > > Perhaps someone reading this could try to reproduce that problem on his > machine. > Now who of the readers owes a Broadcom 4401 NIC and can please try to > test kernel 2.6.22-rc2-mm1? > > Those NICs have been used very very often as onboard controllers, > especially on ASUS boards. I've been using 2.6.22-rc2 for some time and now I compiled 2.6.22-rc2- mm1 and both work fine with the BCM4401 in my laptop. Maxi signature.asc Description: This is a digitally signed message part.
Re: BUG in 2.6.22-rc2-mm1: NIC module b44.c broken (Broadcom 4400)
On Fri, 25 May 2007 17:59:29 +0200, Uwe Bugla wrote: Perhaps someone reading this could try to reproduce that problem on his machine. Now who of the readers owes a Broadcom 4401 NIC and can please try to test kernel 2.6.22-rc2-mm1? Those NICs have been used very very often as onboard controllers, especially on ASUS boards. I've been using 2.6.22-rc2 for some time and now I compiled 2.6.22-rc2- mm1 and both work fine with the BCM4401 in my laptop. Maxi signature.asc Description: This is a digitally signed message part.
Re: Call for help: list of machines with working S3
On Fri, 2005-03-18 at 15:50 +0100, Romano Giannetti wrote: > > It happens exactly the same on my laptop, sony vaio whose configuration is > > http://www.dea.icai.upco.es/romano/linux/vaio-conf/laptop-config.html > > Next week is Easter holyday here, I will try to connect my Psion casio as > serial terminal and see if I can catch something. I was able to get some logs using CONFIG_LP_CONSOLE (the first time I ever saw "Back to C!"): Back to C! PM: Finishing up. ACPI: PCI interrupt :00:1f.1[A] -> GSI 10 (level,low) -> IRQ 10 MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0. Bank 1: e201 hda: task_out_intr: status=0x51 { DriveReady SeekComplete Error } hda: task_out_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown keeps on always repeating last three messages until I reboot Full log: http://home.daemonizer.de/resume.png kernel version is 2.6.11 config: http://home.daemonizer.de/config-2.6.11-S3test dmesg from booting: http://home.daemonizer.de/dmesg-2.6.11-S3test lspci: http://home.daemonizer.de/lspci Gentoo Base System version 1.6.10 Hardware: Acer Travelmate 661lci (centrino) Intel(R) Pentium(R) M processor 1400MHz please mail me if you need additional data. Thanks for help, Maxi signature.asc Description: This is a digitally signed message part
Re: Call for help: list of machines with working S3
On Fri, 2005-03-18 at 15:50 +0100, Romano Giannetti wrote: It happens exactly the same on my laptop, sony vaio whose configuration is http://www.dea.icai.upco.es/romano/linux/vaio-conf/laptop-config.html Next week is Easter holyday here, I will try to connect my Psion casio as serial terminal and see if I can catch something. I was able to get some logs using CONFIG_LP_CONSOLE (the first time I ever saw Back to C!): Back to C! PM: Finishing up. ACPI: PCI interrupt :00:1f.1[A] - GSI 10 (level,low) - IRQ 10 MCE: The hardware reports a non fatal, correctable incident occurred on CPU 0. Bank 1: e201 hda: task_out_intr: status=0x51 { DriveReady SeekComplete Error } hda: task_out_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown keeps on always repeating last three messages until I reboot Full log: http://home.daemonizer.de/resume.png kernel version is 2.6.11 config: http://home.daemonizer.de/config-2.6.11-S3test dmesg from booting: http://home.daemonizer.de/dmesg-2.6.11-S3test lspci: http://home.daemonizer.de/lspci Gentoo Base System version 1.6.10 Hardware: Acer Travelmate 661lci (centrino) Intel(R) Pentium(R) M processor 1400MHz please mail me if you need additional data. Thanks for help, Maxi signature.asc Description: This is a digitally signed message part
Re: Call for help: list of machines with working S3
On Fri, 2005-03-18 at 15:50 +0100, Romano Giannetti wrote: > > It happens exactly the same on my laptop, sony vaio whose configuration is > > http://www.dea.icai.upco.es/romano/linux/vaio-conf/laptop-config.html > > Next week is Easter holyday here, I will try to connect my Psion casio as > serial terminal and see if I can catch something. > >Romano Sorry that I didn't answer earlier , but I didn't have much time the last week. Unfortunately my laptop has a serial port only via docking station that I don't have. So I tried logging via netconsole. This generally worked, but when I try to enter S3 the last thing I get is "PM: Entering state" but the laptop never enters S3, it just hangs there forever. So sadly I couldn't get more information. If anyone has any idea what else I could do to either fix this problem or get more information about it, please tell me and I'll try :) Maxi signature.asc Description: This is a digitally signed message part
Re: Call for help: list of machines with working S3
On Fri, 2005-03-18 at 15:50 +0100, Romano Giannetti wrote: It happens exactly the same on my laptop, sony vaio whose configuration is http://www.dea.icai.upco.es/romano/linux/vaio-conf/laptop-config.html Next week is Easter holyday here, I will try to connect my Psion casio as serial terminal and see if I can catch something. Romano Sorry that I didn't answer earlier , but I didn't have much time the last week. Unfortunately my laptop has a serial port only via docking station that I don't have. So I tried logging via netconsole. This generally worked, but when I try to enter S3 the last thing I get is PM: Entering state but the laptop never enters S3, it just hangs there forever. So sadly I couldn't get more information. If anyone has any idea what else I could do to either fix this problem or get more information about it, please tell me and I'll try :) Maxi signature.asc Description: This is a digitally signed message part
Re: Call for help: list of machines with working S3
On Mon, 2005-02-14 at 22:20 +0100, Pavel Machek wrote: > Hi! > > Stefan provided me initial list of machines where S3 works (including > video). If you have machine that is not on the list, please send me a > diff. If you have eMachines... I'd like you to try playing with > vbetool (it worked for me), and if it works for you supplying right > model numbers. > > Pavel > > > Video issues with S3 resume > ~~~ > 2003-2005, Pavel Machek > > During S3 resume, hardware needs to be reinitialized. For most > devices, this is easy, and kernel driver knows how to do > it. Unfortunately there's one exception: video card. Those are usually > initialized by BIOS, and kernel does not have enough information to > boot video card. (Kernel usually does not even contain video card > driver -- vesafb and vgacon are widely used). > > This is not problem for swsusp, because during swsusp resume, BIOS is > run normally so video card is normally initialized. S3 has absolutely > no change to work with SMP/HT. Be sure it to turn it off before > testing (swsusp should work ok, OTOH). > > There are few types of systems where video works after S3 resume: > > (1) systems where video state is preserved over S3. > > (2) systems where it is possible to call video bios during S3 > resume. Unfortunately, it is not correct to call video BIOS at that > point, but it happens to work on some machines. Use > acpi_sleep=s3_bios. > > (3) systems that initialize video card into vga text mode and where BIOS > works well enough to be able to set video mode. Use > acpi_sleep=s3_mode on these. > > (4) on some systems s3_bios kicks video into text mode, and > acpi_sleep=s3_bios,s3_mode is needed. > > (5) radeon systems, where X can soft-boot your video card. You'll need > patched X, and plain text console (no vesafb or radeonfb), see > http://www.doesi.gmxhome.de/linux/tm800s3/s3.html. > > (6) other radeon systems, where vbetool is enough to bring system back > to life. Do vbetool vbestate save > /tmp/delme; echo 3 > /proc/acpi/sleep; > vbetool post; vbetool vbestate restore < /tmp/delme; setfont > , and your video should work. Tried all this on my Laptop but nothing seems to work for me. I do "echo 3 > /proc/acpi/sleep" and the systems seems to go into S3. When I press some key to wake it up again it powers up but I get nothing than a black screen. It's not only the video card that's not working, because the only thing it reacts to is Sysrq (without screen of course). One additional thing I found is that in this state the HDD led keeps lighting all the time untill I reboot my system. After rebooting I couldn't find anything interesting in my logs. Is there any way I could get S3 working on my laptop? some data: Acer Travel Mate 661lci Gentoo Base System version 1.6.10 kernel 2.6.11 I did all this testing with a minimal kernel that only had the absolutely necessary drivers. Thanks for help, Maxi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Call for help: list of machines with working S3
On Mon, 2005-02-14 at 22:20 +0100, Pavel Machek wrote: Hi! Stefan provided me initial list of machines where S3 works (including video). If you have machine that is not on the list, please send me a diff. If you have eMachines... I'd like you to try playing with vbetool (it worked for me), and if it works for you supplying right model numbers. Pavel Video issues with S3 resume ~~~ 2003-2005, Pavel Machek During S3 resume, hardware needs to be reinitialized. For most devices, this is easy, and kernel driver knows how to do it. Unfortunately there's one exception: video card. Those are usually initialized by BIOS, and kernel does not have enough information to boot video card. (Kernel usually does not even contain video card driver -- vesafb and vgacon are widely used). This is not problem for swsusp, because during swsusp resume, BIOS is run normally so video card is normally initialized. S3 has absolutely no change to work with SMP/HT. Be sure it to turn it off before testing (swsusp should work ok, OTOH). There are few types of systems where video works after S3 resume: (1) systems where video state is preserved over S3. (2) systems where it is possible to call video bios during S3 resume. Unfortunately, it is not correct to call video BIOS at that point, but it happens to work on some machines. Use acpi_sleep=s3_bios. (3) systems that initialize video card into vga text mode and where BIOS works well enough to be able to set video mode. Use acpi_sleep=s3_mode on these. (4) on some systems s3_bios kicks video into text mode, and acpi_sleep=s3_bios,s3_mode is needed. (5) radeon systems, where X can soft-boot your video card. You'll need patched X, and plain text console (no vesafb or radeonfb), see http://www.doesi.gmxhome.de/linux/tm800s3/s3.html. (6) other radeon systems, where vbetool is enough to bring system back to life. Do vbetool vbestate save /tmp/delme; echo 3 /proc/acpi/sleep; vbetool post; vbetool vbestate restore /tmp/delme; setfont whatever, and your video should work. Tried all this on my Laptop but nothing seems to work for me. I do echo 3 /proc/acpi/sleep and the systems seems to go into S3. When I press some key to wake it up again it powers up but I get nothing than a black screen. It's not only the video card that's not working, because the only thing it reacts to is Sysrq (without screen of course). One additional thing I found is that in this state the HDD led keeps lighting all the time untill I reboot my system. After rebooting I couldn't find anything interesting in my logs. Is there any way I could get S3 working on my laptop? some data: Acer Travel Mate 661lci Gentoo Base System version 1.6.10 kernel 2.6.11 I did all this testing with a minimal kernel that only had the absolutely necessary drivers. Thanks for help, Maxi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/