Re: b44: high ping times with wireless-dev

2007-06-17 Thread Maximilian Engelhardt
On Sunday 17 June 2007, Michael Buesch wrote:
> On Saturday 16 June 2007 23:27:43 Maximilian Engelhardt wrote:
> > [...]
> > ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
> > ACPI: PCI Interrupt :02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low)
> > -> IRQ 10
> > ssb: Sonics Silicon Backplane found on PCI device :02:02.0
> > b44.c:v2.0
> > eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
> > [...]
>
> Ok, I prepared two debugging patches.
>
> Please enable SonicsSiliconBackplane Debugging in the kernel kconfig,
> so I can get more detail information about your card.
> Device Drivers/Sonics Silicon Backplane/SSB debugging
> (Must disable "No SSB kernel messages")
>
> Please apply and test the attached debugging patches in a row.
> So apply patch 1 and test if it works again. If not, apply
> patch 2 and test if it works.
> Always save complete dmesg log on each test run and send it to me.
>
> Thanks for testing.
> (This time it seems we are actually getting somewhere, when
> dealing with sane people. :D )

I did the tests with my kernel where only the card is on interrupt 10. dmesg 
is attached.
With the first patch applied networking does work again. I also additionally 
tried patch2 and it also does work.

Maxi
Linux version 2.6.22-rc4-wireless-dev-20070616-test1 ([EMAIL PROTECTED]) (gcc 
version 4.1.3 20070601 (prerelease) (Debian 4.1.2-12)) #6 PREEMPT Sun Jun 17 
13:24:13 CEST 2007
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009f800 (usable)
 BIOS-e820: 0009f800 - 000a (reserved)
 BIOS-e820: 000ce000 - 000d (reserved)
 BIOS-e820: 000e - 0010 (reserved)
 BIOS-e820: 0010 - 4dee (usable)
 BIOS-e820: 4dee - 4deec000 (ACPI data)
 BIOS-e820: 4deec000 - 4df0 (ACPI NVS)
 BIOS-e820: 4df0 - 5000 (reserved)
 BIOS-e820: fec1 - fec2 (reserved)
 BIOS-e820: ff80 - ffc0 (reserved)
 BIOS-e820: fc00 - 0001 (reserved)
350MB HIGHMEM available.
896MB LOWMEM available.
Entering add_active_range(0, 0, 319200) 0 entries of 256 used
Zone PFN ranges:
  DMA 0 -> 4096
  Normal   4096 ->   229376
  HighMem229376 ->   319200
early_node_map[1] active PFN ranges
0:0 ->   319200
On node 0 totalpages: 319200
  DMA zone: 32 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4064 pages, LIFO batch:0
  Normal zone: 1760 pages used for memmap
  Normal zone: 223520 pages, LIFO batch:31
  HighMem zone: 701 pages used for memmap
  HighMem zone: 89123 pages, LIFO batch:15
DMI present.
ACPI: RSDP 000F6050, 0014 (r0 ACER  )
ACPI: RSDT 4DEE5A39, 0030 (r1 ACER   Wagtail  20020114  LTP0)
ACPI: FACP 4DEEBF2C, 0074 (r1 ACER   Wagtail  20020114 PTL50)
ACPI: DSDT 4DEE5A69, 64C3 (r1 ACER   Wagtail  20020114 MSFT  10E)
ACPI: FACS 4DEFCFC0, 0040
ACPI: HPET 4DEEBFA0, 0038 (r1 ACER   Wagtail  20020114 PTL 0)
ACPI: BOOT 4DEEBFD8, 0028 (r1 ACER   Wagtail  20020114  LTP1)
ACPI: PM-Timer IO Port: 0x1008
ACPI: HPET id: 0x8086a201 base: 0x0
Allocating PCI resources starting at 6000 (gap: 5000:aec1)
Built 1 zonelists.  Total pages: 316707
Kernel command line: root=/dev/sda1 ro vga=0x31b resume=/dev/sda2
Local APIC disabled by BIOS -- you can enable it with "lapic"
mapped APIC to d000 (019c4000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 16384 bytes)
Detected 1395.565 MHz processor.
Console: colour dummy device 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1259844k/1276800k available (3572k kernel code, 16168k reserved, 1152k 
data, 220k init, 359296k highmem)
virtual kernel memory layout:
fixmap  : 0xfffaa000 - 0xf000   ( 340 kB)
pkmap   : 0xff80 - 0xffc0   (4096 kB)
vmalloc : 0xf880 - 0xff7fe000   ( 111 MB)
lowmem  : 0xc000 - 0xf800   ( 896 MB)
  .init : 0xc05a - 0xc05d7000   ( 220 kB)
  .data : 0xc047d05d - 0xc059d0b0   (1152 kB)
  .text : 0xc010 - 0xc047d05d   (3572 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
SLUB: Genslabs=23, HWalign=64, Order=0-1, MinObjects=4, Processors=1, Nodes=1
Calibrating delay using timer specific routine.. 2793.34 BogoMIPS (lpj=4653358)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: a7e9f9bf    0180 
 
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 1024K
CPU: After all inits, caps: a7e9f9bf   2040 0180 
 
Intel m

Re: b44: high ping times with wireless-dev

2007-06-17 Thread Maximilian Engelhardt
On Sunday 17 June 2007, Michael Buesch wrote:
 On Saturday 16 June 2007 23:27:43 Maximilian Engelhardt wrote:
  [...]
  ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
  ACPI: PCI Interrupt :02:02.0[A] - Link [LNKD] - GSI 10 (level, low)
  - IRQ 10
  ssb: Sonics Silicon Backplane found on PCI device :02:02.0
  b44.c:v2.0
  eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
  [...]

 Ok, I prepared two debugging patches.

 Please enable SonicsSiliconBackplane Debugging in the kernel kconfig,
 so I can get more detail information about your card.
 Device Drivers/Sonics Silicon Backplane/SSB debugging
 (Must disable No SSB kernel messages)

 Please apply and test the attached debugging patches in a row.
 So apply patch 1 and test if it works again. If not, apply
 patch 2 and test if it works.
 Always save complete dmesg log on each test run and send it to me.

 Thanks for testing.
 (This time it seems we are actually getting somewhere, when
 dealing with sane people. :D )

I did the tests with my kernel where only the card is on interrupt 10. dmesg 
is attached.
With the first patch applied networking does work again. I also additionally 
tried patch2 and it also does work.

Maxi
Linux version 2.6.22-rc4-wireless-dev-20070616-test1 ([EMAIL PROTECTED]) (gcc 
version 4.1.3 20070601 (prerelease) (Debian 4.1.2-12)) #6 PREEMPT Sun Jun 17 
13:24:13 CEST 2007
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009f800 (usable)
 BIOS-e820: 0009f800 - 000a (reserved)
 BIOS-e820: 000ce000 - 000d (reserved)
 BIOS-e820: 000e - 0010 (reserved)
 BIOS-e820: 0010 - 4dee (usable)
 BIOS-e820: 4dee - 4deec000 (ACPI data)
 BIOS-e820: 4deec000 - 4df0 (ACPI NVS)
 BIOS-e820: 4df0 - 5000 (reserved)
 BIOS-e820: fec1 - fec2 (reserved)
 BIOS-e820: ff80 - ffc0 (reserved)
 BIOS-e820: fc00 - 0001 (reserved)
350MB HIGHMEM available.
896MB LOWMEM available.
Entering add_active_range(0, 0, 319200) 0 entries of 256 used
Zone PFN ranges:
  DMA 0 - 4096
  Normal   4096 -   229376
  HighMem229376 -   319200
early_node_map[1] active PFN ranges
0:0 -   319200
On node 0 totalpages: 319200
  DMA zone: 32 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4064 pages, LIFO batch:0
  Normal zone: 1760 pages used for memmap
  Normal zone: 223520 pages, LIFO batch:31
  HighMem zone: 701 pages used for memmap
  HighMem zone: 89123 pages, LIFO batch:15
DMI present.
ACPI: RSDP 000F6050, 0014 (r0 ACER  )
ACPI: RSDT 4DEE5A39, 0030 (r1 ACER   Wagtail  20020114  LTP0)
ACPI: FACP 4DEEBF2C, 0074 (r1 ACER   Wagtail  20020114 PTL50)
ACPI: DSDT 4DEE5A69, 64C3 (r1 ACER   Wagtail  20020114 MSFT  10E)
ACPI: FACS 4DEFCFC0, 0040
ACPI: HPET 4DEEBFA0, 0038 (r1 ACER   Wagtail  20020114 PTL 0)
ACPI: BOOT 4DEEBFD8, 0028 (r1 ACER   Wagtail  20020114  LTP1)
ACPI: PM-Timer IO Port: 0x1008
ACPI: HPET id: 0x8086a201 base: 0x0
Allocating PCI resources starting at 6000 (gap: 5000:aec1)
Built 1 zonelists.  Total pages: 316707
Kernel command line: root=/dev/sda1 ro vga=0x31b resume=/dev/sda2
Local APIC disabled by BIOS -- you can enable it with lapic
mapped APIC to d000 (019c4000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 16384 bytes)
Detected 1395.565 MHz processor.
Console: colour dummy device 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1259844k/1276800k available (3572k kernel code, 16168k reserved, 1152k 
data, 220k init, 359296k highmem)
virtual kernel memory layout:
fixmap  : 0xfffaa000 - 0xf000   ( 340 kB)
pkmap   : 0xff80 - 0xffc0   (4096 kB)
vmalloc : 0xf880 - 0xff7fe000   ( 111 MB)
lowmem  : 0xc000 - 0xf800   ( 896 MB)
  .init : 0xc05a - 0xc05d7000   ( 220 kB)
  .data : 0xc047d05d - 0xc059d0b0   (1152 kB)
  .text : 0xc010 - 0xc047d05d   (3572 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
SLUB: Genslabs=23, HWalign=64, Order=0-1, MinObjects=4, Processors=1, Nodes=1
Calibrating delay using timer specific routine.. 2793.34 BogoMIPS (lpj=4653358)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: a7e9f9bf    0180 
 
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 1024K
CPU: After all inits, caps: a7e9f9bf   2040 0180 
 
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Compat vDSO mapped to e000.
CPU: Intel(R) Pentium(R) M processor 1400MHz stepping 05
Checking

Re: b44: high ping times with wireless-dev

2007-06-16 Thread Maximilian Engelhardt
On Sunday 17 June 2007, Stephen Hemminger wrote:
> On Sat, 16 Jun 2007 23:27:43 +0200
>
> Maximilian Engelhardt <[EMAIL PROTECTED]> wrote:
> > Hello,
> >
> > I recently did some test and found out something interesting about the
> > b44 problem I wrote earlier.
> >
> > The problem is the following:
> > When I use my BCM4401 with the b44 driver in wireless-dev I get very high
> > ping times looking like this:
> >
> > 64 bytes from 172.30.10.1: icmp_seq=1 ttl=64 time=1863 ms
> > 64 bytes from 172.30.10.1: icmp_seq=2 ttl=64 time=855 ms
> > 64 bytes from 172.30.10.1: icmp_seq=3 ttl=64 time=1855 ms
> > 64 bytes from 172.30.10.1: icmp_seq=4 ttl=64 time=855 ms
> > 64 bytes from 172.30.10.1: icmp_seq=5 ttl=64 time=1854 ms
> > 64 bytes from 172.30.10.1: icmp_seq=6 ttl=64 time=854 ms
> > 64 bytes from 172.30.10.1: icmp_seq=7 ttl=64 time=1851 ms
> > 64 bytes from 172.30.10.1: icmp_seq=8 ttl=64 time=851 ms
> > 64 bytes from 172.30.10.1: icmp_seq=9 ttl=64 time=1851 ms
> > 64 bytes from 172.30.10.1: icmp_seq=10 ttl=64 time=851 ms
> >
> > I also found out that shortly after I boot my laptop and log into kde
> > ping times are not that high but start to increase very quickly:
> >
> > 64 bytes from 172.30.10.1: icmp_seq=53 ttl=64 time=2.19 ms
> > 64 bytes from 172.30.10.1: icmp_seq=54 ttl=64 time=2.22 ms
> > 64 bytes from 172.30.10.1: icmp_seq=55 ttl=64 time=2.20 ms
> > 64 bytes from 172.30.10.1: icmp_seq=56 ttl=64 time=2.20 ms
> > 64 bytes from 172.30.10.1: icmp_seq=57 ttl=64 time=18.6 ms
> > 64 bytes from 172.30.10.1: icmp_seq=58 ttl=64 time=1268 ms
> > 64 bytes from 172.30.10.1: icmp_seq=59 ttl=64 time=268 ms
> > 64 bytes from 172.30.10.1: icmp_seq=60 ttl=64 time=1268 ms
> > 64 bytes from 172.30.10.1: icmp_seq=61 ttl=64 time=268 ms
> > 64 bytes from 172.30.10.1: icmp_seq=62 ttl=64 time=6.08 ms
> > 64 bytes from 172.30.10.1: icmp_seq=63 ttl=64 time=268 ms
> > 64 bytes from 172.30.10.1: icmp_seq=64 ttl=64 time=1264 ms
> > 64 bytes from 172.30.10.1: icmp_seq=65 ttl=64 time=264 ms
> >
> > After some time digging around I found out something really interesting.
> > When I play some music ping times are immediately lower. If I stop
> > playing music they are back to the same times as they were before.
> >
> > I guess that there is a problem with interrupts so I post some
> > information of my system in hope it will be usefull.
> >
> > [EMAIL PROTECTED]:~$ cat /proc/interrupts
> >   CPU0
> >  0: 126317XT-PIC-XTtimer
> >  1:   3600XT-PIC-XTi8042
> >  2:  0XT-PIC-XTcascade
> >  7:  1XT-PIC-XTparport0
> >  8:  1XT-PIC-XTrtc
> >  9:  17371XT-PIC-XTacpi
> > 10:  13237XT-PIC-XTfirewire_ohci, yenta, yenta,
> > ehci_hcd:usb1, uhci_hcd:usb3, uhci_hcd:usb4, Intel 82801DB-ICH4, Intel
> > 82801DB-ICH4 Modem, eth0
> > 11:  89059XT-PIC-XTuhci_hcd:usb2, [EMAIL 
> > PROTECTED]::00:02.0
> > 12:632XT-PIC-XTi8042
> > 14:  10354XT-PIC-XTlibata
> > 15:   7408XT-PIC-XTlibata
> > NMI:  0
> > ERR:  0
> >
> >
> > [...]
> > ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
> > ACPI: PCI Interrupt :02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low)
> > -> IRQ 10
> > ssb: Sonics Silicon Backplane found on PCI device :02:02.0
> > b44.c:v2.0
> > eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
> > [...]
> >
> > This problem did only happen with wireless-dev (checkout this evening)
> > and with -mm kernels I used some time ago for testing. Currently I'm
> > running 2.6.22-rc4 that works perfectly fine and doesn't show that
> > problem.
> >
> > Maxi
>
> Can you build with APIC for uniprocessor.

I did enable CONFIG_X86_UP_APIC and CONFIG_X86_UP_IOAPIC and tried with lapic 
and apic=force but couldn't get APIC working.

>
> There is lots of IRQ sharing, so
>  - one of the other device's may be not handling shared IRQ properly.
>Try unloading firewhire modem and yenta devices.
>
>  - IRQ might be set edge triggered which doesn't work with NAPI
>   or shared IRQ.

I did build a kernel without the three mentioned above but the problem is 
still the same. I also did remove everything but eth0 on interrupt 10 so the 
only device using that interrupt is eth0 and then the card completely stopped 
working.

Maxi


signature.asc
Description: This is a digitally signed message part.


b44: high ping times with wireless-dev

2007-06-16 Thread Maximilian Engelhardt
Hello,

I recently did some test and found out something interesting about the b44 
problem I wrote earlier. 

The problem is the following:
When I use my BCM4401 with the b44 driver in wireless-dev I get very high ping 
times looking like this:

64 bytes from 172.30.10.1: icmp_seq=1 ttl=64 time=1863 ms
64 bytes from 172.30.10.1: icmp_seq=2 ttl=64 time=855 ms
64 bytes from 172.30.10.1: icmp_seq=3 ttl=64 time=1855 ms
64 bytes from 172.30.10.1: icmp_seq=4 ttl=64 time=855 ms
64 bytes from 172.30.10.1: icmp_seq=5 ttl=64 time=1854 ms
64 bytes from 172.30.10.1: icmp_seq=6 ttl=64 time=854 ms
64 bytes from 172.30.10.1: icmp_seq=7 ttl=64 time=1851 ms
64 bytes from 172.30.10.1: icmp_seq=8 ttl=64 time=851 ms
64 bytes from 172.30.10.1: icmp_seq=9 ttl=64 time=1851 ms
64 bytes from 172.30.10.1: icmp_seq=10 ttl=64 time=851 ms

I also found out that shortly after I boot my laptop and log into kde ping 
times are not that high but start to increase very quickly:

64 bytes from 172.30.10.1: icmp_seq=53 ttl=64 time=2.19 ms
64 bytes from 172.30.10.1: icmp_seq=54 ttl=64 time=2.22 ms
64 bytes from 172.30.10.1: icmp_seq=55 ttl=64 time=2.20 ms
64 bytes from 172.30.10.1: icmp_seq=56 ttl=64 time=2.20 ms
64 bytes from 172.30.10.1: icmp_seq=57 ttl=64 time=18.6 ms
64 bytes from 172.30.10.1: icmp_seq=58 ttl=64 time=1268 ms
64 bytes from 172.30.10.1: icmp_seq=59 ttl=64 time=268 ms
64 bytes from 172.30.10.1: icmp_seq=60 ttl=64 time=1268 ms
64 bytes from 172.30.10.1: icmp_seq=61 ttl=64 time=268 ms
64 bytes from 172.30.10.1: icmp_seq=62 ttl=64 time=6.08 ms
64 bytes from 172.30.10.1: icmp_seq=63 ttl=64 time=268 ms
64 bytes from 172.30.10.1: icmp_seq=64 ttl=64 time=1264 ms
64 bytes from 172.30.10.1: icmp_seq=65 ttl=64 time=264 ms

After some time digging around I found out something really interesting. When 
I play some music ping times are immediately lower. If I stop playing music 
they are back to the same times as they were before.

I guess that there is a problem with interrupts so I post some information of 
my system in hope it will be usefull.

[EMAIL PROTECTED]:~$ cat /proc/interrupts
  CPU0   
 0: 126317XT-PIC-XTtimer
 1:   3600XT-PIC-XTi8042
 2:  0XT-PIC-XTcascade
 7:  1XT-PIC-XTparport0
 8:  1XT-PIC-XTrtc
 9:  17371XT-PIC-XTacpi
10:  13237XT-PIC-XTfirewire_ohci, yenta, yenta, ehci_hcd:usb1, 
uhci_hcd:usb3, uhci_hcd:usb4, Intel 82801DB-ICH4, Intel 82801DB-ICH4 Modem, 
eth0
11:  89059XT-PIC-XTuhci_hcd:usb2, [EMAIL PROTECTED]::00:02.0
12:632XT-PIC-XTi8042
14:  10354XT-PIC-XTlibata
15:   7408XT-PIC-XTlibata
NMI:  0 
ERR:  0


[...]
ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
ACPI: PCI Interrupt :02:02.0[A] -> Link [LNKD] -> GSI 10 (level, low) -> 
IRQ 10
ssb: Sonics Silicon Backplane found on PCI device :02:02.0
b44.c:v2.0
eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
[...]

This problem did only happen with wireless-dev (checkout this evening) and 
with -mm kernels I used some time ago for testing. Currently I'm running 
2.6.22-rc4 that works perfectly fine and doesn't show that problem.

Maxi


signature.asc
Description: This is a digitally signed message part.


b44: high ping times with wireless-dev

2007-06-16 Thread Maximilian Engelhardt
Hello,

I recently did some test and found out something interesting about the b44 
problem I wrote earlier. 

The problem is the following:
When I use my BCM4401 with the b44 driver in wireless-dev I get very high ping 
times looking like this:

64 bytes from 172.30.10.1: icmp_seq=1 ttl=64 time=1863 ms
64 bytes from 172.30.10.1: icmp_seq=2 ttl=64 time=855 ms
64 bytes from 172.30.10.1: icmp_seq=3 ttl=64 time=1855 ms
64 bytes from 172.30.10.1: icmp_seq=4 ttl=64 time=855 ms
64 bytes from 172.30.10.1: icmp_seq=5 ttl=64 time=1854 ms
64 bytes from 172.30.10.1: icmp_seq=6 ttl=64 time=854 ms
64 bytes from 172.30.10.1: icmp_seq=7 ttl=64 time=1851 ms
64 bytes from 172.30.10.1: icmp_seq=8 ttl=64 time=851 ms
64 bytes from 172.30.10.1: icmp_seq=9 ttl=64 time=1851 ms
64 bytes from 172.30.10.1: icmp_seq=10 ttl=64 time=851 ms

I also found out that shortly after I boot my laptop and log into kde ping 
times are not that high but start to increase very quickly:

64 bytes from 172.30.10.1: icmp_seq=53 ttl=64 time=2.19 ms
64 bytes from 172.30.10.1: icmp_seq=54 ttl=64 time=2.22 ms
64 bytes from 172.30.10.1: icmp_seq=55 ttl=64 time=2.20 ms
64 bytes from 172.30.10.1: icmp_seq=56 ttl=64 time=2.20 ms
64 bytes from 172.30.10.1: icmp_seq=57 ttl=64 time=18.6 ms
64 bytes from 172.30.10.1: icmp_seq=58 ttl=64 time=1268 ms
64 bytes from 172.30.10.1: icmp_seq=59 ttl=64 time=268 ms
64 bytes from 172.30.10.1: icmp_seq=60 ttl=64 time=1268 ms
64 bytes from 172.30.10.1: icmp_seq=61 ttl=64 time=268 ms
64 bytes from 172.30.10.1: icmp_seq=62 ttl=64 time=6.08 ms
64 bytes from 172.30.10.1: icmp_seq=63 ttl=64 time=268 ms
64 bytes from 172.30.10.1: icmp_seq=64 ttl=64 time=1264 ms
64 bytes from 172.30.10.1: icmp_seq=65 ttl=64 time=264 ms

After some time digging around I found out something really interesting. When 
I play some music ping times are immediately lower. If I stop playing music 
they are back to the same times as they were before.

I guess that there is a problem with interrupts so I post some information of 
my system in hope it will be usefull.

[EMAIL PROTECTED]:~$ cat /proc/interrupts
  CPU0   
 0: 126317XT-PIC-XTtimer
 1:   3600XT-PIC-XTi8042
 2:  0XT-PIC-XTcascade
 7:  1XT-PIC-XTparport0
 8:  1XT-PIC-XTrtc
 9:  17371XT-PIC-XTacpi
10:  13237XT-PIC-XTfirewire_ohci, yenta, yenta, ehci_hcd:usb1, 
uhci_hcd:usb3, uhci_hcd:usb4, Intel 82801DB-ICH4, Intel 82801DB-ICH4 Modem, 
eth0
11:  89059XT-PIC-XTuhci_hcd:usb2, [EMAIL PROTECTED]::00:02.0
12:632XT-PIC-XTi8042
14:  10354XT-PIC-XTlibata
15:   7408XT-PIC-XTlibata
NMI:  0 
ERR:  0


[...]
ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
ACPI: PCI Interrupt :02:02.0[A] - Link [LNKD] - GSI 10 (level, low) - 
IRQ 10
ssb: Sonics Silicon Backplane found on PCI device :02:02.0
b44.c:v2.0
eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
[...]

This problem did only happen with wireless-dev (checkout this evening) and 
with -mm kernels I used some time ago for testing. Currently I'm running 
2.6.22-rc4 that works perfectly fine and doesn't show that problem.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: high ping times with wireless-dev

2007-06-16 Thread Maximilian Engelhardt
On Sunday 17 June 2007, Stephen Hemminger wrote:
 On Sat, 16 Jun 2007 23:27:43 +0200

 Maximilian Engelhardt [EMAIL PROTECTED] wrote:
  Hello,
 
  I recently did some test and found out something interesting about the
  b44 problem I wrote earlier.
 
  The problem is the following:
  When I use my BCM4401 with the b44 driver in wireless-dev I get very high
  ping times looking like this:
 
  64 bytes from 172.30.10.1: icmp_seq=1 ttl=64 time=1863 ms
  64 bytes from 172.30.10.1: icmp_seq=2 ttl=64 time=855 ms
  64 bytes from 172.30.10.1: icmp_seq=3 ttl=64 time=1855 ms
  64 bytes from 172.30.10.1: icmp_seq=4 ttl=64 time=855 ms
  64 bytes from 172.30.10.1: icmp_seq=5 ttl=64 time=1854 ms
  64 bytes from 172.30.10.1: icmp_seq=6 ttl=64 time=854 ms
  64 bytes from 172.30.10.1: icmp_seq=7 ttl=64 time=1851 ms
  64 bytes from 172.30.10.1: icmp_seq=8 ttl=64 time=851 ms
  64 bytes from 172.30.10.1: icmp_seq=9 ttl=64 time=1851 ms
  64 bytes from 172.30.10.1: icmp_seq=10 ttl=64 time=851 ms
 
  I also found out that shortly after I boot my laptop and log into kde
  ping times are not that high but start to increase very quickly:
 
  64 bytes from 172.30.10.1: icmp_seq=53 ttl=64 time=2.19 ms
  64 bytes from 172.30.10.1: icmp_seq=54 ttl=64 time=2.22 ms
  64 bytes from 172.30.10.1: icmp_seq=55 ttl=64 time=2.20 ms
  64 bytes from 172.30.10.1: icmp_seq=56 ttl=64 time=2.20 ms
  64 bytes from 172.30.10.1: icmp_seq=57 ttl=64 time=18.6 ms
  64 bytes from 172.30.10.1: icmp_seq=58 ttl=64 time=1268 ms
  64 bytes from 172.30.10.1: icmp_seq=59 ttl=64 time=268 ms
  64 bytes from 172.30.10.1: icmp_seq=60 ttl=64 time=1268 ms
  64 bytes from 172.30.10.1: icmp_seq=61 ttl=64 time=268 ms
  64 bytes from 172.30.10.1: icmp_seq=62 ttl=64 time=6.08 ms
  64 bytes from 172.30.10.1: icmp_seq=63 ttl=64 time=268 ms
  64 bytes from 172.30.10.1: icmp_seq=64 ttl=64 time=1264 ms
  64 bytes from 172.30.10.1: icmp_seq=65 ttl=64 time=264 ms
 
  After some time digging around I found out something really interesting.
  When I play some music ping times are immediately lower. If I stop
  playing music they are back to the same times as they were before.
 
  I guess that there is a problem with interrupts so I post some
  information of my system in hope it will be usefull.
 
  [EMAIL PROTECTED]:~$ cat /proc/interrupts
CPU0
   0: 126317XT-PIC-XTtimer
   1:   3600XT-PIC-XTi8042
   2:  0XT-PIC-XTcascade
   7:  1XT-PIC-XTparport0
   8:  1XT-PIC-XTrtc
   9:  17371XT-PIC-XTacpi
  10:  13237XT-PIC-XTfirewire_ohci, yenta, yenta,
  ehci_hcd:usb1, uhci_hcd:usb3, uhci_hcd:usb4, Intel 82801DB-ICH4, Intel
  82801DB-ICH4 Modem, eth0
  11:  89059XT-PIC-XTuhci_hcd:usb2, [EMAIL 
  PROTECTED]::00:02.0
  12:632XT-PIC-XTi8042
  14:  10354XT-PIC-XTlibata
  15:   7408XT-PIC-XTlibata
  NMI:  0
  ERR:  0
 
 
  [...]
  ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
  ACPI: PCI Interrupt :02:02.0[A] - Link [LNKD] - GSI 10 (level, low)
  - IRQ 10
  ssb: Sonics Silicon Backplane found on PCI device :02:02.0
  b44.c:v2.0
  eth0: Broadcom 44xx/47xx 10/100BaseT Ethernet 00:c0:9f:29:99:a7
  [...]
 
  This problem did only happen with wireless-dev (checkout this evening)
  and with -mm kernels I used some time ago for testing. Currently I'm
  running 2.6.22-rc4 that works perfectly fine and doesn't show that
  problem.
 
  Maxi

 Can you build with APIC for uniprocessor.

I did enable CONFIG_X86_UP_APIC and CONFIG_X86_UP_IOAPIC and tried with lapic 
and apic=force but couldn't get APIC working.


 There is lots of IRQ sharing, so
  - one of the other device's may be not handling shared IRQ properly.
Try unloading firewhire modem and yenta devices.

  - IRQ might be set edge triggered which doesn't work with NAPI
   or shared IRQ.

I did build a kernel without the three mentioned above but the problem is 
still the same. I also did remove everything but eth0 on interrupt 10 so the 
only device using that interrupt is eth0 and then the card completely stopped 
working.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Maximilian Engelhardt
On Monday 04 June 2007, Stephen Hemminger wrote:
> On Mon, 4 Jun 2007 21:47:59 +0200
>
> Maximilian Engelhardt <[EMAIL PROTECTED]> wrote:
> > On Monday 04 June 2007, Ingo Molnar wrote:
> > > * Stephen Hemminger <[EMAIL PROTECTED]> wrote:
> > > > Yes, the following patch makes iperf work better than ever. But are
> > > > other broken applications going to have same problem. Sounds like the
> > > > old "who runs first" fork() problems.
> > >
> > > this is the first such app and really, and even for this app: i've been
> > > frequently running iperf on -rt kernels for _years_ and never noticed
> > > how buggy its 'locking' code was, and that it would under some
> > > circumstances use up the whole CPU on high-res timers.
> >
> > I must admit I don't know much about that topic, but there is one thing I
> > don't understand. Why is iperf (even if it's buggy) able to use up the
> > whole cpu? I didn't run it as root but as my normal user so it should
> > have limited rights. Shouldn't the linux scheduler distribute cpu time
> > among all running processes?
>
> In this case, there are two threads. One is receiving data and the other
> is spinning checking on progress. If the spinning thread doesn't yield,
> it will end up using it's whole quantum (10ms at 100hz), before the
> scheduler lets the receiver run again. If the receiving thread doesn't
> get to run then on a UP the performance stinks.
>
Ok, let's see if I got this right:
If there are other processes that want cpu time they will get it after the 
quantum for the iperf thread is used up. So cpu time will be distributed 
among other processes, but it takes some time until they get it and this 
increases latency.

> The problem only showed up laptop because most of my other systems are
> SMP (or fake SMP/HT), and usually set HZ to 1000 not 100.

Hm, on my laptop (Pentium M) I have configured CONFIG_HZ_300 and CONFIG_NO_HZ.
On my desktop PC (Athlon 2000+, also UP) I also have CONFIG_HZ_300 and 
CONFIG_NO_HZ but didn't notice the problem.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Maximilian Engelhardt
On Monday 04 June 2007, Ingo Molnar wrote:
> * Stephen Hemminger <[EMAIL PROTECTED]> wrote:
> > Yes, the following patch makes iperf work better than ever. But are
> > other broken applications going to have same problem. Sounds like the
> > old "who runs first" fork() problems.
>
> this is the first such app and really, and even for this app: i've been
> frequently running iperf on -rt kernels for _years_ and never noticed
> how buggy its 'locking' code was, and that it would under some
> circumstances use up the whole CPU on high-res timers.

I must admit I don't know much about that topic, but there is one thing I 
don't understand. Why is iperf (even if it's buggy) able to use up the whole 
cpu? I didn't run it as root but as my normal user so it should have limited 
rights. Shouldn't the linux scheduler distribute cpu time among all running 
processes?

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Maximilian Engelhardt
On Monday 04 June 2007, Ingo Molnar wrote:
 * Stephen Hemminger [EMAIL PROTECTED] wrote:
  Yes, the following patch makes iperf work better than ever. But are
  other broken applications going to have same problem. Sounds like the
  old who runs first fork() problems.

 this is the first such app and really, and even for this app: i've been
 frequently running iperf on -rt kernels for _years_ and never noticed
 how buggy its 'locking' code was, and that it would under some
 circumstances use up the whole CPU on high-res timers.

I must admit I don't know much about that topic, but there is one thing I 
don't understand. Why is iperf (even if it's buggy) able to use up the whole 
cpu? I didn't run it as root but as my normal user so it should have limited 
rights. Shouldn't the linux scheduler distribute cpu time among all running 
processes?

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: iperf: performance regression (was b44 driver problem?)

2007-06-04 Thread Maximilian Engelhardt
On Monday 04 June 2007, Stephen Hemminger wrote:
 On Mon, 4 Jun 2007 21:47:59 +0200

 Maximilian Engelhardt [EMAIL PROTECTED] wrote:
  On Monday 04 June 2007, Ingo Molnar wrote:
   * Stephen Hemminger [EMAIL PROTECTED] wrote:
Yes, the following patch makes iperf work better than ever. But are
other broken applications going to have same problem. Sounds like the
old who runs first fork() problems.
  
   this is the first such app and really, and even for this app: i've been
   frequently running iperf on -rt kernels for _years_ and never noticed
   how buggy its 'locking' code was, and that it would under some
   circumstances use up the whole CPU on high-res timers.
 
  I must admit I don't know much about that topic, but there is one thing I
  don't understand. Why is iperf (even if it's buggy) able to use up the
  whole cpu? I didn't run it as root but as my normal user so it should
  have limited rights. Shouldn't the linux scheduler distribute cpu time
  among all running processes?

 In this case, there are two threads. One is receiving data and the other
 is spinning checking on progress. If the spinning thread doesn't yield,
 it will end up using it's whole quantum (10ms at 100hz), before the
 scheduler lets the receiver run again. If the receiving thread doesn't
 get to run then on a UP the performance stinks.

Ok, let's see if I got this right:
If there are other processes that want cpu time they will get it after the 
quantum for the iperf thread is used up. So cpu time will be distributed 
among other processes, but it takes some time until they get it and this 
increases latency.

 The problem only showed up laptop because most of my other systems are
 SMP (or fake SMP/HT), and usually set HZ to 1000 not 100.

Hm, on my laptop (Pentium M) I have configured CONFIG_HZ_300 and CONFIG_NO_HZ.
On my desktop PC (Athlon 2000+, also UP) I also have CONFIG_HZ_300 and 
CONFIG_NO_HZ but didn't notice the problem.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-06-03 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
> On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
> > > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
> > > following combinations on the kernel command line:
> > >
> > > 1) highres=off nohz=off (should be the same as your working config)
> > > 2) highres=off
> > > 3) nohz=off
> >
> > I tested this with my 2.6.22-rc3 kernel, here are the results:
> >
> > without any special boot parameters: problem does appear
> > highres=off nohz=off: problem does not appear
> > highres=off: problem does not appear
> > nohz=off: problem does appear
>
> Is there any other strange behavior of the high res enabled kernel than
> the b44 problem ?

I didn't notice anything in the past (as I wrote). But today I did some tests 
for an updated version of the p54 mac80211 wlan driver and I noticed exactly 
the same problem:

when booting with highres=off everything is fine.
But when I boot an highres enabled kernel and I do the iperf-test with the p54 
driver, my systems becomes unresponsive during the test. It seems to be 
exactly the same problem I have with the b44 driver.
So this might not be a bug in the b44 code but a bug somewhere in the linux 
networking code.

I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
external as module. 

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-06-03 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
 On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
   Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
   following combinations on the kernel command line:
  
   1) highres=off nohz=off (should be the same as your working config)
   2) highres=off
   3) nohz=off
 
  I tested this with my 2.6.22-rc3 kernel, here are the results:
 
  without any special boot parameters: problem does appear
  highres=off nohz=off: problem does not appear
  highres=off: problem does not appear
  nohz=off: problem does appear

 Is there any other strange behavior of the high res enabled kernel than
 the b44 problem ?

I didn't notice anything in the past (as I wrote). But today I did some tests 
for an updated version of the p54 mac80211 wlan driver and I noticed exactly 
the same problem:

when booting with highres=off everything is fine.
But when I boot an highres enabled kernel and I do the iperf-test with the p54 
driver, my systems becomes unresponsive during the test. It seems to be 
exactly the same problem I have with the b44 driver.
So this might not be a bug in the b44 code but a bug somewhere in the linux 
networking code.

I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
external as module. 

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
> On Mon, 2007-05-28 at 22:55 +0200, Maximilian Engelhardt wrote:
> > > > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
> > > > Timer, but the high ping problem is still there.
> > >
> > > Hmm, that's mysterious. Wild guess is that highres exposes the hidden
> > > "feature" in a different way than rc2-mm1 does.
> >
> > I think the bug in 2.6.21/22-rc3 is a different one that the one in
> > 2.6.22-rc2-mm1, but that's also only a wild guess :)
> >
> > I'll explain this a bit:
> > In 2.6.21/22-rc3 is the same b44 driver that has been in the stock
> > kernels for some time. With this driver and High Resolution Timer turned
> > on I get problems using iperf. The problems are that the systems becomes
> > really slow and unresponsive.  Michael Buesch thought this could be an
> > IRQ storm which sounds logical to me. This bug did never happen to me
> > before I startet the iperf test.
>
> Can you please apply
>
> http://www.tglx.de/projects/hrtimers/2.6.22-rc3/patch-2.6.22-rc3-hrt1.patch
>
> on top of rc3 and check, whether it has any effect on your problem.
>
The patch didn't change anything.

> > The other issue happens only with 2.6.22-rc2-mm1 which includes the b44
> > ssb spilt. It's independed wether High Resolution Timer is turned on or
> > off I always get very varying and high ping times. The iperf-test doesn't
> > show the problems from 2.6.21/22-rc3.
>
> Neither with nor without highres ?

Yes, it doesn't matter if highres is turned on or off. iperf never showed the 
problem from 2.6.21/22-rc3.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Maximilian Engelhardt
On Tuesday 29 May 2007, Gary Zambrano wrote:
> On Mon, 2007-05-28 at 13:55 -0700, Maximilian Engelhardt wrote:
> > On Monday 28 May 2007, Thomas Gleixner wrote:
> > > On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
> > > > > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try
> > > > > the following combinations on the kernel command line:
> > > > >
> > > > > 1) highres=off nohz=off (should be the same as your working config)
> > > > > 2) highres=off
> > > > > 3) nohz=off
> > > >
> > > > I tested this with my 2.6.22-rc3 kernel, here are the results:
> > > >
> > > > without any special boot parameters: problem does appear
> > > > highres=off nohz=off: problem does not appear
> > > > highres=off: problem does not appear
> > > > nohz=off: problem does appear
> > >
> > > Is there any other strange behavior of the high res enabled kernel than
> > > the b44 problem ?
> >
> > I didn't notice anything.
> >
> > > > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
> > > > Timer, but the high ping problem is still there.
> > >
> > > Hmm, that's mysterious. Wild guess is that highres exposes the hidden
> > > "feature" in a different way than rc2-mm1 does.
> >
> > I think the bug in 2.6.21/22-rc3 is a different one that the one in
> > 2.6.22-rc2-mm1, but that's also only a wild guess :)
> >
> > I'll explain this a bit:
> > In 2.6.21/22-rc3 is the same b44 driver that has been in the stock
> > kernels for some time. With this driver and High Resolution Timer turned
> > on I get problems using iperf. The problems are that the systems becomes
> > really slow and unresponsive.  Michael Buesch thought this could be an
> > IRQ storm which sounds logical to me. This bug did never happen to me
> > before I startet the iperf test.
>
> Can you please check to see if you notice anything out of the ordinary
> using netperf in place of iperf in your high res timer on/off testbed?

ok, here are the results, I also had a look at the cpu kernel usage.
'good' means that the kernel responsiveness during the test was as I would 
expect it and I didn't notice any problems.

highres enabled:

netperf: 80%sy 15%si (good)
iperf: not really messureable (bad, problem described above)

highres disabled:

netperf: 80%sy 15%si (good)
iperf:  5%sy 30%hi 15%si (good)


for test tests I did run the following commands:
netperf -l 60 192.168.1.1
iperf -c 192.168.1.1 -r -t 60

I also tried to run iperf without any additional arguments (iperf -c 
192.168.1.1) on the problematic kernel but the result is the same as the 
command I wrote above.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Maximilian Engelhardt
On Tuesday 29 May 2007, Gary Zambrano wrote:
 On Mon, 2007-05-28 at 13:55 -0700, Maximilian Engelhardt wrote:
  On Monday 28 May 2007, Thomas Gleixner wrote:
   On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
 Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try
 the following combinations on the kernel command line:

 1) highres=off nohz=off (should be the same as your working config)
 2) highres=off
 3) nohz=off
   
I tested this with my 2.6.22-rc3 kernel, here are the results:
   
without any special boot parameters: problem does appear
highres=off nohz=off: problem does not appear
highres=off: problem does not appear
nohz=off: problem does appear
  
   Is there any other strange behavior of the high res enabled kernel than
   the b44 problem ?
 
  I didn't notice anything.
 
I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
Timer, but the high ping problem is still there.
  
   Hmm, that's mysterious. Wild guess is that highres exposes the hidden
   feature in a different way than rc2-mm1 does.
 
  I think the bug in 2.6.21/22-rc3 is a different one that the one in
  2.6.22-rc2-mm1, but that's also only a wild guess :)
 
  I'll explain this a bit:
  In 2.6.21/22-rc3 is the same b44 driver that has been in the stock
  kernels for some time. With this driver and High Resolution Timer turned
  on I get problems using iperf. The problems are that the systems becomes
  really slow and unresponsive.  Michael Buesch thought this could be an
  IRQ storm which sounds logical to me. This bug did never happen to me
  before I startet the iperf test.

 Can you please check to see if you notice anything out of the ordinary
 using netperf in place of iperf in your high res timer on/off testbed?

ok, here are the results, I also had a look at the cpu kernel usage.
'good' means that the kernel responsiveness during the test was as I would 
expect it and I didn't notice any problems.

highres enabled:

netperf: 80%sy 15%si (good)
iperf: not really messureable (bad, problem described above)

highres disabled:

netperf: 80%sy 15%si (good)
iperf:  5%sy 30%hi 15%si (good)


for test tests I did run the following commands:
netperf -l 60 192.168.1.1
iperf -c 192.168.1.1 -r -t 60

I also tried to run iperf without any additional arguments (iperf -c 
192.168.1.1) on the problematic kernel but the result is the same as the 
command I wrote above.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
 On Mon, 2007-05-28 at 22:55 +0200, Maximilian Engelhardt wrote:
I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
Timer, but the high ping problem is still there.
  
   Hmm, that's mysterious. Wild guess is that highres exposes the hidden
   feature in a different way than rc2-mm1 does.
 
  I think the bug in 2.6.21/22-rc3 is a different one that the one in
  2.6.22-rc2-mm1, but that's also only a wild guess :)
 
  I'll explain this a bit:
  In 2.6.21/22-rc3 is the same b44 driver that has been in the stock
  kernels for some time. With this driver and High Resolution Timer turned
  on I get problems using iperf. The problems are that the systems becomes
  really slow and unresponsive.  Michael Buesch thought this could be an
  IRQ storm which sounds logical to me. This bug did never happen to me
  before I startet the iperf test.

 Can you please apply

 http://www.tglx.de/projects/hrtimers/2.6.22-rc3/patch-2.6.22-rc3-hrt1.patch

 on top of rc3 and check, whether it has any effect on your problem.

The patch didn't change anything.

  The other issue happens only with 2.6.22-rc2-mm1 which includes the b44
  ssb spilt. It's independed wether High Resolution Timer is turned on or
  off I always get very varying and high ping times. The iperf-test doesn't
  show the problems from 2.6.21/22-rc3.

 Neither with nor without highres ?

Yes, it doesn't matter if highres is turned on or off. iperf never showed the 
problem from 2.6.21/22-rc3.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: software suspend doesn't work with 2.6.22-rc3

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Rafael J. Wysocki wrote:
> On Monday, 28 May 2007 09:59, Rafael J. Wysocki wrote:
> > On Monday, 28 May 2007 02:21, Maximilian Engelhardt wrote:
> > > On Sunday 27 May 2007, Rafael J. Wysocki wrote:
> > > > On Sunday, 27 May 2007 22:41, Maximilian Engelhardt wrote:
> > > > > On Sunday 27 May 2007, Rafael J. Wysocki wrote:
> > > > > > On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote:
> > > > > > > On Saturday 26 May 2007, Nigel Cunningham wrote:
> > > > > > > > Hi.
> > > > > > > >
> > > > > > > > On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt 
wrote:
> > > > > > > > > On Saturday 26 May 2007, Nigel Cunningham wrote:
> > > > > > > > > > Hi.
> > > > > > > > > >
> > > > > > > > > > On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt 
wrote:
> > > > > > > > > > > Hello,
> > > > > > > > > > >
> > > > > > > > > > > When I try software suspend on my laptop it always
> > > > > > > > > > > returns to my running system after some time.
> > > > > > > > > > > This is what's logged by the kernel:
> > > > > > > > > > >
> > > > > > > > > > > swsusp: Basic memory bitmaps created
> > > > > > > > > > > Stopping tasks ...
> > > > > > > > > > > Stopping kernel threads timed out after 20 seconds (1
> > > > > > > > > > > tasks refusing to freeze):
> > > > > > > > > > >  cryptd
> > > > > > > > > > > Restarting tasks ... done.
> > > > > > > > > > > swsusp: Basic memory bitmaps freed
> > > > > > > > > > >
> > > > > > > > > > > I have no idea what's the problem, but if you tell me
> > > > > > > > > > > what I should do I can create debugging information
> > > > > > > > > > > and/or test patches.
> > > > > > > > > >
> > > > > > > > > > Could you try this patch, please? It should help.
> > > > > > > > > >
> > > > > > > > > > Herbert, is this right? If cryptd is going to be used for
> > > > > > > > > > block devs, the task should probably be PF_NOFREEZE (or
> > > > > > > > > > whatever it is today) instead.
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > >
> > > > > > > > > > Nigel
> > > > > > > > > >
> > > > > > > > > >  crypto/cryptd.c |1 +
> > > > > > > > > >  include/linux/freezer.h |3 +++
> > > > > > > > > >  kernel/power/process.c  |2 +-
> > > > > > > > > >  3 files changed, 5 insertions(+), 1 deletion(-)
> > > > > > > > > > diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c
> > > > > > > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c ---
> > > > > > > > > > 991-fix-cryptd.patch-old/crypto/cryptd.c2007-05-19
> > > > > > > > > > 18:16:47.0 +1000 +++
> > > > > > > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26
> > > > > > > > > > 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int
> > > > > > > > > > cryptd_thread(void *data)
> > > > > > > > > >
> > > > > > > > > > mutex_unlock(>mutex);
> > > > > > > > > >
> > > > > > > > > > +   try_to_freeze();
> > > > > > > > > > schedule();
> > > > > > > > > > } while (!stop);
> > > > > > > > >
> > > > > > > > > I tried your patch, but when I apply it my kernel doesn't
> > > > > > > > > compile any more. I get these warnings/errors:
> > > > > > > > >
> > > > > > 

Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
> On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
> > > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
> > > following combinations on the kernel command line:
> > >
> > > 1) highres=off nohz=off (should be the same as your working config)
> > > 2) highres=off
> > > 3) nohz=off
> >
> > I tested this with my 2.6.22-rc3 kernel, here are the results:
> >
> > without any special boot parameters: problem does appear
> > highres=off nohz=off: problem does not appear
> > highres=off: problem does not appear
> > nohz=off: problem does appear
>
> Is there any other strange behavior of the high res enabled kernel than
> the b44 problem ?

I didn't notice anything.

>
> > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
> > Timer, but the high ping problem is still there.
>
> Hmm, that's mysterious. Wild guess is that highres exposes the hidden
> "feature" in a different way than rc2-mm1 does.

I think the bug in 2.6.21/22-rc3 is a different one that the one in 
2.6.22-rc2-mm1, but that's also only a wild guess :)

I'll explain this a bit:
In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels for 
some time. With this driver and High Resolution Timer turned on I get 
problems using iperf. The problems are that the systems becomes really slow 
and unresponsive.  Michael Buesch thought this could be an IRQ storm which 
sounds logical to me. This bug did never happen to me before I startet the 
iperf test.

The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 ssb 
spilt. It's independed wether High Resolution Timer is turned on or off I 
always get very varying and high ping times. The iperf-test doesn't show the 
problems from 2.6.21/22-rc3.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
> On Mon, 2007-05-28 at 17:14 +0200, Michael Buesch wrote:
> > > The -oldconfig1 is the kernel that had no problems and the other shows
> > > the b44 problem. So if High Resolution Timer Support is disabled
> > > everything works fine and if I enable it the problems do appear again.
> > >
> > > I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling
> > > High Resolution Timer Support will also solve the problem there.
> > >
> > > The older kernels I tried also work perfectly fine and they didn't have
> > > the High Resolution Timer Support yet.
> >
> > So, that's interesting, indeed.
> > Any idea what's going on, someone? Thomas?
>
> Not off the top of my head.
>
> Maximilian, does the kernel work otherwise (I mean aside of the b44
> driver) ?
>
> Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
> following combinations on the kernel command line:
>
> 1) highres=off nohz=off (should be the same as your working config)
> 2) highres=off
> 3) nohz=off

I tested this with my 2.6.22-rc3 kernel, here are the results:

without any special boot parameters: problem does appear
highres=off nohz=off: problem does not appear
highres=off: problem does not appear
nohz=off: problem does appear

I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution Timer, 
but the high ping problem is still there.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
> Can you also test the following patch?
> I think there's a bug in b44 that is doesn't properly discard
> shared IRQs, so it might possibly generate a NAPI storm, dunno.
> Worth a try.
>
> Index: linux-2.6.22-rc3/drivers/net/b44.c
> ===
> --- linux-2.6.22-rc3.orig/drivers/net/b44.c   2007-05-27 23:01:44.0
> +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-28 12:48:27.0
> +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
>   spin_lock(>lock);
>
>   istat = br32(bp, B44_ISTAT);
> + if (istat == 0x)
> + goto out; /* Shared IRQ not for us */
>   imask = br32(bp, B44_IMASK);
>
>   /* The interrupt mask register controls which interrupt bits
> @@ -942,6 +944,7 @@ irq_ack:
>   bw32(bp, B44_ISTAT, istat);
>   br32(bp, B44_ISTAT);
>   }
> +out:
>   spin_unlock(>lock);
>   return IRQ_RETVAL(handled);
>  }

I did try this patch on a affected kernel, but I didn't notice any big 
difference. Perhaps the kernel is a bit less slow during the test, but It's 
hard to tell.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
> Can you give 2.6.16 a try? The diff is not that big and we might
> be able to find out what broke if you find out 2.6.16 works.
> You can also try later kernels like .17, .18, .19 to further
> reduce the patch. (You could also git-bisect, if you have the time).
>
I did some testing and compiled some kernels and here are the results:

I was able to find out what causes the problems for me.  I did build two 
2.6.21.3 kernels, and one does work fine and the other doesn't.

This is a diff of the kernel configs I used:

--- /usr/src/linux-2.6.21.3-oldconfig1/.config  2007-05-28 13:41:15.0 
+0200
+++ /usr/src/linux-2.6.21.3/.config 2007-05-28 14:46:08.0 +0200
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
 # Linux kernel version: 2.6.21.3
-# Mon May 28 13:41:15 2007
+# Mon May 28 14:46:09 2007
 #
 CONFIG_X86_32=y
 CONFIG_GENERIC_TIME=y
@@ -32,7 +32,7 @@
 #
 # General setup
 #
-CONFIG_LOCALVERSION="-oldconfig1"
+CONFIG_LOCALVERSION=""
 CONFIG_LOCALVERSION_AUTO=y
 CONFIG_SWAP=y
 CONFIG_SYSVIPC=y
@@ -108,9 +108,9 @@
 #
 # Processor type and features
 #
-# CONFIG_TICK_ONESHOT is not set
+CONFIG_TICK_ONESHOT=y
 # CONFIG_NO_HZ is not set
-# CONFIG_HIGH_RES_TIMERS is not set
+CONFIG_HIGH_RES_TIMERS=y
 # CONFIG_SMP is not set
 CONFIG_X86_PC=y
 # CONFIG_X86_ELAN is not set

The -oldconfig1 is the kernel that had no problems and the other shows the b44 
problem. So if High Resolution Timer Support is disabled everything works 
fine and if I enable it the problems do appear again.

I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling High 
Resolution Timer Support will also solve the problem there.

The older kernels I tried also work perfectly fine and they didn't have the 
High Resolution Timer Support yet.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: Oops with prism54 in 2.6.22-rc3

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Björn Steinbrink wrote:
> On 2007.05.26 14:42:30 +0200, Maximilian Engelhardt wrote:
> > Hello,
> >
> > when using the prism54 driver including in the 2.6.22-rc3 kernel I get
> > this Oops when putting the card into monitor mode:
> >
> > BUG: unable to handle kernel NULL pointer dereference at virtual address
> > 01d8
> >  printing eip:
> > c0500608
> > *pde = 
> > Oops: 0002 [#1]
> > PREEMPT
> > Modules linked in: fuse
> > CPU:0
> > EIP:0060:[]Not tainted VLI
> > EFLAGS: 00010046   (2.6.22-rc3 #2)
> > EIP is at netif_rx+0x48/0xc0
> > eax:    ebx: c18fdbc0   ecx: c087991c   edx: c0879910
> > esi: 0246   edi: f7c68010   ebp: f7fe0ba0   esp: c07bbef0
> > ds: 007b   es: 007b   fs:   gs:   ss: 0068
> > Process swapper (pid: 0, ti=c07ba000 task=c075a280 task.ti=c07ba000)
> > Stack: f7ec  c03d2b8f c07bbf24 0082 f7c68024 f7fe0800
> > c18fdbc0 0070 0046 0286 0286 0008 0007 0032dcd5
> >  f7fe0ba0 0002 f7fe0800 c03d913d   f7f4d2c0
> >  Call Trace:
> >  [] islpci_eth_receive+0x12f/0x590
> >  [] islpci_interrupt+0x1cd/0x280
> >  [] handle_IRQ_event+0x25/0x50
> >  [] handle_fasteoi_irq+0x5c/0xe0
> >  [] do_IRQ+0x4a/0x80
> >  [] common_interrupt+0x23/0x28
> >  [] default_idle+0x2a/0x40
> >  [] cpu_idle+0x43/0x80
> >  [] start_kernel+0x21a/0x260
> >  [] unknown_bootoption+0x0/0x260
> >  ===
> > Code: c7 43 0c 00 00 00 00 c7 43 10 00 00 00 00 9c 5e fa ff 05 bc 9c 87
> > c0 a1 0c 99 87 c0 3b 05 c0 4a 7b c0 77 30 85 c0 74 43 8b 43 14  80 d8
> > 01 00 00 a1 08 99 87 c0 ff 05 0c 99 87 c0 c7 03 04 99
> > EIP: [] netif_rx+0x48/0xc0 SS:ESP 0068:c07bbef0
> > Kernel panic - not syncing: Fatal exception in interrupt
> >
> > After this the system is frozen. Using kernel 2.6.21 everything works
> > fine, I can capture packets in monitor mode and do not get any Oops.
>
> That's probably due to commit 4c13eb6657fe9ef7b4dc8f1a405c902e9e5234e0,
> which moved the setting of skb->dev into eth_type_trans, which is never
> called when the card is in monitor mode.
>
> Could you try this patch?
>
>
> Manually set the device of a skb for prism54 cards that are in monitor
> mode as we never call eth_type_trans in that case.
>
> Signed-off-by: Björn Steinbrink <[EMAIL PROTECTED]>
> ---
> diff --git a/drivers/net/wireless/prism54/islpci_eth.c
> b/drivers/net/wireless/prism54/islpci_eth.c index dd070cc..f49eb06 100644
> --- a/drivers/net/wireless/prism54/islpci_eth.c
> +++ b/drivers/net/wireless/prism54/islpci_eth.c
> @@ -378,9 +378,10 @@ islpci_eth_receive(islpci_private *priv)
>   display_buffer((char *) skb->data, skb->len);
>  #endif
>   /* take care of monitor mode and spy monitoring. */
> - if (unlikely(priv->iw_mode == IW_MODE_MONITOR))
> + if (unlikely(priv->iw_mode == IW_MODE_MONITOR)) {
> + skb->dev = ndev;
>   discard = islpci_monitor_rx(priv, );
> - else {
> + } else {
>   if (unlikely(skb->data[2 * ETH_ALEN] == 0)) {
>   /* The packet has a rx_annex. Read it for spy 
> monitoring, Then
>* remove it, while keeping the 2 leading MAC addr.

With this patch monitor mode does work fine.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: Oops with prism54 in 2.6.22-rc3

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Björn Steinbrink wrote:
 On 2007.05.26 14:42:30 +0200, Maximilian Engelhardt wrote:
  Hello,
 
  when using the prism54 driver including in the 2.6.22-rc3 kernel I get
  this Oops when putting the card into monitor mode:
 
  BUG: unable to handle kernel NULL pointer dereference at virtual address
  01d8
   printing eip:
  c0500608
  *pde = 
  Oops: 0002 [#1]
  PREEMPT
  Modules linked in: fuse
  CPU:0
  EIP:0060:[c0500608]Not tainted VLI
  EFLAGS: 00010046   (2.6.22-rc3 #2)
  EIP is at netif_rx+0x48/0xc0
  eax:    ebx: c18fdbc0   ecx: c087991c   edx: c0879910
  esi: 0246   edi: f7c68010   ebp: f7fe0ba0   esp: c07bbef0
  ds: 007b   es: 007b   fs:   gs:   ss: 0068
  Process swapper (pid: 0, ti=c07ba000 task=c075a280 task.ti=c07ba000)
  Stack: f7ec  c03d2b8f c07bbf24 0082 f7c68024 f7fe0800
  c18fdbc0 0070 0046 0286 0286 0008 0007 0032dcd5
   f7fe0ba0 0002 f7fe0800 c03d913d   f7f4d2c0
   Call Trace:
   [c03d2b8f] islpci_eth_receive+0x12f/0x590
   [c03d913d] islpci_interrupt+0x1cd/0x280
   [c0144e15] handle_IRQ_event+0x25/0x50
   [c014669c] handle_fasteoi_irq+0x5c/0xe0
   [c010674a] do_IRQ+0x4a/0x80
   [c010498f] common_interrupt+0x23/0x28
   [c0102b3a] default_idle+0x2a/0x40
   [c01023e3] cpu_idle+0x43/0x80
   [c07bcb2a] start_kernel+0x21a/0x260
   [c07bc450] unknown_bootoption+0x0/0x260
   ===
  Code: c7 43 0c 00 00 00 00 c7 43 10 00 00 00 00 9c 5e fa ff 05 bc 9c 87
  c0 a1 0c 99 87 c0 3b 05 c0 4a 7b c0 77 30 85 c0 74 43 8b 43 14 ff 80 d8
  01 00 00 a1 08 99 87 c0 ff 05 0c 99 87 c0 c7 03 04 99
  EIP: [c0500608] netif_rx+0x48/0xc0 SS:ESP 0068:c07bbef0
  Kernel panic - not syncing: Fatal exception in interrupt
 
  After this the system is frozen. Using kernel 2.6.21 everything works
  fine, I can capture packets in monitor mode and do not get any Oops.

 That's probably due to commit 4c13eb6657fe9ef7b4dc8f1a405c902e9e5234e0,
 which moved the setting of skb-dev into eth_type_trans, which is never
 called when the card is in monitor mode.

 Could you try this patch?


 Manually set the device of a skb for prism54 cards that are in monitor
 mode as we never call eth_type_trans in that case.

 Signed-off-by: Björn Steinbrink [EMAIL PROTECTED]
 ---
 diff --git a/drivers/net/wireless/prism54/islpci_eth.c
 b/drivers/net/wireless/prism54/islpci_eth.c index dd070cc..f49eb06 100644
 --- a/drivers/net/wireless/prism54/islpci_eth.c
 +++ b/drivers/net/wireless/prism54/islpci_eth.c
 @@ -378,9 +378,10 @@ islpci_eth_receive(islpci_private *priv)
   display_buffer((char *) skb-data, skb-len);
  #endif
   /* take care of monitor mode and spy monitoring. */
 - if (unlikely(priv-iw_mode == IW_MODE_MONITOR))
 + if (unlikely(priv-iw_mode == IW_MODE_MONITOR)) {
 + skb-dev = ndev;
   discard = islpci_monitor_rx(priv, skb);
 - else {
 + } else {
   if (unlikely(skb-data[2 * ETH_ALEN] == 0)) {
   /* The packet has a rx_annex. Read it for spy 
 monitoring, Then
* remove it, while keeping the 2 leading MAC addr.

With this patch monitor mode does work fine.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
 Can you give 2.6.16 a try? The diff is not that big and we might
 be able to find out what broke if you find out 2.6.16 works.
 You can also try later kernels like .17, .18, .19 to further
 reduce the patch. (You could also git-bisect, if you have the time).

I did some testing and compiled some kernels and here are the results:

I was able to find out what causes the problems for me.  I did build two 
2.6.21.3 kernels, and one does work fine and the other doesn't.

This is a diff of the kernel configs I used:

--- /usr/src/linux-2.6.21.3-oldconfig1/.config  2007-05-28 13:41:15.0 
+0200
+++ /usr/src/linux-2.6.21.3/.config 2007-05-28 14:46:08.0 +0200
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
 # Linux kernel version: 2.6.21.3
-# Mon May 28 13:41:15 2007
+# Mon May 28 14:46:09 2007
 #
 CONFIG_X86_32=y
 CONFIG_GENERIC_TIME=y
@@ -32,7 +32,7 @@
 #
 # General setup
 #
-CONFIG_LOCALVERSION=-oldconfig1
+CONFIG_LOCALVERSION=
 CONFIG_LOCALVERSION_AUTO=y
 CONFIG_SWAP=y
 CONFIG_SYSVIPC=y
@@ -108,9 +108,9 @@
 #
 # Processor type and features
 #
-# CONFIG_TICK_ONESHOT is not set
+CONFIG_TICK_ONESHOT=y
 # CONFIG_NO_HZ is not set
-# CONFIG_HIGH_RES_TIMERS is not set
+CONFIG_HIGH_RES_TIMERS=y
 # CONFIG_SMP is not set
 CONFIG_X86_PC=y
 # CONFIG_X86_ELAN is not set

The -oldconfig1 is the kernel that had no problems and the other shows the b44 
problem. So if High Resolution Timer Support is disabled everything works 
fine and if I enable it the problems do appear again.

I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling High 
Resolution Timer Support will also solve the problem there.

The older kernels I tried also work perfectly fine and they didn't have the 
High Resolution Timer Support yet.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
 Can you also test the following patch?
 I think there's a bug in b44 that is doesn't properly discard
 shared IRQs, so it might possibly generate a NAPI storm, dunno.
 Worth a try.

 Index: linux-2.6.22-rc3/drivers/net/b44.c
 ===
 --- linux-2.6.22-rc3.orig/drivers/net/b44.c   2007-05-27 23:01:44.0
 +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-28 12:48:27.0
 +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
   spin_lock(bp-lock);

   istat = br32(bp, B44_ISTAT);
 + if (istat == 0x)
 + goto out; /* Shared IRQ not for us */
   imask = br32(bp, B44_IMASK);

   /* The interrupt mask register controls which interrupt bits
 @@ -942,6 +944,7 @@ irq_ack:
   bw32(bp, B44_ISTAT, istat);
   br32(bp, B44_ISTAT);
   }
 +out:
   spin_unlock(bp-lock);
   return IRQ_RETVAL(handled);
  }

I did try this patch on a affected kernel, but I didn't notice any big 
difference. Perhaps the kernel is a bit less slow during the test, but It's 
hard to tell.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
 On Mon, 2007-05-28 at 17:14 +0200, Michael Buesch wrote:
   The -oldconfig1 is the kernel that had no problems and the other shows
   the b44 problem. So if High Resolution Timer Support is disabled
   everything works fine and if I enable it the problems do appear again.
  
   I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling
   High Resolution Timer Support will also solve the problem there.
  
   The older kernels I tried also work perfectly fine and they didn't have
   the High Resolution Timer Support yet.
 
  So, that's interesting, indeed.
  Any idea what's going on, someone? Thomas?

 Not off the top of my head.

 Maximilian, does the kernel work otherwise (I mean aside of the b44
 driver) ?

 Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
 following combinations on the kernel command line:

 1) highres=off nohz=off (should be the same as your working config)
 2) highres=off
 3) nohz=off

I tested this with my 2.6.22-rc3 kernel, here are the results:

without any special boot parameters: problem does appear
highres=off nohz=off: problem does not appear
highres=off: problem does not appear
nohz=off: problem does appear

I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution Timer, 
but the high ping problem is still there.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
 On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
   Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
   following combinations on the kernel command line:
  
   1) highres=off nohz=off (should be the same as your working config)
   2) highres=off
   3) nohz=off
 
  I tested this with my 2.6.22-rc3 kernel, here are the results:
 
  without any special boot parameters: problem does appear
  highres=off nohz=off: problem does not appear
  highres=off: problem does not appear
  nohz=off: problem does appear

 Is there any other strange behavior of the high res enabled kernel than
 the b44 problem ?

I didn't notice anything.


  I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
  Timer, but the high ping problem is still there.

 Hmm, that's mysterious. Wild guess is that highres exposes the hidden
 feature in a different way than rc2-mm1 does.

I think the bug in 2.6.21/22-rc3 is a different one that the one in 
2.6.22-rc2-mm1, but that's also only a wild guess :)

I'll explain this a bit:
In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels for 
some time. With this driver and High Resolution Timer turned on I get 
problems using iperf. The problems are that the systems becomes really slow 
and unresponsive.  Michael Buesch thought this could be an IRQ storm which 
sounds logical to me. This bug did never happen to me before I startet the 
iperf test.

The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 ssb 
spilt. It's independed wether High Resolution Timer is turned on or off I 
always get very varying and high ping times. The iperf-test doesn't show the 
problems from 2.6.21/22-rc3.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: software suspend doesn't work with 2.6.22-rc3

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Rafael J. Wysocki wrote:
 On Monday, 28 May 2007 09:59, Rafael J. Wysocki wrote:
  On Monday, 28 May 2007 02:21, Maximilian Engelhardt wrote:
   On Sunday 27 May 2007, Rafael J. Wysocki wrote:
On Sunday, 27 May 2007 22:41, Maximilian Engelhardt wrote:
 On Sunday 27 May 2007, Rafael J. Wysocki wrote:
  On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote:
   On Saturday 26 May 2007, Nigel Cunningham wrote:
Hi.
   
On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt 
wrote:
 On Saturday 26 May 2007, Nigel Cunningham wrote:
  Hi.
 
  On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt 
wrote:
   Hello,
  
   When I try software suspend on my laptop it always
   returns to my running system after some time.
   This is what's logged by the kernel:
  
   swsusp: Basic memory bitmaps created
   Stopping tasks ...
   Stopping kernel threads timed out after 20 seconds (1
   tasks refusing to freeze):
cryptd
   Restarting tasks ... done.
   swsusp: Basic memory bitmaps freed
  
   I have no idea what's the problem, but if you tell me
   what I should do I can create debugging information
   and/or test patches.
 
  Could you try this patch, please? It should help.
 
  Herbert, is this right? If cryptd is going to be used for
  block devs, the task should probably be PF_NOFREEZE (or
  whatever it is today) instead.
 
  Regards,
 
  Nigel
 
   crypto/cryptd.c |1 +
   include/linux/freezer.h |3 +++
   kernel/power/process.c  |2 +-
   3 files changed, 5 insertions(+), 1 deletion(-)
  diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c
  991-fix-cryptd.patch-new/crypto/cryptd.c ---
  991-fix-cryptd.patch-old/crypto/cryptd.c2007-05-19
  18:16:47.0 +1000 +++
  991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26
  19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int
  cryptd_thread(void *data)
 
  mutex_unlock(state-mutex);
 
  +   try_to_freeze();
  schedule();
  } while (!stop);

 I tried your patch, but when I apply it my kernel doesn't
 compile any more. I get these warnings/errors:

 [...]
   CC  crypto/cryptd.o
 crypto/cryptd.c: In function ‘cryptd_thread’:
 crypto/cryptd.c:344: warning: implicit declaration of
 function ‘try_to_freeze’ [...]
   LD  init/built-in.o
   LD  .tmp_vmlinux1
 crypto/built-in.o: In function `cryptd_thread':
 cryptd.c:(.text+0xd7f5): undefined reference to
 `try_to_freeze' make: *** [.tmp_vmlinux1] Error 1
   
Ah. You'll need to add #include linux/freezer.h near that
start of crypto/cryptd.c. Sorry for forgetting that.
   
Nigel
  
   I added the include line and now I could compile the kernel,
   but suspending still doesn't work.
  
   swsusp: Basic memory bitmaps created
   Stopping tasks ...
   Stopping kernel threads timed out after 20 seconds (1 tasks
   refusing to freeze):
   cryptd
   Restarting tasks ... done.
   swsusp: Basic memory bitmaps freed
 
  OK, this means that cryptd doesn't execute the try_to_freeze()
  for some reason.
 
  Please apply the appended patch on top of 2.6.22-rc3 and see if
  that helps.
 
  Greetings,
  Rafael
 
  ---
   crypto/cryptd.c |1 +
   1 file changed, 1 insertion(+)
 
  Index: linux-2.6.22-rc3/crypto/cryptd.c
  =
 == --- linux-2.6.22-rc3.orig/crypto/cryptd.c
  +++ linux-2.6.22-rc3/crypto/cryptd.c
  @@ -316,6 +316,7 @@ static int cryptd_thread(void *data)
  struct cryptd_state *state = data;
  int stop;
 
  +   current-flags |= PF_NOFREEZE;
  do {
  struct crypto_async_request *req, *backlog;

 Even with this patch suspending doesn't work, dmesg shows the same
 error message.
 I also did build a kernel without cryptd and suspending does work
 there.
   
Well, that's strange, because in that case the freezer shouldn't even
wait for cryptd.
   
Can you please try the patch at http://lkml.org/lkml/2007/5/26/169 ?
  
   With this patch applied suspend does work fine.
 
  Hmm.  IMO the patch is too intrusive for 2.6.22, but OTOH it's going into
  the direction preferred by some prominent people. ;-)
 
  Let's try to combine the two threads and see what results from that.

 Well, it looks like we have

Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
> Ok, another question: On which CPU architecture are you?

[EMAIL PROTECTED]:~$ uname -m
i686

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: software suspend doesn't work with 2.6.22-rc3

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Rafael J. Wysocki wrote:
> On Sunday, 27 May 2007 22:41, Maximilian Engelhardt wrote:
> > On Sunday 27 May 2007, Rafael J. Wysocki wrote:
> > > On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote:
> > > > On Saturday 26 May 2007, Nigel Cunningham wrote:
> > > > > Hi.
> > > > >
> > > > > On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote:
> > > > > > On Saturday 26 May 2007, Nigel Cunningham wrote:
> > > > > > > Hi.
> > > > > > >
> > > > > > > On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote:
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > When I try software suspend on my laptop it always returns to
> > > > > > > > my running system after some time.
> > > > > > > > This is what's logged by the kernel:
> > > > > > > >
> > > > > > > > swsusp: Basic memory bitmaps created
> > > > > > > > Stopping tasks ...
> > > > > > > > Stopping kernel threads timed out after 20 seconds (1 tasks
> > > > > > > > refusing to freeze):
> > > > > > > >  cryptd
> > > > > > > > Restarting tasks ... done.
> > > > > > > > swsusp: Basic memory bitmaps freed
> > > > > > > >
> > > > > > > > I have no idea what's the problem, but if you tell me what I
> > > > > > > > should do I can create debugging information and/or test
> > > > > > > > patches.
> > > > > > >
> > > > > > > Could you try this patch, please? It should help.
> > > > > > >
> > > > > > > Herbert, is this right? If cryptd is going to be used for block
> > > > > > > devs, the task should probably be PF_NOFREEZE (or whatever it
> > > > > > > is today) instead.
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > Nigel
> > > > > > >
> > > > > > >  crypto/cryptd.c |1 +
> > > > > > >  include/linux/freezer.h |3 +++
> > > > > > >  kernel/power/process.c  |2 +-
> > > > > > >  3 files changed, 5 insertions(+), 1 deletion(-)
> > > > > > > diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c
> > > > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c ---
> > > > > > > 991-fix-cryptd.patch-old/crypto/cryptd.c  2007-05-19
> > > > > > > 18:16:47.0 +1000 +++
> > > > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c  2007-05-26
> > > > > > > 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int
> > > > > > > cryptd_thread(void *data)
> > > > > > >
> > > > > > >   mutex_unlock(>mutex);
> > > > > > >
> > > > > > > + try_to_freeze();
> > > > > > >   schedule();
> > > > > > >   } while (!stop);
> > > > > >
> > > > > > I tried your patch, but when I apply it my kernel doesn't compile
> > > > > > any more. I get these warnings/errors:
> > > > > >
> > > > > > [...]
> > > > > >   CC  crypto/cryptd.o
> > > > > > crypto/cryptd.c: In function ‘cryptd_thread’:
> > > > > > crypto/cryptd.c:344: warning: implicit declaration of function
> > > > > > ‘try_to_freeze’ [...]
> > > > > >   LD  init/built-in.o
> > > > > >   LD  .tmp_vmlinux1
> > > > > > crypto/built-in.o: In function `cryptd_thread':
> > > > > > cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze'
> > > > > > make: *** [.tmp_vmlinux1] Error 1
> > > > >
> > > > > Ah. You'll need to add #include  near that start
> > > > > of crypto/cryptd.c. Sorry for forgetting that.
> > > > >
> > > > > Nigel
> > > >
> > > > I added the include line and now I could compile the kernel, but
> > > > suspending still doesn't work.
> > > >
> > > > swsusp: Basic memory bitmaps created
> > > > Stopping tasks ...
> > > > Stopping kernel threads timed out after 20 seconds (1 tasks refusing
> > > > to freeze):
> > > > cryptd
> > > > Restarting tasks ... done.
> > > > swsusp: Basic memory bitmaps freed
> > >
> > > OK, this means that cryptd doesn't execute the try_to_freeze() for some
> > > reason.
> > >
> > > Please apply the appended patch on top of 2.6.22-rc3 and see if that
> > > helps.
> > >
> > > Greetings,
> > > Rafael
> > >
> > > ---
> > >  crypto/cryptd.c |1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > Index: linux-2.6.22-rc3/crypto/cryptd.c
> > > ===
> > > --- linux-2.6.22-rc3.orig/crypto/cryptd.c
> > > +++ linux-2.6.22-rc3/crypto/cryptd.c
> > > @@ -316,6 +316,7 @@ static int cryptd_thread(void *data)
> > >   struct cryptd_state *state = data;
> > >   int stop;
> > >
> > > + current->flags |= PF_NOFREEZE;
> > >   do {
> > >   struct crypto_async_request *req, *backlog;
> >
> > Even with this patch suspending doesn't work, dmesg shows the same error
> > message.
> > I also did build a kernel without cryptd and suspending does work there.
>
> Well, that's strange, because in that case the freezer shouldn't even wait
> for cryptd.
>
> Can you please try the patch at http://lkml.org/lkml/2007/5/26/169 ?

With this patch applied suspend does work fine.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
> On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
> > 2.6.21.1:
> > [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
> > [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
> > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
> > [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec
> >
> > 2.6.22-rc3:
> > [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
> > [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
> > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
> > [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec
>
> This is the diff between these two kernels.
> I'm not sure why you see a much better TX throughput here.
>
> Can you re-check to make sure it's not just some test-jitter?
>
2.6.21.1:

[  5] local 192.168.1.2 port 54423 connected with 192.168.1.1 port 5001
[  5]  0.0-60.3 sec  3.06 MBytes426 Kbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 41053
[  4]  0.0-163.0 sec130 MBytes  6.67 Mbits/sec


2.6.22-rc3:

[  5] local 192.168.1.2 port 46002 connected with 192.168.1.1 port 5001
[  5]  0.0-61.5 sec  84.0 MBytes  11.5 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 44379
[  4]  0.0-93.8 sec  30.6 MBytes  2.74 Mbits/sec

For TX the iperf server reports the same values as the client (all values are 
from the client) but for RX they are differen:

2.6.21.1: (iperf server log):

[  5] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 54423
[  5]  0.0-60.5 sec  3.06 MBytes425 Kbits/sec
[  5] local 192.168.1.1 port 41053 connected with 192.168.1.2 port 5001
[  5]  0.0-63.1 sec130 MBytes  17.2 Mbits/sec


2.6.22-rc3 (iperf server log):

[  4] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 46002
[  4]  0.0-61.6 sec  84.0 MBytes  11.5 Mbits/sec
[  4] local 192.168.1.1 port 44379 connected with 192.168.1.2 port 5001
[  4]  0.0-63.3 sec  30.6 MBytes  4.06 Mbits/sec

I have no idea how iperf internally works and what can cause such different 
results here.

>
> --- linux-2.6.21.1/drivers/net/b44.c2007-05-27 22:58:01.0 +0200
> +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-27 23:01:44.0 +0200
> @@ -825,12 +825,11 @@
> if (copy_skb == NULL)
> goto drop_it_no_recycle;
>
> -   copy_skb->dev = bp->dev;
> skb_reserve(copy_skb, 2);
> skb_put(copy_skb, len);
> /* DMA sync done above, copy just the actual packet
> */ -   memcpy(copy_skb->data, skb->data+bp->rx_offset,
> len); -
> +   skb_copy_from_linear_data_offset(skb,
> bp->rx_offset, +   
> copy_skb->data, len); skb = copy_skb;
> }
> skb->ip_summed = CHECKSUM_NONE;
> @@ -1007,7 +1006,8 @@
> goto err_out;
> }
>
> -   memcpy(skb_put(bounce_skb, len), skb->data, skb->len);
> +   skb_copy_from_linear_data(skb, skb_put(bounce_skb, len),
> + skb->len);
> dev_kfree_skb_any(skb);
> skb = bounce_skb;
> }




signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
> On Sunday 27 May 2007 23:13:32 Michael Buesch wrote:
> > On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
> > > 2.6.21.1:
> > > [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
> > > [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
> > > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
> > > [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec
> > >
> > > 2.6.22-rc3:
> > > [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
> > > [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
> > > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
> > > [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec
> >
> > This is the diff between these two kernels.
> > I'm not sure why you see a much better TX throughput here.
> >
> > Can you re-check to make sure it's not just some test-jitter?
>
> Oh, eh, and what I forgot to ask:
> Do you know an old kernel that works perfectly well for you,
> so I can look at a diff between this one and anything >=2.6.21.1.

I don't know any, most older kernels did work fine for me, but I never user 
iperf there so I guess if the bug is there also I simply didn't trigger it.
If you think it's usefull I could go back and try different kernels, but that 
would take some time.
Except the iperf bug 2.6.21.1 and 2.6.22-rc3 work fine.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
> On Sunday 27 May 2007 22:36:39 Maximilian Engelhardt wrote:
> > When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in
> > normal use I didn't notice any problems. It did work fine as I would
> > expect it. I think the wget and ping tests here are as they should be.
> >
> > With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The
> > ping test does confirm this, because here response times are very high.
> > As far as I can remember the wget download rate was a bit slower than
> > 2.6.21.1 or 2.6.22-rc3 till it stalled.
> > I would expect it to be someting like the other two kernels. The two
> > problems I see are the high ping times and the fact that the card stopped
> > working.
> >
> > I don't know why the iperf results are so different from my personal
> > experience. I guess the fact that I get so bad results with 2.6.21.1 and
> > 2.6.22-rc3 is that iperf does something that causes the system to be
> > extremely slow and thus degrading performance. This could be a bug
> > somewhere in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has
> > unintended been fixed by the ssb switch, but that's only a roughly guess.
>
> Ok. I guess (Yes I do :D) that there is an IRQ storm or something like
> that, because you say that your system is becoming very slow and
> unresponsive. It sounds like an IRQ is not ACKed correctly and so keeps
> triggering and stalling the system. I'll take a look at a few diffs...
> Do you see significant differences in the "hi" and/or "si" times in top?
> Do you see a significant difference in the /proc/interrupts count. For
> example that the kernel that works worse generates 10 times the IRQ count
> for the same amount of data.

ok, here are the results:

Using 2.6.22-rc3 I get lot's of hi during TX and lots of hi and si during RX.
Using 2.6.22-rc3-mm1 hi and si are significantly lower.
It's difficult to give absolute numbers, because top refreshes very slow, but 
with 2.6.22-rc3 hi is about 30% during TX and RX and si is 0% during TX and 
50% during RX. With Using 2.6.22-rc3-mm1 hi is 0% during TX and 0.3% during 
RX and si is 10% during TX and 0% during RX.

When I do the same test on both kernels I get about 10 times (yes, it's really 
about ten times like in your example) more interrupts with 2.6.22-rc3 than 
with 2.6.22-rc3-mm1.

An additional thing I noticed it that it's not the BCM4401 card that stops 
working but my e100 card. If I take the e100 card down and up again the 
connection is working again, so the BCM4401 doesn't have a "stops working" 
bug for me.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: software suspend doesn't work with 2.6.22-rc3

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Rafael J. Wysocki wrote:
> On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote:
> > On Saturday 26 May 2007, Nigel Cunningham wrote:
> > > Hi.
> > >
> > > On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote:
> > > > On Saturday 26 May 2007, Nigel Cunningham wrote:
> > > > > Hi.
> > > > >
> > > > > On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote:
> > > > > > Hello,
> > > > > >
> > > > > > When I try software suspend on my laptop it always returns to my
> > > > > > running system after some time.
> > > > > > This is what's logged by the kernel:
> > > > > >
> > > > > > swsusp: Basic memory bitmaps created
> > > > > > Stopping tasks ...
> > > > > > Stopping kernel threads timed out after 20 seconds (1 tasks
> > > > > > refusing to freeze):
> > > > > >  cryptd
> > > > > > Restarting tasks ... done.
> > > > > > swsusp: Basic memory bitmaps freed
> > > > > >
> > > > > > I have no idea what's the problem, but if you tell me what I
> > > > > > should do I can create debugging information and/or test patches.
> > > > >
> > > > > Could you try this patch, please? It should help.
> > > > >
> > > > > Herbert, is this right? If cryptd is going to be used for block
> > > > > devs, the task should probably be PF_NOFREEZE (or whatever it is
> > > > > today) instead.
> > > > >
> > > > > Regards,
> > > > >
> > > > > Nigel
> > > > >
> > > > >  crypto/cryptd.c |1 +
> > > > >  include/linux/freezer.h |3 +++
> > > > >  kernel/power/process.c  |2 +-
> > > > >  3 files changed, 5 insertions(+), 1 deletion(-)
> > > > > diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c
> > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c ---
> > > > > 991-fix-cryptd.patch-old/crypto/cryptd.c  2007-05-19
> > > > > 18:16:47.0 +1000 +++
> > > > > 991-fix-cryptd.patch-new/crypto/cryptd.c  2007-05-26
> > > > > 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int
> > > > > cryptd_thread(void *data)
> > > > >
> > > > >   mutex_unlock(>mutex);
> > > > >
> > > > > + try_to_freeze();
> > > > >   schedule();
> > > > >   } while (!stop);
> > > >
> > > > I tried your patch, but when I apply it my kernel doesn't compile any
> > > > more. I get these warnings/errors:
> > > >
> > > > [...]
> > > >   CC  crypto/cryptd.o
> > > > crypto/cryptd.c: In function ‘cryptd_thread’:
> > > > crypto/cryptd.c:344: warning: implicit declaration of function
> > > > ‘try_to_freeze’ [...]
> > > >   LD  init/built-in.o
> > > >   LD  .tmp_vmlinux1
> > > > crypto/built-in.o: In function `cryptd_thread':
> > > > cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze'
> > > > make: *** [.tmp_vmlinux1] Error 1
> > >
> > > Ah. You'll need to add #include  near that start of
> > > crypto/cryptd.c. Sorry for forgetting that.
> > >
> > > Nigel
> >
> > I added the include line and now I could compile the kernel, but
> > suspending still doesn't work.
> >
> > swsusp: Basic memory bitmaps created
> > Stopping tasks ...
> > Stopping kernel threads timed out after 20 seconds (1 tasks refusing to
> > freeze):
> > cryptd
> > Restarting tasks ... done.
> > swsusp: Basic memory bitmaps freed
>
> OK, this means that cryptd doesn't execute the try_to_freeze() for some
> reason.
>
> Please apply the appended patch on top of 2.6.22-rc3 and see if that helps.
>
> Greetings,
> Rafael
>
> ---
>  crypto/cryptd.c |1 +
>  1 file changed, 1 insertion(+)
>
> Index: linux-2.6.22-rc3/crypto/cryptd.c
> ===
> --- linux-2.6.22-rc3.orig/crypto/cryptd.c
> +++ linux-2.6.22-rc3/crypto/cryptd.c
> @@ -316,6 +316,7 @@ static int cryptd_thread(void *data)
>   struct cryptd_state *state = data;
>   int stop;
>
> + current->flags |= PF_NOFREEZE;
>   do {
>   struct crypto_async_request *req, *backlog;

Even with this patch suspending doesn't work, dmesg shows the same error 
message.
I also did build a kernel without cryptd and suspending does work there.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
> On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
> > 2.6.22-rc3:
> >
> > [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
> > [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
> > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
> > [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec
>
> Why do we have two different measurements here? Is one TX and one RX?
> Which one?

Yes, the first is TX (BCM4401 --> e100) and the second is RX. Both are tcp 
connections. I think iperf does display the ip addresses wrong in the second 
connection, but that's another issue.

>
> > koala:~# ping -c10 192.168.1.1
> > PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
> > 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms
> > 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms
> > 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms
> > 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms
> > 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms
> > 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms
> > 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms
> > 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms
> > 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
> > 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms
> >
> > --- 192.168.1.1 ping statistics ---
> > 10 packets transmitted, 10 received, 0% packet loss, time 8997ms
> > rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms
> >
> > System responsiveness was the same as with 2.6.21.1.
> >
> > wget got 11.23M/s, again same as 2.6.21.1.
> >
> >
> > 2.6.22-rc2-mm1:
> >
> > [  5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001
> > [  5]  0.0-60.1 sec402 MBytes  56.1 Mbits/sec
> > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598
> > [  4]  0.0-63.0 sec177 MBytes  23.6 Mbits/sec
>
> So with -mm (with ssb) you actually get better performace
> then with plain 2.6.22-rc3?
>
> Can you elaborate a bit more about what you get an what you expect
> on which kernel?

When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in normal 
use I didn't notice any problems. It did work fine as I would expect it.
I think the wget and ping tests here are as they should be.

With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The ping 
test does confirm this, because here response times are very high. As far as 
I can remember the wget download rate was a bit slower than 2.6.21.1 or 
2.6.22-rc3 till it stalled.
I would expect it to be someting like the other two kernels. The two problems 
I see are the high ping times and the fact that the card stopped working.

I don't know why the iperf results are so different from my personal 
experience. I guess the fact that I get so bad results with 2.6.21.1 and 
2.6.22-rc3 is that iperf does something that causes the system to be 
extremely slow and thus degrading performance. This could be a bug somewhere 
in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has unintended been fixed 
by the ssb switch, but that's only a roughly guess.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
I send this again because my first mail accidently had html code in it and 
might have been filtered by some people.

On Saturday 26 May 2007, Michael Buesch wrote:
> On Saturday 26 May 2007 02:24:31 Stephen Hemminger wrote:
> > Something is broken with the b44 driver in 2.6.22-rc1 or later. Now
> > bisecting. The performance (with iperf) for receiving is normally 94Mbits
> > or more. But something happened that dropped performance to less than
> > 1Mbit, probably corrupted packets.
> >
> > There is nothing obvious in the commit log for drivers/net/b44.c, so it
> > probably is something more general.
> >
> >
> > Looking at the code in b44_rx(), I see a couple unrelated of bugs:
> > 1. In the small packet case it recycles the skb before copying data
> > out... Not good if new data arrives overwriting existing data.
> >
> > 2. Macros like RX_PKT_BUF_SZ that depend on local variables are evil!!
>
> Very interesting!
> 2.6.22 doesn't include ssb, does it?
>
> Adding CCs to make reporters of another bugreport aware of this.

I did some more tests with my BCM4401 and different kernels, here are the 
results:

2.6.21.1:

iperf:
[  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
[  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
[  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.241 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.215 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.230 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.238 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.229 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.231 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.229 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.237 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8998ms
rtt min/avg/max/mdev = 0.215/0.230/0.241/0.018 ms

The system was unusable while i ran the iperf test, when I moved the mouse it 
was only jumping around and doing anything like starting programs or 
switching the desktop first happend after iperf had finished it's test.

I did a http downlaod with wget and got 11.23M/s.


2.6.22-rc3:

[  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
[  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
[  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8997ms
rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms

System responsiveness was the same as with 2.6.21.1.

wget got 11.23M/s, again same as 2.6.21.1.


2.6.22-rc2-mm1:

[  5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001
[  5]  0.0-60.1 sec402 MBytes  56.1 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598
[  4]  0.0-63.0 sec177 MBytes  23.6 Mbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=39.8 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=52.7 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=86.7 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=8.22 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=32.1 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=56.0 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=80.0 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=1.52 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=25.4 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=49.3 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9000ms
rtt min/avg/max/mdev = 1.526/43.207/86.700/26.369 ms

Here system responsiveness was ok whil I ran iperf, I didn't notic anything 
anomalous.

When I tried the wget http download the tranfer did stall and from this point 
on I couldn't send or receive anything on my BCM4401 anymore. Taken the 

Re: software suspend doesn't work with 2.6.22-rc3

2007-05-27 Thread Maximilian Engelhardt
On Saturday 26 May 2007, Nigel Cunningham wrote:
> Hi.
>
> On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote:
> > On Saturday 26 May 2007, Nigel Cunningham wrote:
> > > Hi.
> > >
> > > On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote:
> > > > Hello,
> > > >
> > > > When I try software suspend on my laptop it always returns to my
> > > > running system after some time.
> > > > This is what's logged by the kernel:
> > > >
> > > > swsusp: Basic memory bitmaps created
> > > > Stopping tasks ...
> > > > Stopping kernel threads timed out after 20 seconds (1 tasks refusing
> > > > to freeze):
> > > >  cryptd
> > > > Restarting tasks ... done.
> > > > swsusp: Basic memory bitmaps freed
> > > >
> > > > I have no idea what's the problem, but if you tell me what I should
> > > > do I can create debugging information and/or test patches.
> > >
> > > Could you try this patch, please? It should help.
> > >
> > > Herbert, is this right? If cryptd is going to be used for block devs,
> > > the task should probably be PF_NOFREEZE (or whatever it is today)
> > > instead.
> > >
> > > Regards,
> > >
> > > Nigel
> > >
> > >  crypto/cryptd.c |1 +
> > >  include/linux/freezer.h |3 +++
> > >  kernel/power/process.c  |2 +-
> > >  3 files changed, 5 insertions(+), 1 deletion(-)
> > > diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c
> > > 991-fix-cryptd.patch-new/crypto/cryptd.c ---
> > > 991-fix-cryptd.patch-old/crypto/cryptd.c  2007-05-19 18:16:47.0
> > > +1000 +++ 991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26
> > > 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int
> > > cryptd_thread(void *data)
> > >
> > >   mutex_unlock(>mutex);
> > >
> > > + try_to_freeze();
> > >   schedule();
> > >   } while (!stop);
> >
> > I tried your patch, but when I apply it my kernel doesn't compile any
> > more. I get these warnings/errors:
> >
> > [...]
> >   CC  crypto/cryptd.o
> > crypto/cryptd.c: In function ‘cryptd_thread’:
> > crypto/cryptd.c:344: warning: implicit declaration of function
> > ‘try_to_freeze’ [...]
> >   LD  init/built-in.o
> >   LD  .tmp_vmlinux1
> > crypto/built-in.o: In function `cryptd_thread':
> > cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze'
> > make: *** [.tmp_vmlinux1] Error 1
>
> Ah. You'll need to add #include  near that start of
> crypto/cryptd.c. Sorry for forgetting that.
>
> Nigel

I added the include line and now I could compile the kernel, but suspending 
still doesn't work.

swsusp: Basic memory bitmaps created
Stopping tasks ... 
Stopping kernel threads timed out after 20 seconds (1 tasks refusing to 
freeze):
cryptd
Restarting tasks ... done.
swsusp: Basic memory bitmaps freed

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: software suspend doesn't work with 2.6.22-rc3

2007-05-27 Thread Maximilian Engelhardt
On Saturday 26 May 2007, Nigel Cunningham wrote:
 Hi.

 On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote:
  On Saturday 26 May 2007, Nigel Cunningham wrote:
   Hi.
  
   On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote:
Hello,
   
When I try software suspend on my laptop it always returns to my
running system after some time.
This is what's logged by the kernel:
   
swsusp: Basic memory bitmaps created
Stopping tasks ...
Stopping kernel threads timed out after 20 seconds (1 tasks refusing
to freeze):
 cryptd
Restarting tasks ... done.
swsusp: Basic memory bitmaps freed
   
I have no idea what's the problem, but if you tell me what I should
do I can create debugging information and/or test patches.
  
   Could you try this patch, please? It should help.
  
   Herbert, is this right? If cryptd is going to be used for block devs,
   the task should probably be PF_NOFREEZE (or whatever it is today)
   instead.
  
   Regards,
  
   Nigel
  
crypto/cryptd.c |1 +
include/linux/freezer.h |3 +++
kernel/power/process.c  |2 +-
3 files changed, 5 insertions(+), 1 deletion(-)
   diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c
   991-fix-cryptd.patch-new/crypto/cryptd.c ---
   991-fix-cryptd.patch-old/crypto/cryptd.c  2007-05-19 18:16:47.0
   +1000 +++ 991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26
   19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int
   cryptd_thread(void *data)
  
 mutex_unlock(state-mutex);
  
   + try_to_freeze();
 schedule();
 } while (!stop);
 
  I tried your patch, but when I apply it my kernel doesn't compile any
  more. I get these warnings/errors:
 
  [...]
CC  crypto/cryptd.o
  crypto/cryptd.c: In function ‘cryptd_thread’:
  crypto/cryptd.c:344: warning: implicit declaration of function
  ‘try_to_freeze’ [...]
LD  init/built-in.o
LD  .tmp_vmlinux1
  crypto/built-in.o: In function `cryptd_thread':
  cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze'
  make: *** [.tmp_vmlinux1] Error 1

 Ah. You'll need to add #include linux/freezer.h near that start of
 crypto/cryptd.c. Sorry for forgetting that.

 Nigel

I added the include line and now I could compile the kernel, but suspending 
still doesn't work.

swsusp: Basic memory bitmaps created
Stopping tasks ... 
Stopping kernel threads timed out after 20 seconds (1 tasks refusing to 
freeze):
cryptd
Restarting tasks ... done.
swsusp: Basic memory bitmaps freed

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
I send this again because my first mail accidently had html code in it and 
might have been filtered by some people.

On Saturday 26 May 2007, Michael Buesch wrote:
 On Saturday 26 May 2007 02:24:31 Stephen Hemminger wrote:
  Something is broken with the b44 driver in 2.6.22-rc1 or later. Now
  bisecting. The performance (with iperf) for receiving is normally 94Mbits
  or more. But something happened that dropped performance to less than
  1Mbit, probably corrupted packets.
 
  There is nothing obvious in the commit log for drivers/net/b44.c, so it
  probably is something more general.
 
 
  Looking at the code in b44_rx(), I see a couple unrelated of bugs:
  1. In the small packet case it recycles the skb before copying data
  out... Not good if new data arrives overwriting existing data.
 
  2. Macros like RX_PKT_BUF_SZ that depend on local variables are evil!!

 Very interesting!
 2.6.22 doesn't include ssb, does it?

 Adding CCs to make reporters of another bugreport aware of this.

I did some more tests with my BCM4401 and different kernels, here are the 
results:

2.6.21.1:

iperf:
[  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
[  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
[  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.241 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.215 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.230 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.238 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.229 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.231 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.229 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.237 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8998ms
rtt min/avg/max/mdev = 0.215/0.230/0.241/0.018 ms

The system was unusable while i ran the iperf test, when I moved the mouse it 
was only jumping around and doing anything like starting programs or 
switching the desktop first happend after iperf had finished it's test.

I did a http downlaod with wget and got 11.23M/s.


2.6.22-rc3:

[  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
[  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
[  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8997ms
rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms

System responsiveness was the same as with 2.6.21.1.

wget got 11.23M/s, again same as 2.6.21.1.


2.6.22-rc2-mm1:

[  5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001
[  5]  0.0-60.1 sec402 MBytes  56.1 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598
[  4]  0.0-63.0 sec177 MBytes  23.6 Mbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=39.8 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=52.7 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=86.7 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=8.22 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=32.1 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=56.0 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=80.0 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=1.52 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=25.4 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=49.3 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9000ms
rtt min/avg/max/mdev = 1.526/43.207/86.700/26.369 ms

Here system responsiveness was ok whil I ran iperf, I didn't notic anything 
anomalous.

When I tried the wget http download the tranfer did stall and from this point 
on I couldn't send or receive anything on my BCM4401 anymore. Taken the 
interface down and up again didn't 

Re: software suspend doesn't work with 2.6.22-rc3

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Rafael J. Wysocki wrote:
 On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote:
  On Saturday 26 May 2007, Nigel Cunningham wrote:
   Hi.
  
   On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote:
On Saturday 26 May 2007, Nigel Cunningham wrote:
 Hi.

 On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote:
  Hello,
 
  When I try software suspend on my laptop it always returns to my
  running system after some time.
  This is what's logged by the kernel:
 
  swsusp: Basic memory bitmaps created
  Stopping tasks ...
  Stopping kernel threads timed out after 20 seconds (1 tasks
  refusing to freeze):
   cryptd
  Restarting tasks ... done.
  swsusp: Basic memory bitmaps freed
 
  I have no idea what's the problem, but if you tell me what I
  should do I can create debugging information and/or test patches.

 Could you try this patch, please? It should help.

 Herbert, is this right? If cryptd is going to be used for block
 devs, the task should probably be PF_NOFREEZE (or whatever it is
 today) instead.

 Regards,

 Nigel

  crypto/cryptd.c |1 +
  include/linux/freezer.h |3 +++
  kernel/power/process.c  |2 +-
  3 files changed, 5 insertions(+), 1 deletion(-)
 diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c
 991-fix-cryptd.patch-new/crypto/cryptd.c ---
 991-fix-cryptd.patch-old/crypto/cryptd.c  2007-05-19
 18:16:47.0 +1000 +++
 991-fix-cryptd.patch-new/crypto/cryptd.c  2007-05-26
 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int
 cryptd_thread(void *data)

   mutex_unlock(state-mutex);

 + try_to_freeze();
   schedule();
   } while (!stop);
   
I tried your patch, but when I apply it my kernel doesn't compile any
more. I get these warnings/errors:
   
[...]
  CC  crypto/cryptd.o
crypto/cryptd.c: In function ‘cryptd_thread’:
crypto/cryptd.c:344: warning: implicit declaration of function
‘try_to_freeze’ [...]
  LD  init/built-in.o
  LD  .tmp_vmlinux1
crypto/built-in.o: In function `cryptd_thread':
cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze'
make: *** [.tmp_vmlinux1] Error 1
  
   Ah. You'll need to add #include linux/freezer.h near that start of
   crypto/cryptd.c. Sorry for forgetting that.
  
   Nigel
 
  I added the include line and now I could compile the kernel, but
  suspending still doesn't work.
 
  swsusp: Basic memory bitmaps created
  Stopping tasks ...
  Stopping kernel threads timed out after 20 seconds (1 tasks refusing to
  freeze):
  cryptd
  Restarting tasks ... done.
  swsusp: Basic memory bitmaps freed

 OK, this means that cryptd doesn't execute the try_to_freeze() for some
 reason.

 Please apply the appended patch on top of 2.6.22-rc3 and see if that helps.

 Greetings,
 Rafael

 ---
  crypto/cryptd.c |1 +
  1 file changed, 1 insertion(+)

 Index: linux-2.6.22-rc3/crypto/cryptd.c
 ===
 --- linux-2.6.22-rc3.orig/crypto/cryptd.c
 +++ linux-2.6.22-rc3/crypto/cryptd.c
 @@ -316,6 +316,7 @@ static int cryptd_thread(void *data)
   struct cryptd_state *state = data;
   int stop;

 + current-flags |= PF_NOFREEZE;
   do {
   struct crypto_async_request *req, *backlog;

Even with this patch suspending doesn't work, dmesg shows the same error 
message.
I also did build a kernel without cryptd and suspending does work there.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
 On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
  2.6.22-rc3:
 
  [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
  [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
  [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
  [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

 Why do we have two different measurements here? Is one TX and one RX?
 Which one?

Yes, the first is TX (BCM4401 -- e100) and the second is RX. Both are tcp 
connections. I think iperf does display the ip addresses wrong in the second 
connection, but that's another issue.


  koala:~# ping -c10 192.168.1.1
  PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
  64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms
  64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms
  64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms
  64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms
  64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms
  64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms
  64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms
  64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms
  64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
  64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms
 
  --- 192.168.1.1 ping statistics ---
  10 packets transmitted, 10 received, 0% packet loss, time 8997ms
  rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms
 
  System responsiveness was the same as with 2.6.21.1.
 
  wget got 11.23M/s, again same as 2.6.21.1.
 
 
  2.6.22-rc2-mm1:
 
  [  5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001
  [  5]  0.0-60.1 sec402 MBytes  56.1 Mbits/sec
  [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598
  [  4]  0.0-63.0 sec177 MBytes  23.6 Mbits/sec

 So with -mm (with ssb) you actually get better performace
 then with plain 2.6.22-rc3?

 Can you elaborate a bit more about what you get an what you expect
 on which kernel?

When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in normal 
use I didn't notice any problems. It did work fine as I would expect it.
I think the wget and ping tests here are as they should be.

With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The ping 
test does confirm this, because here response times are very high. As far as 
I can remember the wget download rate was a bit slower than 2.6.21.1 or 
2.6.22-rc3 till it stalled.
I would expect it to be someting like the other two kernels. The two problems 
I see are the high ping times and the fact that the card stopped working.

I don't know why the iperf results are so different from my personal 
experience. I guess the fact that I get so bad results with 2.6.21.1 and 
2.6.22-rc3 is that iperf does something that causes the system to be 
extremely slow and thus degrading performance. This could be a bug somewhere 
in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has unintended been fixed 
by the ssb switch, but that's only a roughly guess.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
 On Sunday 27 May 2007 22:36:39 Maximilian Engelhardt wrote:
  When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in
  normal use I didn't notice any problems. It did work fine as I would
  expect it. I think the wget and ping tests here are as they should be.
 
  With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The
  ping test does confirm this, because here response times are very high.
  As far as I can remember the wget download rate was a bit slower than
  2.6.21.1 or 2.6.22-rc3 till it stalled.
  I would expect it to be someting like the other two kernels. The two
  problems I see are the high ping times and the fact that the card stopped
  working.
 
  I don't know why the iperf results are so different from my personal
  experience. I guess the fact that I get so bad results with 2.6.21.1 and
  2.6.22-rc3 is that iperf does something that causes the system to be
  extremely slow and thus degrading performance. This could be a bug
  somewhere in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has
  unintended been fixed by the ssb switch, but that's only a roughly guess.

 Ok. I guess (Yes I do :D) that there is an IRQ storm or something like
 that, because you say that your system is becoming very slow and
 unresponsive. It sounds like an IRQ is not ACKed correctly and so keeps
 triggering and stalling the system. I'll take a look at a few diffs...
 Do you see significant differences in the hi and/or si times in top?
 Do you see a significant difference in the /proc/interrupts count. For
 example that the kernel that works worse generates 10 times the IRQ count
 for the same amount of data.

ok, here are the results:

Using 2.6.22-rc3 I get lot's of hi during TX and lots of hi and si during RX.
Using 2.6.22-rc3-mm1 hi and si are significantly lower.
It's difficult to give absolute numbers, because top refreshes very slow, but 
with 2.6.22-rc3 hi is about 30% during TX and RX and si is 0% during TX and 
50% during RX. With Using 2.6.22-rc3-mm1 hi is 0% during TX and 0.3% during 
RX and si is 10% during TX and 0% during RX.

When I do the same test on both kernels I get about 10 times (yes, it's really 
about ten times like in your example) more interrupts with 2.6.22-rc3 than 
with 2.6.22-rc3-mm1.

An additional thing I noticed it that it's not the BCM4401 card that stops 
working but my e100 card. If I take the e100 card down and up again the 
connection is working again, so the BCM4401 doesn't have a stops working 
bug for me.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
 On Sunday 27 May 2007 23:13:32 Michael Buesch wrote:
  On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
   2.6.21.1:
   [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
   [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
   [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
   [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec
  
   2.6.22-rc3:
   [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
   [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
   [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
   [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec
 
  This is the diff between these two kernels.
  I'm not sure why you see a much better TX throughput here.
 
  Can you re-check to make sure it's not just some test-jitter?

 Oh, eh, and what I forgot to ask:
 Do you know an old kernel that works perfectly well for you,
 so I can look at a diff between this one and anything =2.6.21.1.

I don't know any, most older kernels did work fine for me, but I never user 
iperf there so I guess if the bug is there also I simply didn't trigger it.
If you think it's usefull I could go back and try different kernels, but that 
would take some time.
Except the iperf bug 2.6.21.1 and 2.6.22-rc3 work fine.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
 On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
  2.6.21.1:
  [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
  [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
  [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
  [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec
 
  2.6.22-rc3:
  [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
  [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
  [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
  [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

 This is the diff between these two kernels.
 I'm not sure why you see a much better TX throughput here.

 Can you re-check to make sure it's not just some test-jitter?

2.6.21.1:

[  5] local 192.168.1.2 port 54423 connected with 192.168.1.1 port 5001
[  5]  0.0-60.3 sec  3.06 MBytes426 Kbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 41053
[  4]  0.0-163.0 sec130 MBytes  6.67 Mbits/sec


2.6.22-rc3:

[  5] local 192.168.1.2 port 46002 connected with 192.168.1.1 port 5001
[  5]  0.0-61.5 sec  84.0 MBytes  11.5 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 44379
[  4]  0.0-93.8 sec  30.6 MBytes  2.74 Mbits/sec

For TX the iperf server reports the same values as the client (all values are 
from the client) but for RX they are differen:

2.6.21.1: (iperf server log):

[  5] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 54423
[  5]  0.0-60.5 sec  3.06 MBytes425 Kbits/sec
[  5] local 192.168.1.1 port 41053 connected with 192.168.1.2 port 5001
[  5]  0.0-63.1 sec130 MBytes  17.2 Mbits/sec


2.6.22-rc3 (iperf server log):

[  4] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 46002
[  4]  0.0-61.6 sec  84.0 MBytes  11.5 Mbits/sec
[  4] local 192.168.1.1 port 44379 connected with 192.168.1.2 port 5001
[  4]  0.0-63.3 sec  30.6 MBytes  4.06 Mbits/sec

I have no idea how iperf internally works and what can cause such different 
results here.


 --- linux-2.6.21.1/drivers/net/b44.c2007-05-27 22:58:01.0 +0200
 +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-27 23:01:44.0 +0200
 @@ -825,12 +825,11 @@
 if (copy_skb == NULL)
 goto drop_it_no_recycle;

 -   copy_skb-dev = bp-dev;
 skb_reserve(copy_skb, 2);
 skb_put(copy_skb, len);
 /* DMA sync done above, copy just the actual packet
 */ -   memcpy(copy_skb-data, skb-data+bp-rx_offset,
 len); -
 +   skb_copy_from_linear_data_offset(skb,
 bp-rx_offset, +   
 copy_skb-data, len); skb = copy_skb;
 }
 skb-ip_summed = CHECKSUM_NONE;
 @@ -1007,7 +1006,8 @@
 goto err_out;
 }

 -   memcpy(skb_put(bounce_skb, len), skb-data, skb-len);
 +   skb_copy_from_linear_data(skb, skb_put(bounce_skb, len),
 + skb-len);
 dev_kfree_skb_any(skb);
 skb = bounce_skb;
 }




signature.asc
Description: This is a digitally signed message part.


Re: software suspend doesn't work with 2.6.22-rc3

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Rafael J. Wysocki wrote:
 On Sunday, 27 May 2007 22:41, Maximilian Engelhardt wrote:
  On Sunday 27 May 2007, Rafael J. Wysocki wrote:
   On Sunday, 27 May 2007 18:01, Maximilian Engelhardt wrote:
On Saturday 26 May 2007, Nigel Cunningham wrote:
 Hi.

 On Sat, 2007-05-26 at 14:49 +0200, Maximilian Engelhardt wrote:
  On Saturday 26 May 2007, Nigel Cunningham wrote:
   Hi.
  
   On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote:
Hello,
   
When I try software suspend on my laptop it always returns to
my running system after some time.
This is what's logged by the kernel:
   
swsusp: Basic memory bitmaps created
Stopping tasks ...
Stopping kernel threads timed out after 20 seconds (1 tasks
refusing to freeze):
 cryptd
Restarting tasks ... done.
swsusp: Basic memory bitmaps freed
   
I have no idea what's the problem, but if you tell me what I
should do I can create debugging information and/or test
patches.
  
   Could you try this patch, please? It should help.
  
   Herbert, is this right? If cryptd is going to be used for block
   devs, the task should probably be PF_NOFREEZE (or whatever it
   is today) instead.
  
   Regards,
  
   Nigel
  
crypto/cryptd.c |1 +
include/linux/freezer.h |3 +++
kernel/power/process.c  |2 +-
3 files changed, 5 insertions(+), 1 deletion(-)
   diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c
   991-fix-cryptd.patch-new/crypto/cryptd.c ---
   991-fix-cryptd.patch-old/crypto/cryptd.c  2007-05-19
   18:16:47.0 +1000 +++
   991-fix-cryptd.patch-new/crypto/cryptd.c  2007-05-26
   19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int
   cryptd_thread(void *data)
  
 mutex_unlock(state-mutex);
  
   + try_to_freeze();
 schedule();
 } while (!stop);
 
  I tried your patch, but when I apply it my kernel doesn't compile
  any more. I get these warnings/errors:
 
  [...]
CC  crypto/cryptd.o
  crypto/cryptd.c: In function ‘cryptd_thread’:
  crypto/cryptd.c:344: warning: implicit declaration of function
  ‘try_to_freeze’ [...]
LD  init/built-in.o
LD  .tmp_vmlinux1
  crypto/built-in.o: In function `cryptd_thread':
  cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze'
  make: *** [.tmp_vmlinux1] Error 1

 Ah. You'll need to add #include linux/freezer.h near that start
 of crypto/cryptd.c. Sorry for forgetting that.

 Nigel
   
I added the include line and now I could compile the kernel, but
suspending still doesn't work.
   
swsusp: Basic memory bitmaps created
Stopping tasks ...
Stopping kernel threads timed out after 20 seconds (1 tasks refusing
to freeze):
cryptd
Restarting tasks ... done.
swsusp: Basic memory bitmaps freed
  
   OK, this means that cryptd doesn't execute the try_to_freeze() for some
   reason.
  
   Please apply the appended patch on top of 2.6.22-rc3 and see if that
   helps.
  
   Greetings,
   Rafael
  
   ---
crypto/cryptd.c |1 +
1 file changed, 1 insertion(+)
  
   Index: linux-2.6.22-rc3/crypto/cryptd.c
   ===
   --- linux-2.6.22-rc3.orig/crypto/cryptd.c
   +++ linux-2.6.22-rc3/crypto/cryptd.c
   @@ -316,6 +316,7 @@ static int cryptd_thread(void *data)
 struct cryptd_state *state = data;
 int stop;
  
   + current-flags |= PF_NOFREEZE;
 do {
 struct crypto_async_request *req, *backlog;
 
  Even with this patch suspending doesn't work, dmesg shows the same error
  message.
  I also did build a kernel without cryptd and suspending does work there.

 Well, that's strange, because in that case the freezer shouldn't even wait
 for cryptd.

 Can you please try the patch at http://lkml.org/lkml/2007/5/26/169 ?

With this patch applied suspend does work fine.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
 Ok, another question: On which CPU architecture are you?

[EMAIL PROTECTED]:~$ uname -m
i686

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: software suspend doesn't work with 2.6.22-rc3

2007-05-26 Thread Maximilian Engelhardt
On Saturday 26 May 2007, Nigel Cunningham wrote:
> Hi.
>
> On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote:
> > Hello,
> >
> > When I try software suspend on my laptop it always returns to my running
> > system after some time.
> > This is what's logged by the kernel:
> >
> > swsusp: Basic memory bitmaps created
> > Stopping tasks ...
> > Stopping kernel threads timed out after 20 seconds (1 tasks refusing to
> > freeze):
> >  cryptd
> > Restarting tasks ... done.
> > swsusp: Basic memory bitmaps freed
> >
> > I have no idea what's the problem, but if you tell me what I should do I
> > can create debugging information and/or test patches.
>
> Could you try this patch, please? It should help.
>
> Herbert, is this right? If cryptd is going to be used for block devs,
> the task should probably be PF_NOFREEZE (or whatever it is today)
> instead.
>
> Regards,
>
> Nigel
>
>  crypto/cryptd.c |1 +
>  include/linux/freezer.h |3 +++
>  kernel/power/process.c  |2 +-
>  3 files changed, 5 insertions(+), 1 deletion(-)
> diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c
> 991-fix-cryptd.patch-new/crypto/cryptd.c ---
> 991-fix-cryptd.patch-old/crypto/cryptd.c  2007-05-19 18:16:47.0
> +1000 +++ 991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26
> 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int cryptd_thread(void
> *data)
>
>   mutex_unlock(>mutex);
>
> + try_to_freeze();
>   schedule();
>   } while (!stop);

I tried your patch, but when I apply it my kernel doesn't compile any more. I 
get these warnings/errors:

[...]
  CC  crypto/cryptd.o
crypto/cryptd.c: In function ‘cryptd_thread’:
crypto/cryptd.c:344: warning: implicit declaration of function ‘try_to_freeze’
[...]
  LD  init/built-in.o
  LD  .tmp_vmlinux1
crypto/built-in.o: In function `cryptd_thread':
cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze'
make: *** [.tmp_vmlinux1] Error 1

Maxi


signature.asc
Description: This is a digitally signed message part.


Oops with prism54 in 2.6.22-rc3

2007-05-26 Thread Maximilian Engelhardt
Hello,

when using the prism54 driver including in the 2.6.22-rc3 kernel I get this 
Oops when putting the card into monitor mode:

BUG: unable to handle kernel NULL pointer dereference at virtual address 
01d8
 printing eip:
c0500608
*pde = 
Oops: 0002 [#1]
PREEMPT 
Modules linked in: fuse
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010046   (2.6.22-rc3 #2)
EIP is at netif_rx+0x48/0xc0
eax:    ebx: c18fdbc0   ecx: c087991c   edx: c0879910
esi: 0246   edi: f7c68010   ebp: f7fe0ba0   esp: c07bbef0
ds: 007b   es: 007b   fs:   gs:   ss: 0068
Process swapper (pid: 0, ti=c07ba000 task=c075a280 task.ti=c07ba000)
Stack: f7ec  c03d2b8f c07bbf24 0082 f7c68024 f7fe0800 c18fdbc0 
   0070 0046 0286 0286 0008 0007 0032dcd5  
   f7fe0ba0 0002 f7fe0800 c03d913d   f7f4d2c0  
Call Trace:
 [] islpci_eth_receive+0x12f/0x590
 [] islpci_interrupt+0x1cd/0x280
 [] handle_IRQ_event+0x25/0x50
 [] handle_fasteoi_irq+0x5c/0xe0
 [] do_IRQ+0x4a/0x80
 [] common_interrupt+0x23/0x28
 [] default_idle+0x2a/0x40
 [] cpu_idle+0x43/0x80
 [] start_kernel+0x21a/0x260
 [] unknown_bootoption+0x0/0x260
 ===
Code: c7 43 0c 00 00 00 00 c7 43 10 00 00 00 00 9c 5e fa ff 05 bc 9c 87 c0 a1 
0c 99 87 c0 3b 05 c0 4a 7b c0 77 30 85 c0 74 43 8b 43 14  80 d8 01 00 00 
a1 08 99 87 c0 ff 05 0c 99 87 c0 c7 03 04 99 
EIP: [] netif_rx+0x48/0xc0 SS:ESP 0068:c07bbef0
Kernel panic - not syncing: Fatal exception in interrupt

After this the system is frozen. Using kernel 2.6.21 everything works fine, I 
can capture packets in monitor mode and do not get any Oops.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: BUG in 2.6.22-rc2-mm1: NIC module b44.c broken (Broadcom 4400)

2007-05-26 Thread Maximilian Engelhardt
On Saturday 26 May 2007, Michael Buesch wrote:
> On Friday 25 May 2007 21:40, Uwe Bugla wrote:
> > Am Freitag, 25. Mai 2007 20:48 schrieben Sie:
> > > On Fri, 25 May 2007 17:59:29 +0200, Uwe Bugla wrote:
> > > > Perhaps someone reading this could try to reproduce that problem on
> > > > his machine.
> > > > Now who of the readers owes a Broadcom 4401 NIC and can please try to
> > > > test kernel 2.6.22-rc2-mm1?
> > > >
> > > > Those NICs have been used very very often as onboard controllers,
> > > > especially on ASUS boards.
> > >
> > > I've been using 2.6.22-rc2 for some time and now I compiled 2.6.22-rc2-
> > > mm1 and both work fine with the BCM4401 in my laptop.
> > >
> > > Maxi
> >
> > Hello Maxi,
> >
> > That may be true for your Laptop, but it unfortunately isn't true for my
> > ASUS mainboard onboard controller.
> >
> > Unfortunately I cannot confirm this:
> >
> > My broadcom 4401 driver is not part of a notebook, but instead part of an
> > ASUS P4PE mainboard.
> >
> > At my second attempt I went the conventional path (i. e. ignoring the
> > fact that
> > "Broadcom 4400 ethernet support appears twice in section "Network device
> > support":
> >
> > Whether you leave out "EISA, VLB, PCI and on board controllers" or not it
> > simply appears twice in kernel config! This is bug number 1.
>
> No it is NOT a bug.
> It simply shows again that you don't know how b44, ssb or anything related
> works.
>
> Would you _please_ take a look at the code, before calling features bugs.
> And yes, this IS a feature. It is a feature to get b44 running on an
> OpenWRT embedded device. These devices don't have a PCI bus. So b44 MUST
> NOT depend on "EISA, VLB, PCI and on board controllers".
> "Broadcom 4400 PCI device support" does depend on "EISA, VLB, PCI and on
> board controllers".
>
> Everything is correct.
> Bug number 1 is solved.
> qed
>
> > This time I do get a "good" interrupt: IRQ 21 for the the device.
> >
> > BUT:
> >
> > Trying to ping another machine fails saying:
> >
> > "destination host unreachable"
> >
> >
> > That means, Although the interrupt is fine now, the device is still not
> > functionable.
>
> And it's completely impossible that you did a mistake when configuring
> the device? Typo in the IP? Typo in the gateway or DNS entries?
> Try it again, please.
> And please try with current wireless-dev tree.
>
> And I simply do not get it why you suddenly get a good IRQ number, like
> everybody else does, without fixing The Bug (tm).

I did run my 2.6.22-rc2-mm1 kernel a bit longer and noticed that I was wrong 
in my first mail. The driver does work with my 4401 and network traffic seem 
to get out and in fine, but it has huge performance problems. If I do some 
pings and traceroutes I sometimes get response times of only a few ms but I 
also get times of a few seconds. Also trying to play games is totally 
impossible. This doesn't happen with 2.6.22-rc2 and 2.6.22-rc3.

Maxi


signature.asc
Description: This is a digitally signed message part.


software suspend doesn't work with 2.6.22-rc3

2007-05-26 Thread Maximilian Engelhardt
Hello,

When I try software suspend on my laptop it always returns to my running 
system after some time.
This is what's logged by the kernel:

swsusp: Basic memory bitmaps created
Stopping tasks ... 
Stopping kernel threads timed out after 20 seconds (1 tasks refusing to 
freeze):
 cryptd
Restarting tasks ... done.
swsusp: Basic memory bitmaps freed

I have no idea what's the problem, but if you tell me what I should do I can 
create debugging information and/or test patches.

I have my config attached, the kernel is 2.6.22-rc3

Maxi
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.22-rc3
# Sat May 26 10:07:12 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
# CONFIG_TASK_XACCT is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y

#
# Block layer
#
CONFIG_BLOCK=y
# CONFIG_LBD is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
# CONFIG_SMP is not set
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
CONFIG_MPENTIUMM=y
# CONFIG_MCORE2 is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_MODEL=4
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
# CONFIG_X86_UP_APIC is not set
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_VM86=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set

#
# Firmware Drivers
#
# CONFIG_EDD is not set
# CONFIG_DELL_RBU is not set
# 

software suspend doesn't work with 2.6.22-rc3

2007-05-26 Thread Maximilian Engelhardt
Hello,

When I try software suspend on my laptop it always returns to my running 
system after some time.
This is what's logged by the kernel:

swsusp: Basic memory bitmaps created
Stopping tasks ... 
Stopping kernel threads timed out after 20 seconds (1 tasks refusing to 
freeze):
 cryptd
Restarting tasks ... done.
swsusp: Basic memory bitmaps freed

I have no idea what's the problem, but if you tell me what I should do I can 
create debugging information and/or test patches.

I have my config attached, the kernel is 2.6.22-rc3

Maxi
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.22-rc3
# Sat May 26 10:07:12 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
# CONFIG_TASK_XACCT is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y

#
# Block layer
#
CONFIG_BLOCK=y
# CONFIG_LBD is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED=cfq

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
# CONFIG_SMP is not set
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
CONFIG_MPENTIUMM=y
# CONFIG_MCORE2 is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_MODEL=4
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
# CONFIG_X86_UP_APIC is not set
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_VM86=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set

#
# Firmware Drivers
#
# CONFIG_EDD is not set
# CONFIG_DELL_RBU is not set
# 

Re: BUG in 2.6.22-rc2-mm1: NIC module b44.c broken (Broadcom 4400)

2007-05-26 Thread Maximilian Engelhardt
On Saturday 26 May 2007, Michael Buesch wrote:
 On Friday 25 May 2007 21:40, Uwe Bugla wrote:
  Am Freitag, 25. Mai 2007 20:48 schrieben Sie:
   On Fri, 25 May 2007 17:59:29 +0200, Uwe Bugla wrote:
Perhaps someone reading this could try to reproduce that problem on
his machine.
Now who of the readers owes a Broadcom 4401 NIC and can please try to
test kernel 2.6.22-rc2-mm1?
   
Those NICs have been used very very often as onboard controllers,
especially on ASUS boards.
  
   I've been using 2.6.22-rc2 for some time and now I compiled 2.6.22-rc2-
   mm1 and both work fine with the BCM4401 in my laptop.
  
   Maxi
 
  Hello Maxi,
 
  That may be true for your Laptop, but it unfortunately isn't true for my
  ASUS mainboard onboard controller.
 
  Unfortunately I cannot confirm this:
 
  My broadcom 4401 driver is not part of a notebook, but instead part of an
  ASUS P4PE mainboard.
 
  At my second attempt I went the conventional path (i. e. ignoring the
  fact that
  Broadcom 4400 ethernet support appears twice in section Network device
  support:
 
  Whether you leave out EISA, VLB, PCI and on board controllers or not it
  simply appears twice in kernel config! This is bug number 1.

 No it is NOT a bug.
 It simply shows again that you don't know how b44, ssb or anything related
 works.

 Would you _please_ take a look at the code, before calling features bugs.
 And yes, this IS a feature. It is a feature to get b44 running on an
 OpenWRT embedded device. These devices don't have a PCI bus. So b44 MUST
 NOT depend on EISA, VLB, PCI and on board controllers.
 Broadcom 4400 PCI device support does depend on EISA, VLB, PCI and on
 board controllers.

 Everything is correct.
 Bug number 1 is solved.
 qed

  This time I do get a good interrupt: IRQ 21 for the the device.
 
  BUT:
 
  Trying to ping another machine fails saying:
 
  destination host unreachable
 
 
  That means, Although the interrupt is fine now, the device is still not
  functionable.

 And it's completely impossible that you did a mistake when configuring
 the device? Typo in the IP? Typo in the gateway or DNS entries?
 Try it again, please.
 And please try with current wireless-dev tree.

 And I simply do not get it why you suddenly get a good IRQ number, like
 everybody else does, without fixing The Bug (tm).

I did run my 2.6.22-rc2-mm1 kernel a bit longer and noticed that I was wrong 
in my first mail. The driver does work with my 4401 and network traffic seem 
to get out and in fine, but it has huge performance problems. If I do some 
pings and traceroutes I sometimes get response times of only a few ms but I 
also get times of a few seconds. Also trying to play games is totally 
impossible. This doesn't happen with 2.6.22-rc2 and 2.6.22-rc3.

Maxi


signature.asc
Description: This is a digitally signed message part.


Oops with prism54 in 2.6.22-rc3

2007-05-26 Thread Maximilian Engelhardt
Hello,

when using the prism54 driver including in the 2.6.22-rc3 kernel I get this 
Oops when putting the card into monitor mode:

BUG: unable to handle kernel NULL pointer dereference at virtual address 
01d8
 printing eip:
c0500608
*pde = 
Oops: 0002 [#1]
PREEMPT 
Modules linked in: fuse
CPU:0
EIP:0060:[c0500608]Not tainted VLI
EFLAGS: 00010046   (2.6.22-rc3 #2)
EIP is at netif_rx+0x48/0xc0
eax:    ebx: c18fdbc0   ecx: c087991c   edx: c0879910
esi: 0246   edi: f7c68010   ebp: f7fe0ba0   esp: c07bbef0
ds: 007b   es: 007b   fs:   gs:   ss: 0068
Process swapper (pid: 0, ti=c07ba000 task=c075a280 task.ti=c07ba000)
Stack: f7ec  c03d2b8f c07bbf24 0082 f7c68024 f7fe0800 c18fdbc0 
   0070 0046 0286 0286 0008 0007 0032dcd5  
   f7fe0ba0 0002 f7fe0800 c03d913d   f7f4d2c0  
Call Trace:
 [c03d2b8f] islpci_eth_receive+0x12f/0x590
 [c03d913d] islpci_interrupt+0x1cd/0x280
 [c0144e15] handle_IRQ_event+0x25/0x50
 [c014669c] handle_fasteoi_irq+0x5c/0xe0
 [c010674a] do_IRQ+0x4a/0x80
 [c010498f] common_interrupt+0x23/0x28
 [c0102b3a] default_idle+0x2a/0x40
 [c01023e3] cpu_idle+0x43/0x80
 [c07bcb2a] start_kernel+0x21a/0x260
 [c07bc450] unknown_bootoption+0x0/0x260
 ===
Code: c7 43 0c 00 00 00 00 c7 43 10 00 00 00 00 9c 5e fa ff 05 bc 9c 87 c0 a1 
0c 99 87 c0 3b 05 c0 4a 7b c0 77 30 85 c0 74 43 8b 43 14 ff 80 d8 01 00 00 
a1 08 99 87 c0 ff 05 0c 99 87 c0 c7 03 04 99 
EIP: [c0500608] netif_rx+0x48/0xc0 SS:ESP 0068:c07bbef0
Kernel panic - not syncing: Fatal exception in interrupt

After this the system is frozen. Using kernel 2.6.21 everything works fine, I 
can capture packets in monitor mode and do not get any Oops.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: software suspend doesn't work with 2.6.22-rc3

2007-05-26 Thread Maximilian Engelhardt
On Saturday 26 May 2007, Nigel Cunningham wrote:
 Hi.

 On Sat, 2007-05-26 at 11:28 +0200, Maximilian Engelhardt wrote:
  Hello,
 
  When I try software suspend on my laptop it always returns to my running
  system after some time.
  This is what's logged by the kernel:
 
  swsusp: Basic memory bitmaps created
  Stopping tasks ...
  Stopping kernel threads timed out after 20 seconds (1 tasks refusing to
  freeze):
   cryptd
  Restarting tasks ... done.
  swsusp: Basic memory bitmaps freed
 
  I have no idea what's the problem, but if you tell me what I should do I
  can create debugging information and/or test patches.

 Could you try this patch, please? It should help.

 Herbert, is this right? If cryptd is going to be used for block devs,
 the task should probably be PF_NOFREEZE (or whatever it is today)
 instead.

 Regards,

 Nigel

  crypto/cryptd.c |1 +
  include/linux/freezer.h |3 +++
  kernel/power/process.c  |2 +-
  3 files changed, 5 insertions(+), 1 deletion(-)
 diff -ruNp 991-fix-cryptd.patch-old/crypto/cryptd.c
 991-fix-cryptd.patch-new/crypto/cryptd.c ---
 991-fix-cryptd.patch-old/crypto/cryptd.c  2007-05-19 18:16:47.0
 +1000 +++ 991-fix-cryptd.patch-new/crypto/cryptd.c2007-05-26
 19:45:42.0 +1000 @@ -341,6 +341,7 @@ static int cryptd_thread(void
 *data)

   mutex_unlock(state-mutex);

 + try_to_freeze();
   schedule();
   } while (!stop);

I tried your patch, but when I apply it my kernel doesn't compile any more. I 
get these warnings/errors:

[...]
  CC  crypto/cryptd.o
crypto/cryptd.c: In function ‘cryptd_thread’:
crypto/cryptd.c:344: warning: implicit declaration of function ‘try_to_freeze’
[...]
  LD  init/built-in.o
  LD  .tmp_vmlinux1
crypto/built-in.o: In function `cryptd_thread':
cryptd.c:(.text+0xd7f5): undefined reference to `try_to_freeze'
make: *** [.tmp_vmlinux1] Error 1

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: BUG in 2.6.22-rc2-mm1: NIC module b44.c broken (Broadcom 4400)

2007-05-25 Thread Maximilian Engelhardt
On Fri, 25 May 2007 17:59:29 +0200, Uwe Bugla wrote:

> 
> Perhaps someone reading this could try to reproduce that problem on his
> machine.
> Now who of the readers owes a Broadcom 4401 NIC and can please try to
> test kernel 2.6.22-rc2-mm1?
> 
> Those NICs have been used very very often as onboard controllers,
> especially on ASUS boards.

I've been using 2.6.22-rc2 for some time and now I compiled 2.6.22-rc2-
mm1 and both work fine with the BCM4401 in my laptop.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: BUG in 2.6.22-rc2-mm1: NIC module b44.c broken (Broadcom 4400)

2007-05-25 Thread Maximilian Engelhardt
On Fri, 25 May 2007 17:59:29 +0200, Uwe Bugla wrote:

 
 Perhaps someone reading this could try to reproduce that problem on his
 machine.
 Now who of the readers owes a Broadcom 4401 NIC and can please try to
 test kernel 2.6.22-rc2-mm1?
 
 Those NICs have been used very very often as onboard controllers,
 especially on ASUS boards.

I've been using 2.6.22-rc2 for some time and now I compiled 2.6.22-rc2-
mm1 and both work fine with the BCM4401 in my laptop.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: Call for help: list of machines with working S3

2005-03-31 Thread Maximilian Engelhardt
On Fri, 2005-03-18 at 15:50 +0100, Romano Giannetti wrote:
> 
> It happens exactly the same on my laptop, sony vaio whose configuration is 
> 
> http://www.dea.icai.upco.es/romano/linux/vaio-conf/laptop-config.html
> 
> Next week is Easter holyday here, I will try to connect my Psion casio as
> serial terminal and see if I can catch something. 

I was able to get some logs using CONFIG_LP_CONSOLE (the first time I
ever saw "Back to C!"):

Back to C!
PM: Finishing up.
ACPI: PCI interrupt :00:1f.1[A] -> GSI 10 (level,low) -> IRQ 10
MCE: The hardware reports a non fatal, correctable incident occurred on
CPU 0.
Bank 1: e201
hda: task_out_intr: status=0x51 { DriveReady SeekComplete Error }
hda: task_out_intr: error=0x04 { DriveStatusError }
ide: failed opcode was: unknown

keeps on always repeating last three messages until I reboot

Full log:
http://home.daemonizer.de/resume.png

kernel version is 2.6.11
config: http://home.daemonizer.de/config-2.6.11-S3test
dmesg from booting: http://home.daemonizer.de/dmesg-2.6.11-S3test
lspci: http://home.daemonizer.de/lspci
Gentoo Base System version 1.6.10

Hardware:
Acer Travelmate 661lci (centrino)
Intel(R) Pentium(R) M processor 1400MHz

please mail me if you need additional data.

Thanks for help,
Maxi


signature.asc
Description: This is a digitally signed message part


Re: Call for help: list of machines with working S3

2005-03-31 Thread Maximilian Engelhardt
On Fri, 2005-03-18 at 15:50 +0100, Romano Giannetti wrote:
 
 It happens exactly the same on my laptop, sony vaio whose configuration is 
 
 http://www.dea.icai.upco.es/romano/linux/vaio-conf/laptop-config.html
 
 Next week is Easter holyday here, I will try to connect my Psion casio as
 serial terminal and see if I can catch something. 

I was able to get some logs using CONFIG_LP_CONSOLE (the first time I
ever saw Back to C!):

Back to C!
PM: Finishing up.
ACPI: PCI interrupt :00:1f.1[A] - GSI 10 (level,low) - IRQ 10
MCE: The hardware reports a non fatal, correctable incident occurred on
CPU 0.
Bank 1: e201
hda: task_out_intr: status=0x51 { DriveReady SeekComplete Error }
hda: task_out_intr: error=0x04 { DriveStatusError }
ide: failed opcode was: unknown

keeps on always repeating last three messages until I reboot

Full log:
http://home.daemonizer.de/resume.png

kernel version is 2.6.11
config: http://home.daemonizer.de/config-2.6.11-S3test
dmesg from booting: http://home.daemonizer.de/dmesg-2.6.11-S3test
lspci: http://home.daemonizer.de/lspci
Gentoo Base System version 1.6.10

Hardware:
Acer Travelmate 661lci (centrino)
Intel(R) Pentium(R) M processor 1400MHz

please mail me if you need additional data.

Thanks for help,
Maxi


signature.asc
Description: This is a digitally signed message part


Re: Call for help: list of machines with working S3

2005-03-27 Thread Maximilian Engelhardt
On Fri, 2005-03-18 at 15:50 +0100, Romano Giannetti wrote:
> 
> It happens exactly the same on my laptop, sony vaio whose configuration is 
> 
> http://www.dea.icai.upco.es/romano/linux/vaio-conf/laptop-config.html
> 
> Next week is Easter holyday here, I will try to connect my Psion casio as
> serial terminal and see if I can catch something. 
> 
>Romano 

Sorry that I didn't answer earlier , but I didn't have much time the
last week. Unfortunately my laptop has a serial port only via docking
station that I don't have. So I tried logging via netconsole. This
generally worked, but when I try to enter S3 the last thing I get is
"PM: Entering state" but the laptop never enters S3, it just hangs there
forever. So sadly I couldn't get more information.

If anyone has any idea what else I could do to either fix this problem
or get more information about it, please tell me and I'll try :)

Maxi


signature.asc
Description: This is a digitally signed message part


Re: Call for help: list of machines with working S3

2005-03-27 Thread Maximilian Engelhardt
On Fri, 2005-03-18 at 15:50 +0100, Romano Giannetti wrote:
 
 It happens exactly the same on my laptop, sony vaio whose configuration is 
 
 http://www.dea.icai.upco.es/romano/linux/vaio-conf/laptop-config.html
 
 Next week is Easter holyday here, I will try to connect my Psion casio as
 serial terminal and see if I can catch something. 
 
Romano 

Sorry that I didn't answer earlier , but I didn't have much time the
last week. Unfortunately my laptop has a serial port only via docking
station that I don't have. So I tried logging via netconsole. This
generally worked, but when I try to enter S3 the last thing I get is
PM: Entering state but the laptop never enters S3, it just hangs there
forever. So sadly I couldn't get more information.

If anyone has any idea what else I could do to either fix this problem
or get more information about it, please tell me and I'll try :)

Maxi


signature.asc
Description: This is a digitally signed message part


Re: Call for help: list of machines with working S3

2005-03-17 Thread Maximilian Engelhardt
On Mon, 2005-02-14 at 22:20 +0100, Pavel Machek wrote:
> Hi!
> 
> Stefan provided me initial list of machines where S3 works (including
> video). If you have machine that is not on the list, please send me a
> diff. If you have eMachines... I'd like you to try playing with
> vbetool (it worked for me), and if it works for you supplying right
> model numbers.
> 
>   Pavel
> 
> 
>   Video issues with S3 resume
>   ~~~
> 2003-2005, Pavel Machek
> 
> During S3 resume, hardware needs to be reinitialized. For most
> devices, this is easy, and kernel driver knows how to do
> it. Unfortunately there's one exception: video card. Those are usually
> initialized by BIOS, and kernel does not have enough information to
> boot video card. (Kernel usually does not even contain video card
> driver -- vesafb and vgacon are widely used).
> 
> This is not problem for swsusp, because during swsusp resume, BIOS is
> run normally so video card is normally initialized. S3 has absolutely
> no change to work with SMP/HT. Be sure it to turn it off before
> testing (swsusp should work ok, OTOH).
> 
> There are few types of systems where video works after S3 resume:
> 
> (1) systems where video state is preserved over S3.
> 
> (2) systems where it is possible to call video bios during S3
>   resume. Unfortunately, it is not correct to call video BIOS at that
>   point, but it happens to work on some machines. Use
>   acpi_sleep=s3_bios.
> 
> (3) systems that initialize video card into vga text mode and where BIOS
>   works well enough to be able to set video mode. Use
>   acpi_sleep=s3_mode on these.
> 
> (4) on some systems s3_bios kicks video into text mode, and
>   acpi_sleep=s3_bios,s3_mode is needed.
> 
> (5) radeon systems, where X can soft-boot your video card. You'll need
>   patched X, and plain text console (no vesafb or radeonfb), see
>   http://www.doesi.gmxhome.de/linux/tm800s3/s3.html.
> 
> (6) other radeon systems, where vbetool is enough to bring system back
>   to life. Do vbetool vbestate save > /tmp/delme; echo 3 > /proc/acpi/sleep;
>   vbetool post; vbetool vbestate restore < /tmp/delme; setfont
>   , and your video should work.

Tried all this on my Laptop but nothing seems to work for me. 
I do "echo 3 > /proc/acpi/sleep" and the systems seems to go into S3.
When I press some key to wake it up again it powers up but I get nothing
than a black screen. It's not only the video card that's not working,
because the only thing it reacts to is Sysrq (without screen of course).
One additional thing I found is that in this state the HDD led keeps
lighting all the time untill I reboot my system. After rebooting I
couldn't find anything interesting in my logs.

Is there any way I could get S3 working on my laptop?

some data:
Acer Travel Mate 661lci
Gentoo Base System version 1.6.10
kernel 2.6.11

I did all this testing with a minimal kernel that only had the
absolutely necessary drivers.

Thanks for help,
Maxi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Call for help: list of machines with working S3

2005-03-17 Thread Maximilian Engelhardt
On Mon, 2005-02-14 at 22:20 +0100, Pavel Machek wrote:
 Hi!
 
 Stefan provided me initial list of machines where S3 works (including
 video). If you have machine that is not on the list, please send me a
 diff. If you have eMachines... I'd like you to try playing with
 vbetool (it worked for me), and if it works for you supplying right
 model numbers.
 
   Pavel
 
 
   Video issues with S3 resume
   ~~~
 2003-2005, Pavel Machek
 
 During S3 resume, hardware needs to be reinitialized. For most
 devices, this is easy, and kernel driver knows how to do
 it. Unfortunately there's one exception: video card. Those are usually
 initialized by BIOS, and kernel does not have enough information to
 boot video card. (Kernel usually does not even contain video card
 driver -- vesafb and vgacon are widely used).
 
 This is not problem for swsusp, because during swsusp resume, BIOS is
 run normally so video card is normally initialized. S3 has absolutely
 no change to work with SMP/HT. Be sure it to turn it off before
 testing (swsusp should work ok, OTOH).
 
 There are few types of systems where video works after S3 resume:
 
 (1) systems where video state is preserved over S3.
 
 (2) systems where it is possible to call video bios during S3
   resume. Unfortunately, it is not correct to call video BIOS at that
   point, but it happens to work on some machines. Use
   acpi_sleep=s3_bios.
 
 (3) systems that initialize video card into vga text mode and where BIOS
   works well enough to be able to set video mode. Use
   acpi_sleep=s3_mode on these.
 
 (4) on some systems s3_bios kicks video into text mode, and
   acpi_sleep=s3_bios,s3_mode is needed.
 
 (5) radeon systems, where X can soft-boot your video card. You'll need
   patched X, and plain text console (no vesafb or radeonfb), see
   http://www.doesi.gmxhome.de/linux/tm800s3/s3.html.
 
 (6) other radeon systems, where vbetool is enough to bring system back
   to life. Do vbetool vbestate save  /tmp/delme; echo 3  /proc/acpi/sleep;
   vbetool post; vbetool vbestate restore  /tmp/delme; setfont
   whatever, and your video should work.

Tried all this on my Laptop but nothing seems to work for me. 
I do echo 3  /proc/acpi/sleep and the systems seems to go into S3.
When I press some key to wake it up again it powers up but I get nothing
than a black screen. It's not only the video card that's not working,
because the only thing it reacts to is Sysrq (without screen of course).
One additional thing I found is that in this state the HDD led keeps
lighting all the time untill I reboot my system. After rebooting I
couldn't find anything interesting in my logs.

Is there any way I could get S3 working on my laptop?

some data:
Acer Travel Mate 661lci
Gentoo Base System version 1.6.10
kernel 2.6.11

I did all this testing with a minimal kernel that only had the
absolutely necessary drivers.

Thanks for help,
Maxi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/