Re: [2.6.25-rc2] e100: Trying to free already-free IRQ 11 during suspend ...
Kok, Auke wrote: > Kok, Auke wrote: >> Andrew Morton wrote: >>> On Sun, 17 Feb 2008 15:36:50 +0300 Andrey Borzenkov <[EMAIL PROTECTED]> >>> wrote: >>> >>>> ... and possibly reboot/poweroff (it flows by too fast to be legible). >>>> >>>> [ 8803.850634] ACPI: Preparing to enter system sleep state S3 >>>> [ 8803.853141] Suspending console(s) >>>> [ 8805.287505] serial 00:09: disabled >>>> [ 8805.291564] Trying to free already-free IRQ 11 >>>> [ 8805.291579] Pid: 6920, comm: pm-suspend Not tainted 2.6.25-rc2-1avb #2 >>>> [ 8805.291628] [] free_irq+0xb7/0x130 >>>> [ 8805.291675] [] e100_suspend+0xc0/0x100 >>>> [ 8805.291724] [] pci_device_suspend+0x26/0x70 >>>> [ 8805.291747] [] suspend_device+0x94/0xd0 >>>> [ 8805.291763] [] device_suspend+0x153/0x240 >>>> [ 8805.291784] [] suspend_devices_and_enter+0x4f/0xf0 >>>> [ 8805.291808] [] ? freeze_processes+0x3f/0x80 >>>> [ 8805.291825] [] enter_state+0xaa/0x140 >>>> [ 8805.291840] [] state_store+0x8f/0xd0 >>>> [ 8805.291852] [] ? state_store+0x0/0xd0 >>>> [ 8805.291866] [] kobj_attr_store+0x24/0x30 >>>> [ 8805.291901] [] sysfs_write_file+0xbb/0x110 >>>> [ 8805.291936] [] vfs_write+0x99/0x130 >>>> [ 8805.291963] [] ? sysfs_write_file+0x0/0x110 >>>> [ 8805.291979] [] sys_write+0x3d/0x70 >>>> [ 8805.291998] [] sysenter_past_esp+0x5f/0xa5 >>>> [ 8805.292038] === >>>> [ 8805.347640] ACPI: PCI interrupt for device :00:06.0 disabled >>>> [ 8805.361128] ACPI: PCI interrupt for device :00:02.0 disabled >>>> [ 8805.376670] hwsleep-0322 [00] enter_sleep_state : Entering sleep >>>> state [S3] >>>> [ 8805.376670] Back to C! >>>> >>>> Interface is unused normally (only for netconsole sometimes). dmesg and >>>> config >>>> attached. >>> Does reverting this: >>> >>> commit 8543da6672b0994921f014f2250e27ae81645580 >>> Author: Auke Kok <[EMAIL PROTECTED]> >>> Date: Wed Dec 12 16:30:42 2007 -0800 >>> >>> e100: free IRQ to remove warningwhenrebooting >>> >>> with this patch: >>> >>> --- a/drivers/net/e100.c~revert-1 >>> +++ a/drivers/net/e100.c >>> @@ -2804,9 +2804,8 @@ static int e100_suspend(struct pci_dev * >>> pci_enable_wake(pdev, PCI_D3cold, 0); >>> } >>> >>> - free_irq(pdev->irq, netdev); >>> - >>> pci_disable_device(pdev); >>> + free_irq(pdev->irq, netdev); >>> pci_set_power_state(pdev, PCI_D3hot); >>> >>> return 0; >>> @@ -2848,8 +2847,6 @@ static void e100_shutdown(struct pci_dev >>> pci_enable_wake(pdev, PCI_D3cold, 0); >>> } >>> >>> - free_irq(pdev->irq, netdev); >>> - >>> pci_disable_device(pdev); >>> pci_set_power_state(pdev, PCI_D3hot); >>> } >>> _ >>> >>> fix it? >>> >>>> Hmm ... after resume device has disappeared at all ... >>>> >>>> {pts/1}% cat /proc/interrupts >>>>CPU0 >>>> 0:1290492XT-PIC-XTtimer >>>> 1: 6675XT-PIC-XTi8042 >>>> 2: 0XT-PIC-XTcascade >>>> 3: 2XT-PIC-XT >>>> 4: 2XT-PIC-XT >>>> 5: 3XT-PIC-XT >>>> 7: 4XT-PIC-XTirda0 >>>> 8: 0XT-PIC-XTrtc0 >>>> 9:583XT-PIC-XTacpi >>>> 10: 2XT-PIC-XT >>>> 11: 31483XT-PIC-XTyenta, yenta, yenta, ohci_hcd:usb1, >>>> ALI 5451, pcmcia0.0 >>>> 12: 28070XT-PIC-XTi8042 >>>> 14: 21705XT-PIC-XTide0 >>>> 15: 82123XT-PIC-XTide1 >>>> NMI: 0 Non-maskable interrupts >>>> TRM: 0 Thermal event interrupts >>>> SPU: 0 Spurious interrupts >>>> ERR: 0 >>> I hope that's not a separate bug... >> I'll take a look at this as well. thanks for reporting. > > ok, I just had a repro - on a regular shutdown even. > > this always worked before - I'm not blaming anything yet but something in the > pci > shutdown code must now be freeing our irq f
Re: [2.6.25-rc2] e100: Trying to free already-free IRQ 11 during suspend ...
Kok, Auke wrote: Kok, Auke wrote: Andrew Morton wrote: On Sun, 17 Feb 2008 15:36:50 +0300 Andrey Borzenkov [EMAIL PROTECTED] wrote: ... and possibly reboot/poweroff (it flows by too fast to be legible). [ 8803.850634] ACPI: Preparing to enter system sleep state S3 [ 8803.853141] Suspending console(s) [ 8805.287505] serial 00:09: disabled [ 8805.291564] Trying to free already-free IRQ 11 [ 8805.291579] Pid: 6920, comm: pm-suspend Not tainted 2.6.25-rc2-1avb #2 [ 8805.291628] [c0152127] free_irq+0xb7/0x130 [ 8805.291675] [c024bd80] e100_suspend+0xc0/0x100 [ 8805.291724] [c01eaa36] pci_device_suspend+0x26/0x70 [ 8805.291747] [c0243674] suspend_device+0x94/0xd0 [ 8805.291763] [c02439a3] device_suspend+0x153/0x240 [ 8805.291784] [c014314f] suspend_devices_and_enter+0x4f/0xf0 [ 8805.291808] [c0143a5f] ? freeze_processes+0x3f/0x80 [ 8805.291825] [c01432fa] enter_state+0xaa/0x140 [ 8805.291840] [c014341f] state_store+0x8f/0xd0 [ 8805.291852] [c0143390] ? state_store+0x0/0xd0 [ 8805.291866] [c01d3404] kobj_attr_store+0x24/0x30 [ 8805.291901] [c01b547b] sysfs_write_file+0xbb/0x110 [ 8805.291936] [c0177d79] vfs_write+0x99/0x130 [ 8805.291963] [c01b53c0] ? sysfs_write_file+0x0/0x110 [ 8805.291979] [c01782fd] sys_write+0x3d/0x70 [ 8805.291998] [c010409a] sysenter_past_esp+0x5f/0xa5 [ 8805.292038] === [ 8805.347640] ACPI: PCI interrupt for device :00:06.0 disabled [ 8805.361128] ACPI: PCI interrupt for device :00:02.0 disabled [ 8805.376670] hwsleep-0322 [00] enter_sleep_state : Entering sleep state [S3] [ 8805.376670] Back to C! Interface is unused normally (only for netconsole sometimes). dmesg and config attached. Does reverting this: commit 8543da6672b0994921f014f2250e27ae81645580 Author: Auke Kok [EMAIL PROTECTED] Date: Wed Dec 12 16:30:42 2007 -0800 e100: free IRQ to remove warningwhenrebooting with this patch: --- a/drivers/net/e100.c~revert-1 +++ a/drivers/net/e100.c @@ -2804,9 +2804,8 @@ static int e100_suspend(struct pci_dev * pci_enable_wake(pdev, PCI_D3cold, 0); } - free_irq(pdev-irq, netdev); - pci_disable_device(pdev); + free_irq(pdev-irq, netdev); pci_set_power_state(pdev, PCI_D3hot); return 0; @@ -2848,8 +2847,6 @@ static void e100_shutdown(struct pci_dev pci_enable_wake(pdev, PCI_D3cold, 0); } - free_irq(pdev-irq, netdev); - pci_disable_device(pdev); pci_set_power_state(pdev, PCI_D3hot); } _ fix it? Hmm ... after resume device has disappeared at all ... {pts/1}% cat /proc/interrupts CPU0 0:1290492XT-PIC-XTtimer 1: 6675XT-PIC-XTi8042 2: 0XT-PIC-XTcascade 3: 2XT-PIC-XT 4: 2XT-PIC-XT 5: 3XT-PIC-XT 7: 4XT-PIC-XTirda0 8: 0XT-PIC-XTrtc0 9:583XT-PIC-XTacpi 10: 2XT-PIC-XT 11: 31483XT-PIC-XTyenta, yenta, yenta, ohci_hcd:usb1, ALI 5451, pcmcia0.0 12: 28070XT-PIC-XTi8042 14: 21705XT-PIC-XTide0 15: 82123XT-PIC-XTide1 NMI: 0 Non-maskable interrupts TRM: 0 Thermal event interrupts SPU: 0 Spurious interrupts ERR: 0 I hope that's not a separate bug... I'll take a look at this as well. thanks for reporting. ok, I just had a repro - on a regular shutdown even. this always worked before - I'm not blaming anything yet but something in the pci shutdown code must now be freeing our irq for us (I'm not using anything fancy to autoconfigure my network here). I definately do not see this with 2.6.24 either. can you try this patch please? It rewrites e100 to do suspend/shutdown just like e1000e does, which is much more tested than anything else. I don't get the IRQ message anymore - but I haven't gotten to testing suspend/resume just yet. Auke e100: Do suspend/shutdown like e1000 This fixes a trying to free already free IRQ message and simplifies the shutdown/suspend code by re-using already existing code when going to suspend. The code is now symmetric with e100_resume. Signed-off-by: Auke Kok [EMAIL PROTECTED] diff --git a/drivers/net/e100.c b/drivers/net/e100.c index 36ba6dc..cdf3090 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -2782,16 +2782,13 @@ static void __devexit e100_remove(struct pci_dev *pdev) } } -#ifdef CONFIG_PM static int e100_suspend(struct pci_dev *pdev, pm_message_t state) { struct net_device *netdev = pci_get_drvdata(pdev); struct nic *nic = netdev_priv(netdev); if (netif_running(netdev)) - napi_disable(nic-napi); - del_timer_sync(nic-watchdog); - netif_carrier_off(nic-netdev); + e100_down(nic); netif_device_detach(netdev); pci_save_state(pdev); @@ -2804,14 +2801,13 @@ static int e100_suspend(struct pci_dev *pdev
Re: [2.6.25-rc2] e100: Trying to free already-free IRQ 11 during suspend ...
Kok, Auke wrote: > Andrew Morton wrote: >> On Sun, 17 Feb 2008 15:36:50 +0300 Andrey Borzenkov <[EMAIL PROTECTED]> >> wrote: >> >>> ... and possibly reboot/poweroff (it flows by too fast to be legible). >>> >>> [ 8803.850634] ACPI: Preparing to enter system sleep state S3 >>> [ 8803.853141] Suspending console(s) >>> [ 8805.287505] serial 00:09: disabled >>> [ 8805.291564] Trying to free already-free IRQ 11 >>> [ 8805.291579] Pid: 6920, comm: pm-suspend Not tainted 2.6.25-rc2-1avb #2 >>> [ 8805.291628] [] free_irq+0xb7/0x130 >>> [ 8805.291675] [] e100_suspend+0xc0/0x100 >>> [ 8805.291724] [] pci_device_suspend+0x26/0x70 >>> [ 8805.291747] [] suspend_device+0x94/0xd0 >>> [ 8805.291763] [] device_suspend+0x153/0x240 >>> [ 8805.291784] [] suspend_devices_and_enter+0x4f/0xf0 >>> [ 8805.291808] [] ? freeze_processes+0x3f/0x80 >>> [ 8805.291825] [] enter_state+0xaa/0x140 >>> [ 8805.291840] [] state_store+0x8f/0xd0 >>> [ 8805.291852] [] ? state_store+0x0/0xd0 >>> [ 8805.291866] [] kobj_attr_store+0x24/0x30 >>> [ 8805.291901] [] sysfs_write_file+0xbb/0x110 >>> [ 8805.291936] [] vfs_write+0x99/0x130 >>> [ 8805.291963] [] ? sysfs_write_file+0x0/0x110 >>> [ 8805.291979] [] sys_write+0x3d/0x70 >>> [ 8805.291998] [] sysenter_past_esp+0x5f/0xa5 >>> [ 8805.292038] === >>> [ 8805.347640] ACPI: PCI interrupt for device :00:06.0 disabled >>> [ 8805.361128] ACPI: PCI interrupt for device :00:02.0 disabled >>> [ 8805.376670] hwsleep-0322 [00] enter_sleep_state : Entering sleep >>> state [S3] >>> [ 8805.376670] Back to C! >>> >>> Interface is unused normally (only for netconsole sometimes). dmesg and >>> config >>> attached. >> Does reverting this: >> >> commit 8543da6672b0994921f014f2250e27ae81645580 >> Author: Auke Kok <[EMAIL PROTECTED]> >> Date: Wed Dec 12 16:30:42 2007 -0800 >> >> e100: free IRQ to remove warningwhenrebooting >> >> with this patch: >> >> --- a/drivers/net/e100.c~revert-1 >> +++ a/drivers/net/e100.c >> @@ -2804,9 +2804,8 @@ static int e100_suspend(struct pci_dev * >> pci_enable_wake(pdev, PCI_D3cold, 0); >> } >> >> -free_irq(pdev->irq, netdev); >> - >> pci_disable_device(pdev); >> +free_irq(pdev->irq, netdev); >> pci_set_power_state(pdev, PCI_D3hot); >> >> return 0; >> @@ -2848,8 +2847,6 @@ static void e100_shutdown(struct pci_dev >> pci_enable_wake(pdev, PCI_D3cold, 0); >> } >> >> -free_irq(pdev->irq, netdev); >> - >> pci_disable_device(pdev); >> pci_set_power_state(pdev, PCI_D3hot); >> } >> _ >> >> fix it? >> >>> Hmm ... after resume device has disappeared at all ... >>> >>> {pts/1}% cat /proc/interrupts >>>CPU0 >>> 0:1290492XT-PIC-XTtimer >>> 1: 6675XT-PIC-XTi8042 >>> 2: 0XT-PIC-XTcascade >>> 3: 2XT-PIC-XT >>> 4: 2XT-PIC-XT >>> 5: 3XT-PIC-XT >>> 7: 4XT-PIC-XTirda0 >>> 8: 0XT-PIC-XTrtc0 >>> 9:583XT-PIC-XTacpi >>> 10: 2XT-PIC-XT >>> 11: 31483XT-PIC-XTyenta, yenta, yenta, ohci_hcd:usb1, ALI >>> 5451, pcmcia0.0 >>> 12: 28070XT-PIC-XTi8042 >>> 14: 21705XT-PIC-XTide0 >>> 15: 82123XT-PIC-XTide1 >>> NMI: 0 Non-maskable interrupts >>> TRM: 0 Thermal event interrupts >>> SPU: 0 Spurious interrupts >>> ERR: 0 >> I hope that's not a separate bug... > > I'll take a look at this as well. thanks for reporting. ok, I just had a repro - on a regular shutdown even. this always worked before - I'm not blaming anything yet but something in the pci shutdown code must now be freeing our irq for us (I'm not using anything fancy to autoconfigure my network here). I definately do not see this with 2.6.24 either. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.25-rc2] e100: Trying to free already-free IRQ 11 during suspend ...
Kok, Auke wrote: Andrew Morton wrote: On Sun, 17 Feb 2008 15:36:50 +0300 Andrey Borzenkov [EMAIL PROTECTED] wrote: ... and possibly reboot/poweroff (it flows by too fast to be legible). [ 8803.850634] ACPI: Preparing to enter system sleep state S3 [ 8803.853141] Suspending console(s) [ 8805.287505] serial 00:09: disabled [ 8805.291564] Trying to free already-free IRQ 11 [ 8805.291579] Pid: 6920, comm: pm-suspend Not tainted 2.6.25-rc2-1avb #2 [ 8805.291628] [c0152127] free_irq+0xb7/0x130 [ 8805.291675] [c024bd80] e100_suspend+0xc0/0x100 [ 8805.291724] [c01eaa36] pci_device_suspend+0x26/0x70 [ 8805.291747] [c0243674] suspend_device+0x94/0xd0 [ 8805.291763] [c02439a3] device_suspend+0x153/0x240 [ 8805.291784] [c014314f] suspend_devices_and_enter+0x4f/0xf0 [ 8805.291808] [c0143a5f] ? freeze_processes+0x3f/0x80 [ 8805.291825] [c01432fa] enter_state+0xaa/0x140 [ 8805.291840] [c014341f] state_store+0x8f/0xd0 [ 8805.291852] [c0143390] ? state_store+0x0/0xd0 [ 8805.291866] [c01d3404] kobj_attr_store+0x24/0x30 [ 8805.291901] [c01b547b] sysfs_write_file+0xbb/0x110 [ 8805.291936] [c0177d79] vfs_write+0x99/0x130 [ 8805.291963] [c01b53c0] ? sysfs_write_file+0x0/0x110 [ 8805.291979] [c01782fd] sys_write+0x3d/0x70 [ 8805.291998] [c010409a] sysenter_past_esp+0x5f/0xa5 [ 8805.292038] === [ 8805.347640] ACPI: PCI interrupt for device :00:06.0 disabled [ 8805.361128] ACPI: PCI interrupt for device :00:02.0 disabled [ 8805.376670] hwsleep-0322 [00] enter_sleep_state : Entering sleep state [S3] [ 8805.376670] Back to C! Interface is unused normally (only for netconsole sometimes). dmesg and config attached. Does reverting this: commit 8543da6672b0994921f014f2250e27ae81645580 Author: Auke Kok [EMAIL PROTECTED] Date: Wed Dec 12 16:30:42 2007 -0800 e100: free IRQ to remove warningwhenrebooting with this patch: --- a/drivers/net/e100.c~revert-1 +++ a/drivers/net/e100.c @@ -2804,9 +2804,8 @@ static int e100_suspend(struct pci_dev * pci_enable_wake(pdev, PCI_D3cold, 0); } -free_irq(pdev-irq, netdev); - pci_disable_device(pdev); +free_irq(pdev-irq, netdev); pci_set_power_state(pdev, PCI_D3hot); return 0; @@ -2848,8 +2847,6 @@ static void e100_shutdown(struct pci_dev pci_enable_wake(pdev, PCI_D3cold, 0); } -free_irq(pdev-irq, netdev); - pci_disable_device(pdev); pci_set_power_state(pdev, PCI_D3hot); } _ fix it? Hmm ... after resume device has disappeared at all ... {pts/1}% cat /proc/interrupts CPU0 0:1290492XT-PIC-XTtimer 1: 6675XT-PIC-XTi8042 2: 0XT-PIC-XTcascade 3: 2XT-PIC-XT 4: 2XT-PIC-XT 5: 3XT-PIC-XT 7: 4XT-PIC-XTirda0 8: 0XT-PIC-XTrtc0 9:583XT-PIC-XTacpi 10: 2XT-PIC-XT 11: 31483XT-PIC-XTyenta, yenta, yenta, ohci_hcd:usb1, ALI 5451, pcmcia0.0 12: 28070XT-PIC-XTi8042 14: 21705XT-PIC-XTide0 15: 82123XT-PIC-XTide1 NMI: 0 Non-maskable interrupts TRM: 0 Thermal event interrupts SPU: 0 Spurious interrupts ERR: 0 I hope that's not a separate bug... I'll take a look at this as well. thanks for reporting. ok, I just had a repro - on a regular shutdown even. this always worked before - I'm not blaming anything yet but something in the pci shutdown code must now be freeing our irq for us (I'm not using anything fancy to autoconfigure my network here). I definately do not see this with 2.6.24 either. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.25-rc2, 2.6.24-rc8] page allocation failure...
Andrew Morton wrote: > On Sun, 17 Feb 2008 13:20:59 + "Daniel J Blueman" <[EMAIL PROTECTED]> > wrote: > >> I'm still hitting this with e1000e on 2.6.25-rc2, 10 times again. are you sure? I don't think that's the case and you're seeing e1000 dumps here... >> It's clearly non-fatal, but then do we expect it to occur? >> >> Daniel >> >> --- [dmesg] >> >> [ 1250.822786] swapper: page allocation failure. order:3, mode:0x4020 >> [ 1250.822786] Pid: 0, comm: swapper Not tainted 2.6.25-rc2-119 #2 >> [ 1250.822786] >> [ 1250.822786] Call Trace: >> [ 1250.822786][] __alloc_pages+0x34e/0x3a0 >> [ 1250.822786] [] ? __netdev_alloc_skb+0x1f/0x40 >> [ 1250.822786] [] __slab_alloc+0x102/0x3d0 >> [ 1250.822786] [] ? __netdev_alloc_skb+0x1f/0x40 >> [ 1250.822786] [] __kmalloc_track_caller+0x7b/0xc0 >> [ 1250.822786] [] __alloc_skb+0x6f/0x160 >> [ 1250.822786] [] __netdev_alloc_skb+0x1f/0x40 >> [ 1250.822786] [] e1000_alloc_rx_buffers+0x1ed/0x260 >> [ 1250.822786] [] e1000_clean_rx_irq+0x22a/0x330 >> [ 1250.822786] [] e1000_clean+0x1e1/0x540 >> [ 1250.822786] [] ? tick_program_event+0x45/0x70 >> [ 1250.822786] [] net_rx_action+0x9a/0x150 >> [ 1250.822786] [] __do_softirq+0x74/0xf0 >> [ 1250.822786] [] call_softirq+0x1c/0x30 >> [ 1250.822786] [] do_softirq+0x3d/0x80 >> [ 1250.822786] [] irq_exit+0x85/0x90 >> [ 1250.822786] [] do_IRQ+0x85/0x100 >> [ 1250.822786] [] ? mwait_idle+0x0/0x50 >> [ 1250.822786] [] ret_from_intr+0x0/0xa >> [ 1250.822786][] ? mwait_idle+0x45/0x50 >> [ 1250.822786] [] ? enter_idle+0x22/0x30 >> [ 1250.822786] [] ? cpu_idle+0x74/0xa0 >> [ 1250.822786] [] ? rest_init+0x55/0x60 > > They're regularly reported with e1000 too - I don't think aything really > changed. > > e1000 has this crazy problem where because of a cascade of follies (mainly > borked hardware) it has to do a 32kb allocation for a 9kb(?) packet. It > would be sad if that was carried over into e1000e? can't be, I personally removed that code. for MTU > 1500 e1000e uses a plain normal sized SKB. for anything bigger e1000e uses pages. so I don't see how this bug could still be showing up for e1000e at all. The large skb receive code is all gone (literally, removed). *please* rmmod e1000; modprobe e1000e and show the dumps again so we know for sure that we're not looking at e1000 dumps. short fix: increase ring size for e1000 with `modprobe e1000 RxDescriptors=4096` (or use ethtool) and `echo -n 8192 > /proc/sys/vm/min_free_kbytes` or something like that. what nic hardware is this on? lspci? Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.25-rc2] e100: Trying to free already-free IRQ 11 during suspend ...
Andrew Morton wrote: > On Sun, 17 Feb 2008 15:36:50 +0300 Andrey Borzenkov <[EMAIL PROTECTED]> wrote: > >> ... and possibly reboot/poweroff (it flows by too fast to be legible). >> >> [ 8803.850634] ACPI: Preparing to enter system sleep state S3 >> [ 8803.853141] Suspending console(s) >> [ 8805.287505] serial 00:09: disabled >> [ 8805.291564] Trying to free already-free IRQ 11 >> [ 8805.291579] Pid: 6920, comm: pm-suspend Not tainted 2.6.25-rc2-1avb #2 >> [ 8805.291628] [] free_irq+0xb7/0x130 >> [ 8805.291675] [] e100_suspend+0xc0/0x100 >> [ 8805.291724] [] pci_device_suspend+0x26/0x70 >> [ 8805.291747] [] suspend_device+0x94/0xd0 >> [ 8805.291763] [] device_suspend+0x153/0x240 >> [ 8805.291784] [] suspend_devices_and_enter+0x4f/0xf0 >> [ 8805.291808] [] ? freeze_processes+0x3f/0x80 >> [ 8805.291825] [] enter_state+0xaa/0x140 >> [ 8805.291840] [] state_store+0x8f/0xd0 >> [ 8805.291852] [] ? state_store+0x0/0xd0 >> [ 8805.291866] [] kobj_attr_store+0x24/0x30 >> [ 8805.291901] [] sysfs_write_file+0xbb/0x110 >> [ 8805.291936] [] vfs_write+0x99/0x130 >> [ 8805.291963] [] ? sysfs_write_file+0x0/0x110 >> [ 8805.291979] [] sys_write+0x3d/0x70 >> [ 8805.291998] [] sysenter_past_esp+0x5f/0xa5 >> [ 8805.292038] === >> [ 8805.347640] ACPI: PCI interrupt for device :00:06.0 disabled >> [ 8805.361128] ACPI: PCI interrupt for device :00:02.0 disabled >> [ 8805.376670] hwsleep-0322 [00] enter_sleep_state : Entering sleep >> state [S3] >> [ 8805.376670] Back to C! >> >> Interface is unused normally (only for netconsole sometimes). dmesg and >> config >> attached. > > Does reverting this: > > commit 8543da6672b0994921f014f2250e27ae81645580 > Author: Auke Kok <[EMAIL PROTECTED]> > Date: Wed Dec 12 16:30:42 2007 -0800 > > e100: free IRQ to remove warningwhenrebooting > > with this patch: > > --- a/drivers/net/e100.c~revert-1 > +++ a/drivers/net/e100.c > @@ -2804,9 +2804,8 @@ static int e100_suspend(struct pci_dev * > pci_enable_wake(pdev, PCI_D3cold, 0); > } > > - free_irq(pdev->irq, netdev); > - > pci_disable_device(pdev); > + free_irq(pdev->irq, netdev); > pci_set_power_state(pdev, PCI_D3hot); > > return 0; > @@ -2848,8 +2847,6 @@ static void e100_shutdown(struct pci_dev > pci_enable_wake(pdev, PCI_D3cold, 0); > } > > - free_irq(pdev->irq, netdev); > - > pci_disable_device(pdev); > pci_set_power_state(pdev, PCI_D3hot); > } > _ > > fix it? > >> Hmm ... after resume device has disappeared at all ... >> >> {pts/1}% cat /proc/interrupts >>CPU0 >> 0:1290492XT-PIC-XTtimer >> 1: 6675XT-PIC-XTi8042 >> 2: 0XT-PIC-XTcascade >> 3: 2XT-PIC-XT >> 4: 2XT-PIC-XT >> 5: 3XT-PIC-XT >> 7: 4XT-PIC-XTirda0 >> 8: 0XT-PIC-XTrtc0 >> 9:583XT-PIC-XTacpi >> 10: 2XT-PIC-XT >> 11: 31483XT-PIC-XTyenta, yenta, yenta, ohci_hcd:usb1, ALI >> 5451, pcmcia0.0 >> 12: 28070XT-PIC-XTi8042 >> 14: 21705XT-PIC-XTide0 >> 15: 82123XT-PIC-XTide1 >> NMI: 0 Non-maskable interrupts >> TRM: 0 Thermal event interrupts >> SPU: 0 Spurious interrupts >> ERR: 0 > > I hope that's not a separate bug... I'll take a look at this as well. thanks for reporting. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.25-rc2] e100: Trying to free already-free IRQ 11 during suspend ...
Andrew Morton wrote: On Sun, 17 Feb 2008 15:36:50 +0300 Andrey Borzenkov [EMAIL PROTECTED] wrote: ... and possibly reboot/poweroff (it flows by too fast to be legible). [ 8803.850634] ACPI: Preparing to enter system sleep state S3 [ 8803.853141] Suspending console(s) [ 8805.287505] serial 00:09: disabled [ 8805.291564] Trying to free already-free IRQ 11 [ 8805.291579] Pid: 6920, comm: pm-suspend Not tainted 2.6.25-rc2-1avb #2 [ 8805.291628] [c0152127] free_irq+0xb7/0x130 [ 8805.291675] [c024bd80] e100_suspend+0xc0/0x100 [ 8805.291724] [c01eaa36] pci_device_suspend+0x26/0x70 [ 8805.291747] [c0243674] suspend_device+0x94/0xd0 [ 8805.291763] [c02439a3] device_suspend+0x153/0x240 [ 8805.291784] [c014314f] suspend_devices_and_enter+0x4f/0xf0 [ 8805.291808] [c0143a5f] ? freeze_processes+0x3f/0x80 [ 8805.291825] [c01432fa] enter_state+0xaa/0x140 [ 8805.291840] [c014341f] state_store+0x8f/0xd0 [ 8805.291852] [c0143390] ? state_store+0x0/0xd0 [ 8805.291866] [c01d3404] kobj_attr_store+0x24/0x30 [ 8805.291901] [c01b547b] sysfs_write_file+0xbb/0x110 [ 8805.291936] [c0177d79] vfs_write+0x99/0x130 [ 8805.291963] [c01b53c0] ? sysfs_write_file+0x0/0x110 [ 8805.291979] [c01782fd] sys_write+0x3d/0x70 [ 8805.291998] [c010409a] sysenter_past_esp+0x5f/0xa5 [ 8805.292038] === [ 8805.347640] ACPI: PCI interrupt for device :00:06.0 disabled [ 8805.361128] ACPI: PCI interrupt for device :00:02.0 disabled [ 8805.376670] hwsleep-0322 [00] enter_sleep_state : Entering sleep state [S3] [ 8805.376670] Back to C! Interface is unused normally (only for netconsole sometimes). dmesg and config attached. Does reverting this: commit 8543da6672b0994921f014f2250e27ae81645580 Author: Auke Kok [EMAIL PROTECTED] Date: Wed Dec 12 16:30:42 2007 -0800 e100: free IRQ to remove warningwhenrebooting with this patch: --- a/drivers/net/e100.c~revert-1 +++ a/drivers/net/e100.c @@ -2804,9 +2804,8 @@ static int e100_suspend(struct pci_dev * pci_enable_wake(pdev, PCI_D3cold, 0); } - free_irq(pdev-irq, netdev); - pci_disable_device(pdev); + free_irq(pdev-irq, netdev); pci_set_power_state(pdev, PCI_D3hot); return 0; @@ -2848,8 +2847,6 @@ static void e100_shutdown(struct pci_dev pci_enable_wake(pdev, PCI_D3cold, 0); } - free_irq(pdev-irq, netdev); - pci_disable_device(pdev); pci_set_power_state(pdev, PCI_D3hot); } _ fix it? Hmm ... after resume device has disappeared at all ... {pts/1}% cat /proc/interrupts CPU0 0:1290492XT-PIC-XTtimer 1: 6675XT-PIC-XTi8042 2: 0XT-PIC-XTcascade 3: 2XT-PIC-XT 4: 2XT-PIC-XT 5: 3XT-PIC-XT 7: 4XT-PIC-XTirda0 8: 0XT-PIC-XTrtc0 9:583XT-PIC-XTacpi 10: 2XT-PIC-XT 11: 31483XT-PIC-XTyenta, yenta, yenta, ohci_hcd:usb1, ALI 5451, pcmcia0.0 12: 28070XT-PIC-XTi8042 14: 21705XT-PIC-XTide0 15: 82123XT-PIC-XTide1 NMI: 0 Non-maskable interrupts TRM: 0 Thermal event interrupts SPU: 0 Spurious interrupts ERR: 0 I hope that's not a separate bug... I'll take a look at this as well. thanks for reporting. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.25-rc2, 2.6.24-rc8] page allocation failure...
Andrew Morton wrote: On Sun, 17 Feb 2008 13:20:59 + Daniel J Blueman [EMAIL PROTECTED] wrote: I'm still hitting this with e1000e on 2.6.25-rc2, 10 times again. are you sure? I don't think that's the case and you're seeing e1000 dumps here... It's clearly non-fatal, but then do we expect it to occur? Daniel --- [dmesg] [ 1250.822786] swapper: page allocation failure. order:3, mode:0x4020 [ 1250.822786] Pid: 0, comm: swapper Not tainted 2.6.25-rc2-119 #2 [ 1250.822786] [ 1250.822786] Call Trace: [ 1250.822786] IRQ [8025fe9e] __alloc_pages+0x34e/0x3a0 [ 1250.822786] [8048c6df] ? __netdev_alloc_skb+0x1f/0x40 [ 1250.822786] [8027acc2] __slab_alloc+0x102/0x3d0 [ 1250.822786] [8048c6df] ? __netdev_alloc_skb+0x1f/0x40 [ 1250.822786] [8027b8cb] __kmalloc_track_caller+0x7b/0xc0 [ 1250.822786] [8048b74f] __alloc_skb+0x6f/0x160 [ 1250.822786] [8048c6df] __netdev_alloc_skb+0x1f/0x40 [ 1250.822786] [8042652d] e1000_alloc_rx_buffers+0x1ed/0x260 [ 1250.822786] [80426b5a] e1000_clean_rx_irq+0x22a/0x330 [ 1250.822786] [80422981] e1000_clean+0x1e1/0x540 [ 1250.822786] [8024b7a5] ? tick_program_event+0x45/0x70 [ 1250.822786] [804930ba] net_rx_action+0x9a/0x150 [ 1250.822786] [802336b4] __do_softirq+0x74/0xf0 [ 1250.822786] [8020c5fc] call_softirq+0x1c/0x30 [ 1250.822786] [8020eaad] do_softirq+0x3d/0x80 [ 1250.822786] [80233635] irq_exit+0x85/0x90 [ 1250.822786] [8020eba5] do_IRQ+0x85/0x100 [ 1250.822786] [8020a5b0] ? mwait_idle+0x0/0x50 [ 1250.822786] [8020b981] ret_from_intr+0x0/0xa [ 1250.822786] EOI [8020a5f5] ? mwait_idle+0x45/0x50 [ 1250.822786] [80209a92] ? enter_idle+0x22/0x30 [ 1250.822786] [8020a534] ? cpu_idle+0x74/0xa0 [ 1250.822786] [80527825] ? rest_init+0x55/0x60 They're regularly reported with e1000 too - I don't think aything really changed. e1000 has this crazy problem where because of a cascade of follies (mainly borked hardware) it has to do a 32kb allocation for a 9kb(?) packet. It would be sad if that was carried over into e1000e? can't be, I personally removed that code. for MTU 1500 e1000e uses a plain normal sized SKB. for anything bigger e1000e uses pages. so I don't see how this bug could still be showing up for e1000e at all. The large skb receive code is all gone (literally, removed). *please* rmmod e1000; modprobe e1000e and show the dumps again so we know for sure that we're not looking at e1000 dumps. short fix: increase ring size for e1000 with `modprobe e1000 RxDescriptors=4096` (or use ethtool) and `echo -n 8192 /proc/sys/vm/min_free_kbytes` or something like that. what nic hardware is this on? lspci? Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1 bugs
Miklos Szeredi wrote: >> the register dump looks OK as far as I can see. Since initialization >> works OK and the adapter seems to be setup OK reading from the >> register dump, I'm not sure at all what is going on. >> >> can you try manually ifup-ing the device and running tcpdump? do you >> see packets coming in? > > Well it does seem to receive the DHCP reply. > >> just for kicks, have you tried a different cable? or is the adapter >> consistently working properly using a different kernel/driver? > > It has been consistently working properly for a long time with various > kernels (and the e1000 driver). I've now swithed to 2.6.24-mm1 and > e1000e, and it's almost consistently broken after boot and resume, and > sometimes needs several tries (with the KDE NetworkManager thingy) to > make it work. maybe it's worth trying linux-2.6.git instead for now? The e1000e driver is actually involved in the dhcp handshake so we can't just rule a bug out yet. However, with my own t60 things appear to work just fine. how does manually getting a dhcp address go? can you eliminate networkmanager perhaps? just kill & restart dhclient over and over again... Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1 bugs
Miklos Szeredi wrote: >> OK. can you download, install and run `ethregs -i eth0` (from >> e1000.sf.net) and send me the output? I'll compare with a known >> working t60 I have here and see if anything shows up. > > OK, attached. > >> Also, post me the dmesg from after the adapter fails to load >> properly. > > Hmm, nothing in dmesg. Looking at syslog, I'm not even sure it's a > driver issue, it could also be dhclient or NetworkManager bug. The > strange thing is that this only happens on -mm. > > These are the relevant syslog messages for an unsuccessful attempt: > > == > Feb 15 12:27:32 tucsk kernel: [ 35.169183] :02:00.0: eth0: Link is Up > 100 Mbps Full Duplex, Flow Control: RX/TX > Feb 15 12:27:32 tucsk kernel: [ 35.169190] :02:00.0: eth0: 10/100 > speed: disabling TSO > Feb 15 12:27:40 tucsk dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port > 67 interval 4 > Feb 15 12:27:40 tucsk dhclient: DHCPOFFER from 192.168.0.1 > Feb 15 12:27:45 tucsk dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67 > Feb 15 12:27:45 tucsk dhclient: DHCPACK from 192.168.0.1 > Feb 15 12:27:46 tucsk dhclient: bound to 192.168.0.9 -- renewal in 1539 > seconds. > Feb 15 12:28:20 tucsk dhclient: caught deadly SIGTERM > Feb 15 12:28:20 tucsk dhclient: could not restore resolv.conf: No such file > or directory > Feb 15 12:28:20 tucsk dhclient: DHCPRELEASE on eth0 to 192.168.0.1 port 67 > Feb 15 12:28:20 tucsk dhclient: send_packet: Network is unreachable > Feb 15 12:28:20 tucsk dhclient: send_packet: please consult README file > regarding broadcast address. > == > > And then for the successful one: > > == > Feb 15 12:29:17 tucsk dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port > 67 interval 2 > Feb 15 12:29:17 tucsk dhclient: DHCPOFFER from 192.168.0.1 > Feb 15 12:29:22 tucsk dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67 > Feb 15 12:29:22 tucsk dhclient: DHCPACK from 192.168.0.1 > Feb 15 12:29:22 tucsk dhclient: bound to 192.168.0.9 -- renewal in 1543 > seconds. > == > > Thanks, > Miklos > > > > 02:00.0 (8086:109a) > Unknown device 8086:109a > Name Value > ~ > CTRL 18140248 the register dump looks OK as far as I can see. Since initialization works OK and the adapter seems to be setup OK reading from the register dump, I'm not sure at all what is going on. can you try manually ifup-ing the device and running tcpdump? do you see packets coming in? just for kicks, have you tried a different cable? or is the adapter consistently working properly using a different kernel/driver? Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1 bugs
Miklos Szeredi wrote: >>> - network doesn't always come up at first try (e1000e). On 2.6.24 >>>e1000e doesn't seem to work at all, so I use e1000, but that has >>>other problems. > > It does seem to be using MSI interrupts: > >CPU0 CPU1 > 0:2994380 1 IO-APIC-edge timer > 1: 24 0 IO-APIC-edge i8042 > 8: 1 0 IO-APIC-edge rtc > 9: 2107 0 IO-APIC-fasteoi acpi > 12: 3570 0 IO-APIC-edge i8042 > 14: 92 0 IO-APIC-edge ide0 > 16: 124168 0 IO-APIC-fasteoi uhci_hcd:usb2 > 17: 287458 0 IO-APIC-fasteoi uhci_hcd:usb3, HDA Intel > 18: 0 0 IO-APIC-fasteoi uhci_hcd:usb4 > 19: 70 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb5 > 313: 75321 5 PCI-MSI-edge eth0 OK. can you download, install and run `ethregs -i eth0` (from e1000.sf.net) and send me the output? I'll compare with a known working t60 I have here and see if anything shows up. Also, post me the dmesg from after the adapter fails to load properly. thanks, Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1 bugs
Kok, Auke wrote: > Miklos Szeredi wrote: >> - network doesn't always come up at first try (e1000e). On 2.6.24 >>e1000e doesn't seem to work at all, so I use e1000, but that has >>other problems. > > Andy Gospodarek pointed out a possible problem with e1000e if you are not > using > MSI interrupts (e.g. booting with pci=nomsi or CONFIG_PCI_MSI=n). Can you > confirm > that you are not using MSI irqs? If so, then we have a patch that you can try. oops, I'm confusing things here, ignore that > IOW post your `cat /proc/interrupts` please :) please send this, as well as full dmesg and anything that you can think of might be important. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1 bugs
Miklos Szeredi wrote: > - network doesn't always come up at first try (e1000e). On 2.6.24 >e1000e doesn't seem to work at all, so I use e1000, but that has >other problems. Andy Gospodarek pointed out a possible problem with e1000e if you are not using MSI interrupts (e.g. booting with pci=nomsi or CONFIG_PCI_MSI=n). Can you confirm that you are not using MSI irqs? If so, then we have a patch that you can try. IOW post your `cat /proc/interrupts` please :) Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1 bugs
Miklos Szeredi wrote: - network doesn't always come up at first try (e1000e). On 2.6.24 e1000e doesn't seem to work at all, so I use e1000, but that has other problems. Andy Gospodarek pointed out a possible problem with e1000e if you are not using MSI interrupts (e.g. booting with pci=nomsi or CONFIG_PCI_MSI=n). Can you confirm that you are not using MSI irqs? If so, then we have a patch that you can try. IOW post your `cat /proc/interrupts` please :) Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1 bugs
Kok, Auke wrote: Miklos Szeredi wrote: - network doesn't always come up at first try (e1000e). On 2.6.24 e1000e doesn't seem to work at all, so I use e1000, but that has other problems. Andy Gospodarek pointed out a possible problem with e1000e if you are not using MSI interrupts (e.g. booting with pci=nomsi or CONFIG_PCI_MSI=n). Can you confirm that you are not using MSI irqs? If so, then we have a patch that you can try. oops, I'm confusing things here, ignore that IOW post your `cat /proc/interrupts` please :) please send this, as well as full dmesg and anything that you can think of might be important. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1 bugs
Miklos Szeredi wrote: the register dump looks OK as far as I can see. Since initialization works OK and the adapter seems to be setup OK reading from the register dump, I'm not sure at all what is going on. can you try manually ifup-ing the device and running tcpdump? do you see packets coming in? Well it does seem to receive the DHCP reply. just for kicks, have you tried a different cable? or is the adapter consistently working properly using a different kernel/driver? It has been consistently working properly for a long time with various kernels (and the e1000 driver). I've now swithed to 2.6.24-mm1 and e1000e, and it's almost consistently broken after boot and resume, and sometimes needs several tries (with the KDE NetworkManager thingy) to make it work. maybe it's worth trying linux-2.6.git instead for now? The e1000e driver is actually involved in the dhcp handshake so we can't just rule a bug out yet. However, with my own t60 things appear to work just fine. how does manually getting a dhcp address go? can you eliminate networkmanager perhaps? just kill restart dhclient over and over again... Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1 bugs
Miklos Szeredi wrote: OK. can you download, install and run `ethregs -i eth0` (from e1000.sf.net) and send me the output? I'll compare with a known working t60 I have here and see if anything shows up. OK, attached. Also, post me the dmesg from after the adapter fails to load properly. Hmm, nothing in dmesg. Looking at syslog, I'm not even sure it's a driver issue, it could also be dhclient or NetworkManager bug. The strange thing is that this only happens on -mm. These are the relevant syslog messages for an unsuccessful attempt: == Feb 15 12:27:32 tucsk kernel: [ 35.169183] :02:00.0: eth0: Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX Feb 15 12:27:32 tucsk kernel: [ 35.169190] :02:00.0: eth0: 10/100 speed: disabling TSO Feb 15 12:27:40 tucsk dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 4 Feb 15 12:27:40 tucsk dhclient: DHCPOFFER from 192.168.0.1 Feb 15 12:27:45 tucsk dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67 Feb 15 12:27:45 tucsk dhclient: DHCPACK from 192.168.0.1 Feb 15 12:27:46 tucsk dhclient: bound to 192.168.0.9 -- renewal in 1539 seconds. Feb 15 12:28:20 tucsk dhclient: caught deadly SIGTERM Feb 15 12:28:20 tucsk dhclient: could not restore resolv.conf: No such file or directory Feb 15 12:28:20 tucsk dhclient: DHCPRELEASE on eth0 to 192.168.0.1 port 67 Feb 15 12:28:20 tucsk dhclient: send_packet: Network is unreachable Feb 15 12:28:20 tucsk dhclient: send_packet: please consult README file regarding broadcast address. == And then for the successful one: == Feb 15 12:29:17 tucsk dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 2 Feb 15 12:29:17 tucsk dhclient: DHCPOFFER from 192.168.0.1 Feb 15 12:29:22 tucsk dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67 Feb 15 12:29:22 tucsk dhclient: DHCPACK from 192.168.0.1 Feb 15 12:29:22 tucsk dhclient: bound to 192.168.0.9 -- renewal in 1543 seconds. == Thanks, Miklos 02:00.0 (8086:109a) Unknown device 8086:109a Name Value ~ CTRL 18140248 the register dump looks OK as far as I can see. Since initialization works OK and the adapter seems to be setup OK reading from the register dump, I'm not sure at all what is going on. can you try manually ifup-ing the device and running tcpdump? do you see packets coming in? just for kicks, have you tried a different cable? or is the adapter consistently working properly using a different kernel/driver? Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-mm1 bugs
Miklos Szeredi wrote: - network doesn't always come up at first try (e1000e). On 2.6.24 e1000e doesn't seem to work at all, so I use e1000, but that has other problems. It does seem to be using MSI interrupts: CPU0 CPU1 0:2994380 1 IO-APIC-edge timer 1: 24 0 IO-APIC-edge i8042 8: 1 0 IO-APIC-edge rtc 9: 2107 0 IO-APIC-fasteoi acpi 12: 3570 0 IO-APIC-edge i8042 14: 92 0 IO-APIC-edge ide0 16: 124168 0 IO-APIC-fasteoi uhci_hcd:usb2 17: 287458 0 IO-APIC-fasteoi uhci_hcd:usb3, HDA Intel 18: 0 0 IO-APIC-fasteoi uhci_hcd:usb4 19: 70 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb5 313: 75321 5 PCI-MSI-edge eth0 OK. can you download, install and run `ethregs -i eth0` (from e1000.sf.net) and send me the output? I'll compare with a known working t60 I have here and see if anything shows up. Also, post me the dmesg from after the adapter fails to load properly. thanks, Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + drivers-net-e1000-use-field_sizeof.patch added to -mm tree
[EMAIL PROTECTED] wrote: > The patch titled > drivers/net/e1000: Use FIELD_SIZEOF > has been added to the -mm tree. Its filename is > drivers-net-e1000-use-field_sizeof.patch > > Before you just go and hit "reply", please: >a) Consider who else should be cc'ed >b) Prefer to cc a suitable mailing list as well >c) Ideally: find the original patch on the mailing list and do a > reply-to-all to that, adding suitable additional cc's > > *** Remember to use Documentation/SubmitChecklist when testing your code *** > > See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find > out what to do about this > > The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/ FYI Jeff already merged this (all 3 for the intel nics) in #upstream fixes as a single patch ... Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + drivers-net-e1000-use-field_sizeof.patch added to -mm tree
[EMAIL PROTECTED] wrote: The patch titled drivers/net/e1000: Use FIELD_SIZEOF has been added to the -mm tree. Its filename is drivers-net-e1000-use-field_sizeof.patch Before you just go and hit reply, please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find out what to do about this The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/ FYI Jeff already merged this (all 3 for the intel nics) in #upstream fixes as a single patch ... Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: E1000 (PCI-E) doesn't work on nforce430, MSI issue.
Prakash Punnoor wrote: > On the day of Tuesday 12 February 2008 Krzysztof Halasa hast written: >> Hi, >> >> Is it a known problem? >> Linux 2.6.24.2, ASUS M2NPV-MX mobo, nforce 430 based, two PCI-E x1 >> E1000 cards, 32-bit kernel, default e1000 driver (PCI IDs disabled in >> e1000e). your card will work with e1000e from 2.6.25 onwards. >> Don't work by default. "pci=nomsi" fixes the problem. actually does not fix anything - it just works around it by falling back to legacy interrupts. > Probably the patch to enable msi bit on the host bridge(?) is still missing > in > mainline. I needed that patch to make msi work on my MCP51 system. and there have been many more reports. unfortunately e1000 cards are one of the few things that actually uses MSI. It works great except for a few problematic motherboards, and the ones mentioned in this thread fall amongst them. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: E1000 (PCI-E) doesn't work on nforce430, MSI issue.
Prakash Punnoor wrote: On the day of Tuesday 12 February 2008 Krzysztof Halasa hast written: Hi, Is it a known problem? Linux 2.6.24.2, ASUS M2NPV-MX mobo, nforce 430 based, two PCI-E x1 E1000 cards, 32-bit kernel, default e1000 driver (PCI IDs disabled in e1000e). your card will work with e1000e from 2.6.25 onwards. Don't work by default. pci=nomsi fixes the problem. actually does not fix anything - it just works around it by falling back to legacy interrupts. Probably the patch to enable msi bit on the host bridge(?) is still missing in mainline. I needed that patch to make msi work on my MCP51 system. and there have been many more reports. unfortunately e1000 cards are one of the few things that actually uses MSI. It works great except for a few problematic motherboards, and the ones mentioned in this thread fall amongst them. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] [PATCH 6/8] drivers/net/igb: Use FIELD_SIZEOF
Julia Lawall wrote: > From: Julia Lawall <[EMAIL PROTECTED]> > > Robert P.J. Day proposed to use the macro FIELD_SIZEOF in replace of code > that matches its definition. thanks for the (3) patches, I'll make sure they get merged. Cheers, Auke > > The modification was made using the following semantic patch > (http://www.emn.fr/x-info/coccinelle/) > > // > @haskernel@ > @@ > > #include > > @depends on haskernel@ > type t; > identifier f; > @@ > > - (sizeof(((t*)0)->f)) > + FIELD_SIZEOF(t, f) > > @depends on haskernel@ > type t; > identifier f; > @@ > > - sizeof(((t*)0)->f) > + FIELD_SIZEOF(t, f) > // > > Signed-off-by: Julia Lawall <[EMAIL PROTECTED]> > > --- > > diff -u -p a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c > --- a/drivers/net/igb/igb_ethtool.c 2008-02-02 15:28:20.0 +0100 > +++ b/drivers/net/igb/igb_ethtool.c 2008-02-10 18:06:35.0 +0100 > @@ -43,7 +43,7 @@ struct igb_stats { > int stat_offset; > }; > > -#define IGB_STAT(m) sizeof(((struct igb_adapter *)0)->m), \ > +#define IGB_STAT(m) FIELD_SIZEOF(struct igb_adapter, m), \ > offsetof(struct igb_adapter, m) > static const struct igb_stats igb_gstrings_stats[] = { > { "rx_packets", IGB_STAT(stats.gprc) }, > > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ > ___ > E1000-devel mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/e1000-devel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] [PATCH 6/8] drivers/net/igb: Use FIELD_SIZEOF
Julia Lawall wrote: From: Julia Lawall [EMAIL PROTECTED] Robert P.J. Day proposed to use the macro FIELD_SIZEOF in replace of code that matches its definition. thanks for the (3) patches, I'll make sure they get merged. Cheers, Auke The modification was made using the following semantic patch (http://www.emn.fr/x-info/coccinelle/) // smpl @haskernel@ @@ #include linux/kernel.h @depends on haskernel@ type t; identifier f; @@ - (sizeof(((t*)0)-f)) + FIELD_SIZEOF(t, f) @depends on haskernel@ type t; identifier f; @@ - sizeof(((t*)0)-f) + FIELD_SIZEOF(t, f) // /smpl Signed-off-by: Julia Lawall [EMAIL PROTECTED] --- diff -u -p a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c --- a/drivers/net/igb/igb_ethtool.c 2008-02-02 15:28:20.0 +0100 +++ b/drivers/net/igb/igb_ethtool.c 2008-02-10 18:06:35.0 +0100 @@ -43,7 +43,7 @@ struct igb_stats { int stat_offset; }; -#define IGB_STAT(m) sizeof(((struct igb_adapter *)0)-m), \ +#define IGB_STAT(m) FIELD_SIZEOF(struct igb_adapter, m), \ offsetof(struct igb_adapter, m) static const struct igb_stats igb_gstrings_stats[] = { { rx_packets, IGB_STAT(stats.gprc) }, - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ E1000-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/e1000-devel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: Re: e1000 1sec latency problem
Ray Lee wrote: > On Feb 9, 2008 1:51 PM, Kok, Auke <[EMAIL PROTECTED]> wrote: >> Martin Rogge wrote: >>> On Saturday 09 February 2008 11:07:26 Martin Rogge wrote: >>>> Hi, >>>> >>>> I am not so familiar with the various mailing lists and missed out on >>>> [EMAIL PROTECTED] the first time. Please cc me on any >>>> replies. >>>> >>>> I am looking for help with either making the e1000e driver work on my >>>> Thinkpad T60 or fixing the 1s latency issue with e1000. >>>> >>>> To be honest, I do not understand why the e1000e driver failed to recognize >>>> the NIC when I tried. At least, I noticed the correct device ID is defined >>>> in drivers/net/e1000e/hw.h: >>>> >>>> #define E1000_DEV_ID_82573L0x109A >>>> >>>> Any help is appreciated. >>>> >>>> Thanks, >>>> >>>> Martin >>>> >>>> -- Forwarded Message -- >>>> >>>> Subject: Re: e1000 1sec latency problem >>>> Date: Thursday 07 February 2008 >>>> From: Martin Rogge <[EMAIL PROTECTED]> >>>> To: linux-kernel@vger.kernel.org >>>> >>>> Pavel Machek wrote: >>>>> Hi! >>>>> >>>>> I have the famous e1000 latency problems: >>>> Hi, I have the same problem with my Thinkpad T60. >>>> >>>> [EMAIL PROTECTED]:~# ping arnold >>>> PING arnold (192.168.158.6) 56(84) bytes of data. >>>> 64 bytes from arnold (192.168.158.6): icmp_seq=1 ttl=64 time=49.7 ms >>>> 64 bytes from arnold (192.168.158.6): icmp_seq=2 ttl=64 time=0.438 ms >>>> 64 bytes from arnold (192.168.158.6): icmp_seq=3 ttl=64 time=1000 ms >>>> 64 bytes from arnold (192.168.158.6): icmp_seq=4 ttl=64 time=0.970 ms >>>> 64 bytes from arnold (192.168.158.6): icmp_seq=5 ttl=64 time=885 ms >>>> 64 bytes from arnold (192.168.158.6): icmp_seq=6 ttl=64 time=0.484 ms >>>> 64 bytes from arnold (192.168.158.6): icmp_seq=7 ttl=64 time=529 ms >>>> 64 bytes from arnold (192.168.158.6): icmp_seq=8 ttl=64 time=1.02 ms >>>> 64 bytes from arnold (192.168.158.6): icmp_seq=9 ttl=64 time=149 ms >>>> 64 bytes from arnold (192.168.158.6): icmp_seq=10 ttl=64 time=0.549 ms >>>> 64 bytes from arnold (192.168.158.6): icmp_seq=11 ttl=64 time=0.829 ms >>>> >>>> --- arnold ping statistics --- >>>> 11 packets transmitted, 11 received, 0% packet loss, time ms >>>> rtt min/avg/max/mdev = 0.438/238.113/1000.967/365.279 ms, pipe 2 >>>> [EMAIL PROTECTED]:~# uname -a >>>> Linux zorro 2.6.24 #6 SMP PREEMPT Sun Feb 3 18:27:48 CET 2008 i686 Intel(R) >>>> Core(TM)2 CPU T7200 @ 2.00GHz GenuineIntel GNU/Linux >>>> [EMAIL PROTECTED]:~# lspci -vvv >>> [stuff deleted] >>> >>>> Unfortunately the e1000e driver is not an option as it will not detect the >>>> NIC: >>>> >>>> from dmesg with e1000 compiled in: >>>> Intel(R) PRO/1000 Network Driver - version 7.3.20-k2-NAPI >>>> Copyright (c) 1999-2006 Intel Corporation. >>>> ACPI: PCI Interrupt :02:00.0[A] -> GSI 16 (level, low) -> IRQ 16 >>>> PCI: Setting latency timer of device :02:00.0 to 64 >>>> e1000: :02:00.0: e1000_probe: (PCI Express:2.5Gb/s:Width x1) >>>> 00:15:58:c3:3a:71 >>>> e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection >>>> >>>> from dmesg with e1000e compiled in: >>>> e1000e: Intel(R) PRO/1000 Network Driver - 0.2.0 >>>> e1000e: Copyright (c) 1999-2007 Intel Corporation. >>>> >>>> Any pointers? >>>> >>>> Thanks, >>>> >>>> Martin >>>> >>>> >>>> >>>> --- >>> Just for the records, I googled the following solution for the Lenovo T60: >>> >>> (a) use the e1000 driver >>> (b) if compiling as a module, add the following parameter to modprobe.conf: >>> options e1000 RxIntDelay=5 >>> (c) if compiling a static driver, use the following patch (based on 2.6.24): >>> >>> --- e1000_param.c.orig2008-01-24 23:58:37.0 +0100 >>> +++ e1000_param.c 2008-02-09 20:42:23.0 +0100 >>> @@ -158,7 +158,7 @@ >>> * Valid Range: 0-65535 >>> */ >>
Re: Fwd: Re: e1000 1sec latency problem
Martin Rogge wrote: > On Saturday 09 February 2008 11:07:26 Martin Rogge wrote: >> Hi, >> >> I am not so familiar with the various mailing lists and missed out on >> [EMAIL PROTECTED] the first time. Please cc me on any >> replies. >> >> I am looking for help with either making the e1000e driver work on my >> Thinkpad T60 or fixing the 1s latency issue with e1000. >> >> To be honest, I do not understand why the e1000e driver failed to recognize >> the NIC when I tried. At least, I noticed the correct device ID is defined >> in drivers/net/e1000e/hw.h: >> >> #define E1000_DEV_ID_82573L0x109A >> >> Any help is appreciated. >> >> Thanks, >> >> Martin >> >> -- Forwarded Message -- >> >> Subject: Re: e1000 1sec latency problem >> Date: Thursday 07 February 2008 >> From: Martin Rogge <[EMAIL PROTECTED]> >> To: linux-kernel@vger.kernel.org >> >> Pavel Machek wrote: >>> Hi! >>> >>> I have the famous e1000 latency problems: >> Hi, I have the same problem with my Thinkpad T60. >> >> [EMAIL PROTECTED]:~# ping arnold >> PING arnold (192.168.158.6) 56(84) bytes of data. >> 64 bytes from arnold (192.168.158.6): icmp_seq=1 ttl=64 time=49.7 ms >> 64 bytes from arnold (192.168.158.6): icmp_seq=2 ttl=64 time=0.438 ms >> 64 bytes from arnold (192.168.158.6): icmp_seq=3 ttl=64 time=1000 ms >> 64 bytes from arnold (192.168.158.6): icmp_seq=4 ttl=64 time=0.970 ms >> 64 bytes from arnold (192.168.158.6): icmp_seq=5 ttl=64 time=885 ms >> 64 bytes from arnold (192.168.158.6): icmp_seq=6 ttl=64 time=0.484 ms >> 64 bytes from arnold (192.168.158.6): icmp_seq=7 ttl=64 time=529 ms >> 64 bytes from arnold (192.168.158.6): icmp_seq=8 ttl=64 time=1.02 ms >> 64 bytes from arnold (192.168.158.6): icmp_seq=9 ttl=64 time=149 ms >> 64 bytes from arnold (192.168.158.6): icmp_seq=10 ttl=64 time=0.549 ms >> 64 bytes from arnold (192.168.158.6): icmp_seq=11 ttl=64 time=0.829 ms >> >> --- arnold ping statistics --- >> 11 packets transmitted, 11 received, 0% packet loss, time ms >> rtt min/avg/max/mdev = 0.438/238.113/1000.967/365.279 ms, pipe 2 >> [EMAIL PROTECTED]:~# uname -a >> Linux zorro 2.6.24 #6 SMP PREEMPT Sun Feb 3 18:27:48 CET 2008 i686 Intel(R) >> Core(TM)2 CPU T7200 @ 2.00GHz GenuineIntel GNU/Linux >> [EMAIL PROTECTED]:~# lspci -vvv > > [stuff deleted] > >> Unfortunately the e1000e driver is not an option as it will not detect the >> NIC: >> >> from dmesg with e1000 compiled in: >> Intel(R) PRO/1000 Network Driver - version 7.3.20-k2-NAPI >> Copyright (c) 1999-2006 Intel Corporation. >> ACPI: PCI Interrupt :02:00.0[A] -> GSI 16 (level, low) -> IRQ 16 >> PCI: Setting latency timer of device :02:00.0 to 64 >> e1000: :02:00.0: e1000_probe: (PCI Express:2.5Gb/s:Width x1) >> 00:15:58:c3:3a:71 >> e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection >> >> from dmesg with e1000e compiled in: >> e1000e: Intel(R) PRO/1000 Network Driver - 0.2.0 >> e1000e: Copyright (c) 1999-2007 Intel Corporation. >> >> Any pointers? >> >> Thanks, >> >> Martin >> >> >> >> --- > > Just for the records, I googled the following solution for the Lenovo T60: > > (a) use the e1000 driver > (b) if compiling as a module, add the following parameter to modprobe.conf: > options e1000 RxIntDelay=5 > (c) if compiling a static driver, use the following patch (based on 2.6.24): > > --- e1000_param.c.orig2008-01-24 23:58:37.0 +0100 > +++ e1000_param.c 2008-02-09 20:42:23.0 +0100 > @@ -158,7 +158,7 @@ > * Valid Range: 0-65535 > */ > E1000_PARAM(RxIntDelay, "Receive Interrupt Delay"); > -#define DEFAULT_RDTR 0 > +#define DEFAULT_RDTR 5 > #define MAX_RXDELAY 0x > #define MIN_RXDELAY0 > > After reboot, the average ping time is still factor 10 worse than it should > be, but it stays below 2 ms (which is a remarkable improvement compared to > 1000 ms). correct, this was a workaround which improved things for most people, but did not *fix* it. the real fix is to disable L1 ASPM alltogether at the cost of more power consumption, which is what is in e1000e in 2.6.25-git. Cheers, Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: Re: e1000 1sec latency problem
Martin Rogge wrote: On Saturday 09 February 2008 11:07:26 Martin Rogge wrote: Hi, I am not so familiar with the various mailing lists and missed out on [EMAIL PROTECTED] the first time. Please cc me on any replies. I am looking for help with either making the e1000e driver work on my Thinkpad T60 or fixing the 1s latency issue with e1000. To be honest, I do not understand why the e1000e driver failed to recognize the NIC when I tried. At least, I noticed the correct device ID is defined in drivers/net/e1000e/hw.h: #define E1000_DEV_ID_82573L0x109A Any help is appreciated. Thanks, Martin -- Forwarded Message -- Subject: Re: e1000 1sec latency problem Date: Thursday 07 February 2008 From: Martin Rogge [EMAIL PROTECTED] To: linux-kernel@vger.kernel.org Pavel Machek wrote: Hi! I have the famous e1000 latency problems: Hi, I have the same problem with my Thinkpad T60. [EMAIL PROTECTED]:~# ping arnold PING arnold (192.168.158.6) 56(84) bytes of data. 64 bytes from arnold (192.168.158.6): icmp_seq=1 ttl=64 time=49.7 ms 64 bytes from arnold (192.168.158.6): icmp_seq=2 ttl=64 time=0.438 ms 64 bytes from arnold (192.168.158.6): icmp_seq=3 ttl=64 time=1000 ms 64 bytes from arnold (192.168.158.6): icmp_seq=4 ttl=64 time=0.970 ms 64 bytes from arnold (192.168.158.6): icmp_seq=5 ttl=64 time=885 ms 64 bytes from arnold (192.168.158.6): icmp_seq=6 ttl=64 time=0.484 ms 64 bytes from arnold (192.168.158.6): icmp_seq=7 ttl=64 time=529 ms 64 bytes from arnold (192.168.158.6): icmp_seq=8 ttl=64 time=1.02 ms 64 bytes from arnold (192.168.158.6): icmp_seq=9 ttl=64 time=149 ms 64 bytes from arnold (192.168.158.6): icmp_seq=10 ttl=64 time=0.549 ms 64 bytes from arnold (192.168.158.6): icmp_seq=11 ttl=64 time=0.829 ms --- arnold ping statistics --- 11 packets transmitted, 11 received, 0% packet loss, time ms rtt min/avg/max/mdev = 0.438/238.113/1000.967/365.279 ms, pipe 2 [EMAIL PROTECTED]:~# uname -a Linux zorro 2.6.24 #6 SMP PREEMPT Sun Feb 3 18:27:48 CET 2008 i686 Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz GenuineIntel GNU/Linux [EMAIL PROTECTED]:~# lspci -vvv [stuff deleted] Unfortunately the e1000e driver is not an option as it will not detect the NIC: from dmesg with e1000 compiled in: Intel(R) PRO/1000 Network Driver - version 7.3.20-k2-NAPI Copyright (c) 1999-2006 Intel Corporation. ACPI: PCI Interrupt :02:00.0[A] - GSI 16 (level, low) - IRQ 16 PCI: Setting latency timer of device :02:00.0 to 64 e1000: :02:00.0: e1000_probe: (PCI Express:2.5Gb/s:Width x1) 00:15:58:c3:3a:71 e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection from dmesg with e1000e compiled in: e1000e: Intel(R) PRO/1000 Network Driver - 0.2.0 e1000e: Copyright (c) 1999-2007 Intel Corporation. Any pointers? Thanks, Martin --- Just for the records, I googled the following solution for the Lenovo T60: (a) use the e1000 driver (b) if compiling as a module, add the following parameter to modprobe.conf: options e1000 RxIntDelay=5 (c) if compiling a static driver, use the following patch (based on 2.6.24): --- e1000_param.c.orig2008-01-24 23:58:37.0 +0100 +++ e1000_param.c 2008-02-09 20:42:23.0 +0100 @@ -158,7 +158,7 @@ * Valid Range: 0-65535 */ E1000_PARAM(RxIntDelay, Receive Interrupt Delay); -#define DEFAULT_RDTR 0 +#define DEFAULT_RDTR 5 #define MAX_RXDELAY 0x #define MIN_RXDELAY0 After reboot, the average ping time is still factor 10 worse than it should be, but it stays below 2 ms (which is a remarkable improvement compared to 1000 ms). correct, this was a workaround which improved things for most people, but did not *fix* it. the real fix is to disable L1 ASPM alltogether at the cost of more power consumption, which is what is in e1000e in 2.6.25-git. Cheers, Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: Re: e1000 1sec latency problem
Ray Lee wrote: On Feb 9, 2008 1:51 PM, Kok, Auke [EMAIL PROTECTED] wrote: Martin Rogge wrote: On Saturday 09 February 2008 11:07:26 Martin Rogge wrote: Hi, I am not so familiar with the various mailing lists and missed out on [EMAIL PROTECTED] the first time. Please cc me on any replies. I am looking for help with either making the e1000e driver work on my Thinkpad T60 or fixing the 1s latency issue with e1000. To be honest, I do not understand why the e1000e driver failed to recognize the NIC when I tried. At least, I noticed the correct device ID is defined in drivers/net/e1000e/hw.h: #define E1000_DEV_ID_82573L0x109A Any help is appreciated. Thanks, Martin -- Forwarded Message -- Subject: Re: e1000 1sec latency problem Date: Thursday 07 February 2008 From: Martin Rogge [EMAIL PROTECTED] To: linux-kernel@vger.kernel.org Pavel Machek wrote: Hi! I have the famous e1000 latency problems: Hi, I have the same problem with my Thinkpad T60. [EMAIL PROTECTED]:~# ping arnold PING arnold (192.168.158.6) 56(84) bytes of data. 64 bytes from arnold (192.168.158.6): icmp_seq=1 ttl=64 time=49.7 ms 64 bytes from arnold (192.168.158.6): icmp_seq=2 ttl=64 time=0.438 ms 64 bytes from arnold (192.168.158.6): icmp_seq=3 ttl=64 time=1000 ms 64 bytes from arnold (192.168.158.6): icmp_seq=4 ttl=64 time=0.970 ms 64 bytes from arnold (192.168.158.6): icmp_seq=5 ttl=64 time=885 ms 64 bytes from arnold (192.168.158.6): icmp_seq=6 ttl=64 time=0.484 ms 64 bytes from arnold (192.168.158.6): icmp_seq=7 ttl=64 time=529 ms 64 bytes from arnold (192.168.158.6): icmp_seq=8 ttl=64 time=1.02 ms 64 bytes from arnold (192.168.158.6): icmp_seq=9 ttl=64 time=149 ms 64 bytes from arnold (192.168.158.6): icmp_seq=10 ttl=64 time=0.549 ms 64 bytes from arnold (192.168.158.6): icmp_seq=11 ttl=64 time=0.829 ms --- arnold ping statistics --- 11 packets transmitted, 11 received, 0% packet loss, time ms rtt min/avg/max/mdev = 0.438/238.113/1000.967/365.279 ms, pipe 2 [EMAIL PROTECTED]:~# uname -a Linux zorro 2.6.24 #6 SMP PREEMPT Sun Feb 3 18:27:48 CET 2008 i686 Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz GenuineIntel GNU/Linux [EMAIL PROTECTED]:~# lspci -vvv [stuff deleted] Unfortunately the e1000e driver is not an option as it will not detect the NIC: from dmesg with e1000 compiled in: Intel(R) PRO/1000 Network Driver - version 7.3.20-k2-NAPI Copyright (c) 1999-2006 Intel Corporation. ACPI: PCI Interrupt :02:00.0[A] - GSI 16 (level, low) - IRQ 16 PCI: Setting latency timer of device :02:00.0 to 64 e1000: :02:00.0: e1000_probe: (PCI Express:2.5Gb/s:Width x1) 00:15:58:c3:3a:71 e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection from dmesg with e1000e compiled in: e1000e: Intel(R) PRO/1000 Network Driver - 0.2.0 e1000e: Copyright (c) 1999-2007 Intel Corporation. Any pointers? Thanks, Martin --- Just for the records, I googled the following solution for the Lenovo T60: (a) use the e1000 driver (b) if compiling as a module, add the following parameter to modprobe.conf: options e1000 RxIntDelay=5 (c) if compiling a static driver, use the following patch (based on 2.6.24): --- e1000_param.c.orig2008-01-24 23:58:37.0 +0100 +++ e1000_param.c 2008-02-09 20:42:23.0 +0100 @@ -158,7 +158,7 @@ * Valid Range: 0-65535 */ E1000_PARAM(RxIntDelay, Receive Interrupt Delay); -#define DEFAULT_RDTR 0 +#define DEFAULT_RDTR 5 #define MAX_RXDELAY 0x #define MIN_RXDELAY0 After reboot, the average ping time is still factor 10 worse than it should be, but it stays below 2 ms (which is a remarkable improvement compared to 1000 ms). correct, this was a workaround which improved things for most people, but did not *fix* it. the real fix is to disable L1 ASPM alltogether at the cost of more power consumption, which is what is in e1000e in 2.6.25-git. e1000e doesn't recognize his NIC. Will you be adding this to the e1000 driver as well? no, from 2.6.25 onwards e1000e will support 82573 nics, so you'll have to migrate drivers, and you will get the fix automatically that way. after 2.6.25 releases, support for all pci-e nics will be removed from the e1000 driver. Cheers Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] e1000 1sec latency problem
Pavel Machek wrote: > On Thu 2008-02-07 14:32:16, Kok, Auke wrote: >> Pavel Machek wrote: >>> Hi! >>> >>>>> I have the famous e1000 latency problems: >>>>> >>>>> 64 bytes from 195.113.31.123: icmp_seq=68 ttl=56 time=351.9 ms >>>>> 64 bytes from 195.113.31.123: icmp_seq=69 ttl=56 time=209.2 ms >>>>> 64 bytes from 195.113.31.123: icmp_seq=70 ttl=56 time=1004.1 ms >>>>> 64 bytes from 195.113.31.123: icmp_seq=71 ttl=56 time=308.9 ms >>>>> 64 bytes from 195.113.31.123: icmp_seq=72 ttl=56 time=305.4 ms >>>>> 64 bytes from 195.113.31.123: icmp_seq=73 ttl=56 time=9.8 ms >>>>> 64 bytes from 195.113.31.123: icmp_seq=74 ttl=56 time=3.7 ms >>>>> >>>>> ...and they are still there in 2.6.25-git0. I had ethernet EEPROM >>>>> checksum problems, which I fixed by the update, but problems are not >>>>> gone. >>>> pavel, start using "e1000e" instead - this driver replaces e1000 for all >>>> the >>>> pci-express devices and has the infamous L1 ASPM disable patch to >>>> fix this issue. >>> Ok, e1000e seems to work for me. >>> >>> In another email, you asked for lspci - of failing e1000 >>> case. Should I still provide it? >> well, if you do it you should see that L1 ASPM is now disabled (with e1000e) >> whereas with e1000 it is still enabled. That's the fix that you need... > > Is there easy way to push that fix to e1000, too? Or print "use e1000e > instead" and refuse to load? well we're going to delete all pci-e related code from this driver soon anyway, but I am indeed writing a patch right now that prints out this warning... Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] e1000 1sec latency problem
Pavel Machek wrote: > Hi! > >>> I have the famous e1000 latency problems: >>> >>> 64 bytes from 195.113.31.123: icmp_seq=68 ttl=56 time=351.9 ms >>> 64 bytes from 195.113.31.123: icmp_seq=69 ttl=56 time=209.2 ms >>> 64 bytes from 195.113.31.123: icmp_seq=70 ttl=56 time=1004.1 ms >>> 64 bytes from 195.113.31.123: icmp_seq=71 ttl=56 time=308.9 ms >>> 64 bytes from 195.113.31.123: icmp_seq=72 ttl=56 time=305.4 ms >>> 64 bytes from 195.113.31.123: icmp_seq=73 ttl=56 time=9.8 ms >>> 64 bytes from 195.113.31.123: icmp_seq=74 ttl=56 time=3.7 ms >>> >>> ...and they are still there in 2.6.25-git0. I had ethernet EEPROM >>> checksum problems, which I fixed by the update, but problems are not >>> gone. >> pavel, start using "e1000e" instead - this driver replaces e1000 for all the >> pci-express devices and has the infamous L1 ASPM disable patch to >> fix this issue. > > Ok, e1000e seems to work for me. > > In another email, you asked for lspci - of failing e1000 > case. Should I still provide it? well, if you do it you should see that L1 ASPM is now disabled (with e1000e) whereas with e1000 it is still enabled. That's the fix that you need... Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] e1000 1sec latency problem
Max Krasnyansky wrote: > > Kok, Auke wrote: >> Max Krasnyansky wrote: >>> Kok, Auke wrote: >>>> Max Krasnyansky wrote: >>>>> So you don't think it's related to the interrupt coalescing by any chance >>>>> ? >>>>> I'd suggest to try and disable the coalescing and see if it makes any >>>>> difference. >>>>> We've had lots of issues with coalescing misbehavior. Not this bad (ie 1 >>>>> second) though. >>>>> >>>>> Add this to modprobe.conf and reload e1000 module >>>>> >>>>> options e1000 RxIntDelay=0,0 RxAbsIntDelay=0,0 InterruptThrottleRate=0,0 >>>>> TxIntDelay=0,0 TxAbsIntDelay=0,0 >>>> that can't be the problem. irq moderation would only account for 2-3ms >>>> variance >>>> maximum. >>> Oh, I've definitely seen worse than that. Not as bad as a 1second though. >>> Plus you're talking >>> about the case when coalescing logic is working as designed ;-). What if >>> there is some kind of >>> bug where timer did not expire or something. >> we don't use a software timer in e1000 irq coalescing/moderation, it's all in >> hardware, so we don't have that problem at all. And I certainly have never >> seen >> anything you are referring to with e1000 hardware, and I do not know of any >> bug >> related to this. >> >> are you maybe confused with other hardware ? >> >> feel free to demonstrate an example... > > Just to give you a background. I wrote and maintain http://libe1000.sf.net > So I know E1000 HW and SW in and out. wow, even I do not dare to say that! > And no I'm not confused with other HW and I know that we're > not using SW timers for the coalescing. HW can be buggy as well. Note that > I'm not saying that I > know for sure that the problem is coalescing, I'm just suggesting to take it > out of the equation > while Pavel is investigating. > > Unfortunately I cannot demonstrate an example but I've seen unexplained > packet delays in the range > of 1-20 milliseconds on E1000 HW (and boy ... I do have a lot of it in my > labs). Once coalescing > was disabled those problems have gone away. this sounds like you have some sort of PCI POST-ing problem and those can indeed be worse if you use any form of interrupt coalescing. In any case that is largely irrelevant to the in-kernel drivers, and as I said we definately have no open issues on that right now, and I really do not recollect any as well either (other than the issue of interference when both ends are irq coalescing) Cheers, Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] e1000 1sec latency problem
Pavel Machek wrote: > Hi! > > I have the famous e1000 latency problems: > > 64 bytes from 195.113.31.123: icmp_seq=68 ttl=56 time=351.9 ms > 64 bytes from 195.113.31.123: icmp_seq=69 ttl=56 time=209.2 ms > 64 bytes from 195.113.31.123: icmp_seq=70 ttl=56 time=1004.1 ms > 64 bytes from 195.113.31.123: icmp_seq=71 ttl=56 time=308.9 ms > 64 bytes from 195.113.31.123: icmp_seq=72 ttl=56 time=305.4 ms > 64 bytes from 195.113.31.123: icmp_seq=73 ttl=56 time=9.8 ms > 64 bytes from 195.113.31.123: icmp_seq=74 ttl=56 time=3.7 ms > > ...and they are still there in 2.6.25-git0. I had ethernet EEPROM > checksum problems, which I fixed by the update, but problems are not > gone. pavel, start using "e1000e" instead - this driver replaces e1000 for all the pci-express devices and has the infamous L1 ASPM disable patch to fix this issue. make sure you have CONFIG_E1000E=m/y in your .config, otherwise the old e1000 code will drive your card, and that driver does not have the fix. BAH, this is a good example how Linus' patch can wreak havoc - a lot of people will now not see fixes since they only go into e1000e, but people can unnoticed now go and use e1000 for too long... Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] e1000 1sec latency problem
Max Krasnyansky wrote: > Kok, Auke wrote: >> Max Krasnyansky wrote: >>> So you don't think it's related to the interrupt coalescing by any chance ? >>> I'd suggest to try and disable the coalescing and see if it makes any >>> difference. >>> We've had lots of issues with coalescing misbehavior. Not this bad (ie 1 >>> second) though. >>> >>> Add this to modprobe.conf and reload e1000 module >>> >>> options e1000 RxIntDelay=0,0 RxAbsIntDelay=0,0 InterruptThrottleRate=0,0 >>> TxIntDelay=0,0 TxAbsIntDelay=0,0 >> that can't be the problem. irq moderation would only account for 2-3ms >> variance >> maximum. > Oh, I've definitely seen worse than that. Not as bad as a 1second though. > Plus you're talking > about the case when coalescing logic is working as designed ;-). What if > there is some kind of > bug where timer did not expire or something. we don't use a software timer in e1000 irq coalescing/moderation, it's all in hardware, so we don't have that problem at all. And I certainly have never seen anything you are referring to with e1000 hardware, and I do not know of any bug related to this. are you maybe confused with other hardware? feel free to demonstrate an example... Cheers, Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] e1000 1sec latency problem
Max Krasnyansky wrote: > Pavel Machek wrote: >> Hi! >> >> I have the famous e1000 latency problems: >> >> 64 bytes from 195.113.31.123: icmp_seq=68 ttl=56 time=351.9 ms >> 64 bytes from 195.113.31.123: icmp_seq=69 ttl=56 time=209.2 ms >> 64 bytes from 195.113.31.123: icmp_seq=70 ttl=56 time=1004.1 ms >> 64 bytes from 195.113.31.123: icmp_seq=71 ttl=56 time=308.9 ms >> 64 bytes from 195.113.31.123: icmp_seq=72 ttl=56 time=305.4 ms >> 64 bytes from 195.113.31.123: icmp_seq=73 ttl=56 time=9.8 ms >> 64 bytes from 195.113.31.123: icmp_seq=74 ttl=56 time=3.7 ms >> >> ...and they are still there in 2.6.25-git0. I had ethernet EEPROM >> checksum problems, which I fixed by the update, but problems are not >> gone. >> >> irqpoll helps. >> >> nosmp (which implies XT-PIC is being used) does not help. >> >> 16: 1925 0 IO-APIC-fasteoi ahci, yenta, uhci_hcd:usb2, >> eth0 >> >> Booting kernel with nosmp/ no yenta, no usb does not help. >> >> Hmm, as expected, interrupt load on ahci (find /) makes latencies go >> away. >> >> It should be easily reproducible on x60 with latest bios, it is 100% >> reproducible for me... > > So you don't think it's related to the interrupt coalescing by any chance ? > I'd suggest to try and disable the coalescing and see if it makes any > difference. > We've had lots of issues with coalescing misbehavior. Not this bad (ie 1 > second) though. > > Add this to modprobe.conf and reload e1000 module > > options e1000 RxIntDelay=0,0 RxAbsIntDelay=0,0 InterruptThrottleRate=0,0 > TxIntDelay=0,0 TxAbsIntDelay=0,0 that can't be the problem. irq moderation would only account for 2-3ms variance maximum. Pavel, can you send me the `lspci -vvv` of your machine with the very latest git tree and after it's showing the poor ping performance? Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] e1000 1sec latency problem
Pavel Machek wrote: Hi! I have the famous e1000 latency problems: 64 bytes from 195.113.31.123: icmp_seq=68 ttl=56 time=351.9 ms 64 bytes from 195.113.31.123: icmp_seq=69 ttl=56 time=209.2 ms 64 bytes from 195.113.31.123: icmp_seq=70 ttl=56 time=1004.1 ms 64 bytes from 195.113.31.123: icmp_seq=71 ttl=56 time=308.9 ms 64 bytes from 195.113.31.123: icmp_seq=72 ttl=56 time=305.4 ms 64 bytes from 195.113.31.123: icmp_seq=73 ttl=56 time=9.8 ms 64 bytes from 195.113.31.123: icmp_seq=74 ttl=56 time=3.7 ms ...and they are still there in 2.6.25-git0. I had ethernet EEPROM checksum problems, which I fixed by the update, but problems are not gone. pavel, start using e1000e instead - this driver replaces e1000 for all the pci-express devices and has the infamous L1 ASPM disable patch to fix this issue. make sure you have CONFIG_E1000E=m/y in your .config, otherwise the old e1000 code will drive your card, and that driver does not have the fix. BAH, this is a good example how Linus' patch can wreak havoc - a lot of people will now not see fixes since they only go into e1000e, but people can unnoticed now go and use e1000 for too long... Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] e1000 1sec latency problem
Max Krasnyansky wrote: Kok, Auke wrote: Max Krasnyansky wrote: Kok, Auke wrote: Max Krasnyansky wrote: So you don't think it's related to the interrupt coalescing by any chance ? I'd suggest to try and disable the coalescing and see if it makes any difference. We've had lots of issues with coalescing misbehavior. Not this bad (ie 1 second) though. Add this to modprobe.conf and reload e1000 module options e1000 RxIntDelay=0,0 RxAbsIntDelay=0,0 InterruptThrottleRate=0,0 TxIntDelay=0,0 TxAbsIntDelay=0,0 that can't be the problem. irq moderation would only account for 2-3ms variance maximum. Oh, I've definitely seen worse than that. Not as bad as a 1second though. Plus you're talking about the case when coalescing logic is working as designed ;-). What if there is some kind of bug where timer did not expire or something. we don't use a software timer in e1000 irq coalescing/moderation, it's all in hardware, so we don't have that problem at all. And I certainly have never seen anything you are referring to with e1000 hardware, and I do not know of any bug related to this. are you maybe confused with other hardware ? feel free to demonstrate an example... Just to give you a background. I wrote and maintain http://libe1000.sf.net So I know E1000 HW and SW in and out. wow, even I do not dare to say that! And no I'm not confused with other HW and I know that we're not using SW timers for the coalescing. HW can be buggy as well. Note that I'm not saying that I know for sure that the problem is coalescing, I'm just suggesting to take it out of the equation while Pavel is investigating. Unfortunately I cannot demonstrate an example but I've seen unexplained packet delays in the range of 1-20 milliseconds on E1000 HW (and boy ... I do have a lot of it in my labs). Once coalescing was disabled those problems have gone away. this sounds like you have some sort of PCI POST-ing problem and those can indeed be worse if you use any form of interrupt coalescing. In any case that is largely irrelevant to the in-kernel drivers, and as I said we definately have no open issues on that right now, and I really do not recollect any as well either (other than the issue of interference when both ends are irq coalescing) Cheers, Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] e1000 1sec latency problem
Max Krasnyansky wrote: Kok, Auke wrote: Max Krasnyansky wrote: So you don't think it's related to the interrupt coalescing by any chance ? I'd suggest to try and disable the coalescing and see if it makes any difference. We've had lots of issues with coalescing misbehavior. Not this bad (ie 1 second) though. Add this to modprobe.conf and reload e1000 module options e1000 RxIntDelay=0,0 RxAbsIntDelay=0,0 InterruptThrottleRate=0,0 TxIntDelay=0,0 TxAbsIntDelay=0,0 that can't be the problem. irq moderation would only account for 2-3ms variance maximum. Oh, I've definitely seen worse than that. Not as bad as a 1second though. Plus you're talking about the case when coalescing logic is working as designed ;-). What if there is some kind of bug where timer did not expire or something. we don't use a software timer in e1000 irq coalescing/moderation, it's all in hardware, so we don't have that problem at all. And I certainly have never seen anything you are referring to with e1000 hardware, and I do not know of any bug related to this. are you maybe confused with other hardware? feel free to demonstrate an example... Cheers, Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] e1000 1sec latency problem
Pavel Machek wrote: Hi! I have the famous e1000 latency problems: 64 bytes from 195.113.31.123: icmp_seq=68 ttl=56 time=351.9 ms 64 bytes from 195.113.31.123: icmp_seq=69 ttl=56 time=209.2 ms 64 bytes from 195.113.31.123: icmp_seq=70 ttl=56 time=1004.1 ms 64 bytes from 195.113.31.123: icmp_seq=71 ttl=56 time=308.9 ms 64 bytes from 195.113.31.123: icmp_seq=72 ttl=56 time=305.4 ms 64 bytes from 195.113.31.123: icmp_seq=73 ttl=56 time=9.8 ms 64 bytes from 195.113.31.123: icmp_seq=74 ttl=56 time=3.7 ms ...and they are still there in 2.6.25-git0. I had ethernet EEPROM checksum problems, which I fixed by the update, but problems are not gone. pavel, start using e1000e instead - this driver replaces e1000 for all the pci-express devices and has the infamous L1 ASPM disable patch to fix this issue. Ok, e1000e seems to work for me. In another email, you asked for lspci - of failing e1000 case. Should I still provide it? well, if you do it you should see that L1 ASPM is now disabled (with e1000e) whereas with e1000 it is still enabled. That's the fix that you need... Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] e1000 1sec latency problem
Pavel Machek wrote: On Thu 2008-02-07 14:32:16, Kok, Auke wrote: Pavel Machek wrote: Hi! I have the famous e1000 latency problems: 64 bytes from 195.113.31.123: icmp_seq=68 ttl=56 time=351.9 ms 64 bytes from 195.113.31.123: icmp_seq=69 ttl=56 time=209.2 ms 64 bytes from 195.113.31.123: icmp_seq=70 ttl=56 time=1004.1 ms 64 bytes from 195.113.31.123: icmp_seq=71 ttl=56 time=308.9 ms 64 bytes from 195.113.31.123: icmp_seq=72 ttl=56 time=305.4 ms 64 bytes from 195.113.31.123: icmp_seq=73 ttl=56 time=9.8 ms 64 bytes from 195.113.31.123: icmp_seq=74 ttl=56 time=3.7 ms ...and they are still there in 2.6.25-git0. I had ethernet EEPROM checksum problems, which I fixed by the update, but problems are not gone. pavel, start using e1000e instead - this driver replaces e1000 for all the pci-express devices and has the infamous L1 ASPM disable patch to fix this issue. Ok, e1000e seems to work for me. In another email, you asked for lspci - of failing e1000 case. Should I still provide it? well, if you do it you should see that L1 ASPM is now disabled (with e1000e) whereas with e1000 it is still enabled. That's the fix that you need... Is there easy way to push that fix to e1000, too? Or print use e1000e instead and refuse to load? well we're going to delete all pci-e related code from this driver soon anyway, but I am indeed writing a patch right now that prints out this warning... Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCIE ASPM support hangs my laptop pretty often
Rafael J. Wysocki wrote: > On Wednesday, 6 of February 2008, Pavel Machek wrote: >> On Tue 2008-02-05 16:22:55, Kok, Auke wrote: >>> ?? ??? wrote: >>>>>>>> I've patched my kernel with the PCIe ASPM and after setting >>>>>>>> echo powersave > /sys/module/pcie_aspm/parameters/policy >>>>>>>> >>>>>>>> I started to experience random hangs of my laptop. >>>>>>>> Hardware info: >>>>>>>> Thinkpad x60s 1704-5UG >>>>>>> the x60's chipset doesn't support ASPM properly afaik... bad idea. >>>>>> Well, the code shouldn't then cause a crash of the machine :) >>>>> The user enabled it specifically (where it is disabled by default) >>>>> >>>>> ASPM has been crashing e1000(e), which is why I've recently merged a patch >>>>> to disable L1 ASPM for the onboard 82573 nic on those platforms. >>>>> >>>>> this new infrastructure should work in the default configuration - >>>>> enabling >>>>> ASPM where this system leaves it disabled is expected to give problems >>>>> unless you know what you are doing. >>>> In my defense, the patch documentation didn't say it doesn't work with my >>>> hardware, nor that it hangs the chipset :) and the promised 1.3w surelly >>>> looked nice. >>>> >>>> So, are there any benefits of ASPM if I have it in the kernel but it's set >>>> to >>>> default? I got the impression that "default" means not much power savings? >>> did the Kconfig not come with a big fat (EXPERIMENTAL) ? >> (EXPERIMENTAL) is something different from (KNOWN BROKEN). >> >> If we know about broken setups, we should probably be blacklisting >> them. > > Well, the ASPM thing seems to break every single setup I've tested. So, > perhaps we should whitelist the working ones? greg KH is reverting this patch alltogether in mainline, maybe the original writer can accomodate some of the comments in the rewrite. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCIE ASPM support hangs my laptop pretty often
Rafael J. Wysocki wrote: On Wednesday, 6 of February 2008, Pavel Machek wrote: On Tue 2008-02-05 16:22:55, Kok, Auke wrote: ?? ??? wrote: I've patched my kernel with the PCIe ASPM and after setting echo powersave /sys/module/pcie_aspm/parameters/policy I started to experience random hangs of my laptop. Hardware info: Thinkpad x60s 1704-5UG the x60's chipset doesn't support ASPM properly afaik... bad idea. Well, the code shouldn't then cause a crash of the machine :) The user enabled it specifically (where it is disabled by default) ASPM has been crashing e1000(e), which is why I've recently merged a patch to disable L1 ASPM for the onboard 82573 nic on those platforms. this new infrastructure should work in the default configuration - enabling ASPM where this system leaves it disabled is expected to give problems unless you know what you are doing. In my defense, the patch documentation didn't say it doesn't work with my hardware, nor that it hangs the chipset :) and the promised 1.3w surelly looked nice. So, are there any benefits of ASPM if I have it in the kernel but it's set to default? I got the impression that default means not much power savings? did the Kconfig not come with a big fat (EXPERIMENTAL) ? (EXPERIMENTAL) is something different from (KNOWN BROKEN). If we know about broken setups, we should probably be blacklisting them. Well, the ASPM thing seems to break every single setup I've tested. So, perhaps we should whitelist the working ones? greg KH is reverting this patch alltogether in mainline, maybe the original writer can accomodate some of the comments in the rewrite. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCIE ASPM support hangs my laptop pretty often
?? ??? wrote: > I've patched my kernel with the PCIe ASPM and after setting > echo powersave > /sys/module/pcie_aspm/parameters/policy > > I started to experience random hangs of my laptop. > Hardware info: > Thinkpad x60s 1704-5UG the x60's chipset doesn't support ASPM properly afaik... bad idea. >>> Well, the code shouldn't then cause a crash of the machine :) >> The user enabled it specifically (where it is disabled by default) >> >> ASPM has been crashing e1000(e), which is why I've recently merged a patch >> to disable L1 ASPM for the onboard 82573 nic on those platforms. >> >> this new infrastructure should work in the default configuration - enabling >> ASPM where this system leaves it disabled is expected to give problems >> unless you know what you are doing. > > In my defense, the patch documentation didn't say it doesn't work with my > hardware, nor that it hangs the chipset :) and the promised 1.3w surelly > looked nice. > > So, are there any benefits of ASPM if I have it in the kernel but it's set to > default? I got the impression that "default" means not much power savings? did the Kconfig not come with a big fat (EXPERIMENTAL) ? it actually depends for each device on the PCI-Express bus. Most PCI-E ports support it but the device has the option of advertising enablement of that capability or not. both platform and each device on the pci-e bus are involved. some sata chipsets work great with it, some that might not even advertise the capability... but it's really hit and miss. Your report is great of course, no doubt about it. I hope that people understand that this feature can seriously break things at the bus level. It makes me feel a lot better about the issues we had with some of our network cards and ASPM :) once we get some feeling about how good ASPM works in the field for people we might have to blacklist certain platforms or devices. you could (for instance) try to see which device on your busses support ASPM and work on per-device ASPM parameters (which is one of the things I suggested before) so that we get an idea of which device is badly behaving with ASPM on your system. Cheers, Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCIE ASPM support hangs my laptop pretty often
Greg KH wrote: > On Tue, Feb 05, 2008 at 10:46:23AM -0800, Arjan van de Ven wrote: >> On Tue, 5 Feb 2008 18:40:04 +0100 >> ?? <[EMAIL PROTECTED]> wrote: >> >>> I've patched my kernel with the PCIe ASPM and after setting >>> echo powersave > /sys/module/pcie_aspm/parameters/policy >>> >>> I started to experience random hangs of my laptop. >>> Hardware info: >>> Thinkpad x60s 1704-5UG >> the x60's chipset doesn't support ASPM properly afaik... bad idea. > > Well, the code shouldn't then cause a crash of the machine :) The user enabled it specifically (where it is disabled by default) ASPM has been crashing e1000(e), which is why I've recently merged a patch to disable L1 ASPM for the onboard 82573 nic on those platforms. this new infrastructure should work in the default configuration - enabling ASPM where this system leaves it disabled is expected to give problems unless you know what you are doing. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCIE ASPM support hangs my laptop pretty often
Greg KH wrote: On Tue, Feb 05, 2008 at 10:46:23AM -0800, Arjan van de Ven wrote: On Tue, 5 Feb 2008 18:40:04 +0100 ?? [EMAIL PROTECTED] wrote: I've patched my kernel with the PCIe ASPM and after setting echo powersave /sys/module/pcie_aspm/parameters/policy I started to experience random hangs of my laptop. Hardware info: Thinkpad x60s 1704-5UG the x60's chipset doesn't support ASPM properly afaik... bad idea. Well, the code shouldn't then cause a crash of the machine :) The user enabled it specifically (where it is disabled by default) ASPM has been crashing e1000(e), which is why I've recently merged a patch to disable L1 ASPM for the onboard 82573 nic on those platforms. this new infrastructure should work in the default configuration - enabling ASPM where this system leaves it disabled is expected to give problems unless you know what you are doing. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCIE ASPM support hangs my laptop pretty often
?? ??? wrote: I've patched my kernel with the PCIe ASPM and after setting echo powersave /sys/module/pcie_aspm/parameters/policy I started to experience random hangs of my laptop. Hardware info: Thinkpad x60s 1704-5UG the x60's chipset doesn't support ASPM properly afaik... bad idea. Well, the code shouldn't then cause a crash of the machine :) The user enabled it specifically (where it is disabled by default) ASPM has been crashing e1000(e), which is why I've recently merged a patch to disable L1 ASPM for the onboard 82573 nic on those platforms. this new infrastructure should work in the default configuration - enabling ASPM where this system leaves it disabled is expected to give problems unless you know what you are doing. In my defense, the patch documentation didn't say it doesn't work with my hardware, nor that it hangs the chipset :) and the promised 1.3w surelly looked nice. So, are there any benefits of ASPM if I have it in the kernel but it's set to default? I got the impression that default means not much power savings? did the Kconfig not come with a big fat (EXPERIMENTAL) ? it actually depends for each device on the PCI-Express bus. Most PCI-E ports support it but the device has the option of advertising enablement of that capability or not. both platform and each device on the pci-e bus are involved. some sata chipsets work great with it, some that might not even advertise the capability... but it's really hit and miss. Your report is great of course, no doubt about it. I hope that people understand that this feature can seriously break things at the bus level. It makes me feel a lot better about the issues we had with some of our network cards and ASPM :) once we get some feeling about how good ASPM works in the field for people we might have to blacklist certain platforms or devices. you could (for instance) try to see which device on your busses support ASPM and work on per-device ASPM parameters (which is one of the things I suggested before) so that we get an idea of which device is badly behaving with ASPM on your system. Cheers, Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] make e1000_dump_eeprom() static
Adrian Bunk wrote: > This patch makes the needlessly global e1000_dump_eeprom() static. > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> yes, thanks, I'll push it to Jeff. Auke > > --- > b5fd924a1388d4aaa94cf05e42e317c2b1fb5748 > diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c > index 7f5b2ae..8a6645b 100644 > --- a/drivers/net/e1000/e1000_main.c > +++ b/drivers/net/e1000/e1000_main.c > @@ -820,7 +820,7 @@ e1000_reset(struct e1000_adapter *adapter) > /** > * Dump the eeprom for users having checksum issues > **/ > -void e1000_dump_eeprom(struct e1000_adapter *adapter) > +static void e1000_dump_eeprom(struct e1000_adapter *adapter) > { > struct net_device *netdev = adapter->netdev; > struct ethtool_eeprom eeprom; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] e1000e/ethtool.c: make a function static
Adrian Bunk wrote: > This patch makes the needlessly global reg_pattern_test_array() static. > > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> stephen hemminger already pointed this out to me... I'll certainly push this upstream, thanks Adrian! Auke > > --- > ed72e457f06311390d9a9e51a00c904939466aff > diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c > index 6d9c27f..a2034cf 100644 > --- a/drivers/net/e1000e/ethtool.c > +++ b/drivers/net/e1000e/ethtool.c > @@ -690,8 +690,8 @@ err_setup: > return err; > } > > -bool reg_pattern_test_array(struct e1000_adapter *adapter, u64 *data, > - int reg, int offset, u32 mask, u32 write) > +static bool reg_pattern_test_array(struct e1000_adapter *adapter, u64 *data, > +int reg, int offset, u32 mask, u32 write) > { > int i; > u32 read; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] e100 driver didn't support any MII-less PHYs...
Andreas Mohr wrote: > Hi, > > On Tue, Jan 29, 2008 at 03:09:25PM -0800, Kok, Auke wrote: >> Andreas Mohr wrote: >>> Perhaps it's useful to file a bug/patch >>> on http://sourceforge.net/projects/e1000/ ? Perhaps -mm testing? >> I wanted to push this though our testing labs first which has not happened >> due to >> time constraints - that should quickly at least confirm that the most common >> nics >> work OK after the change with your patch. I'll try and see if we can get this >> testing done soon. > > Oh, full-scale regression testing even? Nice idea... > Would optionally be even better if during hardware tests one could also > dig out some i82503-based card (or additional MII-less cards?) > since I didn't really make any effort yet to try to make them all > recognized/supported by my patch already (would have been out of scope anyway > since I have this single card only). the problem is that I think that most of those (mii-less cards) are customly designed by OEM's that buy the silicon and glue on a different interface and we usually do not carry those designs in our labs apart from a few exceptions. So, I can at least touch the common hardware in testing, but not the exotic stuff. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Mostly revert "e1000/e1000e: Move PCI-Express device IDs over to e1000e"
Jeff Garzik wrote: > Linus Torvalds wrote: >> >> On Tue, 29 Jan 2008, Randy Dunlap wrote: >>> Andrew was concerned about this when the driver was in -mm. >>> He asked for a patch that would set E1000E to same value as E1000 >>> and I supplied that. Auke acked it IIRC. Other people vetoed it. :( >> >> Yeah, I've been discussing with Jeff and the gang. >> >> I think we have agreed on a solution where the ID's show up in the old >> driver if the new driver is not enabled at all. >> >> (And as a side note: it turns out that the problem I experienced >> didn't come from the new e1000e driver after all, so I'll be removing >> the EXPERIMENTAL flag again). >> >> So I'd suggest the final patch be something like this, but I'm sendign >> it out just as an example of how we could solve this, not necessarily >> as a final patch. >> >> Jeff, Auke, would something like this be acceptable? It makes it very >> obvious in the driver table which entries are for the PCIE versions >> that would be handled by the E1000E driver if it is enabled.. >> >> Untested, but as mentioned, this is more of a "this looks maintainable >> and like it should solve the issues" rather than anything I was >> planning on committing now. >> >> Linus >> --- >> drivers/net/Kconfig|5 ++- >> drivers/net/e1000/e1000_main.c | 60 >> ++-- >> 2 files changed, 37 insertions(+), 28 deletions(-) >> >> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig >> index 5a2d1dd..6c57540 100644 >> --- a/drivers/net/Kconfig >> +++ b/drivers/net/Kconfig >> @@ -1992,7 +1992,7 @@ config E1000_DISABLE_PACKET_SPLIT >> >> config E1000E >> tristate "Intel(R) PRO/1000 PCI-Express Gigabit Ethernet support" >> -depends on PCI && EXPERIMENTAL >> +depends on PCI >> ---help--- >>This driver supports the PCI-Express Intel(R) PRO/1000 gigabit >>ethernet family of adapters. For PCI or PCI-X e1000 adapters, >> @@ -2009,6 +2009,9 @@ config E1000E >>To compile this driver as a module, choose M here. The module >>will be called e1000e. >> >> +config E1000E_ENABLED >> +def_bool E1000E != n >> + >> config IP1000 >> tristate "IP1000 Gigabit Ethernet support" >> depends on PCI && EXPERIMENTAL >> diff --git a/drivers/net/e1000/e1000_main.c >> b/drivers/net/e1000/e1000_main.c >> index 3111af6..8c87940 100644 >> --- a/drivers/net/e1000/e1000_main.c >> +++ b/drivers/net/e1000/e1000_main.c >> @@ -47,6 +47,12 @@ static const char e1000_copyright[] = "Copyright >> (c) 1999-2006 Intel Corporation >> * Macro expands to... >> * {PCI_DEVICE(PCI_VENDOR_ID_INTEL, device_id)} >> */ >> +#ifdef CONFIG_E1000E_ENABLED >> + #define PCIE(x) +#else >> + #define PCIE(x) x, >> +#endif > > Patch gets my ACK, if you like, though an improvement would be to have > your Kconfig logic activate CONFIG_E1000_PCIEX. Then future janitors > could come along and disable unused code in addition to PCI IDs. Ack from my side as well, allthough I hope that this code will not live long as I would love to start taking out pci-e code out of e1000. If we merge this patch then I suggest that we don't do that until for at least a whole cycle, since it does not make much sense otherwise. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] Net: e100, fix iomap mem accesses
Jiri Slaby wrote: > Patch against netdev-2.6 follows. > -- > writeX functions are not permitted on iomap-ped space change to iowriteX, > also pci_unmap pci_map-ped space on exit (instead of iounmap). > > Signed-off-by: Jiri Slaby <[EMAIL PROTECTED]> > --- > drivers/net/e100.c |8 > 1 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/e100.c b/drivers/net/e100.c > index 51cf577..47548ef 100644 > --- a/drivers/net/e100.c > +++ b/drivers/net/e100.c > @@ -1904,7 +1904,7 @@ static void e100_rx_clean(struct nic *nic, unsigned int > *work_done, > > if(restart_required) { > // ack the rnr? > - writeb(stat_ack_rnr, >csr->scb.stat_ack); > + iowrite8(stat_ack_rnr, >csr->scb.stat_ack); > e100_start_receiver(nic, rx_to_start); > if(work_done) > (*work_done)++; > @@ -2706,7 +2706,7 @@ static void __devexit e100_remove(struct pci_dev *pdev) > struct nic *nic = netdev_priv(netdev); > unregister_netdev(netdev); > e100_free(nic); > - iounmap(nic->csr); > + pci_iounmap(pdev, nic->csr); > free_netdev(netdev); > pci_release_regions(pdev); > pci_disable_device(pdev); Acked-By: Auke Kok <[EMAIL PROTECTED]> Jeff, feel free to merge in upstream-fixes. Cheers, Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] Net: e100, fix iomap mem accesses
Jiri Slaby wrote: Patch against netdev-2.6 follows. -- writeX functions are not permitted on iomap-ped space change to iowriteX, also pci_unmap pci_map-ped space on exit (instead of iounmap). Signed-off-by: Jiri Slaby [EMAIL PROTECTED] --- drivers/net/e100.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/e100.c b/drivers/net/e100.c index 51cf577..47548ef 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -1904,7 +1904,7 @@ static void e100_rx_clean(struct nic *nic, unsigned int *work_done, if(restart_required) { // ack the rnr? - writeb(stat_ack_rnr, nic-csr-scb.stat_ack); + iowrite8(stat_ack_rnr, nic-csr-scb.stat_ack); e100_start_receiver(nic, rx_to_start); if(work_done) (*work_done)++; @@ -2706,7 +2706,7 @@ static void __devexit e100_remove(struct pci_dev *pdev) struct nic *nic = netdev_priv(netdev); unregister_netdev(netdev); e100_free(nic); - iounmap(nic-csr); + pci_iounmap(pdev, nic-csr); free_netdev(netdev); pci_release_regions(pdev); pci_disable_device(pdev); Acked-By: Auke Kok [EMAIL PROTECTED] Jeff, feel free to merge in upstream-fixes. Cheers, Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Mostly revert e1000/e1000e: Move PCI-Express device IDs over to e1000e
Jeff Garzik wrote: Linus Torvalds wrote: On Tue, 29 Jan 2008, Randy Dunlap wrote: Andrew was concerned about this when the driver was in -mm. He asked for a patch that would set E1000E to same value as E1000 and I supplied that. Auke acked it IIRC. Other people vetoed it. :( Yeah, I've been discussing with Jeff and the gang. I think we have agreed on a solution where the ID's show up in the old driver if the new driver is not enabled at all. (And as a side note: it turns out that the problem I experienced didn't come from the new e1000e driver after all, so I'll be removing the EXPERIMENTAL flag again). So I'd suggest the final patch be something like this, but I'm sendign it out just as an example of how we could solve this, not necessarily as a final patch. Jeff, Auke, would something like this be acceptable? It makes it very obvious in the driver table which entries are for the PCIE versions that would be handled by the E1000E driver if it is enabled.. Untested, but as mentioned, this is more of a this looks maintainable and like it should solve the issues rather than anything I was planning on committing now. Linus --- drivers/net/Kconfig|5 ++- drivers/net/e1000/e1000_main.c | 60 ++-- 2 files changed, 37 insertions(+), 28 deletions(-) diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 5a2d1dd..6c57540 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -1992,7 +1992,7 @@ config E1000_DISABLE_PACKET_SPLIT config E1000E tristate Intel(R) PRO/1000 PCI-Express Gigabit Ethernet support -depends on PCI EXPERIMENTAL +depends on PCI ---help--- This driver supports the PCI-Express Intel(R) PRO/1000 gigabit ethernet family of adapters. For PCI or PCI-X e1000 adapters, @@ -2009,6 +2009,9 @@ config E1000E To compile this driver as a module, choose M here. The module will be called e1000e. +config E1000E_ENABLED +def_bool E1000E != n + config IP1000 tristate IP1000 Gigabit Ethernet support depends on PCI EXPERIMENTAL diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 3111af6..8c87940 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -47,6 +47,12 @@ static const char e1000_copyright[] = Copyright (c) 1999-2006 Intel Corporation * Macro expands to... * {PCI_DEVICE(PCI_VENDOR_ID_INTEL, device_id)} */ +#ifdef CONFIG_E1000E_ENABLED + #define PCIE(x) +#else + #define PCIE(x) x, +#endif Patch gets my ACK, if you like, though an improvement would be to have your Kconfig logic activate CONFIG_E1000_PCIEX. Then future janitors could come along and disable unused code in addition to PCI IDs. Ack from my side as well, allthough I hope that this code will not live long as I would love to start taking out pci-e code out of e1000. If we merge this patch then I suggest that we don't do that until for at least a whole cycle, since it does not make much sense otherwise. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] e100 driver didn't support any MII-less PHYs...
Andreas Mohr wrote: Hi, On Tue, Jan 29, 2008 at 03:09:25PM -0800, Kok, Auke wrote: Andreas Mohr wrote: Perhaps it's useful to file a bug/patch on http://sourceforge.net/projects/e1000/ ? Perhaps -mm testing? I wanted to push this though our testing labs first which has not happened due to time constraints - that should quickly at least confirm that the most common nics work OK after the change with your patch. I'll try and see if we can get this testing done soon. Oh, full-scale regression testing even? Nice idea... Would optionally be even better if during hardware tests one could also dig out some i82503-based card (or additional MII-less cards?) since I didn't really make any effort yet to try to make them all recognized/supported by my patch already (would have been out of scope anyway since I have this single card only). the problem is that I think that most of those (mii-less cards) are customly designed by OEM's that buy the silicon and glue on a different interface and we usually do not carry those designs in our labs apart from a few exceptions. So, I can at least touch the common hardware in testing, but not the exotic stuff. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] e1000e/ethtool.c: make a function static
Adrian Bunk wrote: This patch makes the needlessly global reg_pattern_test_array() static. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] stephen hemminger already pointed this out to me... I'll certainly push this upstream, thanks Adrian! Auke --- ed72e457f06311390d9a9e51a00c904939466aff diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c index 6d9c27f..a2034cf 100644 --- a/drivers/net/e1000e/ethtool.c +++ b/drivers/net/e1000e/ethtool.c @@ -690,8 +690,8 @@ err_setup: return err; } -bool reg_pattern_test_array(struct e1000_adapter *adapter, u64 *data, - int reg, int offset, u32 mask, u32 write) +static bool reg_pattern_test_array(struct e1000_adapter *adapter, u64 *data, +int reg, int offset, u32 mask, u32 write) { int i; u32 read; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] make e1000_dump_eeprom() static
Adrian Bunk wrote: This patch makes the needlessly global e1000_dump_eeprom() static. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] yes, thanks, I'll push it to Jeff. Auke --- b5fd924a1388d4aaa94cf05e42e317c2b1fb5748 diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 7f5b2ae..8a6645b 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -820,7 +820,7 @@ e1000_reset(struct e1000_adapter *adapter) /** * Dump the eeprom for users having checksum issues **/ -void e1000_dump_eeprom(struct e1000_adapter *adapter) +static void e1000_dump_eeprom(struct e1000_adapter *adapter) { struct net_device *netdev = adapter-netdev; struct ethtool_eeprom eeprom; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] e100 driver didn't support any MII-less PHYs...
Andreas Mohr wrote: > Hi, > > On Tue, Jan 01, 2008 at 09:09:08PM +0100, Andreas Mohr wrote: >> Thanks for your quick reply! >> >> OK, here's part 1, the MII-less support stuff. >> (preliminary posting, for review only) >> >> Note that these diffs apply to 2.6.24-rc6-mm1 without much trouble, >> thus might want to do -mm testing soon. > > Any verdict on this one? > > I happen to be asking now since silly me just ""upgraded"" a mere mortal's > sorta-production machine to 2.6.24 proper without remembering > that the previous -rc6 had contained a minor but effective change > to make those wires do their thing. Or, to tell it as it was, > "Mom wasn't impressed ;)". > > Perhaps it's useful to file a bug/patch > on http://sourceforge.net/projects/e1000/ ? Perhaps -mm testing? I wanted to push this though our testing labs first which has not happened due to time constraints - that should quickly at least confirm that the most common nics work OK after the change with your patch. I'll try and see if we can get this testing done soon. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] Net: e100, fix iomap mem accesses
Jiri Slaby wrote: > On 01/28/2008 11:31 PM, Kok, Auke wrote: >> Andrew Morton wrote: >>> Please resend when convenient. Maybe more luodly or something, I dunno. >> >> just repost to me and Jeff and I'll pick it up this week if Jeff does >> not. > > Sent few hours ago, you should had received a copy, hadn't you? nothing yet, can you resend it to me again? Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] Net: e100, fix iomap mem accesses
Jiri Slaby wrote: On 01/28/2008 11:31 PM, Kok, Auke wrote: Andrew Morton wrote: Please resend when convenient. Maybe more luodly or something, I dunno. just repost to me and Jeff and I'll pick it up this week if Jeff does not. Sent few hours ago, you should had received a copy, hadn't you? nothing yet, can you resend it to me again? Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] e100 driver didn't support any MII-less PHYs...
Andreas Mohr wrote: Hi, On Tue, Jan 01, 2008 at 09:09:08PM +0100, Andreas Mohr wrote: Thanks for your quick reply! OK, here's part 1, the MII-less support stuff. (preliminary posting, for review only) Note that these diffs apply to 2.6.24-rc6-mm1 without much trouble, thus might want to do -mm testing soon. Any verdict on this one? I happen to be asking now since silly me just upgraded a mere mortal's sorta-production machine to 2.6.24 proper without remembering that the previous -rc6 had contained a minor but effective change to make those wires do their thing. Or, to tell it as it was, Mom wasn't impressed ;). Perhaps it's useful to file a bug/patch on http://sourceforge.net/projects/e1000/ ? Perhaps -mm testing? I wanted to push this though our testing labs first which has not happened due to time constraints - that should quickly at least confirm that the most common nics work OK after the change with your patch. I'll try and see if we can get this testing done soon. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]PCIE ASPM support - takes 3
Pavel Machek wrote: > Hi! > >> v3->v2, fixed the issues Matthew Wilcox raised. >> >> PCI Express ASPM defines a protocol for PCI Express components in the D0 >> state to reduce Link power by placing their Links into a low power state >> and instructing the other end of the Link to do likewise. This >> capability allows hardware-autonomous, dynamic Link power reduction >> beyond what is achievable by software-only controlled power management. >> However, The device should be configured by software appropriately. >> Enabling ASPM will save power, but will introduce device latency. > > How big is the latency? 1msec? 10msec? 100usec? the latency is different for each device but the timing is negotiated and the pci-e spec lists possible timings that can be used. The maximum is (I think...) 64usec but can be as low as 1 or 2 usec. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] Net: e100, fix iomap mem accesses
Andrew Morton wrote: > On Fri, 18 Jan 2008 14:38:51 -0500 Jeff Garzik <[EMAIL PROTECTED]> wrote: > >> Jiri Slaby wrote: >>> readX functions are not permitted on iomap-ped space change to ioreadX, >>> also pci_unmap pci_map-ped space on exit (instead of iounmap). >>> >>> Signed-off-by: Jiri Slaby <[EMAIL PROTECTED]> >>> --- >>> drivers/net/e100.c |8 >>> 1 files changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/net/e100.c b/drivers/net/e100.c >>> index 51cf577..47548ef 100644 >>> --- a/drivers/net/e100.c >>> +++ b/drivers/net/e100.c >>> @@ -1836,7 +1836,7 @@ static int e100_rx_indicate(struct nic *nic, struct >>> rx *rx, >>> if ((le16_to_cpu(rfd->command) & cb_el) && >>> (RU_RUNNING == nic->ru_running)) >>> >>> - if (readb(>csr->scb.status) & rus_no_res) >>> + if (ioread8(>csr->scb.status) & rus_no_res) >>> nic->ru_running = RU_SUSPENDED; >>> return -ENODATA; >>> } >>> @@ -1859,7 +1859,7 @@ static int e100_rx_indicate(struct nic *nic, struct >>> rx *rx, >>> if ((le16_to_cpu(rfd->command) & cb_el) && >>> (RU_RUNNING == nic->ru_running)) { >>> >>> - if (readb(>csr->scb.status) & rus_no_res) >>> + if (ioread8(>csr->scb.status) & rus_no_res) >>> nic->ru_running = RU_SUSPENDED; >>> } >>> >>> @@ -1958,7 +1958,7 @@ static void e100_rx_clean(struct nic *nic, unsigned >>> int *work_done, >>> >>> if(restart_required) { >>> // ack the rnr? >>> - writeb(stat_ack_rnr, >csr->scb.stat_ack); >>> + iowrite8(stat_ack_rnr, >csr->scb.stat_ack); >>> e100_start_receiver(nic, nic->rx_to_clean); >>> if(work_done) >>> (*work_done)++; >>> @@ -2774,7 +2774,7 @@ static void __devexit e100_remove(struct pci_dev >>> *pdev) >>> struct nic *nic = netdev_priv(netdev); >>> unregister_netdev(netdev); >>> e100_free(nic); >>> - iounmap(nic->csr); >>> + pci_iounmap(pdev, nic->csr); >>> free_netdev(netdev); >>> pci_release_regions(pdev); >> ACK, but patch doesn't seem to apply... > > It's been a week, nothing seems to have happened and the e100 maintainers > are asleep. not asleep, just pleasantly stuck on an atol in the south pacific, far far away from LCA :) > Please resend when convenient. Maybe more luodly or something, I dunno. just repost to me and Jeff and I'll pick it up this week if Jeff does not. I think the recent non-cache coherent fixes might have messed up the merge. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]PCIE ASPM support - takes 3
Pavel Machek wrote: Hi! v3-v2, fixed the issues Matthew Wilcox raised. PCI Express ASPM defines a protocol for PCI Express components in the D0 state to reduce Link power by placing their Links into a low power state and instructing the other end of the Link to do likewise. This capability allows hardware-autonomous, dynamic Link power reduction beyond what is achievable by software-only controlled power management. However, The device should be configured by software appropriately. Enabling ASPM will save power, but will introduce device latency. How big is the latency? 1msec? 10msec? 100usec? the latency is different for each device but the timing is negotiated and the pci-e spec lists possible timings that can be used. The maximum is (I think...) 64usec but can be as low as 1 or 2 usec. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] Net: e100, fix iomap mem accesses
Andrew Morton wrote: On Fri, 18 Jan 2008 14:38:51 -0500 Jeff Garzik [EMAIL PROTECTED] wrote: Jiri Slaby wrote: readX functions are not permitted on iomap-ped space change to ioreadX, also pci_unmap pci_map-ped space on exit (instead of iounmap). Signed-off-by: Jiri Slaby [EMAIL PROTECTED] --- drivers/net/e100.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/e100.c b/drivers/net/e100.c index 51cf577..47548ef 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -1836,7 +1836,7 @@ static int e100_rx_indicate(struct nic *nic, struct rx *rx, if ((le16_to_cpu(rfd-command) cb_el) (RU_RUNNING == nic-ru_running)) - if (readb(nic-csr-scb.status) rus_no_res) + if (ioread8(nic-csr-scb.status) rus_no_res) nic-ru_running = RU_SUSPENDED; return -ENODATA; } @@ -1859,7 +1859,7 @@ static int e100_rx_indicate(struct nic *nic, struct rx *rx, if ((le16_to_cpu(rfd-command) cb_el) (RU_RUNNING == nic-ru_running)) { - if (readb(nic-csr-scb.status) rus_no_res) + if (ioread8(nic-csr-scb.status) rus_no_res) nic-ru_running = RU_SUSPENDED; } @@ -1958,7 +1958,7 @@ static void e100_rx_clean(struct nic *nic, unsigned int *work_done, if(restart_required) { // ack the rnr? - writeb(stat_ack_rnr, nic-csr-scb.stat_ack); + iowrite8(stat_ack_rnr, nic-csr-scb.stat_ack); e100_start_receiver(nic, nic-rx_to_clean); if(work_done) (*work_done)++; @@ -2774,7 +2774,7 @@ static void __devexit e100_remove(struct pci_dev *pdev) struct nic *nic = netdev_priv(netdev); unregister_netdev(netdev); e100_free(nic); - iounmap(nic-csr); + pci_iounmap(pdev, nic-csr); free_netdev(netdev); pci_release_regions(pdev); ACK, but patch doesn't seem to apply... It's been a week, nothing seems to have happened and the e100 maintainers are asleep. not asleep, just pleasantly stuck on an atol in the south pacific, far far away from LCA :) Please resend when convenient. Maybe more luodly or something, I dunno. just repost to me and Jeff and I'll pick it up this week if Jeff does not. I think the recent non-cache coherent fixes might have messed up the merge. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
Rick Jones wrote: >> 1) Interrupts are being processed on both cpus: >> >> [EMAIL PROTECTED]:/root> cat /proc/interrupts >>CPU0 CPU1 >> 30:17037564530785 U3-MPIC Level eth0 > > IIRC none of the e1000 driven cards are multi-queue the pci-express variants are, but the functionality is almost always disabled (and relatively new anyway). even with multiqueue, you can still have only a single irq line (which defeats the purpose of course mostly). >, so while the above > shows that interrupts from eth0 have been processed on both CPUs at > various points in the past, it doesn't necessarily mean that they are > being processed on both CPUs at the same time right? never will, an irq can only be processed on one cpu at a time anyway, obviously the irq here has been migrated ONCE from one of the cpu's to the other. unfortunately you can't see from /proc/interrupts whether this happens frequently or not, or how many times it happened before. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
Chris Friesen wrote: > Kok, Auke wrote: > >> You're using 2.6.10... you can always replace the e1000 module with the >> out-of-tree version from e1000.sf.net, this might help a bit - the >> version in the >> 2.6.10 kernel is very very old. > > Do you have any reason to believe this would improve things? It seems > like the problem lies in the NAPI/softirq code rather than in the e1000 > driver itself, no? your real issue is that your userspace app is hogging the CPU. While network is not really cpu intensive, it does require that ample time at many intervals is given to the CPU to run cleanups and prevent FIFO issues. alternatively, you can increase your rx/tx ring descriptor count (with ethtool), which basically makes it easier for the hardware not to be serviced for a longer period, since there are more buffers available and the card can go longer on when userspace is hogging the CPU. >> it also appears that your app is eating up CPU time. perhaps setting >> the app to a >> nicer nice level might mitigate things a bit. > > If we're not handling the softirq work from ksoftirqd how would changing > scheduler settings affect anything? correct, it might not. >> Also turn off the in-kernel irq >> mitigation, it just causes cache misses and you really need the >> network irq to sit >> on a single cpu at most (if not all) the time to get the best >> performance. Use the >> userspace irqbalance daemon instead to achieve this. > > Using userspace irqbalance would be some effort to test and deploy > properly. However, as a quick test I tried setting the irq affinity for > this device and it didn't help. irqbalance is a simple userspace app that drops into any system seemlessly and does the best job all around - often it beats manual tuning of smp_affinity even ;) Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
Chris Friesen wrote: > Hi all, > > I've got an issue that's popped up with a deployed system running > 2.6.10. I'm looking for some help figuring out why incoming network > packets aren't being processed fast enough. > > After a recent userspace app change, we've started seeing packets being > dropped by the ethernet hardware (e1000, NAPI is enabled). The > error/dropped/fifo counts are going up in ethtool: > > rx_packets: 32180834 > rx_bytes: 5480756958 > rx_errors: 862506 > rx_dropped: 771345 > rx_length_errors: 0 > rx_over_errors: 0 > rx_crc_errors: 0 > rx_frame_errors: 0 > rx_fifo_errors: 91161 > rx_missed_errors: 91161 > > This link is receiving roughly 13K packets/sec, and we're dropping > roughly 51 packets/sec due to fifo errors. > > Increasing the rx descriptor ring size from 256 up to around 3000 or so > seems to make the problem stop, but it seems to me that this is just a > workaround for the latency in processing the incoming packets. > > So, I'm looking for some suggestions on how to fix this or to figure out > where the latency is coming from. > > Some additional information: > > > 1) Interrupts are being processed on both cpus: > > [EMAIL PROTECTED]:/root> cat /proc/interrupts >CPU0 CPU1 > 30:17037564530785 U3-MPIC Level eth0 > > > > > 2) "top" shows a fair amount of time processing softirqs, but very > little time in ksoftirqd (or is that a sampling artifact?). > > > Tasks: 79 total, 1 running, 78 sleeping, 0 stopped, 0 zombie > Cpu0: 23.6% us, 30.9% sy, 0.0% ni, 36.9% id, 0.0% wa, 0.3% hi, 8.3% si > Cpu1: 30.4% us, 24.1% sy, 0.0% ni, 5.9% id, 0.0% wa, 0.7% hi, 38.9% si > Mem: 4007812k total, 2199148k used, 1808664k free, 0k buffers > Swap: 0k total, 0k used, 0k free, 219844k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 5375 root 15 0 2682m 1.8g 6640 S 99.9 46.7 31:17.68 > SigtranServices > 7696 root 17 0 6952 3212 1192 S 7.3 0.1 0:15.75 > schedmon.ppc210 > 7859 root 16 0 2688 1228 964 R 0.7 0.0 0:00.04 top > 2956 root 8 -8 18940 7436 5776 S 0.3 0.2 0:01.35 blademtc > 1 root 16 0 1660 620 532 S 0.0 0.0 0:30.62 init > 2 root RT 0 000 S 0.0 0.0 0:00.01 migration/0 > 3 root 15 0 000 S 0.0 0.0 0:00.55 ksoftirqd/0 > 4 root RT 0 000 S 0.0 0.0 0:00.01 migration/1 > 5 root 15 0 000 S 0.0 0.0 0:00.43 ksoftirqd/1 > > > 3) /proc/sys/net/core/netdev_max_backlog is set to the default of 300 > > > So...anyone have any ideas/suggestions? You're using 2.6.10... you can always replace the e1000 module with the out-of-tree version from e1000.sf.net, this might help a bit - the version in the 2.6.10 kernel is very very old. it also appears that your app is eating up CPU time. perhaps setting the app to a nicer nice level might mitigate things a bit. Also turn off the in-kernel irq mitigation, it just causes cache misses and you really need the network irq to sit on a single cpu at most (if not all) the time to get the best performance. Use the userspace irqbalance daemon instead to achieve this. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
Chris Friesen wrote: Hi all, I've got an issue that's popped up with a deployed system running 2.6.10. I'm looking for some help figuring out why incoming network packets aren't being processed fast enough. After a recent userspace app change, we've started seeing packets being dropped by the ethernet hardware (e1000, NAPI is enabled). The error/dropped/fifo counts are going up in ethtool: rx_packets: 32180834 rx_bytes: 5480756958 rx_errors: 862506 rx_dropped: 771345 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_fifo_errors: 91161 rx_missed_errors: 91161 This link is receiving roughly 13K packets/sec, and we're dropping roughly 51 packets/sec due to fifo errors. Increasing the rx descriptor ring size from 256 up to around 3000 or so seems to make the problem stop, but it seems to me that this is just a workaround for the latency in processing the incoming packets. So, I'm looking for some suggestions on how to fix this or to figure out where the latency is coming from. Some additional information: 1) Interrupts are being processed on both cpus: [EMAIL PROTECTED]:/root cat /proc/interrupts CPU0 CPU1 30:17037564530785 U3-MPIC Level eth0 2) top shows a fair amount of time processing softirqs, but very little time in ksoftirqd (or is that a sampling artifact?). Tasks: 79 total, 1 running, 78 sleeping, 0 stopped, 0 zombie Cpu0: 23.6% us, 30.9% sy, 0.0% ni, 36.9% id, 0.0% wa, 0.3% hi, 8.3% si Cpu1: 30.4% us, 24.1% sy, 0.0% ni, 5.9% id, 0.0% wa, 0.7% hi, 38.9% si Mem: 4007812k total, 2199148k used, 1808664k free, 0k buffers Swap: 0k total, 0k used, 0k free, 219844k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 5375 root 15 0 2682m 1.8g 6640 S 99.9 46.7 31:17.68 SigtranServices 7696 root 17 0 6952 3212 1192 S 7.3 0.1 0:15.75 schedmon.ppc210 7859 root 16 0 2688 1228 964 R 0.7 0.0 0:00.04 top 2956 root 8 -8 18940 7436 5776 S 0.3 0.2 0:01.35 blademtc 1 root 16 0 1660 620 532 S 0.0 0.0 0:30.62 init 2 root RT 0 000 S 0.0 0.0 0:00.01 migration/0 3 root 15 0 000 S 0.0 0.0 0:00.55 ksoftirqd/0 4 root RT 0 000 S 0.0 0.0 0:00.01 migration/1 5 root 15 0 000 S 0.0 0.0 0:00.43 ksoftirqd/1 3) /proc/sys/net/core/netdev_max_backlog is set to the default of 300 So...anyone have any ideas/suggestions? You're using 2.6.10... you can always replace the e1000 module with the out-of-tree version from e1000.sf.net, this might help a bit - the version in the 2.6.10 kernel is very very old. it also appears that your app is eating up CPU time. perhaps setting the app to a nicer nice level might mitigate things a bit. Also turn off the in-kernel irq mitigation, it just causes cache misses and you really need the network irq to sit on a single cpu at most (if not all) the time to get the best performance. Use the userspace irqbalance daemon instead to achieve this. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
Chris Friesen wrote: Kok, Auke wrote: You're using 2.6.10... you can always replace the e1000 module with the out-of-tree version from e1000.sf.net, this might help a bit - the version in the 2.6.10 kernel is very very old. Do you have any reason to believe this would improve things? It seems like the problem lies in the NAPI/softirq code rather than in the e1000 driver itself, no? your real issue is that your userspace app is hogging the CPU. While network is not really cpu intensive, it does require that ample time at many intervals is given to the CPU to run cleanups and prevent FIFO issues. alternatively, you can increase your rx/tx ring descriptor count (with ethtool), which basically makes it easier for the hardware not to be serviced for a longer period, since there are more buffers available and the card can go longer on when userspace is hogging the CPU. it also appears that your app is eating up CPU time. perhaps setting the app to a nicer nice level might mitigate things a bit. If we're not handling the softirq work from ksoftirqd how would changing scheduler settings affect anything? correct, it might not. Also turn off the in-kernel irq mitigation, it just causes cache misses and you really need the network irq to sit on a single cpu at most (if not all) the time to get the best performance. Use the userspace irqbalance daemon instead to achieve this. Using userspace irqbalance would be some effort to test and deploy properly. However, as a quick test I tried setting the irq affinity for this device and it didn't help. irqbalance is a simple userspace app that drops into any system seemlessly and does the best job all around - often it beats manual tuning of smp_affinity even ;) Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: questions on NAPI processing latency and dropped network packets
Rick Jones wrote: 1) Interrupts are being processed on both cpus: [EMAIL PROTECTED]:/root cat /proc/interrupts CPU0 CPU1 30:17037564530785 U3-MPIC Level eth0 IIRC none of the e1000 driven cards are multi-queue the pci-express variants are, but the functionality is almost always disabled (and relatively new anyway). even with multiqueue, you can still have only a single irq line (which defeats the purpose of course mostly). , so while the above shows that interrupts from eth0 have been processed on both CPUs at various points in the past, it doesn't necessarily mean that they are being processed on both CPUs at the same time right? never will, an irq can only be processed on one cpu at a time anyway, obviously the irq here has been migrated ONCE from one of the cpu's to the other. unfortunately you can't see from /proc/interrupts whether this happens frequently or not, or how many times it happened before. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: WARNING: at kernel/softirq.c:139 local_bh_enable()
[EMAIL PROTECTED] wrote: > I am running 2.6.23 kernel on a DUAL core and QUAD core i386 boxes and > after everyboot, when the ethernet traffic starts i get this warning. > > All the ports in the system are e1000 and i am using the kernel e1000 > driver. [added netdev to the Cc:] can you repro this with 2.6.24-rc7? What distro are you using? Is your distro running a link-monitoring tool of some sorts? Auke > > Jan 7 22:31:00 localhost [warning] WARNING: at kernel/softirq.c:139 > local_bh_enable() > Jan 7 22:31:00 localhost [warning] [] > local_bh_enable+0x49/0xa9 > Jan 7 22:31:00 localhost [warning] [] > dev_queue_xmit+0x26c/0x275 > Jan 7 22:31:00 localhost [warning] [] arp_xmit+0x4d/0x51 > Jan 7 22:31:00 localhost [warning] [] arp_solicit+0x156/0x174 > > Jan 7 22:31:00 localhost [warning] [] > neigh_timer_handler+0x1e0/0x224 > Jan 7 22:31:00 localhost [warning] [] > run_timer_softirq+0x113/0x172 > Jan 7 22:31:00 localhost [warning] [] WARNING: at > kernel/softirq.c:139 local_bh_enable() > Jan 7 22:31:00 localhost [warning] hrtimer_interrupt+0x19c/0x1c4 > Jan 7 22:31:00 localhost [warning] [] [] > local_bh_enable+0x49/0xa9 > Jan 7 22:31:00 localhost [warning] [] > dev_queue_xmit+0x26c/0x275 > Jan 7 22:31:00 localhost [warning] [] > neigh_resolve_output+0x12c/0x15e > Jan 7 22:31:00 localhost [warning] [] > neigh_update+0x246/0x2cb > Jan 7 22:31:00 localhost [warning] [] neigh_lookup+0xa9/0xb3 > Jan 7 22:31:00 localhost [warning] [] arp_process+0x43c/0x477 > > Jan 7 22:31:00 localhost [warning] [] > enqueue_task_fair+0x2d/0x30 > Jan 7 22:31:00 localhost [warning] tick_sched_timer+0x0/0xba > Jan 7 22:31:00 localhost [warning] [] arp_rcv+0x104/0x119 > Jan 7 22:31:00 localhost [warning] [] [] > netif_receive_skb+0x1c5/0x1de > Jan 7 22:31:00 localhost [warning] [] > e1000_clean_rx_irq+0x40e/0x4ca [e1000] > Jan 7 22:31:00 localhost [warning] [] > getnstimeofday+0x36/0x10c > Jan 7 22:31:00 localhost [warning] neigh_timer_handler+0x0/0x224 > Jan 7 22:31:00 localhost [warning] [] __do_softirq+0x60/0xc1 > Jan 7 22:31:00 localhost [warning] [] e1000_clean+0x74/0x119 > [e1000] > Jan 7 22:31:00 localhost [warning] [] [] > net_rx_action+0x5a/0xd3 > Jan 7 22:31:00 localhost [warning] [] __do_softirq+0x60/0xc1 > Jan 7 22:31:00 localhost [warning] do_softirq+0x31/0x35 > Jan 7 22:31:00 localhost [warning] [] do_softirq+0x31/0x35 > Jan 7 22:31:00 localhost [warning] [] irq_exit+0x38/0x6b > Jan 7 22:31:00 localhost [warning] [] [] > do_IRQ+0x80/0x93 > Jan 7 22:31:00 localhost [warning] irq_exit+0x38/0x6b > Jan 7 22:31:00 localhost [warning] [] > common_interrupt+0x23/0x28 > Jan 7 22:31:00 localhost [warning] [] [] > get_swap_page+0xe7/0x215 > Jan 7 22:31:00 localhost [warning] [] > mwait_idle_with_hints+0x34/0x38 Jan 7 22:31:00 localhost [warning] > [] mwait_idle+0x0/0xa > Jan 7 22:31:00 localhost [warning] [] cpu_idle+0x98/0xb9 > Jan 7 22:31:00 localhost [warning] smp_apic_timer_interrupt+0x2c/0x35 > Jan 7 22:31:00 localhost [warning] === > Jan 7 22:31:00 localhost [warning] [] > apic_timer_interrupt+0x28/0x30 > Jan 7 22:31:00 localhost [warning] [] > get_swap_page+0xe7/0x215 > Jan 7 22:31:00 localhost [warning] [] > mwait_idle_with_hints+0x34/0x38 > Jan 7 22:31:00 localhost [warning] [] mwait_idle+0x0/0xa > Jan 7 22:31:00 localhost [warning] [] cpu_idle+0x98/0xb9 > Jan 7 22:31:00 localhost [warning] === > > > Thanks > Jayakrishnan Chathu > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCIE ASPM support
Shaohua Li wrote: > On Thu, 2008-01-03 at 11:33 -0800, Kok, Auke wrote: >> Shaohua Li wrote: >>> PCI Express ASPM defines a protocol for PCI Express components in the D0 >>> state to reduce Link power by placing their Links into a low power state >>> and instructing the other end of the Link to do likewise. This >>> capability allows hardware-autonomous, dynamic Link power reduction >>> beyond what is achievable by software-only controlled power management. >>> However, The device should be configured by software appropriately. >>> Enabling ASPM will save power, but will introduce device latency. >>> >>> This patch adds ASPM support in Linux. It introduces a global policy for >>> ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control >>> it. The interface can be used as a boot option too. Currently we have >>> below setting: >>> -default, BIOS default setting >>> -powersave, highest power saving mode, enable all available ASPM state >>> and clock power management >>> -performance, highest performance, disable ASPM and clock power >>> management >>> By default, the 'default' policy is used currently. >>> >>> In my test, power difference between powersave mode and performance mode >>> is about 1.3w in a system with 3 PCIE links. >>> >>> please review, any comments will be appreciated. >> >> quickly glanced this over since I recently disabled l1 ASPM for the >> e1000/e1000e >> driven 82573 device which has issues with l1 ASPM. that immediately gives me >> the >> question: how can I continue to disable 1l aspm by default for this device >> using >> this infrastructure? > I used to have a per-device interface, but thought the interface might > be hard to use for users. If we really need the per-device interface, I > can re-add it. > >> I do like the fact that there is a generic way to re-enable it for the users >> who >> want to use it. Can this change be done when the device is already active? > Yes, at least in my test. > >> Can you >> change this parameter per device/module? > Another way is to provide a helper for driver, and driver disables > specific ASPM states. It sounds better to let driver do the disabling, > as users haven't the knowledge? agreed, however this could still be usefull in debugging equipment for the experienced user. In any case an easy handle for the driver to dis/enable ASPM would certainly help our case, and possibly others. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCIE ASPM support
Shaohua Li wrote: On Thu, 2008-01-03 at 11:33 -0800, Kok, Auke wrote: Shaohua Li wrote: PCI Express ASPM defines a protocol for PCI Express components in the D0 state to reduce Link power by placing their Links into a low power state and instructing the other end of the Link to do likewise. This capability allows hardware-autonomous, dynamic Link power reduction beyond what is achievable by software-only controlled power management. However, The device should be configured by software appropriately. Enabling ASPM will save power, but will introduce device latency. This patch adds ASPM support in Linux. It introduces a global policy for ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control it. The interface can be used as a boot option too. Currently we have below setting: -default, BIOS default setting -powersave, highest power saving mode, enable all available ASPM state and clock power management -performance, highest performance, disable ASPM and clock power management By default, the 'default' policy is used currently. In my test, power difference between powersave mode and performance mode is about 1.3w in a system with 3 PCIE links. please review, any comments will be appreciated. quickly glanced this over since I recently disabled l1 ASPM for the e1000/e1000e driven 82573 device which has issues with l1 ASPM. that immediately gives me the question: how can I continue to disable 1l aspm by default for this device using this infrastructure? I used to have a per-device interface, but thought the interface might be hard to use for users. If we really need the per-device interface, I can re-add it. I do like the fact that there is a generic way to re-enable it for the users who want to use it. Can this change be done when the device is already active? Yes, at least in my test. Can you change this parameter per device/module? Another way is to provide a helper for driver, and driver disables specific ASPM states. It sounds better to let driver do the disabling, as users haven't the knowledge? agreed, however this could still be usefull in debugging equipment for the experienced user. In any case an easy handle for the driver to dis/enable ASPM would certainly help our case, and possibly others. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: WARNING: at kernel/softirq.c:139 local_bh_enable()
[EMAIL PROTECTED] wrote: I am running 2.6.23 kernel on a DUAL core and QUAD core i386 boxes and after everyboot, when the ethernet traffic starts i get this warning. All the ports in the system are e1000 and i am using the kernel e1000 driver. [added netdev to the Cc:] can you repro this with 2.6.24-rc7? What distro are you using? Is your distro running a link-monitoring tool of some sorts? Auke Jan 7 22:31:00 localhost [warning] WARNING: at kernel/softirq.c:139 local_bh_enable() Jan 7 22:31:00 localhost [warning] [c012bd0f] local_bh_enable+0x49/0xa9 Jan 7 22:31:00 localhost [warning] [c039ba1a] dev_queue_xmit+0x26c/0x275 Jan 7 22:31:00 localhost [warning] [c03cdf6c] arp_xmit+0x4d/0x51 Jan 7 22:31:00 localhost [warning] [c03cd9f6] arp_solicit+0x156/0x174 Jan 7 22:31:00 localhost [warning] [c03a047f] neigh_timer_handler+0x1e0/0x224 Jan 7 22:31:00 localhost [warning] [c012f820] run_timer_softirq+0x113/0x172 Jan 7 22:31:00 localhost [warning] [c013b042] WARNING: at kernel/softirq.c:139 local_bh_enable() Jan 7 22:31:00 localhost [warning] hrtimer_interrupt+0x19c/0x1c4 Jan 7 22:31:00 localhost [warning] [c014002a] [c012bd0f] local_bh_enable+0x49/0xa9 Jan 7 22:31:00 localhost [warning] [c039ba1a] dev_queue_xmit+0x26c/0x275 Jan 7 22:31:00 localhost [warning] [c03a0c05] neigh_resolve_output+0x12c/0x15e Jan 7 22:31:00 localhost [warning] [c03a0881] neigh_update+0x246/0x2cb Jan 7 22:31:00 localhost [warning] [c039fb21] neigh_lookup+0xa9/0xb3 Jan 7 22:31:00 localhost [warning] [c03ce410] arp_process+0x43c/0x477 Jan 7 22:31:00 localhost [warning] [c0120b73] enqueue_task_fair+0x2d/0x30 Jan 7 22:31:00 localhost [warning] tick_sched_timer+0x0/0xba Jan 7 22:31:00 localhost [warning] [c03ce554] arp_rcv+0x104/0x119 Jan 7 22:31:00 localhost [warning] [c03a029f] [c039bda6] netif_receive_skb+0x1c5/0x1de Jan 7 22:31:00 localhost [warning] [f897a61d] e1000_clean_rx_irq+0x40e/0x4ca [e1000] Jan 7 22:31:00 localhost [warning] [c013bdc6] getnstimeofday+0x36/0x10c Jan 7 22:31:00 localhost [warning] neigh_timer_handler+0x0/0x224 Jan 7 22:31:00 localhost [warning] [c012be12] __do_softirq+0x60/0xc1 Jan 7 22:31:00 localhost [warning] [f8979e34] e1000_clean+0x74/0x119 [e1000] Jan 7 22:31:00 localhost [warning] [c039bf03] [c012bea4] net_rx_action+0x5a/0xd3 Jan 7 22:31:00 localhost [warning] [c012be12] __do_softirq+0x60/0xc1 Jan 7 22:31:00 localhost [warning] do_softirq+0x31/0x35 Jan 7 22:31:00 localhost [warning] [c012bea4] do_softirq+0x31/0x35 Jan 7 22:31:00 localhost [warning] [c012bf03] irq_exit+0x38/0x6b Jan 7 22:31:00 localhost [warning] [c0106a1e] [c012bf03] do_IRQ+0x80/0x93 Jan 7 22:31:00 localhost [warning] irq_exit+0x38/0x6b Jan 7 22:31:00 localhost [warning] [c01057b7] common_interrupt+0x23/0x28 Jan 7 22:31:00 localhost [warning] [c01600d8] [c011a34d] get_swap_page+0xe7/0x215 Jan 7 22:31:00 localhost [warning] [c0103232] mwait_idle_with_hints+0x34/0x38 Jan 7 22:31:00 localhost [warning] [c0103236] mwait_idle+0x0/0xa Jan 7 22:31:00 localhost [warning] [c01030f2] cpu_idle+0x98/0xb9 Jan 7 22:31:00 localhost [warning] smp_apic_timer_interrupt+0x2c/0x35 Jan 7 22:31:00 localhost [warning] === Jan 7 22:31:00 localhost [warning] [c0105874] apic_timer_interrupt+0x28/0x30 Jan 7 22:31:00 localhost [warning] [c01600d8] get_swap_page+0xe7/0x215 Jan 7 22:31:00 localhost [warning] [c0103232] mwait_idle_with_hints+0x34/0x38 Jan 7 22:31:00 localhost [warning] [c0103236] mwait_idle+0x0/0xa Jan 7 22:31:00 localhost [warning] [c01030f2] cpu_idle+0x98/0xb9 Jan 7 22:31:00 localhost [warning] === Thanks Jayakrishnan Chathu -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCIE ASPM support
Shaohua Li wrote: > PCI Express ASPM defines a protocol for PCI Express components in the D0 > state to reduce Link power by placing their Links into a low power state > and instructing the other end of the Link to do likewise. This > capability allows hardware-autonomous, dynamic Link power reduction > beyond what is achievable by software-only controlled power management. > However, The device should be configured by software appropriately. > Enabling ASPM will save power, but will introduce device latency. > > This patch adds ASPM support in Linux. It introduces a global policy for > ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control > it. The interface can be used as a boot option too. Currently we have > below setting: > -default, BIOS default setting > -powersave, highest power saving mode, enable all available ASPM state > and clock power management > -performance, highest performance, disable ASPM and clock power > management > By default, the 'default' policy is used currently. > > In my test, power difference between powersave mode and performance mode > is about 1.3w in a system with 3 PCIE links. > > please review, any comments will be appreciated. quickly glanced this over since I recently disabled l1 ASPM for the e1000/e1000e driven 82573 device which has issues with l1 ASPM. that immediately gives me the question: how can I continue to disable 1l aspm by default for this device using this infrastructure? I do like the fact that there is a generic way to re-enable it for the users who want to use it. Can this change be done when the device is already active? Can you change this parameter per device/module? > + /* Clock PM state*/ > + unsigned int clk_pm_capable:1; > + unsigned int clk_pm_enabled:1; > + unsigned int bios_clk_state:1; might want to get rid of these bitfields? Cheers, Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] PCIE ASPM support
Shaohua Li wrote: PCI Express ASPM defines a protocol for PCI Express components in the D0 state to reduce Link power by placing their Links into a low power state and instructing the other end of the Link to do likewise. This capability allows hardware-autonomous, dynamic Link power reduction beyond what is achievable by software-only controlled power management. However, The device should be configured by software appropriately. Enabling ASPM will save power, but will introduce device latency. This patch adds ASPM support in Linux. It introduces a global policy for ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control it. The interface can be used as a boot option too. Currently we have below setting: -default, BIOS default setting -powersave, highest power saving mode, enable all available ASPM state and clock power management -performance, highest performance, disable ASPM and clock power management By default, the 'default' policy is used currently. In my test, power difference between powersave mode and performance mode is about 1.3w in a system with 3 PCIE links. please review, any comments will be appreciated. quickly glanced this over since I recently disabled l1 ASPM for the e1000/e1000e driven 82573 device which has issues with l1 ASPM. that immediately gives me the question: how can I continue to disable 1l aspm by default for this device using this infrastructure? I do like the fact that there is a generic way to re-enable it for the users who want to use it. Can this change be done when the device is already active? Can you change this parameter per device/module? + /* Clock PM state*/ + unsigned int clk_pm_capable:1; + unsigned int clk_pm_enabled:1; + unsigned int bios_clk_state:1; might want to get rid of these bitfields? Cheers, Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] e100 driver didn't support any MII-less PHYs...
Andreas Mohr wrote: > Hi all, > > I was mildly annoyed when rebooting my _headless_ internet gateway after a > hotplug -> udev migration and witnessing it not coming up again, > which turned out to be due to an eepro100 / e100 loading conflict > since eepro100 supported both of my Intel-based network cards, > whereas e100 only supported the "newer" one and entirely failed on ifup... > (udev had somehow managed to tweak loading sequence as compared to > a hotplug setup, which caused the drivers to probe differently) > > After investigating this e100 failure for half an hour it was obvious > that it was failing in e100_hw_init() -> e100_phy_init() since the driver was > prepared to handle MII-capable PHYs only, not certain older(?) MII-less > PHYs such as 80c24 or i82503. > Investigating some FreeBSD etc. drivers it became terribly clear that there > are also some MII-less PHYs and that one would have to handle them properly. > > Thus I decided to add support for those: > - after PHY init failure, try to detect whether the EEPROM lists one of > the MII-less PHYs > - if so, don't fatally fail PHY init function > - avoid touching MII in various utility functions in case of MII-less > PHY (FIXME: this may need review, it was a quick hack in some places) > - add some proper logging on init failure > > Note that this is an initial, semi-rough patch only, would love to have > it corrected/improved by the e1000 team. > (I also added some spelling updates for good measure, these would have > to be committed separately obviously) > > Frankly I'm quite uncertain as to why one would try to actively deprecate > a driver which works for many cards with a newer one which fails to work > for several card types and doesn't seem clearly superiour in hindsight > after going through it... > Oh, right, that's in order to brute-force people to report any > nagging problems with the new driver, which is... errm... very > understandable after all ;) > (I hope that me "reporting" this problem via a patch is ok ;) > > For reference, I'm using a BNC/AUI/TP PCI combo card > Intel 82557 645477-004 FCC ID EJMNPDEPR10PCTPCI > > This mail written using a reassuringly stable connection over the newly > adapted driver... ok, barely glanced over the patch but it might just be fine. Can you split up this patch and send a separate patch for the spelling mistakes? I'll then have some quick testing done on the result and do a bit deeper review after newyears. Cheers, Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] e100 driver didn't support any MII-less PHYs...
Andreas Mohr wrote: Hi all, I was mildly annoyed when rebooting my _headless_ internet gateway after a hotplug - udev migration and witnessing it not coming up again, which turned out to be due to an eepro100 / e100 loading conflict since eepro100 supported both of my Intel-based network cards, whereas e100 only supported the newer one and entirely failed on ifup... (udev had somehow managed to tweak loading sequence as compared to a hotplug setup, which caused the drivers to probe differently) After investigating this e100 failure for half an hour it was obvious that it was failing in e100_hw_init() - e100_phy_init() since the driver was prepared to handle MII-capable PHYs only, not certain older(?) MII-less PHYs such as 80c24 or i82503. Investigating some FreeBSD etc. drivers it became terribly clear that there are also some MII-less PHYs and that one would have to handle them properly. Thus I decided to add support for those: - after PHY init failure, try to detect whether the EEPROM lists one of the MII-less PHYs - if so, don't fatally fail PHY init function - avoid touching MII in various utility functions in case of MII-less PHY (FIXME: this may need review, it was a quick hack in some places) - add some proper logging on init failure Note that this is an initial, semi-rough patch only, would love to have it corrected/improved by the e1000 team. (I also added some spelling updates for good measure, these would have to be committed separately obviously) Frankly I'm quite uncertain as to why one would try to actively deprecate a driver which works for many cards with a newer one which fails to work for several card types and doesn't seem clearly superiour in hindsight after going through it... Oh, right, that's in order to brute-force people to report any nagging problems with the new driver, which is... errm... very understandable after all ;) (I hope that me reporting this problem via a patch is ok ;) For reference, I'm using a BNC/AUI/TP PCI combo card Intel 82557 645477-004 FCC ID EJMNPDEPR10PCTPCI This mail written using a reassuringly stable connection over the newly adapted driver... ok, barely glanced over the patch but it might just be fine. Can you split up this patch and send a separate patch for the spelling mistakes? I'll then have some quick testing done on the result and do a bit deeper review after newyears. Cheers, Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: VGA Drivers
pradeep pradeep wrote: > Hi, > I want to support a new PCI based VGA card in > linux. I want to know what is the VGA driver stack in > the Linux. Can any one help me where to start. Assuming you're not talking about a VGA grabber card here... Graphics/ X drivers are mostly in userspace except some DRI/DRM infrastructure and AGP code. You should start looking at the XOrg project. The kernel has some framebuffer support, but I'm not sure that is what you are looking for. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: VGA Drivers
pradeep pradeep wrote: Hi, I want to support a new PCI based VGA card in linux. I want to know what is the VGA driver stack in the Linux. Can any one help me where to start. Assuming you're not talking about a VGA grabber card here... Graphics/ X drivers are mostly in userspace except some DRI/DRM infrastructure and AGP code. You should start looking at the XOrg project. The kernel has some framebuffer support, but I'm not sure that is what you are looking for. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sky2: Use deferrable timer for watchdog
Stephen Hemminger wrote: > On Thu, 20 Dec 2007 15:36:13 -0500 > "Parag Warudkar" <[EMAIL PROTECTED]> wrote: > >> On Dec 20, 2007 3:04 PM, Arjan van de Ven <[EMAIL PROTECTED]> wrote: I think it is reasonable for Network driver watchdogs to use a deferrable timer - if the machine is 100% IDLE there is no one needing the network to be up. If there is something running even on the other CPU - that is going to cause an IPI, reschedule, TLB invalidation etc. which will make it very likely in practice that each CPU will be interrupted in reasonable amount of time. >>> this is not correct; many machines are idle waiting for network data. Think >>> of webservers... >> Yes, I forgot the receive case. So if a server was 100% IDLE and a web >> server was listening for network data and we reach 0 wakeups per >> second on the CPU where the network watchdog timer is scheduled to run >> deferred _and_ the network link went down, it would cause the watchdog >> to not run and redo the link until some one else wakes up that CPU >> later. >> So as long as we make sure we don't convert every timer to deferrable >> we should be ok - may be this can be resolved easily by having a >> non-deferrable "dont-allow-deferring-for-too-long" timer on each CPU >> that just causes at least one wake up in some reasonable time delta >> from the previous wakeup (whoever caused that one.) It is still >> beneficial in that all deferrable timers would run at once without >> needing to have separate wakeup for each. >> Of course there are theoretical cases where we could land into a situation where a CPU in a multiprocessor machine is IDLE infinitely and that causes the watchdog that happens to be bound to run on the same CPU to not run. To take care of these unlikely cases I think the timer mechanism should have a reasonable limit on how long a CPU can go IDLE if there are deferrable timers. >>> how about something else instead: a timer mechanism that takes a range >>> instead.. >>> that at least has defined semantics; the deferrable semantics really are >>> "indefinite". >>> Lets keep at least the semantics clear and clean. >>> >> Would not the simpler solution of installing a non-deferrable timer >> per cpu which will not allow the CPU to go IDLE for more than x units >> of time at once (or something to that effect) work? Range would >> complicate the thing and I am not sure how many cases will know >> reasonably correct range for their normal operation. In this instance >> of the e1000 watchdog what range could it give and be successful at >> what it wants to do - bring up the link in reasonable amount of time, >> while also realizing the power savings? >> >> Perhaps depending on Server/Laptop/Desktop machine (may be based on >> Preemption) we could have normal or deferrable timers but that'll >> exclude Servers from power savings and I am not sure Data center folks >> will like that :) . >> >> Parag > > > The problem is that on a server the receiver will go deaf if the chip > bug that the watchdog is looking for triggers. Yes, no packets in > and it happily will just sit there. > > So for now, I am not going to apply your simple patch and work on a > two stage timer per arjan's suggestion for a later release. I also think that's the right way to go for now. I'll ask jeff to hold off on the two patches for now. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sky2: Use deferrable timer for watchdog
Arjan van de Ven wrote: > My interpretation of the api is: * round_jiffies() - timer wants to wakeup but isn't precise about when so schedule on next second when system will wake up anyway; e.g why meetings are usually scheduled on the hour * deferrable - timer doesn't have to really wakeup but wants to happen near a particular time. e.g. "I'll meet you at the pub around 8pm" > > this is not correct. > > deferrable means "if you're busy wake me up at this time. But if not, > don't bother waking up for me, get to it > later". > > The "later" can be a LONG time later, several seconds easily, if not more. > (timers are on a per cpu bases, and you may end up with a several-core > system where the common timers are all on another cpu > than this one) > > > >>> If this is the case then the whole usage of round_jiffies() is bogus. >>> All users of round_jiffies() >>> should just be converted to deferrable?? I am a bit concerned that >>> if deferrable gets used everywhere >>> then a strange situation would occur where all timers were waiting >>> for some other timer to finally >>> happen, kind of a wierd timelock situation. Like the old chip/dale >>> cartoon: >>> "you first, no you first, after you mister chip, no after you mister >>> dale,..." >> >> >> >> that's a dangerous situation indeed and I'd really like to know what >> the limits >> are for deferring deferrable timers Arjan, do you know? Anyone? > > there is NO limit to deferring a timer. Do NOT use a deferrable timer if > you can't afford the timer to not happen > within.. 10 to 100 seconds! (or more) > They are really meant for things where you CAN afford for it to not > happen when you're idle ok, that's just bad and if there's no user-defineable limit to the deferral I definately don't like this change. Can I safely assume that any irq will cause all deferred timers to run? If this is the case then for e1000 this patch is still OK since the watchdog needs to run (1) after a link up/down interrupt or (2) to update statistics. Those statistics won't increase if there is no traffic of course... Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000e: Use deferrable timer for watchdog
Parag Warudkar wrote: > On Dec 20, 2007 12:05 PM, Kok, Auke <[EMAIL PROTECTED]> wrote: >> I can't even apply this patch and the e1000 one... not only is it whitespace >> damaged it is also not properly formatted as patch at all. If you want me to >> take >> these patches seriously, then please fix the formatting issues. > > Sigh - I use Pine, follow Documents/email-clients.txt for the > recommended settings and obviously the pathces are not generated with > whitespace damage at my end as I test those before sending out. > > So although I hate to see this happen there is nothing at this moment > that I can do - except for attaching the patch instead of inlining it. > Since they have already been reviewed inline, please see if the > attached patches work for you. here's what the files in my Maildir spool look like in vim (my vim displays a '»' char for tabs and a "¶" for EOL): 76 --- linux-2.6/drivers/net/e1000e/netdev.c» 2007-12-07 10:04:39. 77 +++ linux-2.6-work/drivers/net/e1000e/netdev.c» 2007-12-18 20:45:59. 78 @@ -3899,7 +3899,7 @@¶ 79 » » goto err_eeprom;¶ 80 » }¶ 81 ¶ 82 -» init_timer(>watchdog_timer);¶ 83 +» init_timer_deferrable(>watchdog_timer);¶ 84 » adapter->watchdog_timer.function = _watchdog;¶ 85 » adapter->watchdog_timer.data = (unsigned long) adapter;¶ 86 ¶ 87 --¶ notice that there are two spaces instead of 1. Also there's no line heading the diff with 'diff a/foo b/foo' which is what throws of stg. And the -p option is missing. as for content, the patch looks OK with me. I ran the numbers and allthough there was a slight average delay in the link up detection time it is negligeable (less than 0.2sec difference over a bunch of measurements), and I confirmed your powertop numbers are correct. As for the timer interval, the watchdog may already be delayed up to 3 seconds safely, this doesn't change that. I'll forward the patch, Care to make one for e100? plenty of laptops with those still around! The embedded guys would love it I think. Thanks, Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sky2: Use deferrable timer for watchdog
Stephen Hemminger wrote: > On Thu, 20 Dec 2007 17:29:23 + > >> -Original Message- >> From: Stephen Hemminger <[EMAIL PROTECTED]> >> >> Date: Thu, 20 Dec 2007 09:16:03 >> To:[EMAIL PROTECTED] >> Cc:[EMAIL PROTECTED], [EMAIL PROTECTED], linux-kernel@vger.kernel.org >> Subject: Re: [PATCH] sky2: Use deferrable timer for watchdog >> >> >> On Tue, 18 Dec 2007 20:13:28 -0500 (EST) >> Parag Warudkar <[EMAIL PROTECTED]> wrote: >> >>> sky2 can use deferrable timer for watchdog - reduces wakeups from idle per >>> second. >>> >>> Signed-off-by: Parag Warudkar <[EMAIL PROTECTED]> >>> >>> --- linux-2.6/drivers/net/sky2.c2007-12-07 10:04:39.0 -0500 >>> +++ linux-2.6-work/drivers/net/sky2.c 2007-12-18 20:07:58.0 >>> -0500 >>> @@ -4230,7 +4230,10 @@ >>> sky2_show_addr(dev1); >>> } >>> >>> - setup_timer(>watchdog_timer, sky2_watchdog, (unsigned long) hw); >>> + hw->watchdog_timer.function = sky2_watchdog; >>> + hw->watchdog_timer.data = (unsigned long) hw; >>> + init_timer_deferrable(>watchdog_timer); >>> + >>> INIT_WORK(>restart_work, sky2_restart); >>> >>> pci_set_drvdata(pdev, hw); >> Does it really reduce the wakeup's or only change who gets charged by >> powertop? >> The system is going to wakeup once a second anyway. Looks to me that if the >> timer is using round_jiffies(), that setting deferrable just changes the >> accounting. >> >> My interpretation of the api is: >>* round_jiffies() - timer wants to wakeup but isn't precise about when >> so schedule >> on next second when system will wake up anyway; >> e.g why meetings are usually scheduled on the hour >> >>* deferrable - timer doesn't have to really wakeup but wants to >> happen near >> a particular time. e.g. "I'll meet you at the pub >> around 8pm" >> >> Therefore doing deferrable is unnecessary for timers using round_jiffies >> unless system >> is so good at doing timers that it is going to skip doing timer once per >> second. >> > > [EMAIL PROTECTED] wrote: > >> NO_HZ kernels don't do timers every second - if you do round_jiffies() the >> kernel will wakeup and run the timer at that time no matter what. >> >> The reason deferrable was introduced is to avoid waking up the kernel just >> for this one timer that can be called when the CPU is not idle for some >> reason other than this timer. >> >> In other words let's say there were two timers - one non-deferrable expiring >> in 3 seconds and other deferrable, expiring in 1.5 seconds. The kernel will >> not wake up twice - once for 1.5 second and other for 3 second - it will >> wake up once at expiry of 3 second timer and execute both the 1.5 second and >> 3 second timers. >> >> And this is not just powertop accounting thing - like I said the total num >> of wakeups per second go down with this patch. >> >> Parag >> >> Sent via BlackBerry from T-Mobile > > > Quit top-posting! > > If this is the case then the whole usage of round_jiffies() is bogus. All > users of round_jiffies() > should just be converted to deferrable?? I am a bit concerned that if > deferrable gets used everywhere > then a strange situation would occur where all timers were waiting for some > other timer to finally > happen, kind of a wierd timelock situation. Like the old chip/dale cartoon: > "you first, no you first, after you mister chip, no after you mister > dale,..." that's a dangerous situation indeed and I'd really like to know what the limits are for deferring deferrable timers Arjan, do you know? Anyone? I don't see a danger just yet on normal systems - I get something like 10 wakeups per second from just the kernel (acpi, ahci, usb) on most my systems which guarantees that the watchdog runs often enough, but for embedded systems and critical timers in other drivers this may be an issue quickly Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000e: Use deferrable timer for watchdog
Parag Warudkar wrote: > > Reduce wakeups from idle per second. > > Signed-off-by: Parag Warudkar <[EMAIL PROTECTED]> > > --- linux-2.6/drivers/net/e1000e/netdev.c2007-12-07 > 10:04:39.0 -0500 > +++ linux-2.6-work/drivers/net/e1000e/netdev.c2007-12-18 > 20:45:59.0 -0500 > @@ -3899,7 +3899,7 @@ > goto err_eeprom; > } > > -init_timer(>watchdog_timer); > +init_timer_deferrable(>watchdog_timer); > adapter->watchdog_timer.function = _watchdog; > adapter->watchdog_timer.data = (unsigned long) adapter; I can't even apply this patch and the e1000 one... not only is it whitespace damaged it is also not properly formatted as patch at all. If you want me to take these patches seriously, then please fix the formatting issues. Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000e: Use deferrable timer for watchdog
Parag Warudkar wrote: Reduce wakeups from idle per second. Signed-off-by: Parag Warudkar [EMAIL PROTECTED] --- linux-2.6/drivers/net/e1000e/netdev.c2007-12-07 10:04:39.0 -0500 +++ linux-2.6-work/drivers/net/e1000e/netdev.c2007-12-18 20:45:59.0 -0500 @@ -3899,7 +3899,7 @@ goto err_eeprom; } -init_timer(adapter-watchdog_timer); +init_timer_deferrable(adapter-watchdog_timer); adapter-watchdog_timer.function = e1000_watchdog; adapter-watchdog_timer.data = (unsigned long) adapter; I can't even apply this patch and the e1000 one... not only is it whitespace damaged it is also not properly formatted as patch at all. If you want me to take these patches seriously, then please fix the formatting issues. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sky2: Use deferrable timer for watchdog
Stephen Hemminger wrote: On Thu, 20 Dec 2007 17:29:23 + -Original Message- From: Stephen Hemminger [EMAIL PROTECTED] Date: Thu, 20 Dec 2007 09:16:03 To:[EMAIL PROTECTED] Cc:[EMAIL PROTECTED], [EMAIL PROTECTED], linux-kernel@vger.kernel.org Subject: Re: [PATCH] sky2: Use deferrable timer for watchdog On Tue, 18 Dec 2007 20:13:28 -0500 (EST) Parag Warudkar [EMAIL PROTECTED] wrote: sky2 can use deferrable timer for watchdog - reduces wakeups from idle per second. Signed-off-by: Parag Warudkar [EMAIL PROTECTED] --- linux-2.6/drivers/net/sky2.c2007-12-07 10:04:39.0 -0500 +++ linux-2.6-work/drivers/net/sky2.c 2007-12-18 20:07:58.0 -0500 @@ -4230,7 +4230,10 @@ sky2_show_addr(dev1); } - setup_timer(hw-watchdog_timer, sky2_watchdog, (unsigned long) hw); + hw-watchdog_timer.function = sky2_watchdog; + hw-watchdog_timer.data = (unsigned long) hw; + init_timer_deferrable(hw-watchdog_timer); + INIT_WORK(hw-restart_work, sky2_restart); pci_set_drvdata(pdev, hw); Does it really reduce the wakeup's or only change who gets charged by powertop? The system is going to wakeup once a second anyway. Looks to me that if the timer is using round_jiffies(), that setting deferrable just changes the accounting. My interpretation of the api is: * round_jiffies() - timer wants to wakeup but isn't precise about when so schedule on next second when system will wake up anyway; e.g why meetings are usually scheduled on the hour * deferrable - timer doesn't have to really wakeup but wants to happen near a particular time. e.g. I'll meet you at the pub around 8pm Therefore doing deferrable is unnecessary for timers using round_jiffies unless system is so good at doing timers that it is going to skip doing timer once per second. [EMAIL PROTECTED] wrote: NO_HZ kernels don't do timers every second - if you do round_jiffies() the kernel will wakeup and run the timer at that time no matter what. The reason deferrable was introduced is to avoid waking up the kernel just for this one timer that can be called when the CPU is not idle for some reason other than this timer. In other words let's say there were two timers - one non-deferrable expiring in 3 seconds and other deferrable, expiring in 1.5 seconds. The kernel will not wake up twice - once for 1.5 second and other for 3 second - it will wake up once at expiry of 3 second timer and execute both the 1.5 second and 3 second timers. And this is not just powertop accounting thing - like I said the total num of wakeups per second go down with this patch. Parag Sent via BlackBerry from T-Mobile Quit top-posting! If this is the case then the whole usage of round_jiffies() is bogus. All users of round_jiffies() should just be converted to deferrable?? I am a bit concerned that if deferrable gets used everywhere then a strange situation would occur where all timers were waiting for some other timer to finally happen, kind of a wierd timelock situation. Like the old chip/dale cartoon: you first, no you first, after you mister chip, no after you mister dale,... that's a dangerous situation indeed and I'd really like to know what the limits are for deferring deferrable timers Arjan, do you know? Anyone? I don't see a danger just yet on normal systems - I get something like 10 wakeups per second from just the kernel (acpi, ahci, usb) on most my systems which guarantees that the watchdog runs often enough, but for embedded systems and critical timers in other drivers this may be an issue quickly Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000e: Use deferrable timer for watchdog
Parag Warudkar wrote: On Dec 20, 2007 12:05 PM, Kok, Auke [EMAIL PROTECTED] wrote: I can't even apply this patch and the e1000 one... not only is it whitespace damaged it is also not properly formatted as patch at all. If you want me to take these patches seriously, then please fix the formatting issues. Sigh - I use Pine, follow Documents/email-clients.txt for the recommended settings and obviously the pathces are not generated with whitespace damage at my end as I test those before sending out. So although I hate to see this happen there is nothing at this moment that I can do - except for attaching the patch instead of inlining it. Since they have already been reviewed inline, please see if the attached patches work for you. here's what the files in my Maildir spool look like in vim (my vim displays a '»' char for tabs and a ¶ for EOL): 76 --- linux-2.6/drivers/net/e1000e/netdev.c» 2007-12-07 10:04:39. 77 +++ linux-2.6-work/drivers/net/e1000e/netdev.c» 2007-12-18 20:45:59. 78 @@ -3899,7 +3899,7 @@¶ 79 » » goto err_eeprom;¶ 80 » }¶ 81 ¶ 82 -» init_timer(adapter-watchdog_timer);¶ 83 +» init_timer_deferrable(adapter-watchdog_timer);¶ 84 » adapter-watchdog_timer.function = e1000_watchdog;¶ 85 » adapter-watchdog_timer.data = (unsigned long) adapter;¶ 86 ¶ 87 --¶ notice that there are two spaces instead of 1. Also there's no line heading the diff with 'diff a/foo b/foo' which is what throws of stg. And the -p option is missing. as for content, the patch looks OK with me. I ran the numbers and allthough there was a slight average delay in the link up detection time it is negligeable (less than 0.2sec difference over a bunch of measurements), and I confirmed your powertop numbers are correct. As for the timer interval, the watchdog may already be delayed up to 3 seconds safely, this doesn't change that. I'll forward the patch, Care to make one for e100? plenty of laptops with those still around! The embedded guys would love it I think. Thanks, Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sky2: Use deferrable timer for watchdog
Arjan van de Ven wrote: My interpretation of the api is: * round_jiffies() - timer wants to wakeup but isn't precise about when so schedule on next second when system will wake up anyway; e.g why meetings are usually scheduled on the hour * deferrable - timer doesn't have to really wakeup but wants to happen near a particular time. e.g. I'll meet you at the pub around 8pm this is not correct. deferrable means if you're busy wake me up at this time. But if not, don't bother waking up for me, get to it later. The later can be a LONG time later, several seconds easily, if not more. (timers are on a per cpu bases, and you may end up with a several-core system where the common timers are all on another cpu than this one) If this is the case then the whole usage of round_jiffies() is bogus. All users of round_jiffies() should just be converted to deferrable?? I am a bit concerned that if deferrable gets used everywhere then a strange situation would occur where all timers were waiting for some other timer to finally happen, kind of a wierd timelock situation. Like the old chip/dale cartoon: you first, no you first, after you mister chip, no after you mister dale,... that's a dangerous situation indeed and I'd really like to know what the limits are for deferring deferrable timers Arjan, do you know? Anyone? there is NO limit to deferring a timer. Do NOT use a deferrable timer if you can't afford the timer to not happen within.. 10 to 100 seconds! (or more) They are really meant for things where you CAN afford for it to not happen when you're idle ok, that's just bad and if there's no user-defineable limit to the deferral I definately don't like this change. Can I safely assume that any irq will cause all deferred timers to run? If this is the case then for e1000 this patch is still OK since the watchdog needs to run (1) after a link up/down interrupt or (2) to update statistics. Those statistics won't increase if there is no traffic of course... Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sky2: Use deferrable timer for watchdog
Stephen Hemminger wrote: On Thu, 20 Dec 2007 15:36:13 -0500 Parag Warudkar [EMAIL PROTECTED] wrote: On Dec 20, 2007 3:04 PM, Arjan van de Ven [EMAIL PROTECTED] wrote: I think it is reasonable for Network driver watchdogs to use a deferrable timer - if the machine is 100% IDLE there is no one needing the network to be up. If there is something running even on the other CPU - that is going to cause an IPI, reschedule, TLB invalidation etc. which will make it very likely in practice that each CPU will be interrupted in reasonable amount of time. this is not correct; many machines are idle waiting for network data. Think of webservers... Yes, I forgot the receive case. So if a server was 100% IDLE and a web server was listening for network data and we reach 0 wakeups per second on the CPU where the network watchdog timer is scheduled to run deferred _and_ the network link went down, it would cause the watchdog to not run and redo the link until some one else wakes up that CPU later. So as long as we make sure we don't convert every timer to deferrable we should be ok - may be this can be resolved easily by having a non-deferrable dont-allow-deferring-for-too-long timer on each CPU that just causes at least one wake up in some reasonable time delta from the previous wakeup (whoever caused that one.) It is still beneficial in that all deferrable timers would run at once without needing to have separate wakeup for each. Of course there are theoretical cases where we could land into a situation where a CPU in a multiprocessor machine is IDLE infinitely and that causes the watchdog that happens to be bound to run on the same CPU to not run. To take care of these unlikely cases I think the timer mechanism should have a reasonable limit on how long a CPU can go IDLE if there are deferrable timers. how about something else instead: a timer mechanism that takes a range instead.. that at least has defined semantics; the deferrable semantics really are indefinite. Lets keep at least the semantics clear and clean. Would not the simpler solution of installing a non-deferrable timer per cpu which will not allow the CPU to go IDLE for more than x units of time at once (or something to that effect) work? Range would complicate the thing and I am not sure how many cases will know reasonably correct range for their normal operation. In this instance of the e1000 watchdog what range could it give and be successful at what it wants to do - bring up the link in reasonable amount of time, while also realizing the power savings? Perhaps depending on Server/Laptop/Desktop machine (may be based on Preemption) we could have normal or deferrable timers but that'll exclude Servers from power savings and I am not sure Data center folks will like that :) . Parag The problem is that on a server the receiver will go deaf if the chip bug that the watchdog is looking for triggers. Yes, no packets in and it happily will just sit there. So for now, I am not going to apply your simple patch and work on a two stage timer per arjan's suggestion for a later release. I also think that's the right way to go for now. I'll ask jeff to hold off on the two patches for now. Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000: Use deferrable timer for watchdog
Parag Warudkar wrote: > On Dec 19, 2007 4:38 PM, Kok, Auke <[EMAIL PROTECTED]> wrote: >> Parag Warudkar wrote: >>> On 12/19/07, Kok, Auke <[EMAIL PROTECTED]> wrote: >> why would this patch reduce wakeups even more than round_jiffies()? Does it >> make >> our ~2 second update interval not reliable? can you quantify "shows it >> reduces" ? >> Or timer only runs once every two seconds... > > Without the patch - here is what powertop reports steady on my desktop - > > Wakeups-from-idle per second : 8.5 interval: 1.9s > no ACPI power usage estimate available > > Top causes for wakeups: > 28.6% ( 4.0) : clocksource_register > (clocksource_watchdog) > 14.3% ( 2.0) automount : futex_wait (hrtimer_wakeup) > 14.3% ( 2.0) ntpd : do_setitimer (it_real_fn) > 14.3% ( 2.0) ntpdate : do_adjtimex (sync_cmos_clock) >7.1% ( 1.0): PS/2 keyboard/mouse/touchpad >7.1% ( 1.0): eth0 >7.1% ( 1.0)ip : e1000_intr_msi (e1000_watchdog) > > $> stop network; rmmod e1000e > $> patch e1000e/netdev.c ; rebuild ; insmod > $> Wait for things to settle > > With the patch here is what it shows steadily - > > Wakeups-from-idle per second : 7.5 interval: 5.8s > no ACPI power usage estimate available > > Top causes for wakeups: > 32.4% ( 2.2) : clocksource_register > (clocksource_watchdog) > 17.6% ( 1.2) ntpd : do_setitimer (it_real_fn) > 14.7% ( 1.0) ntpdate : do_adjtimex (sync_cmos_clock) >8.8% ( 0.6): eth0 >5.9% ( 0.4) events/1 : __netdev_watchdog_up (dev_watchdog) >5.9% ( 0.4) : neigh_table_init_no_netlink > (neigh_periodic_ 5.9% ( 0.4): > neigh_table_init_no_netlink (neigh_periodic_timer) > > So no longer e1000_watchdog is waking up the CPU for its own sake - it > still runs but when the CPU is already out of IDLE to run something > else that needs to be run undeferred. > Wakeups from IDLE are down by 1 - from 8.5 to 7.5 . > >> maybe I just don't understand the effect of timer_set_deferrable() - we're >> already >> deferring it ourselves when we want to. If that is not working then I >> suggest that >> we fix that first instead of postponing the critical first run of the e1000 >> watchdog task. > > There is of course a difference between round_jiffies() and > timer_set_deferrable() if that's what you were referring to. > round_jiffies() will make the timer run at whatever rounded value no > matter if the CPU is already IDLE or not. Making the timer deferrable > makes it run only when the CPU is NOT IDLE - that is to say it is busy > running something else - another non-deferrable timer for instance. > >> People in the datacenter really don't want to see more delays when bringing >> up >> link, and we get frequent calls about it already being long on gigabit (not >> even >> minding spanning tree). Adding 25% to that time isn't going to down very >> nicely >> with them. >> > Well but when the machine is coming up the CPU is not going to be IDLE > and your initial timer will likely run when it wants to - i.e. > deferable timers won't be deferred if the CPU is not IDLE. > On the other hand Data center people do care about power consumption > and they would much rather make sure they don't lose network links on > Production boxes - so a properly configured machine/network should not > need to bring up the link more than a small number of times if at all. > Lastly e1000 is also sold with many desktop machines (like mine) and > those people will surely appreciate lesser wakeups. > > I don't have GigE connection where my desktop is located and with > 100Mbps I don't notice any measurable delay in bringing up the link - > may be you could try with this patch and see exactly how longer if at > all it takes to bring up the link on a GigE connected machine. OK, I think that would be an interesting venture and I'm willing to see if I can get those numbers. I'm just wondering if round_jiffies() is largely obsolete because of this. It might just make things worse Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000: Use deferrable timer for watchdog
Parag Warudkar wrote: > On 12/19/07, Kok, Auke <[EMAIL PROTECTED]> wrote: > [snip] > >> I can't possibly see any benefit from this other than that you just add up >> to a >> whole second to the initialization cycle, which is bad. >> > Well, Ok but it can't be bad - I've been using this patch sometime and > haven't seen any problem at all and powertop shows it reduces the > wakeups-from-idle. > > But whatever - no big deal since it already uses round_jiffies(). why would this patch reduce wakeups even more than round_jiffies()? Does it make our ~2 second update interval not reliable? can you quantify "shows it reduces" ? Or timer only runs once every two seconds... maybe I just don't understand the effect of timer_set_deferrable() - we're already deferring it ourselves when we want to. If that is not working then I suggest that we fix that first instead of postponing the critical first run of the e1000 watchdog task. People in the datacenter really don't want to see more delays when bringing up link, and we get frequent calls about it already being long on gigabit (not even minding spanning tree). Adding 25% to that time isn't going to down very nicely with them. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000: Use deferrable timer for watchdog
Parag Warudkar wrote: > > Use deferrable timer for watchdog. Reduces wakeups from idle per second. no, we don't want this. We already allow the re-scheduling of the watchdog to be round_jiffies() modified so that it coincides with other interrupts. but at load time we don't want the timer to be postponed at all for up to a whole second. Since we're doing all sorts of initialization work anyway this is counterproductive anyway, and you really want the link to come up as soon as possible, which is exactly what the watchdog handles when it runs first. I can't possibly see any benefit from this other than that you just add up to a whole second to the initialization cycle, which is bad. Auke > > Signed-off-by: Parag Warudkar <[EMAIL PROTECTED]> > > --- linux-2.6/drivers/net/e1000/e1000_main.c2007-12-07 > 10:04:39.0 -0500 > +++ linux-2.6-work/drivers/net/e1000/e1000_main.c2007-12-18 > 20:38:38.0 -0500 > @@ -1030,7 +1030,7 @@ > adapter->tx_fifo_stall_timer.function = _82547_tx_fifo_stall; > adapter->tx_fifo_stall_timer.data = (unsigned long) adapter; > > -init_timer(>watchdog_timer); > +init_timer_deferrable(>watchdog_timer); > adapter->watchdog_timer.function = _watchdog; > adapter->watchdog_timer.data = (unsigned long) adapter; > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000e: Use deferrable timer for watchdog
Parag Warudkar wrote: > > Reduce wakeups from idle per second. > > Signed-off-by: Parag Warudkar <[EMAIL PROTECTED]> > > --- linux-2.6/drivers/net/e1000e/netdev.c2007-12-07 > 10:04:39.0 -0500 > +++ linux-2.6-work/drivers/net/e1000e/netdev.c2007-12-18 > 20:45:59.0 -0500 > @@ -3899,7 +3899,7 @@ > goto err_eeprom; > } > > -init_timer(>watchdog_timer); > +init_timer_deferrable(>watchdog_timer); > adapter->watchdog_timer.function = _watchdog; > adapter->watchdog_timer.data = (unsigned long) adapter; > see my reply to "Re: [PATCH] e1000: Use deferrable timer for watchdog" - IOW no, we don't want this Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000e: Use deferrable timer for watchdog
Parag Warudkar wrote: Reduce wakeups from idle per second. Signed-off-by: Parag Warudkar [EMAIL PROTECTED] --- linux-2.6/drivers/net/e1000e/netdev.c2007-12-07 10:04:39.0 -0500 +++ linux-2.6-work/drivers/net/e1000e/netdev.c2007-12-18 20:45:59.0 -0500 @@ -3899,7 +3899,7 @@ goto err_eeprom; } -init_timer(adapter-watchdog_timer); +init_timer_deferrable(adapter-watchdog_timer); adapter-watchdog_timer.function = e1000_watchdog; adapter-watchdog_timer.data = (unsigned long) adapter; see my reply to Re: [PATCH] e1000: Use deferrable timer for watchdog - IOW no, we don't want this Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000: Use deferrable timer for watchdog
Parag Warudkar wrote: Use deferrable timer for watchdog. Reduces wakeups from idle per second. no, we don't want this. We already allow the re-scheduling of the watchdog to be round_jiffies() modified so that it coincides with other interrupts. but at load time we don't want the timer to be postponed at all for up to a whole second. Since we're doing all sorts of initialization work anyway this is counterproductive anyway, and you really want the link to come up as soon as possible, which is exactly what the watchdog handles when it runs first. I can't possibly see any benefit from this other than that you just add up to a whole second to the initialization cycle, which is bad. Auke Signed-off-by: Parag Warudkar [EMAIL PROTECTED] --- linux-2.6/drivers/net/e1000/e1000_main.c2007-12-07 10:04:39.0 -0500 +++ linux-2.6-work/drivers/net/e1000/e1000_main.c2007-12-18 20:38:38.0 -0500 @@ -1030,7 +1030,7 @@ adapter-tx_fifo_stall_timer.function = e1000_82547_tx_fifo_stall; adapter-tx_fifo_stall_timer.data = (unsigned long) adapter; -init_timer(adapter-watchdog_timer); +init_timer_deferrable(adapter-watchdog_timer); adapter-watchdog_timer.function = e1000_watchdog; adapter-watchdog_timer.data = (unsigned long) adapter; -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000: Use deferrable timer for watchdog
Parag Warudkar wrote: On 12/19/07, Kok, Auke [EMAIL PROTECTED] wrote: [snip] I can't possibly see any benefit from this other than that you just add up to a whole second to the initialization cycle, which is bad. Well, Ok but it can't be bad - I've been using this patch sometime and haven't seen any problem at all and powertop shows it reduces the wakeups-from-idle. But whatever - no big deal since it already uses round_jiffies(). why would this patch reduce wakeups even more than round_jiffies()? Does it make our ~2 second update interval not reliable? can you quantify shows it reduces ? Or timer only runs once every two seconds... maybe I just don't understand the effect of timer_set_deferrable() - we're already deferring it ourselves when we want to. If that is not working then I suggest that we fix that first instead of postponing the critical first run of the e1000 watchdog task. People in the datacenter really don't want to see more delays when bringing up link, and we get frequent calls about it already being long on gigabit (not even minding spanning tree). Adding 25% to that time isn't going to down very nicely with them. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] e1000: Use deferrable timer for watchdog
Parag Warudkar wrote: On Dec 19, 2007 4:38 PM, Kok, Auke [EMAIL PROTECTED] wrote: Parag Warudkar wrote: On 12/19/07, Kok, Auke [EMAIL PROTECTED] wrote: why would this patch reduce wakeups even more than round_jiffies()? Does it make our ~2 second update interval not reliable? can you quantify shows it reduces ? Or timer only runs once every two seconds... Without the patch - here is what powertop reports steady on my desktop - Wakeups-from-idle per second : 8.5 interval: 1.9s no ACPI power usage estimate available Top causes for wakeups: 28.6% ( 4.0) kernel core : clocksource_register (clocksource_watchdog) 14.3% ( 2.0) automount : futex_wait (hrtimer_wakeup) 14.3% ( 2.0) ntpd : do_setitimer (it_real_fn) 14.3% ( 2.0) ntpdate : do_adjtimex (sync_cmos_clock) 7.1% ( 1.0) interrupt : PS/2 keyboard/mouse/touchpad 7.1% ( 1.0) interrupt : eth0 7.1% ( 1.0)ip : e1000_intr_msi (e1000_watchdog) $ stop network; rmmod e1000e $ patch e1000e/netdev.c ; rebuild ; insmod $ Wait for things to settle With the patch here is what it shows steadily - Wakeups-from-idle per second : 7.5 interval: 5.8s no ACPI power usage estimate available Top causes for wakeups: 32.4% ( 2.2) kernel core : clocksource_register (clocksource_watchdog) 17.6% ( 1.2) ntpd : do_setitimer (it_real_fn) 14.7% ( 1.0) ntpdate : do_adjtimex (sync_cmos_clock) 8.8% ( 0.6) interrupt : eth0 5.9% ( 0.4) events/1 : __netdev_watchdog_up (dev_watchdog) 5.9% ( 0.4) kernel core : neigh_table_init_no_netlink (neigh_periodic_ 5.9% ( 0.4) kernel module : neigh_table_init_no_netlink (neigh_periodic_timer) So no longer e1000_watchdog is waking up the CPU for its own sake - it still runs but when the CPU is already out of IDLE to run something else that needs to be run undeferred. Wakeups from IDLE are down by 1 - from 8.5 to 7.5 . maybe I just don't understand the effect of timer_set_deferrable() - we're already deferring it ourselves when we want to. If that is not working then I suggest that we fix that first instead of postponing the critical first run of the e1000 watchdog task. There is of course a difference between round_jiffies() and timer_set_deferrable() if that's what you were referring to. round_jiffies() will make the timer run at whatever rounded value no matter if the CPU is already IDLE or not. Making the timer deferrable makes it run only when the CPU is NOT IDLE - that is to say it is busy running something else - another non-deferrable timer for instance. People in the datacenter really don't want to see more delays when bringing up link, and we get frequent calls about it already being long on gigabit (not even minding spanning tree). Adding 25% to that time isn't going to down very nicely with them. Well but when the machine is coming up the CPU is not going to be IDLE and your initial timer will likely run when it wants to - i.e. deferable timers won't be deferred if the CPU is not IDLE. On the other hand Data center people do care about power consumption and they would much rather make sure they don't lose network links on Production boxes - so a properly configured machine/network should not need to bring up the link more than a small number of times if at all. Lastly e1000 is also sold with many desktop machines (like mine) and those people will surely appreciate lesser wakeups. I don't have GigE connection where my desktop is located and with 100Mbps I don't notice any measurable delay in bringing up the link - may be you could try with this patch and see exactly how longer if at all it takes to bring up the link on a GigE connected machine. OK, I think that would be an interesting venture and I'm willing to see if I can get those numbers. I'm just wondering if round_jiffies() is largely obsolete because of this. It might just make things worse Auke -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PCI: fix for quirk_e100_interrupt()
Ivan Kokshaysky wrote: > Check that the e100 is in the D0 power state. If it's not, it won't > respond to MMIO accesses and we end up with master-abort machine > checks on some platforms. > > Signed-off-by: Ivan Kokshaysky <[EMAIL PROTECTED]> what kind of platform actually is doing this? It almost seems like something is wrong with that platform's BIOS and I wonder if this workaround should not be more general (IOW is it not just e100 that is affected but other components as well?) Auke > --- > drivers/pci/quirks.c | 14 +- > 1 files changed, 13 insertions(+), 1 deletions(-) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 26cc4dc..c8b2b9d 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -1406,9 +1406,10 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NETMOS, > PCI_ANY_ID, quirk_netmos); > > static void __devinit quirk_e100_interrupt(struct pci_dev *dev) > { > - u16 command; > + u16 command, pmcsr; > u8 __iomem *csr; > u8 cmd_hi; > + int pm; > > switch (dev->device) { > /* PCI IDs taken from drivers/net/e100.c */ > @@ -1442,6 +1443,17 @@ static void __devinit quirk_e100_interrupt(struct > pci_dev *dev) > if (!(command & PCI_COMMAND_MEMORY) || !pci_resource_start(dev, 0)) > return; > > + /* > + * Check that the device is in the D0 power state. If it's not, > + * there is no point to look any further. > + */ > + pm = pci_find_capability(dev, PCI_CAP_ID_PM); > + if (pm) { > + pci_read_config_word(dev, pm + PCI_PM_CTRL, ); > + if ((pmcsr & PCI_PM_CTRL_STATE_MASK) != PCI_D0) > + return; > + } > + > /* Convert from PCI bus to resource space. */ > csr = ioremap(pci_resource_start(dev, 0), 8); > if (!csr) { > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] PCI: fix for quirk_e100_interrupt()
Ivan Kokshaysky wrote: Check that the e100 is in the D0 power state. If it's not, it won't respond to MMIO accesses and we end up with master-abort machine checks on some platforms. Signed-off-by: Ivan Kokshaysky [EMAIL PROTECTED] what kind of platform actually is doing this? It almost seems like something is wrong with that platform's BIOS and I wonder if this workaround should not be more general (IOW is it not just e100 that is affected but other components as well?) Auke --- drivers/pci/quirks.c | 14 +- 1 files changed, 13 insertions(+), 1 deletions(-) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 26cc4dc..c8b2b9d 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -1406,9 +1406,10 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NETMOS, PCI_ANY_ID, quirk_netmos); static void __devinit quirk_e100_interrupt(struct pci_dev *dev) { - u16 command; + u16 command, pmcsr; u8 __iomem *csr; u8 cmd_hi; + int pm; switch (dev-device) { /* PCI IDs taken from drivers/net/e100.c */ @@ -1442,6 +1443,17 @@ static void __devinit quirk_e100_interrupt(struct pci_dev *dev) if (!(command PCI_COMMAND_MEMORY) || !pci_resource_start(dev, 0)) return; + /* + * Check that the device is in the D0 power state. If it's not, + * there is no point to look any further. + */ + pm = pci_find_capability(dev, PCI_CAP_ID_PM); + if (pm) { + pci_read_config_word(dev, pm + PCI_PM_CTRL, pmcsr); + if ((pmcsr PCI_PM_CTRL_STATE_MASK) != PCI_D0) + return; + } + /* Convert from PCI bus to resource space. */ csr = ioremap(pci_resource_start(dev, 0), 8); if (!csr) { -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] net: napi fix
David Miller wrote: > From: Andrew Gallatin <[EMAIL PROTECTED]> > Date: Thu, 13 Dec 2007 09:13:54 -0500 > >> If the netif_running() check is indeed required to make a device break >> out of napi polling and respond to an ifconfig down, then I think the >> netif_running() check should be moved up into net_rx_action() to avoid >> potential for driver complexity and bugs like the ones you found. > > That, or something like it, definitely sounds reasonable and much > better than putting the check into every driver :-) hear hear! Auke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/