IPsec IPv4 over IPv6 Problem: No route to host
Problem about IPsec IPv4 over IPv6, which is included in kernel 2.6.21. A IPv6 global address is assigned for eth1 by DHCPv6 IA-PD. This case, IPsec SA is successfully established but packets cannot been sent to WAN. I do ping, this error occured: $ ping: sendmsg: No route to host Do someone know about this issue ? Network is figured below: LAN WAN LAN |-- GW-1 ---| | eth1eth0 | |-- GW-2 ---| | eth0 eth1 | | ・GW-1 eth0 Link encap:Ethernet HWaddr 00:11:43:AC:60:AF inet6 addr: fe80::211:43ff:feac:60af/64 Scope:Link eth1 Link encap:Ethernet HWaddr 00:90:CC:DE:8B:EE inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: 2001:1::290:ccff:fede:8bee/64 Scope:Global inet6 addr: fe80::290:ccff:fede:8bee/64 Scope:Link ・GW-2 eth0 Link encap:Ethernet HWaddr 00:11:43:AB:00:8A inet6 addr: fe80::211:43ff:feab:8a/64 Scope:Link eth1 Link encap:Ethernet HWaddr 00:90:CC:DE:89:F7 inet addr:192.168.2.1 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: 2001:2::290:ccff:fede:89f7/64 Scope:Global inet6 addr: fe80::290:ccff:fede:89f7/64 Scope:Link After IKE daemon (racoon2) is booted on GW-1/2, I do ping from GW-1: $ ping -I 192.168.1.1 192.168.2.1 This case, IKE sequence is successfully done, and IPsec SA is registered. but ping packet isn't been sent with error above. The case IPv6 global address is assigned to eth0 manually, IPv4 over IPv6 packet is sent successfully. ping6 between GW-1 and GW-2 is okay. I suspect that IPv6 routing table is not loaded or invalid. I tried that IPv6 global address is assigned to eth0 manually (which prefix is varied to GW1/eth0 from GW2/eth0 with appropriate routing set), IPv4 over IPv6 packet is not sent successfully. Toshiyuki Okamoto (okamoso at gmail.com) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20-2.6.21 - networking dies after random time
On Mon, Aug 06, 2007 at 05:19:03PM -0400, Chuck Ebbert wrote: On 08/06/2007 04:42 PM, Jean-Baptiste Vignaud wrote: Mmm, bad news, after 4 hours of intensive network stressing, one of the 2 3com card failed with the latest fedora kernel. Aug 6 22:31:09 loki kernel: NETDEV WATCHDOG: eth2: transmit timed out Aug 6 22:31:09 loki kernel: eth2: transmit timed out, tx_status 00 status e601. Aug 6 22:31:09 loki kernel: diagnostics: net 0ccc media 8880 dma 003a fifo 8000 Aug 6 22:31:09 loki kernel: eth2: Interrupt posted but not delivered -- IRQ blocked by another device? Aug 6 22:31:09 loki kernel: Flags; bus-master 1, dirty 26085000(8) current 26085000(8) Aug 6 22:31:09 loki kernel: Transmit list vs. 81007c807700. Stressing eth2 by copying large files on a samba on share and eth0 by downloading big files on the internet. So even the full revert doesn't fix the 3Com driver, it just makes it less likely to do that. The other patch probably won't be any better -- I'd guess there's some kind of IRQ handling bug in that driver. I don't know how fast are these 3com chips regarding these 8390 described by Alan, and how are irqs shared on Jean-Baptiste's box, but I'm surprised they could have worked sharing interrupts and without such time outs before this change in 2.6.21. It seems some of those older chips, because of slowness, could have transmit problems even without irq sharing. So, IMHO, if possible, there should be never irq sharing enabled between two (or more) drivers using both disable_irq. These time out problems were reported long time ago, but I think it would be nice if this thread could at least remove these new problems reported only after 2.6.21, which it seems is possible now, after Marcin's diagnose: by reverting the whole 2.6.21 patch or by this current temporary patch in 2.6.23-rc2's resend.c. It would be nice if you could try this patch too. BTW: Jean-Babtiste, could you send or point to you current configs? I mean at least proc/interrupts, but with dmesg and .config it would be even better. (I assume this last report was about the revert patch mentioned by Chuck, not the one below your message?) Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20-2.6.21 - networking dies after random time
2007/8/6, Ingo Molnar [EMAIL PROTECTED]: (..) please try Jarek's second patch too - there was a missing unmask. Ingo -- Subject: genirq: fix simple and fasteoi irq handlers From: Jarek Poplawski [EMAIL PROTECTED] After the genirq: do not mask interrupts by default patch interrupts should be disabled not immediately upon request, but after they happen. But, handle_simple_irq() and handle_fasteoi_irq() can skip this once or more if an irq is just serviced (IRQ_INPROGRESS), possibly disrupting a driver's work. The main reason of problems here, pointing the broken patch and making the first patch which can fix this was done by Marcin Slusarz. Additional test patches of Thomas Gleixner and Ingo Molnar tested by Marcin Slusarz helped to narrow possible reasons even more. Thanks. PS: this patch fixes only one evident error here, but there could be more places affected by above-mentioned change in irq handling. PS 2: After rethinking, IMHO, there are two most probable scenarios here: 1. After hw resend there could be a conflict between retriggered edge type irq and the next level type one: e.g. if this level type irq (io_apic is enabled then) is triggered while retriggered irq is serviced (IRQ_INPROGRESS) there is goto out with eoi, and probably the next such levels are triggered and looping, so probably kind of flood in io_apic until this retriggered edge service has ended. 2. There is something wrong with ioapic_retrigger_irq (less probable because this should be probably seen with 'normal' edge retriggers, but on the other hand, they could be less common). So, if there is #1, this fixed patch should work. But, since level types don't need this retriggers too much I think this don't mask interrupts by default idea should be rethinked: is there enough gain to risk such hard to diagnose errors? So, IMHO, there should be at least possibility to turn this off for level types in config (it should be a visible option, so people could find try this before writing for help or changing a network card). Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.23-rc1-/kernel/irq/chip.c 2.6.23-rc1/kernel/irq/chip.c --- 2.6.23-rc1-/kernel/irq/chip.c 2007-07-09 01:32:17.0 +0200 +++ 2.6.23-rc1/kernel/irq/chip.c2007-08-05 21:49:46.0 +0200 @@ -295,12 +295,11 @@ handle_simple_irq(unsigned int irq, stru spin_lock(desc-lock); - if (unlikely(desc-status IRQ_INPROGRESS)) - goto out_unlock; kstat_cpu(cpu).irqs[irq]++; action = desc-action; - if (unlikely(!action || (desc-status IRQ_DISABLED))) { + if (unlikely(!action || (desc-status (IRQ_INPROGRESS | +IRQ_DISABLED { if (desc-chip-mask) desc-chip-mask(irq); desc-status = ~(IRQ_REPLAY | IRQ_WAITING); @@ -318,6 +317,8 @@ handle_simple_irq(unsigned int irq, stru spin_lock(desc-lock); desc-status = ~IRQ_INPROGRESS; + if (!(desc-status IRQ_DISABLED) desc-chip-unmask) + desc-chip-unmask(irq); out_unlock: spin_unlock(desc-lock); } @@ -392,18 +393,16 @@ handle_fasteoi_irq(unsigned int irq, str spin_lock(desc-lock); - if (unlikely(desc-status IRQ_INPROGRESS)) - goto out; - desc-status = ~(IRQ_REPLAY | IRQ_WAITING); kstat_cpu(cpu).irqs[irq]++; /* -* If its disabled or no action available +* If it's running, disabled or no action available * then mask it and get out of here: */ action = desc-action; - if (unlikely(!action || (desc-status IRQ_DISABLED))) { + if (unlikely(!action || (desc-status (IRQ_INPROGRESS | +IRQ_DISABLED { desc-status |= IRQ_PENDING; if (desc-chip-mask) desc-chip-mask(irq); @@ -420,6 +419,8 @@ handle_fasteoi_irq(unsigned int irq, str spin_lock(desc-lock); desc-status = ~IRQ_INPROGRESS; + if (!(desc-status IRQ_DISABLED) desc-chip-unmask) + desc-chip-unmask(irq); out: desc-chip-eoi(irq); Network card still locks up (tested on 2.6.22.1). I had to upload more data than usual (~350 MB vs ~1-100 MB) to trigger that bug but it might be a coincidence... Marcin - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20-2.6.21 - networking dies after random time
On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote: 2007/8/6, Ingo Molnar [EMAIL PROTECTED]: (..) please try Jarek's second patch too - there was a missing unmask. Ingo -- Subject: genirq: fix simple and fasteoi irq handlers From: Jarek Poplawski [EMAIL PROTECTED] ... Network card still locks up (tested on 2.6.22.1). I had to upload more data than usual (~350 MB vs ~1-100 MB) to trigger that bug but it might be a coincidence... Thanks! It's a good news after all - it would be really strange why this place doesn't hit more people (it seems there is some safety elsewhere for this). BTW: I hope, this previous Thomas' patch with Ingo's warning to resend.c (with a warning), had no problems with a similar load? So, once more, I would suspect hw retrigger code. Ingo, IMHO, this patch for testing HARDIRQS_SW_RESEND could be reworked, so that desc-chip-retrigger() is done only for eadges and the tasklet only for levels. BTW, I think this current warning in the temporary is is too early - we don't know if after this the -retrigger() will take place. Regards, Jarek P. PS: Marcin, if you need a break in this testing let us know! I think the main idea of this bug is known enough. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20-2.6.21 - networking dies after random time
BTW: Jean-Babtiste, could you send or point to you current configs? I mean at least proc/interrupts, but with dmesg and .config it would be even better. (I assume this last report was about the revert patch mentioned by Chuck, not the one below your message?) Sure. Last reports are with the 2.6.22.1-41.fc7 kernel, which has in changelog : * Sat Jul 28 2007 Chuck Ebbert [EMAIL PROTECTED] - revert upstream genirq: do not mask interrupts by default * interrupts (i use irqbalance, but problem was the same without) [EMAIL PROTECTED] ~]# cat /proc/interrupts CPU0 CPU1 0: 44874910668 IO-APIC-edge timer 1:241 58 IO-APIC-edge i8042 8: 0 0 IO-APIC-edge rtc0 9: 0 0 IO-APIC-fasteoi acpi 12: 2139 IO-APIC-edge i8042 14: 0 0 IO-APIC-edge libata 15: 0 0 IO-APIC-edge libata 16: 72625 96 IO-APIC-fasteoi eth1 17: 4667128 IO-APIC-fasteoi eth2 20: 4156 39870 IO-APIC-fasteoi sata_nv 21: 34794 9177 IO-APIC-fasteoi sata_nv 22: 0 0 IO-APIC-fasteoi ehci_hcd:usb2 23: 6005 1565 IO-APIC-fasteoi ohci_hcd:usb1, sata_nv 2297: 3 492180 PCI-MSI-edge eth0 NMI: 0 0 LOC:49153454915282 ERR: 0 problems are with eth1 and eth2 here. never had any problems with the onboard (eth0). * pci 00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a1) 00:01.0 ISA bridge: nVidia Corporation MCP55 LPC Bridge (rev a2) 00:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a2) 00:01.2 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2) 00:02.0 USB Controller: nVidia Corporation MCP55 USB Controller (rev a1) 00:02.1 USB Controller: nVidia Corporation MCP55 USB Controller (rev a2) 00:04.0 IDE interface: nVidia Corporation MCP55 IDE (rev a1) 00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2) 00:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2) 00:05.2 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2) 00:06.0 PCI bridge: nVidia Corporation MCP55 PCI bridge (rev a2) 00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a2) 00:0a.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a2) 00:0b.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a2) 00:0c.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a2) 00:0d.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a2) 00:0e.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a2) 00:0f.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a2) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:06.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78) 01:07.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78) 07:00.0 VGA compatible controller: nVidia Corporation NV44 [GeForce 6200 LE] (rev a1) * dmesg (from a reboot this morning) Linux version 2.6.22.1-41.fc7 ([EMAIL PROTECTED]) (gcc version 4.1.2 20070502 (Red Hat 4.1.2-12)) #1 SMP Fri Jul 27 18:21:43 EDT 2007 Command line: ro root=/dev/all/root BIOS-provided physical RAM map: BIOS-e820: - 0009f000 (usable) BIOS-e820: 0009f000 - 000a (reserved) BIOS-e820: 000f - 0010 (reserved) BIOS-e820: 0010 - 7fee (usable) BIOS-e820: 7fee - 7fee3000 (ACPI NVS) BIOS-e820: 7fee3000 - 7fef (ACPI data) BIOS-e820: 7fef - 7ff0 (reserved) BIOS-e820: f000 - f400 (reserved) BIOS-e820: fec0 - 0001 (reserved) Entering add_active_range(0, 0, 159) 0 entries of 3200 used Entering add_active_range(0, 256, 524000) 1 entries of 3200 used end_pfn_map = 1048576 DMI 2.4 present. ACPI: RSDP 000F7620, 0024 (r2 Nvidia) ACPI: XSDT 7FEE30C0, 0044 (r1 Nvidia ASUSACPI 42302E31 AWRD0) ACPI: FACP 7FEEC400, 00F4 (r3 Nvidia ASUSACPI 42302E31 AWRD0) ACPI: DSDT 7FEE3240, 9164 (r1 NVIDIA AWRDACPI 1000 MSFT 300) ACPI: FACS 7FEE, 0040 ACPI: HPET 7FEEC600, 0038 (r1 Nvidia ASUSACPI 42302E31 AWRD 98) ACPI: MCFG 7FEEC680, 003C (r1 Nvidia ASUSACPI 42302E31 AWRD0) ACPI: APIC 7FEEC540, 007C (r1 Nvidia ASUSACPI 42302E31 AWRD0) Scanning NUMA topology in Northbridge 24 No NUMA configuration found Faking a node at -7fee Entering add_active_range(0, 0, 159)
Re: 2.6.20-2.6.21 - networking dies after random time
* interrupts (i use irqbalance, but problem was the same without) I wonder if you tried without SMP too? No i did not. Do you think that this can be a problem ? To test with no SMP, do i need to recompile kernel or is there a kernel parameter ? BTW, Jean-Baptiste and Chuck - it seems, unless you have too much time, there is no use for testing my genirq: fix simple and fasteoi irq handlers patch. Well i just tested 2.6.23-rc1 with your patch and copied (using smbclient) big files : Aug 7 11:11:53 loki kernel: NETDEV WATCHDOG: eth2: transmit timed out Aug 7 11:11:53 loki kernel: eth2: transmit timed out, tx_status 00 status e601. Aug 7 11:11:53 loki kernel: diagnostics: net 0ccc media 8880 dma 003a fifo Aug 7 11:11:53 loki kernel: eth2: Interrupt posted but not delivered -- IRQ blocked by another device? Aug 7 11:11:53 loki kernel: Flags; bus-master 1, dirty 93481(9) current 93481(9) Aug 7 11:11:53 loki kernel: Transmit list vs. 81007be977a0. Aug 7 11:11:53 loki kernel: 0: @81007be97200 length 805f status 0001005f Aug 7 11:11:53 loki kernel: 1: @81007be972a0 length 805f status 0001005f Aug 7 11:11:53 loki kernel: 2: @81007be97340 length 805f status 0001005f Aug 7 11:11:53 loki kernel: 3: @81007be973e0 length 805f status 0001005f Aug 7 11:11:53 loki kernel: 4: @81007be97480 length 803c status 0001003c Aug 7 11:11:53 loki kernel: 5: @81007be97520 length 803c status 0001003c Aug 7 11:11:53 loki kernel: 6: @81007be975c0 length 803c status 0001003c Aug 7 11:11:53 loki kernel: 7: @81007be97660 length 803c status 8001003c Aug 7 11:11:53 loki kernel: 8: @81007be97700 length 803c status 8001003c Aug 7 11:11:53 loki kernel: 9: @81007be977a0 length 802a status 0001002a Aug 7 11:11:53 loki kernel: 10: @81007be97840 length 803a status 0001003a Aug 7 11:11:53 loki kernel: 11: @81007be978e0 length 805f status 0001005f Aug 7 11:11:53 loki kernel: 12: @81007be97980 length 80be status 0c0100be Aug 7 11:11:53 loki kernel: 13: @81007be97a20 length 80be status 0c0100be Aug 7 11:11:53 loki kernel: 14: @81007be97ac0 length 805f status 0001005f Aug 7 11:11:53 loki kernel: 15: @81007be97b60 length 805f status 0001005f Thanks; Jb - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20-2.6.21 - networking dies after random time
On Tue, Aug 07, 2007 at 10:10:34AM +0200, Jean-Baptiste Vignaud wrote: BTW: Jean-Babtiste, could you send or point to you current configs? Oops! I'm very sorry for misspelling! I mean at least proc/interrupts, but with dmesg and .config it would be even better. (I assume this last report was about the revert patch mentioned by Chuck, not the one below your message?) Sure. Last reports are with the 2.6.22.1-41.fc7 kernel, which has in changelog : * Sat Jul 28 2007 Chuck Ebbert [EMAIL PROTECTED] - revert upstream genirq: do not mask interrupts by default * interrupts (i use irqbalance, but problem was the same without) I wonder if you tried without SMP too? [EMAIL PROTECTED] ~]# cat /proc/interrupts CPU0 CPU1 ... 16: 72625 96 IO-APIC-fasteoi eth1 17: 4667128 IO-APIC-fasteoi eth2 20: 4156 39870 IO-APIC-fasteoi sata_nv 21: 34794 9177 IO-APIC-fasteoi sata_nv 22: 0 0 IO-APIC-fasteoi ehci_hcd:usb2 23: 6005 1565 IO-APIC-fasteoi ohci_hcd:usb1, sata_nv 2297: 3 492180 PCI-MSI-edge eth0 NMI: 0 0 LOC:49153454915282 ERR: 0 So, here it's not about irq sharing... problems are with eth1 and eth2 here. never had any problems with the onboard (eth0). ... * .config i dont have it, it was the standard fedora one. i'm not sure that the problem is related to 3com, because i replaced those cards by old card i had in spare : 01:06.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 42) 01:07.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8029(AS) and i had the exact same problem. Those 3com cards were working 24/24 before i went to fedora 7 (and kernel 2.6.21 then). It seems from 2.6.21 the problems are mainly about 'older' network chips on x86_64. This reverted patch should mean only for those using disable_irq, but I see forcedeth could use this too so it's not clear yet, and btw. there where other changes around irqs and pci, so everybody could have something a bit different with similar time outs logs... BTW, Jean-Baptiste and Chuck - it seems, unless you have too much time, there is no use for testing my genirq: fix simple and fasteoi irq handlers patch. Thanks, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20-2.6.21 - networking dies after random time
On Tue, Aug 07, 2007 at 11:21:07AM +0200, Jean-Baptiste Vignaud wrote: * interrupts (i use irqbalance, but problem was the same without) I wonder if you tried without SMP too? No i did not. Do you think that this can be a problem ? To test with no SMP, do i need to recompile kernel or is there a kernel parameter ? It's always better to exclude any complications if it's possible. Yes, there is the kernel parameter for this: nosmp. So, if you have some time to spare I think 2.6.23-rc2 with this nosmp could be an interesting option. BTW, Jean-Baptiste and Chuck - it seems, unless you have too much time, there is no use for testing my genirq: fix simple and fasteoi irq handlers patch. Well i just tested 2.6.23-rc1 with your patch and copied (using smbclient) big files : Aug 7 11:11:53 loki kernel: NETDEV WATCHDOG: eth2: transmit timed out Aug 7 11:11:53 loki kernel: eth2: transmit timed out, tx_status 00 status e601. Aug 7 11:11:53 loki kernel: diagnostics: net 0ccc media 8880 dma 003a fifo Aug 7 11:11:53 loki kernel: eth2: Interrupt posted but not delivered -- IRQ blocked by another device? Aug 7 11:11:53 loki kernel: Flags; bus-master 1, dirty 93481(9) current 93481(9) Aug 7 11:11:53 loki kernel: Transmit list vs. 81007be977a0. Aug 7 11:11:53 loki kernel: 0: @81007be97200 length 805f status 0001005f ... Thanks, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net/core/utils: fix sparse warning
net_msg_warn is not defined because it is in net/sock.h which isn't included. Signed-off-by: Johannes Berg [EMAIL PROTECTED] --- net/core/utils.c |1 + 1 file changed, 1 insertion(+) --- wireless-dev.orig/net/core/utils.c 2007-08-06 18:35:29.197838489 +0200 +++ wireless-dev/net/core/utils.c 2007-08-06 18:35:52.227838489 +0200 @@ -25,6 +25,7 @@ #include linux/random.h #include linux/percpu.h #include linux/init.h +#include net/sock.h #include asm/byteorder.h #include asm/system.h - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] allow device to stop packet mirror behaviour
In the wireless code, we have special 802.11+radiotap framed virtual interfaces, mostly used to monitor traffic on the air. They also show outgoing packets from other virtual interfaces associated with the same PHY because you can't receive packets while sending. Due to the design of the virtual interfaces, the packets don't pass through those 802.11+radiotap framed interfaces. This whole setup has the additional advantage that we are able to indicate the transmission status parameters via radiotap as well, meaning that we can tell (in userspace) by looking at the radiotap header whether a packet was acknowledged by the receiver, whether RTS/CTS was used etc. This is required for implementing more things in userspace which we plan to do in order to not have the high-complexity MLME in the kernel. Now, however, we run into the situation that somebody is actually sending frames down the 802.11+radiotap framed interface. This could be the userspace MLME implementation, for example, sending association requests or whatever. These will now show up twice on the monitoring interface that the userspace MLME is using, once via dev_queue_xmit_nit() because they were sent on that interface and once via our own mirror mechanism that also shows the transmission status indication. Andy has written a patch that suppresses our own mirror mechanism from becoming effective for packets that were already mirrored out by dev_queue_xmit_nit(), however this is not very desirable because it makes such packets special, their transmission status information will not be available; however, in some circumstances this information is required (for example when implementing an MLME using 802.11+radiotap framed interfaces.) [Also, the current implementation means that on yet another monitor interface you don't see those frames at all.] The only way to solve this problem therefore seems to be to suppress the mirroring out of the packet by dev_queue_xmit_nit(). The patch below does that by way of adding a new netdev flag. Comments welcome. johannes --- Tested with three monitor interfaces and a trivial injection program on bcm43xx-mac80211. I'll send the mac80211 patch I used as a follow-up. include/linux/if.h |2 ++ net/core/dev.c |4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) --- wireless-dev.orig/include/linux/if.h2007-08-06 21:02:55.868164177 +0200 +++ wireless-dev/include/linux/if.h 2007-08-06 21:03:05.458164177 +0200 @@ -50,6 +50,8 @@ #define IFF_LOWER_UP 0x1 /* driver signals L1 up */ #define IFF_DORMANT0x2 /* driver signals dormant */ +#define IFF_NO_MIRROR 0x4 /* driver will mirror packets */ + #define IFF_VOLATILE (IFF_LOOPBACK|IFF_POINTOPOINT|IFF_BROADCAST|\ IFF_MASTER|IFF_SLAVE|IFF_RUNNING|IFF_LOWER_UP|IFF_DORMANT) --- wireless-dev.orig/net/core/dev.c2007-08-06 21:02:55.898164177 +0200 +++ wireless-dev/net/core/dev.c 2007-08-06 21:04:58.218164177 +0200 @@ -1417,7 +1417,7 @@ static int dev_gso_segment(struct sk_buf int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev) { if (likely(!skb-next)) { - if (!list_empty(ptype_all)) + if (!list_empty(ptype_all) !(dev-flags IFF_NO_MIRROR)) dev_queue_xmit_nit(skb, dev); if (netif_needs_gso(dev, skb)) { @@ -2829,7 +2829,7 @@ int dev_change_flags(struct net_device * IFF_DYNAMIC | IFF_MULTICAST | IFF_PORTSEL | IFF_AUTOMEDIA)) | (dev-flags (IFF_UP | IFF_VOLATILE | IFF_PROMISC | - IFF_ALLMULTI)); + IFF_ALLMULTI | IFF_NO_MIRROR)); /* * Load in the correct multicast list now the flags have changed. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] allow device to stop packet mirror behaviour
This is the corresponding patch to mac80211. It marks all monitor type interfaces with IFF_NO_MIRROR the reasons for which I explained in the previous mail; it also marks the master device with IFF_NO_MIRROR so you don't see *any* packets on the master device (right now you see outgoing frames with 802.11 header.) We want to get rid of it anyway so making it a bit more useless yet seems like a good idea. johannes --- net/mac80211/ieee80211.c |1 + net/mac80211/ieee80211_iface.c |2 ++ 2 files changed, 3 insertions(+) --- wireless-dev.orig/net/mac80211/ieee80211_iface.c2007-08-06 21:14:37.398164177 +0200 +++ wireless-dev/net/mac80211/ieee80211_iface.c 2007-08-06 21:15:02.078164177 +0200 @@ -158,6 +158,7 @@ void ieee80211_if_set_type(struct net_de int oldtype = sdata-type; dev-hard_start_xmit = ieee80211_subif_start_xmit; + dev-flags = ~IFF_NO_MIRROR; sdata-type = type; switch (type) { @@ -216,6 +217,7 @@ void ieee80211_if_set_type(struct net_de case IEEE80211_IF_TYPE_MNTR: dev-type = ARPHRD_IEEE80211_RADIOTAP; dev-hard_start_xmit = ieee80211_monitor_start_xmit; + dev-flags |= IFF_NO_MIRROR; break; default: printk(KERN_WARNING %s: %s: Unknown interface type 0x%x, --- wireless-dev.orig/net/mac80211/ieee80211.c 2007-08-06 21:15:01.898164177 +0200 +++ wireless-dev/net/mac80211/ieee80211.c 2007-08-06 21:15:02.088164177 +0200 @@ -5138,6 +5138,7 @@ struct ieee80211_hw *ieee80211_alloc_hw( mdev-stop = ieee80211_master_stop; mdev-type = ARPHRD_IEEE80211; mdev-hard_header_parse = header_parse_80211; + mdev-flags |= IFF_NO_MIRROR; sdata-type = IEEE80211_IF_TYPE_AP; sdata-dev = mdev; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20-2.6.21 - networking dies after random time
On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote: 2007/8/7, Jarek Poplawski [EMAIL PROTECTED]: On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote: Network card still locks up (tested on 2.6.22.1). I had to upload more data than usual (~350 MB vs ~1-100 MB) to trigger that bug but it might be a coincidence... Thanks! It's a good news after all - it would be really strange why this place doesn't hit more people (it seems there is some safety elsewhere for this). BTW: I hope, this previous Thomas' patch with Ingo's warning to resend.c (with a warning), had no problems with a similar load? I always tested on 500-600 MB dataset PS: Marcin, if you need a break in this testing let us know! No, i don't need a break. I'll have more time in next weeks. Great! So, I'll try to send a patch with _SW_RESEND in a few hours, if Ingo doesn't prepare something for you. Thanks, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20-2.6.21 - networking dies after random time
On Mon, Aug 06, 2007 at 01:43:48PM -0400, Chuck Ebbert wrote: On 08/06/2007 03:03 AM, Ingo Molnar wrote: But, since level types don't need this retriggers too much I think this don't mask interrupts by default idea should be rethinked: is there enough gain to risk such hard to diagnose errors? I reverted those masking changes in Fedora and the baffling problem with 3Com 3C905 network adapters went away. Before, they would print: eth0: transmit timed out, tx_status 00 status e601. diagnostics: net 0ccc media 8880 dma 003a fifo eth0: Interrupt posted but not delivered -- IRQ blocked by another device? Flags; bus-master 1, dirty 295757(13) current 295757(13) Transmit list vs. f7150a20. 0: @f7150200 length 8070 status 0c010070 1: @f71502a0 length 8070 status 0c010070 2: @f7150340 length 805c status 0c01005c Now they just work, apparently... So why not just revert the change? Ingo has written about such possibility. But, it would be good to know which precisely place is to blame, as well. Since this diagnosing takes time, I think Chuck is right, and maybe at least this temporary patch for resend.c without this warning, should be recomended for stables (2.6.21 and 2.6.22)? Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] phy layer: fix phy_mii_ioctl for autonegotiation
Fix a thinko (?) in setting phydev-autoneg. Signed-off-by: Domen Puncer [EMAIL PROTECTED] --- This fixes my mii.h - ethtool.h advertising #defines. I'm not sure why and how they're translated, but it does work now. Maybe they're just ignored, since mii-tool directly reads and writes MII registers. drivers/net/phy/phy.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Index: work-powerpc.git/drivers/net/phy/phy.c === --- work-powerpc.git.orig/drivers/net/phy/phy.c +++ work-powerpc.git/drivers/net/phy/phy.c @@ -261,7 +261,7 @@ void phy_sanitize_settings(struct phy_de /* Sanitize settings based on PHY capabilities */ if ((features SUPPORTED_Autoneg) == 0) - phydev-autoneg = 0; + phydev-autoneg = AUTONEG_DISABLE; idx = phy_find_valid(phy_find_setting(phydev-speed, phydev-duplex), features); @@ -374,7 +374,7 @@ int phy_mii_ioctl(struct phy_device *phy if (mii_data-phy_id == phydev-addr) { switch(mii_data-reg_num) { case MII_BMCR: - if (val (BMCR_RESET|BMCR_ANENABLE)) + if ((val (BMCR_RESET|BMCR_ANENABLE)) == 0) phydev-autoneg = AUTONEG_DISABLE; else phydev-autoneg = AUTONEG_ENABLE; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20-2.6.21 - networking dies after random time
On Tue, Aug 07, 2007 at 11:52:46AM +0200, Jarek Poplawski wrote: On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote: 2007/8/7, Jarek Poplawski [EMAIL PROTECTED]: On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote: Network card still locks up (tested on 2.6.22.1). I had to upload more data than usual (~350 MB vs ~1-100 MB) to trigger that bug but it might be a coincidence... Thanks! It's a good news after all - it would be really strange why this place doesn't hit more people (it seems there is some safety elsewhere for this). BTW: I hope, this previous Thomas' patch with Ingo's warning to resend.c (with a warning), had no problems with a similar load? I always tested on 500-600 MB dataset PS: Marcin, if you need a break in this testing let us know! No, i don't need a break. I'll have more time in next weeks. Great! So, I'll try to send a patch with _SW_RESEND in a few hours, if Ingo doesn't prepare something for you. So, the let's try this idea yet: modified Ingo's x86: activate HARDIRQS_SW_RESEND patch. (Don't forget about make oldconfig before make.) For testing only. Cheers, Jarek P. PS: alas there was not even time for compile checking... --- diff -Nurp 2.6.22.1-/arch/i386/Kconfig 2.6.22.1/arch/i386/Kconfig --- 2.6.22.1-/arch/i386/Kconfig 2007-07-09 01:32:17.0 +0200 +++ 2.6.22.1/arch/i386/Kconfig 2007-08-07 13:13:03.0 +0200 @@ -1252,6 +1252,10 @@ config GENERIC_PENDING_IRQ depends on GENERIC_HARDIRQS SMP default y +config HARDIRQS_SW_RESEND + bool + default y + config X86_SMP bool depends on SMP !X86_VOYAGER diff -Nurp 2.6.22.1-/arch/x86_64/Kconfig 2.6.22.1/arch/x86_64/Kconfig --- 2.6.22.1-/arch/x86_64/Kconfig 2007-07-09 01:32:17.0 +0200 +++ 2.6.22.1/arch/x86_64/Kconfig2007-08-07 13:13:03.0 +0200 @@ -690,6 +690,10 @@ config GENERIC_PENDING_IRQ depends on GENERIC_HARDIRQS SMP default y +config HARDIRQS_SW_RESEND + bool + default y + menu Power management options source kernel/power/Kconfig diff -Nurp 2.6.22.1-/kernel/irq/manage.c 2.6.22.1/kernel/irq/manage.c --- 2.6.22.1-/kernel/irq/manage.c 2007-07-09 01:32:17.0 +0200 +++ 2.6.22.1/kernel/irq/manage.c2007-08-07 13:13:03.0 +0200 @@ -169,6 +169,14 @@ void enable_irq(unsigned int irq) desc-depth--; } spin_unlock_irqrestore(desc-lock, flags); +#ifdef CONFIG_HARDIRQS_SW_RESEND + /* +* Do a bh disable/enable pair to trigger any pending +* irq resend logic: +*/ + local_bh_disable(); + local_bh_enable(); +#endif } EXPORT_SYMBOL(enable_irq); diff -Nurp 2.6.22.1-/kernel/irq/resend.c 2.6.22.1/kernel/irq/resend.c --- 2.6.22.1-/kernel/irq/resend.c 2007-07-09 01:32:17.0 +0200 +++ 2.6.22.1/kernel/irq/resend.c2007-08-07 13:57:54.0 +0200 @@ -62,16 +62,24 @@ void check_irq_resend(struct irq_desc *d */ desc-chip-enable(irq); + /* +* Temporary hack to figure out more about the problem, which +* is causing the ancient network cards to die. +*/ + if ((status (IRQ_PENDING | IRQ_REPLAY)) == IRQ_PENDING) { desc-status = (status ~IRQ_PENDING) | IRQ_REPLAY; - if (!desc-chip || !desc-chip-retrigger || - !desc-chip-retrigger(irq)) { + if (desc-handle_irq == handle_edge_irq) { + if (desc-chip-retrigger) + desc-chip-retrigger(irq); + return; + } #ifdef CONFIG_HARDIRQS_SW_RESEND - /* Set it pending and activate the softirq: */ - set_bit(irq, irqs_resend); - tasklet_schedule(resend_tasklet); + WARN_ON_ONCE(1); + /* Set it pending and activate the softirq: */ + set_bit(irq, irqs_resend); + tasklet_schedule(resend_tasklet); #endif - } } } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Distributed storage.
On Sun, Aug 05 2007, Daniel Phillips wrote: A simple way to solve the stable accounting field issue is to add a new pointer to struct bio that is owned by the top level submitter (normally generic_make_request but not always) and is not affected by any recursive resubmission. Then getting rid of that field later becomes somebody's summer project, which is not all that urgent because struct bio is already bloated up with a bunch of dubious fields and is a transient structure anyway. Thanks for your insights. Care to detail what bloat and dubious fields struct bio has? And we don't add temporary fields out of laziness, hoping that someone will later kill it again and rewrite it in a nicer fashion. Hint: that never happens, bloat sticks. -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible bug in realtek 8169 ethernet driver
Francois Romieu wrote: Bram [EMAIL PROTECTED] : [...] The device now works! But, it still comes up as eth2 instead of eth0, even though it's first detected as eth0. There are no other network Check the udev rules and/or your init scripts ? You're right, it's a udev script assigning new names to unknown cards, I wasn't aware of that. Thanks, Bram - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC]: napi_struct V5
On Sun, 2007-05-08 at 23:24 -0700, David Miller wrote: 3) Attempt to bring NAPI howto as uptodate as is possible for such a rotting document. :) That doc is out of date on the split of work - it focusses mostly describing the original tulip which did not mix rx and tx in the napi_poll(). AFAIK, no driver does that today (although i really liked that scheme, there is a lot of fscked hardware out there that wont work well with that scheme). Where are the firemen when you need them? Scanning your changes on the drivers for hardware i possess, I dont see any issues. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20-2.6.21 - networking dies after random time
On Tue, Aug 07, 2007 at 02:13:39PM +0200, Jarek Poplawski wrote: On Tue, Aug 07, 2007 at 11:52:46AM +0200, Jarek Poplawski wrote: On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote: ... No, i don't need a break. I'll have more time in next weeks. Great! So, I'll try to send a patch with _SW_RESEND in a few hours, if Ingo doesn't prepare something for you. So, the let's try this idea yet: modified Ingo's x86: activate HARDIRQS_SW_RESEND patch. (Don't forget about make oldconfig before make.) For testing only. Cheers, Jarek P. PS: alas there was not even time for compile checking... And here is one more patch to test the same idea (chip-retrigger()). Let's try i386 way! (I hope I will not be arrested for this...) (Should be tested without any previous patches.) Jarek P. PS: as above --- diff -Nurp 2.6.22.1-/arch/x86_64/kernel/io_apic.c 2.6.22.1/arch/x86_64/kernel/io_apic.c --- 2.6.22.1-/arch/x86_64/kernel/io_apic.c 2007-07-09 01:32:17.0 +0200 +++ 2.6.22.1/arch/x86_64/kernel/io_apic.c 2007-08-07 14:37:45.0 +0200 @@ -1311,15 +1311,8 @@ static unsigned int startup_ioapic_irq(u static int ioapic_retrigger_irq(unsigned int irq) { struct irq_cfg *cfg = irq_cfg[irq]; - cpumask_t mask; - unsigned long flags; - - spin_lock_irqsave(vector_lock, flags); - cpus_clear(mask); - cpu_set(first_cpu(cfg-domain), mask); - send_IPI_mask(mask, cfg-vector); - spin_unlock_irqrestore(vector_lock, flags); + send_IPI_self(cfg-vector); return 1; } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e100 (was: eepro100) - Nobody Cares (hardware?)
On Mon, 06 Aug 2007 17:45:09 -0700, Kok, Auke wrote [moving to netdev mailinglist] Eric, please don't forget that an entire team here at Intel is dedicated to supporting e100 and pro/1000 devices from Intel. Most of the pro/100 features are documented in the SDM which contains some references to the eeprom parts. Mostly the device doesn't need much configuration from the eeprom to work (unlike gigE parts). The SDM can be downloaded from our sf.net project page: http://sourceforge.net/project/showfiles.php?group_id=42302package_id=68544 The issue that you are reporting: My system boots fine but when I try to bring up the onboard ethernet (an EEPro 100 VE) I get a Nobody Cares message and the interrupt is disabled. However has been recently patched. This should have worked regardless of whether you used e100 or eepro100 (noting that nobody supports eepro100 anymore, you should really use e100 for all tests). if you look in drivers/pci/quirks.c you'll find that there is specific code for e100 devices. If this quirk doesn't work for you then we'll need to dig into that. For this I'd like you to gather: - `ethtool -e eth0` output - `lspci -n` output this will allow me to check the quirck code and see if it has the right device ID. I'm suspecting that the device ID is missing somehow, or the workaround fails. Auke Thanks for the help. Here are the lspci -n and ethtool -e outputs. I am attaching both the results for the 'bad' unit and for another one which is supposedly identical except for some battery charge circuitry. The eeprom data on the bad one may be a little odd due to my trying to make it match that of the good one, including that I forgot what the real MAC address was supposed to be. I can get one that I haven't screwed up if you need it, but it will probably take all day. -- A hunch is creativity trying to tell you something -- Frank Capra Eric Johnson lspci_good.txt Description: Binary data lspci_bad.txt Description: Binary data ethtool_bad.txt Description: Binary data ethtool_good.txt Description: Binary data
Re: fscked clock sources revisited
On Mon, 2007-30-07 at 22:14 -0400, jamal wrote: I am going to test with hpet when i get the chance Couldnt figure how to turn on/off hpet, so didnt test. and perhaps turn off all the other sources if nothing good comes out; i need my numbers ;- Here are some numbers that make the mystery even more interesting. This is with kernel 2.6.22-rc4. Repeating with kernel 2.6.23-rc1 didnt show anything different. I went back to test on 2.6.22-rc4 because it is the base for my batching patches - and since those drove me to this test, i wanted something that reduces variables when comparing with batching. I picked udp for this test because i can select different packet sizes. i used iperf. The sender is a dual opteron with tg3. The receiver is a dual xeon. The default HZ is 250. Each packet size was run 3 times with different clock sources. The experiment made sure that the receiver wasnt a bottleneck (increased socket buffer sizes etc) Packet | jiffies (1/250) | tsc |acpi_pm -|---|--- 64 | 141, 145, 142 | 131, 136, 130 | 103, 104, 110 128 | 256, 256, 256 | 274, 260, 269 | 216, 206, 220 512 | 513, 513, 513 | 886, 886, 886 | 828, 814, 806 1280 | 684, 684, 684 | 951, 951, 951 | 951, 951, 951 So i was wrong to declare jiffies as being good. The last batch of experiments were based on only 64 byte UDP. Clearly as packet size goes up, the results are worse with jiffies. At this point, i decided to recompile the kernel with HZ=1000 and the observations show that the jiffies results are improved. Packet | jiffies (1/250) | tsc |acpi_pm -|---|--- 64 | 145, 135, 135 | 131, 137, 139 | 110, 110, 108 128 | 257, 257, 257 | 270, 264, 250 | 218, 216, 217 512 | 819, 776, 819 | 886, 886, 886 | 841, 824, 846 1280 | 855, 855, 855 | 951, 950, 951 | 951, 951, 951 Still not as good as the other two at large packet sizes. For this machine: The ideal clock source would be jiffies with HZ=1000 upto about 100 bytes then change to tsc. Of course i could pick tsc but people have dissed it so far - i probably didnt hit the condition where it goes into deep slumber. Any insights? This makes it hard to quantify batching experimental improvements as i feel it could be architecture or worse machine dependent. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] stuff from tcp-2.6 partially merged to upcoming net-2.6.24?
Hi Dave, ...Noticed you were planning to open net-2.6.24 tree... IMHO, part of the stuff in tcp-2.6 could be merged to 2.6.24. I suggest that most of the stuff which is not directly related to the rbtree, new lost marker, nor sacktag reorganization are taken. Some of those things are very trivial to take as they do not introduce have any conflicts. Besides that there are some stuff that would need some work if takes as they are built on top of stuff that will remain only in tcp-2.6 (includes left_out removal and IsReno/Fack conversion)... But if it's ok, I could try to come up with a solution even to them... Perhaps do this in two (or more) stages by first taking the trivial ones... I tried rebasing tcp-2.6 (there's some not yet submitted work on top of it too) to top of be1b685fe6c9928848b26b568eaa86ba8ce0046c, result is here: http://www.cs.helsinki.fi/u/ijjarvin/tcp-rebase/{before,after} ...There was at least one gotcha (sacktag's flag reset position change when sacktag_state is created). But all in all, conflicts weren't that hard to resolve... One may resolve some things differently than I did, so YMMV if you want to try that yourself... :-) ...I also diffed all.patch'es to see if there was some undesired side-effect from diff but didn't find any. Currently only compile tested. Do you have any suggestion how I should proceed? Or do you perhaps object such partial merge completely? ...I could try to come up with a cleaned up patch series which has original and their bug fix parts combined to a single patch per change (would provide cleaner history and shouldn't be very hard to do either)... -- i. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Fw: [Bug 8845] New: Kernel 2.6.23-RC2: TCP + ICH9 + Amule + Hours = Freeze of my Debian = Reboot
Any takers? Subject: [Bug 8845] New: Kernel 2.6.23-RC2: TCP + ICH9 + Amule + Hours = Freeze of my Debian = Reboot http://bugzilla.kernel.org/show_bug.cgi?id=8845 Summary: Kernel 2.6.23-RC2: TCP + ICH9 + Amule + Hours = Freeze of my Debian = Reboot Product: Networking Version: 2.5 KernelVersion: Kernel 2.6.23-RC2 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: IPV4 AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Hello! I had the same problem with kernel 2.6.22. I have a very recent Asus P5K-VM motherboard, and with the same harddrive and software on an Asus P5B-VM I had no problem. I have no problem with Ktorrent. When I use amule with only the udp mode (Kademlia), my Debian Etch/Lenny works alright for 24 hours straight, with dozens of active downloads. When I use the regular tcp mode (edonkey protocol), after 3 hours my Debian completely freezes (no hardrive activity, no console accessible, impossible to trigger a reboot from the keyboard with the right sequence of keys), and I have to reboot. I reproduced this more than ten times. I use an old D-Link DFE-530TX ethernet card with which I never had any problem over the years. I have a cable internet connection (1 MB/s up/ 30MB/s down) There is nothing in the logs before the freeze. -- Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [iputils] Print received packets as icmp_seq
Now, ping and ping6 print the packets which are actually received, too, not only the amount of sent packets. It has the format: icmp_seq=received/seq Signed-off-by: Alexander Graf [EMAIL PROTECTED] --- ping_common.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/ping_common.c b/ping_common.c index be36cbd..83be553 100644 --- a/ping_common.c +++ b/ping_common.c @@ -711,7 +711,7 @@ restamp: } else { int i; __u8 *cp, *dp; - printf(%d bytes from %s: icmp_seq=%u, cc, from, seq); + printf(%d bytes from %s: icmp_seq=%li/%u, cc, from, nreceived, seq); if (hops = 0) printf( ttl=%d, hops); -- 1.5.2.4 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [iputils] Print packet loss with more precision
Signed-off-by: Alexander Graf [EMAIL PROTECTED] --- ping_common.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/ping_common.c b/ping_common.c index 83be553..49acab2 100644 --- a/ping_common.c +++ b/ping_common.c @@ -795,9 +795,9 @@ void finish(void) if (nerrors) printf(, +%ld errors, nerrors); if (ntransmitted) { - printf(, %d%% packet loss, - (int) long long)(ntransmitted - nreceived)) * 100) / - ntransmitted)); + printf(, %f%% packet loss, + (((long long)(ntransmitted - nreceived)) * 100.0) / + ntransmitted); printf(, time %ldms, 1000*tv.tv_sec+tv.tv_usec/1000); } putchar('\n'); -- 1.5.2.4 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
Networking experts, I'd like input on the patch below, and help in solving this bug properly. iWARP devices that support both native stack TCP and iWARP (aka RDMA over TCP/IP/Ethernet) connections on the same interface need the fix below or some similar fix to the RDMA connection manager. This is a BUG in the Linux RDMA-CMA code as it stands today. Here is the issue: Consider an mpi cluster running mvapich2. And the cluster runs MPI/Sockets jobs concurrently with MPI/RDMA jobs. It is possible, without the patch below, for MPI/Sockets processes to mistakenly get incoming RDMA connections and vice versa. The way mvapich2 works is that the ranks all bind and listen to a random port (retrying new random ports if the bind fails with in use). Once they get a free port and bind/listen, they advertise that port number to the peers to do connection setup. Currently, without the patch below, the mpi/rdma processes can end up binding/listening to the _same_ port number as the mpi/sockets processes running over the native tcp stack. This is due to duplicate port spaces for native stack TCP and the rdma cm's RDMA_PS_TCP port space. If this happens, then the connections can get screwed up. The correct solution in my mind is to use the host stack's TCP port space for _all_ RDMA_PS_TCP port allocations. The patch below is a minimal delta to unify the port spaces by using the kernel stack to bind ports. This is done by allocating a kernel socket and binding to the appropriate local addr/port. It also allows the kernel stack to pick ephemeral ports by virtue of just passing in port 0 on the kernel bind operation. There has been a discussion already on the RDMA list if anyone is interested: http://www.mail-archive.com/[EMAIL PROTECTED]/msg05162.html Thanks, Steve. --- RDMA/CMA: Allocate PS_TCP ports from the host TCP port space. This is needed for iwarp providers that support native and rdma connections over the same interface. Signed-off-by: Steve Wise [EMAIL PROTECTED] --- drivers/infiniband/core/cma.c | 27 ++- 1 files changed, 26 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 9e0ab04..e4d2d7f 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -111,6 +111,7 @@ struct rdma_id_private { struct rdma_cm_id id; struct rdma_bind_list *bind_list; + struct socket *sock; struct hlist_node node; struct list_headlist; struct list_headlisten_list; @@ -695,6 +696,8 @@ static void cma_release_port(struct rdma kfree(bind_list); } mutex_unlock(lock); + if (id_priv-sock) + sock_release(id_priv-sock); } void rdma_destroy_id(struct rdma_cm_id *id) @@ -1790,6 +1793,25 @@ static int cma_use_port(struct idr *ps, return 0; } +static int cma_get_tcp_port(struct rdma_id_private *id_priv) +{ + int ret; + struct socket *sock; + + ret = sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, sock); + if (ret) + return ret; + ret = sock-ops-bind(sock, + (struct socketaddr *)id_priv-id.route.addr.src_addr, + ip_addr_size(id_priv-id.route.addr.src_addr)); + if (ret) { + sock_release(sock); + return ret; + } + id_priv-sock = sock; + return 0; +} + static int cma_get_port(struct rdma_id_private *id_priv) { struct idr *ps; @@ -1801,6 +1823,9 @@ static int cma_get_port(struct rdma_id_p break; case RDMA_PS_TCP: ps = tcp_ps; + ret = cma_get_tcp_port(id_priv); /* Synch with native stack */ + if (ret) + goto out; break; case RDMA_PS_UDP: ps = udp_ps; @@ -1815,7 +1840,7 @@ static int cma_get_port(struct rdma_id_p else ret = cma_use_port(ps, id_priv); mutex_unlock(lock); - +out: return ret; } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
Hi Steve. On Tue, Aug 07, 2007 at 09:37:41AM -0500, Steve Wise ([EMAIL PROTECTED]) wrote: +static int cma_get_tcp_port(struct rdma_id_private *id_priv) +{ + int ret; + struct socket *sock; + + ret = sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, sock); + if (ret) + return ret; + ret = sock-ops-bind(sock, + (struct socketaddr *)id_priv-id.route.addr.src_addr, + ip_addr_size(id_priv-id.route.addr.src_addr)); If get away from talks about broken offloading, this one will result in the case, when usual network dataflow can enter private rdma land, i.e. after bind succeeded this socket is accessible via any other network device. Is it inteded? And this is quite noticeble overhead per rdma connection, btw. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
Evgeniy Polyakov wrote: Hi Steve. On Tue, Aug 07, 2007 at 09:37:41AM -0500, Steve Wise ([EMAIL PROTECTED]) wrote: +static int cma_get_tcp_port(struct rdma_id_private *id_priv) +{ + int ret; + struct socket *sock; + + ret = sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, sock); + if (ret) + return ret; + ret = sock-ops-bind(sock, + (struct socketaddr *)id_priv-id.route.addr.src_addr, + ip_addr_size(id_priv-id.route.addr.src_addr)); If get away from talks about broken offloading, this one will result in the case, when usual network dataflow can enter private rdma land, i.e. after bind succeeded this socket is accessible via any other network device. Is it inteded? And this is quite noticeble overhead per rdma connection, btw. I'm not sure I understand your question? What do you mean by accessible? The intention is to _just_ reserve the addr/port. The socket struct alloc and bind was a simple way to do this. I assume we'll have to come up with a better way though. Namely provide a low level interface to the port space allocator allowing both rdma and the host tcp stack to share the space without requiring a socket struct for rdma connections. Or maybe we'll come up a different and better solution to this issue... Steve. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
On Tue, Aug 07, 2007 at 10:06:29AM -0500, Steve Wise ([EMAIL PROTECTED]) wrote: On Tue, Aug 07, 2007 at 09:37:41AM -0500, Steve Wise ([EMAIL PROTECTED]) wrote: +static int cma_get_tcp_port(struct rdma_id_private *id_priv) +{ + int ret; + struct socket *sock; + + ret = sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, sock); + if (ret) + return ret; + ret = sock-ops-bind(sock, + (struct socketaddr *)id_priv-id.route.addr.src_addr, + ip_addr_size(id_priv-id.route.addr.src_addr)); If get away from talks about broken offloading, this one will result in the case, when usual network dataflow can enter private rdma land, i.e. after bind succeeded this socket is accessible via any other network device. Is it inteded? And this is quite noticeble overhead per rdma connection, btw. I'm not sure I understand your question? What do you mean by accessible? The intention is to _just_ reserve the addr/port. Above RDMA -bind() ends up with tcp_v4_get_port(), which will only add socket into bhash, but it is only accessible for new sockets created for listening connections or expilicit bind, network traffic checks only listening and establised hashes, which are not affected by above change, so it was false alarm from my side. It does allow to 'grab' a port and forbid its possible reuse. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] drivers/net/ibmveth.c: memset fix
Mariusz Kozlowski wrote: Looks like memset() is zeroing wrong nr of bytes. Good catch, however, I think we can just remove this memset altogether since the memory gets allocated via kzalloc. Correct, that memset() is superfluous. Ok. Then this should do it. Acked-by: Brian King [EMAIL PROTECTED] Signed-off-by: Mariusz Kozlowski [EMAIL PROTECTED] drivers/net/ibmveth.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- linux-2.6.23-rc1-mm2-a/drivers/net/ibmveth.c 2007-08-01 08:43:46.0 +0200 +++ linux-2.6.23-rc1-mm2-b/drivers/net/ibmveth.c 2007-08-06 23:32:13.0 +0200 @@ -963,7 +963,7 @@ static int __devinit ibmveth_probe(struc { int rc, i; struct net_device *netdev; - struct ibmveth_adapter *adapter = NULL; + struct ibmveth_adapter *adapter; unsigned char *mac_addr_p; unsigned int *mcastFilterSize_p; @@ -997,7 +997,6 @@ static int __devinit ibmveth_probe(struc SET_MODULE_OWNER(netdev); adapter = netdev-priv; - memset(adapter, 0, sizeof(adapter)); dev-dev.driver_data = netdev; adapter-vdev = dev; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- Brian King Linux on Power Virtualization IBM Linux Technology Center - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only)
Kok, Auke wrote: From: Auke Kok [EMAIL PROTECTED] Date: Mon, 6 Aug 2007 14:14:44 -0700 Subject: [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only) This driver implements support for the ICH9 on-board LAN ethernet device. The device is similar to ICH8. The driver encompasses code to support 82571/2/3, es2lan and ICH8 devices as well, but those device IDs are disabled and will be lifted from the e1000 driver over one at a time once this driver receives some more live time. Changes to the last snapshot posted are exclusively in the internal hardware API organization. Many thanks to Jeff Garzik for jumping in and getting this organized with a keen eye on the future layout. Signed-off-by: Auke Kok [EMAIL PROTECTED] Thanks for posting the patch in a git-am friendly format :) I merged this into netdev-2.6.git#e1000e just now, and pulled it into netdev-2.6.git#ALL so that Andrew's -mm tree will automatically pick up this driver. Please submit e1000e in the form of follow-up patches to #e1000e, rather than reposting the entire driver. We'll leave it on this side branch for a little while, to give others a chance to review and test, and give you (auke) a chance to update for Andi's comments etc. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e100 (was: eepro100) - Nobody Cares (hardware?)
I want to thank everyone who helped with this. It was proven to be a hardware issue. The board designer had left a GPIO pin in an indeterminate state because he was planning to use it later to do something with the battery charge circuitry. I apologize for wasting everyone's time. On Mon, 06 Aug 2007 17:45:09 -0700, Kok, Auke wrote [moving to netdev mailinglist] ericj wrote: On Mon, 6 Aug 2007 11:20:58 -0500, ericj wrote On Mon, 06 Aug 2007 12:13:28 -0400, Jeff Garzik wrote eepro100 is going to be removed. Please try e100 on 2.6.22 or 2.6.23-rc2. I will give the 2.6.23 a try. I tried 2.6.23-rc2 and there was no change. There is now some question from the hardware guys about whether the eeproms were properly configured before shipping the boards. Is there any documentation of the eeprom on an EE Pro 100 VE (ICH4) so that I can figure out if any of the settings in there might be causing the problem? The only fields I know of for sure are the MAC address at the beginning and the checksum at the end. I also see from the driver code that there is at least one byte controlling wake-on-lan, which I don't care about - unless it's the problem. Thanks for ethtool, by the way. It's been helpful in looking at this and comparing the eeprom to an earlier version of the board that works. Eric, please don't forget that an entire team here at Intel is dedicated to supporting e100 and pro/1000 devices from Intel. Most of the pro/100 features are documented in the SDM which contains some references to the eeprom parts. Mostly the device doesn't need much configuration from the eeprom to work (unlike gigE parts). The SDM can be downloaded from our sf.net project page: http://sourceforge.net/project/showfiles.php?group_id=42302package_id=68544 The issue that you are reporting: My system boots fine but when I try to bring up the onboard ethernet (an EEPro 100 VE) I get a Nobody Cares message and the interrupt is disabled. However has been recently patched. This should have worked regardless of whether you used e100 or eepro100 (noting that nobody supports eepro100 anymore, you should really use e100 for all tests). if you look in drivers/pci/quirks.c you'll find that there is specific code for e100 devices. If this quirk doesn't work for you then we'll need to dig into that. For this I'd like you to gather: - `ethtool -e eth0` output - `lspci -n` output this will allow me to check the quirck code and see if it has the right device ID. I'm suspecting that the device ID is missing somehow, or the workaround fails. Auke -- A hunch is creativity trying to tell you something -- Frank Capra Eric Johnson - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e100
ericj wrote: I want to thank everyone who helped with this. It was proven to be a hardware issue. The board designer had left a GPIO pin in an indeterminate state because he was planning to use it later to do something with the battery charge circuitry. I apologize for wasting everyone's time. happens to everyone :) Thanks for letting us know. Auke On Mon, 06 Aug 2007 17:45:09 -0700, Kok, Auke wrote [moving to netdev mailinglist] ericj wrote: On Mon, 6 Aug 2007 11:20:58 -0500, ericj wrote On Mon, 06 Aug 2007 12:13:28 -0400, Jeff Garzik wrote eepro100 is going to be removed. Please try e100 on 2.6.22 or 2.6.23-rc2. I will give the 2.6.23 a try. I tried 2.6.23-rc2 and there was no change. There is now some question from the hardware guys about whether the eeproms were properly configured before shipping the boards. Is there any documentation of the eeprom on an EE Pro 100 VE (ICH4) so that I can figure out if any of the settings in there might be causing the problem? The only fields I know of for sure are the MAC address at the beginning and the checksum at the end. I also see from the driver code that there is at least one byte controlling wake-on-lan, which I don't care about - unless it's the problem. Thanks for ethtool, by the way. It's been helpful in looking at this and comparing the eeprom to an earlier version of the board that works. Eric, please don't forget that an entire team here at Intel is dedicated to supporting e100 and pro/1000 devices from Intel. Most of the pro/100 features are documented in the SDM which contains some references to the eeprom parts. Mostly the device doesn't need much configuration from the eeprom to work (unlike gigE parts). The SDM can be downloaded from our sf.net project page: http://sourceforge.net/project/showfiles.php?group_id=42302package_id=68544 The issue that you are reporting: My system boots fine but when I try to bring up the onboard ethernet (an EEPro 100 VE) I get a Nobody Cares message and the interrupt is disabled. However has been recently patched. This should have worked regardless of whether you used e100 or eepro100 (noting that nobody supports eepro100 anymore, you should really use e100 for all tests). if you look in drivers/pci/quirks.c you'll find that there is specific code for e100 devices. If this quirk doesn't work for you then we'll need to dig into that. For this I'd like you to gather: - `ethtool -e eth0` output - `lspci -n` output this will allow me to check the quirck code and see if it has the right device ID. I'm suspecting that the device ID is missing somehow, or the workaround fails. Auke -- A hunch is creativity trying to tell you something -- Frank Capra Eric Johnson - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Distributed storage.
On Tuesday 07 August 2007 05:05, Jens Axboe wrote: On Sun, Aug 05 2007, Daniel Phillips wrote: A simple way to solve the stable accounting field issue is to add a new pointer to struct bio that is owned by the top level submitter (normally generic_make_request but not always) and is not affected by any recursive resubmission. Then getting rid of that field later becomes somebody's summer project, which is not all that urgent because struct bio is already bloated up with a bunch of dubious fields and is a transient structure anyway. Thanks for your insights. Care to detail what bloat and dubious fields struct bio has? First obvious one I see is bi_rw separate from bi_flags. Front_size and back_size smell dubious. Is max_vecs really necessary? You could reasonably assume bi_vcnt rounded up to a power of two and bury the details of making that work behind wrapper functions to change the number of bvecs, if anybody actually needs that. Bi_endio and bi_destructor could be combined. I don't see a lot of users of bi_idx, that looks like a soft target. See what happened to struct page when a couple of folks got serious about attacking it, some really deep hacks were done to pare off a few bytes here and there. But struct bio as a space waster is not nearly in the same ballpark. It would be interesting to see if bi_bdev could be made read only. Generally, each stage in the block device stack knows what the next stage is going to be, so why do we have to write that in the bio? For error reporting from interrupt context? Anyway, if Evgeniy wants to do the patch, I will happily unload the task of convincing you that random fields are/are not needed in struct bio :-) Regards, Daniel - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] cubic: backoff after slow start
CUBIC takes several unnecessary iterations to converge out of slow start. This is most noticable over a link where the bottleneck queue size is much larger than BDP, and the sender has to fill the pipe in slow start before the first loss. Typical consumer broadband links seem to have large (up to 2secs) of queue that needs to get filled before the first loss. A possible fix is to use a beta of .5 (same as original TCP) when leaving slow start. Originally, the Linux version didn't do slow start so it probably never was observed. --- a/net/ipv4/tcp_cubic.c 2007-08-02 12:16:22.0 +0100 +++ b/net/ipv4/tcp_cubic.c 2007-08-03 15:57:12.0 +0100 @@ -289,7 +289,11 @@ static u32 bictcp_recalc_ssthresh(struct ca-loss_cwnd = tp-snd_cwnd; - return max((tp-snd_cwnd * beta) / BICTCP_BETA_SCALE, 2U); + /* Initial backoff when leaving slow start */ + if (tp-snd_ssthresh == 0x7fff) + return max(tp-snd_cwnd 1U, 2U); + else + return max((tp-snd_cwnd * beta) / BICTCP_BETA_SCALE, 2U); } static u32 bictcp_undo_cwnd(struct sock *sk) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFT] sky2: turn on pci power
This setup step got dropped in 2.6.23, Yukon-EX configuration, maybe this fixes your problem? --- a/drivers/net/sky2.c2007-08-06 04:39:36.0 -0400 +++ b/drivers/net/sky2.c2007-08-07 14:50:25.0 -0400 @@ -222,6 +222,8 @@ static void sky2_power_on(struct sky2_hw if (hw-chip_id == CHIP_ID_YUKON_EC_U || hw-chip_id == CHIP_ID_YUKON_EX) { u32 reg; + sky2_pci_write32(hw, PCI_DEV_REG3, 0); + reg = sky2_pci_read32(hw, PCI_DEV_REG4); /* set all bits to 0 except bits 15..12 and 8 */ reg = P_ASPM_CONTROL_MSK; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2: workaround for lost IRQ and 2.6.22-stable
On Tue, 7 Aug 2007 20:46:31 +0200 (CEST) Krzysztof Oledzki [EMAIL PROTECTED] wrote: Hello, http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.21.y.git;a=commitdiff;h=fe1fe7c982f86624c692644e8ed05e132f4753cc Is this fix going to be included in the next 2.6.22-stable release or is it not needed any more? Best regards, Krzysztof Olędzki It stops the major hang from IRQ loss. 2.6.23 has more minor stuff that probably aren't needed for stablilty - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
sky2: workaround for lost IRQ and 2.6.22-stable
Hello, http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.21.y.git;a=commitdiff;h=fe1fe7c982f86624c692644e8ed05e132f4753cc Is this fix going to be included in the next 2.6.22-stable release or is it not needed any more? Best regards, Krzysztof Olędzki
Linx
After seeing this article on Linx http://www.linuxdevices.com/news/NS8613439087.html I decided to give it a quick code review (long airline flight). Overall, it isn't awful, it just looks like every other piece of code that hasn't been managed for mainline kernel inclusion. Nice way of saying, this turd needs a man year or more of polishing. Gratiutious Code Review of Linx 0. Bugs. A. Device names can change in kernel at anytime, use pointers or ifindex. In fact any name change will crash kernel in BUG_ON in notifier B. Device's changing MTU will crash kernel in BUG_ON C. Calling del_timer_sync under RTNL 1. Coding Style A. Typedef's Don't use typedef's like LINX_SPID, ... B. Non-standard naming conventions I. Don't use uint32_t for kernel use u32 or __u32 II. No MixedCaseNames C. Use std. macros I. BUG_ON vs. LINX_ASSERT, etc D. Code in macro's that should really be inline's (e.g. linx_check_linx_huntname) E. Indentation F. Excessive scope, much of the code could be local to one file G. Too many spelling errors H. OS Abstraction layer is unacceptable I. Use initializers when possible (e.g device_notifier) J. Quit with all the assert's for in_irq() in timer's etc... 2. Bogus wrappers A. Kmalloc B. Spinlocks 3. Unacceptable ABI A. ioctl's for special functions B. Heavy reliance on config parameters in /proc C. Looks dependent on Ethernet address format D. Code for non-standard adaptive coalesce and his code has protocol playing with drivers timers directly. E. Non-assigned number for Ethernet protocol 4. FYI A. No __init or __exit B. Kernel API documentation Only document API calls that matter not every pissant little function. Avoid stating the obvious. Why not use docbook format? C. Locking way to fine grained (lots of small locks) Should use RCU and avoid rwlocks Use existing linux network device API locks (ie dev_base_lock, RTNL) if possible. Those who don't understand TCP/IP are doomed to reimplement it, badly. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [stable] sky2: workaround for lost IRQ and 2.6.22-stable
On Tue, Aug 07, 2007 at 08:46:31PM +0200, Krzysztof Oledzki wrote: Hello, http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.21.y.git;a=commitdiff;h=fe1fe7c982f86624c692644e8ed05e132f4753cc Is this fix going to be included in the next 2.6.22-stable release or is it not needed any more? It's not queued up for the next 2.6.22-stable release as no one has sent it to the stable maintainers :) thanks, greg k-h - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Distributed storage.
On Tue, Aug 07 2007, Daniel Phillips wrote: On Tuesday 07 August 2007 05:05, Jens Axboe wrote: On Sun, Aug 05 2007, Daniel Phillips wrote: A simple way to solve the stable accounting field issue is to add a new pointer to struct bio that is owned by the top level submitter (normally generic_make_request but not always) and is not affected by any recursive resubmission. Then getting rid of that field later becomes somebody's summer project, which is not all that urgent because struct bio is already bloated up with a bunch of dubious fields and is a transient structure anyway. Thanks for your insights. Care to detail what bloat and dubious fields struct bio has? First obvious one I see is bi_rw separate from bi_flags. Front_size and back_size smell dubious. Is max_vecs really necessary? You could I don't like structure bloat, but I do like nice design. Overloading is a necessary evil sometimes, though. Even today, there isn't enough room to hold bi_rw and bi_flags in the same variable on 32-bit archs, so that concern can be scratched. If you read bio.h, that much is obvious. If you check up on the iommu virtual merging, you'll understand the front and back size members. They may smell dubious to you, but please take the time to understand why it looks the way it does. reasonably assume bi_vcnt rounded up to a power of two and bury the details of making that work behind wrapper functions to change the number of bvecs, if anybody actually needs that. Bi_endio and Changing the number of bvecs is integral to how bio buildup current works. bi_destructor could be combined. I don't see a lot of users of bi_idx, bi_idx is integral to partial io completions. that looks like a soft target. See what happened to struct page when a couple of folks got serious about attacking it, some really deep hacks were done to pare off a few bytes here and there. But struct bio as a space waster is not nearly in the same ballpark. So show some concrete patches and examples, hand waving and assumptions is just a waste of everyones time. It would be interesting to see if bi_bdev could be made read only. Generally, each stage in the block device stack knows what the next stage is going to be, so why do we have to write that in the bio? For error reporting from interrupt context? Anyway, if Evgeniy wants to do the patch, I will happily unload the task of convincing you that random fields are/are not needed in struct bio :-) It's a trade off, otherwise you'd have to pass the block device around a lot. And it's, again, a design issue. A bio contains destination information, that means device/offset/size information. I'm all for shaving structure bytes where it matters, but not for the sake of sacrificing code stability or design. I consider struct bio quite lean and have worked hard to keep it that way. In fact, iirc, the only addition to struct bio since 2001 is the iommu front/back size members. And I resisted those for quite a while. -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 1/1] NetLabel: add missing rcu_dereference() calls in the LSM domain mapping hash table
h Content-Disposition: inline; filename=netlabel-rcu_deref_fix The LSM domain mapping head table pointer was not being referenced via the RCU safe dereferencing function, rcu_dereference(). This patch adds those missing calls to the NetLabel code. This has been tested using recent linux-2.6 git kernels with no visible regressions. Signed-off-by: Paul Moore [EMAIL PROTECTED] --- net/netlabel/netlabel_domainhash.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) Index: linux-2.6_nlbl-rcu-fixup/net/netlabel/netlabel_domainhash.c === --- linux-2.6_nlbl-rcu-fixup.orig/net/netlabel/netlabel_domainhash.c +++ linux-2.6_nlbl-rcu-fixup/net/netlabel/netlabel_domainhash.c @@ -126,7 +126,9 @@ static struct netlbl_dom_map *netlbl_dom if (domain != NULL) { bkt = netlbl_domhsh_hash(domain); - list_for_each_entry_rcu(iter, netlbl_domhsh-tbl[bkt], list) + list_for_each_entry_rcu(iter, +rcu_dereference(netlbl_domhsh)-tbl[bkt], +list) if (iter-valid strcmp(iter-domain, domain) == 0) return iter; } @@ -227,7 +229,7 @@ int netlbl_domhsh_add(struct netlbl_dom_ spin_lock(netlbl_domhsh_lock); if (netlbl_domhsh_search(entry-domain, 0) == NULL) list_add_tail_rcu(entry-list, - netlbl_domhsh-tbl[bkt]); + rcu_dereference(netlbl_domhsh)-tbl[bkt]); else ret_val = -EEXIST; spin_unlock(netlbl_domhsh_lock); @@ -423,8 +425,8 @@ int netlbl_domhsh_walk(u32 *skip_bkt, iter_bkt rcu_dereference(netlbl_domhsh)-size; iter_bkt++, chain_cnt = 0) { list_for_each_entry_rcu(iter_entry, - netlbl_domhsh-tbl[iter_bkt], - list) + rcu_dereference(netlbl_domhsh)-tbl[iter_bkt], + list) if (iter_entry-valid) { if (chain_cnt++ *skip_chain) continue; -- paul moore linux security @ hp - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFT] sky2: backport patch
Any volunteers to test this, it has a backport for the three main stability patches: 1. carrier management 2. lost irq timer 3. rechecking for packets in poll 4. overlength packet hang. I am away from any sky2 hardware for another week, but others maybe able to validate this. --- a/drivers/net/sky2.c2007-07-20 04:30:14.0 -0400 +++ b/drivers/net/sky2.c2007-08-07 17:08:21.0 -0400 @@ -96,7 +96,7 @@ static int disable_msi = 0; module_param(disable_msi, int, 0); MODULE_PARM_DESC(disable_msi, Disable Message Signaled Interrupt (MSI)); -static int idle_timeout = 0; +static int idle_timeout = 100; module_param(idle_timeout, int, 0); MODULE_PARM_DESC(idle_timeout, Watchdog timer for lost interrupts (ms)); @@ -1234,6 +1234,8 @@ static int sky2_up(struct net_device *de if (netif_msg_ifup(sky2)) printk(KERN_INFO PFX %s: enabling interface\n, dev-name); + netif_carrier_off(dev); + /* must be power of 2 */ sky2-tx_le = pci_alloc_consistent(hw-pdev, TX_RING_SIZE * @@ -1573,7 +1575,6 @@ static int sky2_down(struct net_device * /* Stop more packets from being queued */ netif_stop_queue(dev); - netif_carrier_off(dev); /* Disable port IRQ */ imask = sky2_read32(hw, B0_IMSK); @@ -1625,6 +1626,8 @@ static int sky2_down(struct net_device * sky2_phy_power(hw, port, 0); + netif_carrier_off(dev); + /* turn off LED's */ sky2_write16(hw, B0_Y2LED, LED_STAT_OFF); @@ -1689,7 +1692,6 @@ static void sky2_link_up(struct sky2_por gm_phy_write(hw, port, PHY_MARV_INT_MASK, PHY_M_DEF_MSK); netif_carrier_on(sky2-netdev); - netif_wake_queue(sky2-netdev); /* Turn on link LED */ sky2_write8(hw, SK_REG(port, LNK_LED_REG), @@ -1741,7 +1743,6 @@ static void sky2_link_down(struct sky2_p gma_write16(hw, port, GM_GP_CTRL, reg); netif_carrier_off(sky2-netdev); - netif_stop_queue(sky2-netdev); /* Turn on link LED */ sky2_write8(hw, SK_REG(port, LNK_LED_REG), LINKLED_OFF); @@ -2064,6 +2065,9 @@ static struct sk_buff *sky2_receive(stru if (!(status GMR_FS_RX_OK)) goto resubmit; + if (status 16 != length) + goto len_mismatch; + if (length copybreak) skb = receive_copy(sky2, re, length); else @@ -2073,6 +2077,11 @@ resubmit: return skb; +len_mismatch: + /* Truncation of overlength packets + causes PHY length to not match MAC length */ + ++sky2-net_stats.rx_length_errors; + error: ++sky2-net_stats.rx_errors; if (status GMR_FS_RX_FF_OV) { @@ -2441,17 +2450,24 @@ static int sky2_poll(struct net_device * sky2_phy_intr(hw, 1); work_done = sky2_status_intr(hw, work_limit); - if (work_done work_limit) { - netif_rx_complete(dev0); + *budget -= work_done; + dev0-quota -= work_done; - /* end of interrupt, re-enables also acts as I/O synchronization */ - sky2_read32(hw, B0_Y2_SP_LISR); - return 0; - } else { - *budget -= work_done; - dev0-quota -= work_done; + /* More work? */ + if (hw-st_idx != sky2_read16(hw, STAT_PUT_IDX)) return 1; + + /* Bug/Errata workaround? +* Need to kick the TX irq moderation timer. +*/ + if (sky2_read8(hw, STAT_TX_TIMER_CTRL) == TIM_START) { + sky2_write8(hw, STAT_TX_TIMER_CTRL, TIM_STOP); + sky2_write8(hw, STAT_TX_TIMER_CTRL, TIM_START); } + netif_rx_complete(dev0); + + sky2_read32(hw, B0_Y2_SP_LISR); + return 0; } static irqreturn_t sky2_intr(int irq, void *dev_id) @@ -3486,10 +3502,6 @@ static __devinit struct net_device *sky2 memcpy_fromio(dev-dev_addr, hw-regs + B2_MAC_1 + port * 8, ETH_ALEN); memcpy(dev-perm_addr, dev-dev_addr, dev-addr_len); - /* device is off until link detection */ - netif_carrier_off(dev); - netif_stop_queue(dev); - return dev; } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] atl1: use spin_trylock_irqsave()
Jay Cliburn wrote: From: Ingo Molnar [EMAIL PROTECTED] use the simpler spin_trylock_irqsave() API to get the adapter lock. [ this is also a fix for -rt where adapter-lock is a sleeping lock. ] Signed-off-by: Ingo Molnar [EMAIL PROTECTED] Signed-off-by: Jay Cliburn [EMAIL PROTECTED] --- drivers/net/atl1/atl1_main.c |4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) applied to #upstream-fixes - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] sis190 check for ISA bridge on SiS966
maximilian attems wrote: From: Neil Muller [EMAIL PROTECTED] sis190 driver assumes to find ISA only on SiS965. similar fix is in sis900 driver, see bug report http://bugs.debian.org/435547 Signed-off-by: maximilian attems [EMAIL PROTECTED] applied to #upstream-fixes - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] ucc_geth fixes for 2.6.22-rc1
Li Yang wrote: Please pull from 'ucc_geth' branch of master.kernel.org:/pub/scm/linux/kernel/git/leo/fsl-soc.git ucc_geth to receive the following fixes: drivers/net/ucc_geth_ethtool.c |1 - drivers/net/ucc_geth_mii.c |3 ++- 2 files changed, 2 insertions(+), 2 deletions(-) Domen Puncer (1): ucc_geth: fix section mismatch Jan Altenberg (1): ucc_geth: remove get_perm_addr from ucc_geth_ethtool.c pulled - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please pull 'fixes-jgarzik' branch of wireless-2.6
John W. Linville wrote: The following changes since commit d4ac2477fad0f2680e84ec12e387ce67682c5c13: Linus Torvalds (1): Linux 2.6.23-rc2 are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git fixes-jgarzik John W. Linville (1): Revert [PATCH] bcm43xx: Fix deviation from specifications in set_baseband_attenuation Masakazu Mokuno (1): remove duplicated ioctl entries in compat_ioctl.c Michael Buesch (1): softmac: Fix deadlock of wx_set_essid with assoc work Michael Wu (1): rtl8187: ensure priv-hwaddr is always valid Ulrich Kunitz (1): zd1211rw: fix filter for PSPOLL frames drivers/net/wireless/bcm43xx/bcm43xx_phy.c |2 +- drivers/net/wireless/rtl8187_dev.c |2 +- drivers/net/wireless/zd1211rw/zd_mac.c |2 +- fs/compat_ioctl.c |3 --- net/ieee80211/softmac/ieee80211softmac_wx.c | 11 --- 5 files changed, 11 insertions(+), 9 deletions(-) pulled - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] r8169: pull request for 'r8169-for-jeff-20070806' branch
Francois Romieu wrote: Please pull from branch 'r8169-for-jeff-20070806' in repository git://electric-eye.fr.zoreil.com/home/romieu/linux-2.6.git r8169-for-jeff-20070806 to get the changes below. Distance from 'upstream-fixes' (c196d80f994ef4ffefd5a7c62e3f42bd75d538bc) - 313b0305b5a1e7e0fb39383befbf79558ce68a9c 2584fbc3a61897de5eddd56b39a4fa9cd074eca2 Diffstat drivers/net/r8169.c | 24 1 files changed, 16 insertions(+), 8 deletions(-) Shortlog Francois Romieu (1): r8169: avoid needless NAPI poll scheduling Roger So (1): r8169: PHY power-on fix pulled - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT] sky2: backport patch
Stephen Hemminger wrote: Any volunteers to test this, it has a backport for the three main stability patches: 1. carrier management 2. lost irq timer 3. rechecking for packets in poll 4. overlength packet hang. I am away from any sky2 hardware for another week, but others maybe able to validate this. Backport to what? from what? You supplied no kernel version info. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND][PATCH 1/3] ehea: Fix workqueue handling
Thomas Klein wrote: Fix: Workqueue ehea_driver_wq was not destroyed Signed-off-by: Thomas Klein [EMAIL PROTECTED] --- drivers/net/ehea/ehea.h |2 +- drivers/net/ehea/ehea_main.c |1 + 2 files changed, 2 insertions(+), 1 deletions(-) applied 1-3 to #upstream-fixes - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] phy layer: fix phy_mii_ioctl for autonegotiation
Domen Puncer wrote: Fix a thinko (?) in setting phydev-autoneg. Signed-off-by: Domen Puncer [EMAIL PROTECTED] --- This fixes my mii.h - ethtool.h advertising #defines. I'm not sure why and how they're translated, but it does work now. Maybe they're just ignored, since mii-tool directly reads and writes MII registers. drivers/net/phy/phy.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) applied - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] drivers/net/ibmveth.c: memset fix
Mariusz Kozlowski wrote: Looks like memset() is zeroing wrong nr of bytes. Good catch, however, I think we can just remove this memset altogether since the memory gets allocated via kzalloc. Correct, that memset() is superfluous. Ok. Then this should do it. Signed-off-by: Mariusz Kozlowski [EMAIL PROTECTED] drivers/net/ibmveth.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) applied - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] ibmveth: Enable TCP checksum offload
Brian King wrote: This patchset enables TCP checksum offload support for IPV4 on ibmveth. This completely eliminates the generation and checking of the checksum for packets that are completely virtual and never touch a physical network. A simple TCP_STREAM netperf run on a virtual network with maximum mtu set yielded a ~30% increase in throughput. This feature is enabled by default on systems that support it, but can be disabled with a module option. Signed-off-by: Brian King [EMAIL PROTECTED] ACK, but does not apply to current netdev-2.6.git#upstream-fixes. Request resend after the ibmveth fixes hit mainline (24 hours or so after push, I suppose) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT] sky2: backport patch
On Tue, 07 Aug 2007 17:33:51 -0400 Jeff Garzik [EMAIL PROTECTED] wrote: Stephen Hemminger wrote: Any volunteers to test this, it has a backport for the three main stability patches: 1. carrier management 2. lost irq timer 3. rechecking for packets in poll 4. overlength packet hang. I am away from any sky2 hardware for another week, but others maybe able to validate this. Backport to what? from what? From 2.6.23-rc2 to 2.6.22.y base You supplied no kernel version info. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] ibmveth: Implement ethtool hooks to enable/disable checksum offload
Brian King wrote: This patch adds the appropriate ethtool hooks to allow for enabling/disabling of hypervisor assisted checksum offload for TCP. Signed-off-by: Brian King [EMAIL PROTECTED] --- linux-2.6-bjking1/drivers/net/ibmveth.c | 118 +++- linux-2.6-bjking1/drivers/net/ibmveth.h |1 2 files changed, 117 insertions(+), 2 deletions(-) diff -puN drivers/net/ibmveth.c~ibmveth_csum_offload_ethtool drivers/net/ibmveth.c --- linux-2.6/drivers/net/ibmveth.c~ibmveth_csum_offload_ethtool 2007-08-01 14:55:14.0 -0500 +++ linux-2.6-bjking1/drivers/net/ibmveth.c 2007-08-01 14:55:14.0 -0500 @@ -641,12 +641,125 @@ static u32 netdev_get_link(struct net_de return 1; } +static void ibmveth_set_rx_csum_flags(struct net_device *dev, u32 data) +{ + struct ibmveth_adapter *adapter = dev-priv; + + if (data) + adapter-rx_csum = 1; + else { + adapter-rx_csum = 0; + dev-features = ~NETIF_F_IP_CSUM; why does this RX-related code clear a TX-related flag? otherwise looks OK - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] ibmveth: Add ethtool TSO handlers
Brian King wrote: Add handlers for get_tso and get_ufo to prevent errors being printed by ethtool. Signed-off-by: Brian King [EMAIL PROTECTED] --- drivers/net/ibmveth.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff -puN drivers/net/ibmveth.c~ibmveth_ethtool_get_tso drivers/net/ibmveth.c --- linux-2.6/drivers/net/ibmveth.c~ibmveth_ethtool_get_tso 2007-07-19 11:18:38.0 -0500 +++ linux-2.6-bjking1/drivers/net/ibmveth.c 2007-07-19 11:18:38.0 -0500 @@ -759,7 +759,9 @@ static const struct ethtool_ops netdev_e .get_tx_csum= ethtool_op_get_tx_csum, .set_tx_csum= ibmveth_set_tx_csum, .get_rx_csum= ibmveth_get_rx_csum, - .set_rx_csum= ibmveth_set_rx_csum + .set_rx_csum= ibmveth_set_rx_csum, + .get_tso= ethtool_op_get_tso, + .get_ufo= ethtool_op_get_ufo ACK, once you add a comma to the end of the final initializer As you see from this patch, the practice of -not- having commas at the end of a list of struct initializers is not patch-friendly, since you must touch an unrelated line each time you patch the end of the struct. For named initializers particularly, the lack of a comma is even more useless. So, it might tweak some C perfectionists, but adding that seemingly-useless comma at the end of the last entry reduces maintenance headache and makes patch reviews slightly more clear. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] ibmveth: Add ethtool driver stats hooks
Brian King wrote: Add ethtool hooks to ibmveth to retrieve driver statistics. Signed-off-by: Brian King [EMAIL PROTECTED] --- drivers/net/ibmveth.c | 53 +- 1 file changed, 52 insertions(+), 1 deletion(-) diff -puN drivers/net/ibmveth.c~ibmveth_ethtool_driver_stats drivers/net/ibmveth.c --- linux-2.6/drivers/net/ibmveth.c~ibmveth_ethtool_driver_stats 2007-07-19 11:18:41.0 -0500 +++ linux-2.6-bjking1/drivers/net/ibmveth.c 2007-07-19 11:18:41.0 -0500 @@ -112,6 +112,28 @@ MODULE_DESCRIPTION(IBM i/pSeries Virtua MODULE_LICENSE(GPL); MODULE_VERSION(ibmveth_driver_version); +struct ibmveth_stat { + char name[ETH_GSTRING_LEN]; + int offset; +}; + +#define IBMVETH_STAT_OFF(stat) offsetof(struct ibmveth_adapter, stat) +#define IBMVETH_GET_STAT(a, off) *((u64 *)(((unsigned long)(a)) + off)) + +struct ibmveth_stat ibmveth_stats[] = { + { replenish_task_cycles, IBMVETH_STAT_OFF(replenish_task_cycles) }, + { replenish_no_mem, IBMVETH_STAT_OFF(replenish_no_mem) }, + { replenish_add_buff_failure, IBMVETH_STAT_OFF(replenish_add_buff_failure) }, + { replenish_add_buff_success, IBMVETH_STAT_OFF(replenish_add_buff_success) }, + { rx_invalid_buffer, IBMVETH_STAT_OFF(rx_invalid_buffer) }, + { rx_no_buffer, IBMVETH_STAT_OFF(rx_no_buffer) }, + { tx_multidesc_send, IBMVETH_STAT_OFF(tx_multidesc_send) }, + { tx_linearized, IBMVETH_STAT_OFF(tx_linearized) }, + { tx_linearize_failed, IBMVETH_STAT_OFF(tx_linearize_failed) }, + { tx_map_failed, IBMVETH_STAT_OFF(tx_map_failed) }, + { tx_send_failed, IBMVETH_STAT_OFF(tx_send_failed) } +}; + /* simple methods of getting data from the current rxq entry */ static inline int ibmveth_rxq_pending_buffer(struct ibmveth_adapter *adapter) { @@ -751,6 +773,32 @@ static u32 ibmveth_get_rx_csum(struct ne return adapter-rx_csum; } +static void ibmveth_get_strings(struct net_device *dev, u32 stringset, u8 *data) +{ + int i; + + if (stringset != ETH_SS_STATS) + return; + + for (i = 0; i ARRAY_SIZE(ibmveth_stats); i++, data += ETH_GSTRING_LEN) + memcpy(data, ibmveth_stats[i].name, ETH_GSTRING_LEN); +} + +static int ibmveth_get_stats_count(struct net_device *dev) +{ + return ARRAY_SIZE(ibmveth_stats); +} + +static void ibmveth_get_ethtool_stats(struct net_device *dev, + struct ethtool_stats *stats, u64 *data) +{ + int i; + struct ibmveth_adapter *adapter = dev-priv; + + for (i = 0; i ARRAY_SIZE(ibmveth_stats); i++) + data[i] = IBMVETH_GET_STAT(adapter, ibmveth_stats[i].offset); +} + static const struct ethtool_ops netdev_ethtool_ops = { .get_drvinfo= netdev_get_drvinfo, .get_settings = netdev_get_settings, @@ -761,7 +809,10 @@ static const struct ethtool_ops netdev_e .get_rx_csum= ibmveth_get_rx_csum, .set_rx_csum= ibmveth_set_rx_csum, .get_tso= ethtool_op_get_tso, - .get_ufo= ethtool_op_get_ufo + .get_ufo= ethtool_op_get_ufo, + .get_strings= ibmveth_get_strings, + .get_stats_count= ibmveth_get_stats_count, + .get_ethtool_stats = ibmveth_get_ethtool_stats ACK, modulo comma-at-end-of-initializer per previous email - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/6] ibmveth: Remove use of bitfields
Brian King wrote: Removes the use of bitfields from the ibmveth driver. This results in slightly smaller object code. Signed-off-by: Brian King [EMAIL PROTECTED] --- linux-2.6-bjking1/drivers/net/ibmveth.c | 90 linux-2.6-bjking1/drivers/net/ibmveth.h | 56 --- 2 files changed, 68 insertions(+), 78 deletions(-) strong ACK :) Though I also encourage you to avoid #defines for named constants, in favor of enum { IBMVETH_BUF_VALID = (1U 31), IBMVETH_BUF_TOGGLE = (1U 30), IBMVETH_BUF_NO_CSUM = (1U 25), IBMVETH_BUF_CSUM_GOOD = (1U 24), IBMVETH_BUF_LEN_MASK= 0x00FF, }; This illustrates: 1) The 1 n notation is FAR easier to read and compare with data sheets. You're just adding to the trouble by requiring the reviewer's brain to convert hex numbers to bits, even if most engineers can do this in their sleep. 2) The named constants are available to the C compiler, which is more friendly to debuggers. It also supplies type information to the C compiler. 3) Similar to #2, wading through C pre-processor output is much easier when the symbols don't disappear. These are recommendations, not requirements, but I've found these techniques superior to cpp in many other drivers. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 68] drivers/net/s2io.c: kmalloc + memset conversion to k[cz]alloc
Mariusz Kozlowski wrote: Signed-off-by: Mariusz Kozlowski [EMAIL PROTECTED] drivers/net/s2io.c | 235587 - 235340 (-247 bytes) drivers/net/s2io.o | 460768 - 460120 (-648 bytes) drivers/net/s2io.c | 14 +- 1 file changed, 5 insertions(+), 9 deletions(-) ACK but didn't apply, please wait 24-48 hours (so that s2io fixes go upstream), then rediff and resend - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 69] drivers/net/sb1250-mac.c: kmalloc + memset conversion to kcalloc
Mariusz Kozlowski wrote: Signed-off-by: Mariusz Kozlowski [EMAIL PROTECTED] drivers/net/sb1250-mac.c | 76286 - 76199 (-87 bytes) drivers/net/sb1250-mac.c |6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) applied - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 76] drivers/net/via-velocity.c: mostly kmalloc + memset conversion to kcalloc
Mariusz Kozlowski wrote: Signed-off-by: Mariusz Kozlowski [EMAIL PROTECTED] drivers/net/via-velocity.c | 88263 - 88120 (-143 bytes) drivers/net/via-velocity.o | 254264 - 253828 (-436 bytes) drivers/net/via-velocity.c | 24 ++-- 1 file changed, 10 insertions(+), 14 deletions(-) applied - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] xen-netfront: remove dead code
This patch removes some residual dead code left over from removing the flip receive mode. This patch doesn't change the generated output at all, since gcc already realized it was dead. This resolves the regression reported by Adrian. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Adrian Bunk [EMAIL PROTECTED] Cc: Michal Piotrowski [EMAIL PROTECTED] --- drivers/net/xen-netfront.c | 37 ++--- 1 file changed, 2 insertions(+), 35 deletions(-) === --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -209,11 +209,9 @@ static void xennet_alloc_rx_buffers(stru struct page *page; int i, batch_target, notify; RING_IDX req_prod = np-rx.req_prod_pvt; - struct xen_memory_reservation reservation; grant_ref_t ref; unsigned long pfn; void *vaddr; - int nr_flips; struct xen_netif_rx_request *req; if (unlikely(!netif_carrier_ok(dev))) @@ -263,7 +261,7 @@ no_skb: np-rx_target = np-rx_max_target; refill: - for (nr_flips = i = 0; ; i++) { + for (i = 0; ; i++) { skb = __skb_dequeue(np-rx_batch); if (skb == NULL) break; @@ -292,38 +290,7 @@ no_skb: req-gref = ref; } - if (nr_flips != 0) { - reservation.extent_start = np-rx_pfn_array; - reservation.nr_extents = nr_flips; - reservation.extent_order = 0; - reservation.address_bits = 0; - reservation.domid= DOMID_SELF; - - if (!xen_feature(XENFEAT_auto_translated_physmap)) { - /* After all PTEs have been zapped, flush the TLB. */ - np-rx_mcl[i-1].args[MULTI_UVMFLAGS_INDEX] = - UVMF_TLB_FLUSH|UVMF_ALL; - - /* Give away a batch of pages. */ - np-rx_mcl[i].op = __HYPERVISOR_memory_op; - np-rx_mcl[i].args[0] = XENMEM_decrease_reservation; - np-rx_mcl[i].args[1] = (unsigned long)reservation; - - /* Zap PTEs and give away pages in one big -* multicall. */ - (void)HYPERVISOR_multicall(np-rx_mcl, i+1); - - /* Check return status of HYPERVISOR_memory_op(). */ - if (unlikely(np-rx_mcl[i].result != i)) - panic(Unable to reduce memory reservation\n); - } else { - if (HYPERVISOR_memory_op(XENMEM_decrease_reservation, -reservation) != i) - panic(Unable to reduce memory reservation\n); - } - } else { - wmb(); /* barrier so backend seens requests */ - } + wmb(); /* barrier so backend seens requests */ /* Above is a suitable barrier to ensure backend will see requests. */ np-rx.req_prod_pvt = req_prod + i; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] ibmveth: Implement ethtool hooks to enable/disable checksum offload
Jeff Garzik wrote: Brian King wrote: This patch adds the appropriate ethtool hooks to allow for enabling/disabling of hypervisor assisted checksum offload for TCP. Signed-off-by: Brian King [EMAIL PROTECTED] --- linux-2.6-bjking1/drivers/net/ibmveth.c | 118 +++- linux-2.6-bjking1/drivers/net/ibmveth.h |1 2 files changed, 117 insertions(+), 2 deletions(-) diff -puN drivers/net/ibmveth.c~ibmveth_csum_offload_ethtool drivers/net/ibmveth.c --- linux-2.6/drivers/net/ibmveth.c~ibmveth_csum_offload_ethtool 2007-08-01 14:55:14.0 -0500 +++ linux-2.6-bjking1/drivers/net/ibmveth.c 2007-08-01 14:55:14.0 -0500 @@ -641,12 +641,125 @@ static u32 netdev_get_link(struct net_de return 1; } +static void ibmveth_set_rx_csum_flags(struct net_device *dev, u32 data) +{ +struct ibmveth_adapter *adapter = dev-priv; + +if (data) +adapter-rx_csum = 1; +else { +adapter-rx_csum = 0; +dev-features = ~NETIF_F_IP_CSUM; why does this RX-related code clear a TX-related flag? Its related to how the pSeries firmware works. The firmware provides an interface to enable checksum offload, which means both tx and rx checksum offload from the firmware's point of view. The firmware does not support enabling checksum offload for only rx. If I disable it for rx I have to disable it for tx as well, otherwise the firmware will reject all future tx buffers I throw at it that are not checksummed. -Brian -- Brian King Linux on Power Virtualization IBM Linux Technology Center - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][BNX2X]: New driver for Broadcom 10Gb Ethernet.
Michael Buesch wrote: On Wednesday 01 August 2007 10:31:17 Michael Chan wrote: +static irqreturn_t bnx2x_msix_sp_int(int irq, void *dev_instance) +{ + struct net_device *dev = dev_instance; You need to check if dev==NULL and bail out. Another driver sharing the IRQ with this might choose to pass the dev pointer as NULL. NAK that advice: It is pointless having such a check in the hottest of driver hot paths, since a large majority of drivers do not have such a check. It is better to fix the extremely rare oddball that passes NULL to request_irq(), than to update all drivers to be slower due to the oddballs. + struct bnx2x *bp = netdev_priv(dev); No check if the device actually _did_ generate the IRQ? Sharing... Not for MSI +static irqreturn_t bnx2x_msix_fp_int(int irq, void *fp_cookie) +{ + + struct bnx2x_fastpath *fp = fp_cookie; Check if fp==NULL NAK + struct bnx2x *bp = fp-bp; + struct net_device *dev = bp-dev; No share protection either? MSI +static irqreturn_t bnx2x_interrupt(int irq, void *dev_instance) +{ + struct net_device *dev = dev_instance; Check if dev==NULL NAK + struct bnx2x *bp = netdev_priv(dev); + u16 status = bnx2x_ack_int(bp); + + if (unlikely(status == 0)) { That's not unlikely. in this case, agreed the other comments seem fairly sane, and indeed should be considered for the existing drivers as well. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] ibmveth: Implement ethtool hooks to enable/disable checksum offload
Brian King wrote: Jeff Garzik wrote: Brian King wrote: This patch adds the appropriate ethtool hooks to allow for enabling/disabling of hypervisor assisted checksum offload for TCP. Signed-off-by: Brian King [EMAIL PROTECTED] --- linux-2.6-bjking1/drivers/net/ibmveth.c | 118 +++- linux-2.6-bjking1/drivers/net/ibmveth.h |1 2 files changed, 117 insertions(+), 2 deletions(-) diff -puN drivers/net/ibmveth.c~ibmveth_csum_offload_ethtool drivers/net/ibmveth.c --- linux-2.6/drivers/net/ibmveth.c~ibmveth_csum_offload_ethtool 2007-08-01 14:55:14.0 -0500 +++ linux-2.6-bjking1/drivers/net/ibmveth.c 2007-08-01 14:55:14.0 -0500 @@ -641,12 +641,125 @@ static u32 netdev_get_link(struct net_de return 1; } +static void ibmveth_set_rx_csum_flags(struct net_device *dev, u32 data) +{ + struct ibmveth_adapter *adapter = dev-priv; + + if (data) + adapter-rx_csum = 1; + else { + adapter-rx_csum = 0; + dev-features = ~NETIF_F_IP_CSUM; why does this RX-related code clear a TX-related flag? Its related to how the pSeries firmware works. The firmware provides an interface to enable checksum offload, which means both tx and rx checksum offload from the firmware's point of view. The firmware does not support enabling checksum offload for only rx. If I disable it for rx I have to disable it for tx as well, otherwise the firmware will reject all future tx buffers I throw at it that are not checksummed. ACK once you add a comment describing this :) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.23 1/3]S2IO: Making MSIX as default intr_type
Sivakumar Subramani wrote: - Making MSIX as default intr_type - Driver will test MSI-X by issuing test MSI-X vector and if fails it will fallback to INTA Signed-off-by: Sivakumar Subramani [EMAIL PROTECTED] Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED] This patch series looks a bit big to apply this far past the merge window. It changes behavior in the middle of a -rc stream, which is something to avoid since we are deep into bug fixes only mode at this point. I'm open to suggestions, otherwise I can apply these to netdev#upstream (queued for 2.6.24). Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please pull 'upstream-jgarzik' branch of wireless-2.6
John W. Linville wrote: These are intended for 2.6.24... --- The following changes since commit fdc8f43b5e49b64b251bb48da95193a13ac0132f: Michael Buesch (1): softmac: Fix deadlock of wx_set_essid with assoc work are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git upstream-jgarzik Bill Nottingham (1): remove gratuitous space in airo module description Faidon Liambotis (2): Kconfig: order options Kconfig: remove references of pcmcia-cs Mariusz Kozlowski (1): drivers/net/wireless/prism54/oid_mgt.c: kmalloc + memset conversion to kzalloc Matthias Kaehlcke (1): Use mutex instead of semaphore in the Host AP driver Ulrich Kunitz (1): zd1211rw: monitor all packets Yoann Padioleau (1): dev-priv to netdev_priv(dev), for drivers/net/wireless drivers/net/wireless/Kconfig | 85 +++ drivers/net/wireless/airo.c|4 +- drivers/net/wireless/arlan-proc.c | 14 ++-- drivers/net/wireless/hostap/hostap_cs.c|2 +- drivers/net/wireless/hostap/hostap_hw.c| 16 +++--- drivers/net/wireless/hostap/hostap_ioctl.c | 14 ++-- drivers/net/wireless/hostap/hostap_wlan.h |3 +- drivers/net/wireless/orinoco_tmd.c |2 +- drivers/net/wireless/prism54/isl_ioctl.c |6 +- drivers/net/wireless/prism54/oid_mgt.c |4 +- drivers/net/wireless/ray_cs.c | 66 +++--- drivers/net/wireless/strip.c |2 +- drivers/net/wireless/wl3501_cs.c | 66 +++--- drivers/net/wireless/zd1211rw/zd_chip.h|5 -- drivers/net/wireless/zd1211rw/zd_mac.c | 44 -- 15 files changed, 172 insertions(+), 161 deletions(-) pulled into #upstream - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][BNX2X]: New driver for Broadcom 10Gb Ethernet.
On Wednesday 08 August 2007 00:15:47 Jeff Garzik wrote: Michael Buesch wrote: On Wednesday 01 August 2007 10:31:17 Michael Chan wrote: +static irqreturn_t bnx2x_msix_sp_int(int irq, void *dev_instance) +{ + struct net_device *dev = dev_instance; You need to check if dev==NULL and bail out. Another driver sharing the IRQ with this might choose to pass the dev pointer as NULL. NAK that advice: It is pointless having such a check in the hottest of driver hot paths, since a large majority of drivers do not have such a check. It is better to fix the extremely rare oddball that passes NULL to request_irq(), than to update all drivers to be slower due to the oddballs. Ah, well. IMO one should better go safe than Oops. ;) It's not that an if branch takes more than 2 or 3 CPU cycles at worst. But well, if you don't like it, I can live without it, too. -- Greetings Michael. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please pull 'libertas-upstream' branch of wireless-2.6
John W. Linville wrote: Got a big patch bomb from the libertas guys. I tried to cherry-pick some of the fixes for 2.6.23, but they either were fixes to problems in new code or all the code cleanups made them difficult for me to intelligently backport. So, this is intended for 2.6.24... --- The following changes since commit d4ac2477fad0f2680e84ec12e387ce67682c5c13: Linus Torvalds (1): Linux 2.6.23-rc2 are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git libertas-upstream This is missing the patch. I looked at it locally, but please do send a patch for review with each push, no matter how big. pulled into #upstream - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xen-netfront: remove dead code
Jeremy Fitzhardinge wrote: This patch removes some residual dead code left over from removing the flip receive mode. This patch doesn't change the generated output at all, since gcc already realized it was dead. This resolves the regression reported by Adrian. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Adrian Bunk [EMAIL PROTECTED] Cc: Michal Piotrowski [EMAIL PROTECTED] --- drivers/net/xen-netfront.c | 37 ++--- 1 file changed, 2 insertions(+), 35 deletions(-) Please send drivers/net/* through me and netdev... Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC]: napi_struct V5
Thanks for looking at ipoib... overall looks fine, just a few comments. --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -281,63 +281,58 @@ static void ipoib_ib_handle_tx_wc(struct net_device *dev, struct ib_wc *wc) wc-status, wr_id, wc-vendor_err); } -int ipoib_poll(struct net_device *dev, int *budget) +int ipoib_poll(struct napi_struct *napi, int budget) +poll_more: +while (done budget) { +int max = (budget - done); + t = min(IPOIB_NUM_WC, max); I think this is the only place where max is used now. Might as well kill it and put budget-done in directly. That would get rid of the strange-looking parens in the max = line too. n = ib_poll_cq(priv-cq, t, priv-ibwc); -for (i = 0; i n; ++i) { +for (i = 0; i n; i++) { it might be nicer to avoid noise like this in the patch. +if (done budget) { +netif_rx_complete(dev, napi); if (unlikely(ib_req_notify_cq(priv-cq, IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS)) -netif_rx_reschedule(dev, 0)) -return 1; - -return 0; +netif_rx_reschedule(napi)) +goto poll_more; this goto back to the polling loop is a change in behavior. When we were tuning NAPI, we found that returning in the missed event case and letting the NAPI core call the poll routine later actually performed better, because it allowed more work to pile up. So could the code just look like: netif_rx_complete(dev, napi); if (unlikely(ib_req_notify_cq(priv-cq, IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS))) netif_rx_reschedule(napi); and then just return done in all cases? It doesn't seem like the return value of netif_rx_reschedule() matters in what we would want to do. The only thing it's used for in the old code is to decide what the poll routine should return. - R. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xen-netfront: remove dead code
Jeff Garzik wrote: Please send drivers/net/* through me and netdev... Sure. Did you pick this patch up? Thanks, J - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2.6.23 1/3]S2IO: Making MSIX as default intr_type
Jeff, Go ahead and apply these patches to netdev#upstream (queued for 2.6.24). Thanks, Ram -Original Message- From: Jeff Garzik [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 07, 2007 3:20 PM To: Sivakumar Subramani Cc: netdev@vger.kernel.org; support Subject: Re: [PATCH 2.6.23 1/3]S2IO: Making MSIX as default intr_type Sivakumar Subramani wrote: - Making MSIX as default intr_type - Driver will test MSI-X by issuing test MSI-X vector and if fails it will fallback to INTA Signed-off-by: Sivakumar Subramani [EMAIL PROTECTED] Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED] This patch series looks a bit big to apply this far past the merge window. It changes behavior in the middle of a -rc stream, which is something to avoid since we are deep into bug fixes only mode at this point. I'm open to suggestions, otherwise I can apply these to netdev#upstream (queued for 2.6.24). Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][BNX2X]: New driver for Broadcom 10Gb Ethernet.
+static irqreturn_t bnx2x_msix_sp_int(int irq, void *dev_instance) +{ + struct net_device *dev = dev_instance; You need to check if dev==NULL and bail out. Another driver sharing the IRQ with this might choose to pass the dev pointer as NULL. I don't really understand this. If another driver is sharing the IRQ with a different device pointer (or even NULL), then that driver's handler is the one that would be called. It's certainly the case that an interrupt handler can be called for a shared interrupt generated by another device, but a driver will never get a cookie back into its interrupt handler different than the one it passed to request_irq(). A NULL check couldn't really help anything -- because if one driver's dev_id can get passed into another driver's interrupt handler, the non-NULL case would be even worse, because you would have one driver poking into another driver's data structure. But fortunately the kernel is smart enough not to create this mess. (And also, MSI-X interrupts are never shared so this is doubly irrelevant in this particular case) - R. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][BNX2X]: New driver for Broadcom 10Gb Ethernet.
On Wed, Aug 08, 2007 at 12:20:35AM +0200, Michael Buesch wrote: On Wednesday 08 August 2007 00:15:47 Jeff Garzik wrote: Michael Buesch wrote: On Wednesday 01 August 2007 10:31:17 Michael Chan wrote: +static irqreturn_t bnx2x_msix_sp_int(int irq, void *dev_instance) +{ +struct net_device *dev = dev_instance; You need to check if dev==NULL and bail out. Another driver sharing the IRQ with this might choose to pass the dev pointer as NULL. NAK that advice: It is pointless having such a check in the hottest of driver hot paths, since a large majority of drivers do not have such a check. It is better to fix the extremely rare oddball that passes NULL to request_irq(), than to update all drivers to be slower due to the oddballs. Ah, well. IMO one should better go safe than Oops. ;) It's not that an if branch takes more than 2 or 3 CPU cycles at worst. But well, if you don't like it, I can live without it, too. Please take a look at kernel/irq/handle.c. The irq handler is always called with the right dev_id argument. Everything would be a complete nightmare to handle because you usually need to access the device private data to check whether the shared irq is for this device. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC]: napi_struct V5
From: Roland Dreier [EMAIL PROTECTED] Date: Tue, 07 Aug 2007 15:37:30 -0700 n = ib_poll_cq(priv-cq, t, priv-ibwc); - for (i = 0; i n; ++i) { + for (i = 0; i n; i++) { it might be nicer to avoid noise like this in the patch. That one was just too much of an eye sore to ignore and it effect my ability to audit the change I was making. I mean, this is one of the first precise examples of kinds of programming that lead to subtle bugs mentioned in The Practice of Programming. So this is staying in the patch, sorry. this goto back to the polling loop is a change in behavior. When we were tuning NAPI, we found that returning in the missed event case and letting the NAPI core call the poll routine later actually performed better, because it allowed more work to pile up. You weren't using your quantum, which is what you're supposed to do. Sometimes using your quantum correctly won't perform optimally, but in the interest of fairness and what NAPI wants, that is what you're supposed to do, process work until you hit budget or there is no more work. Look, I'm not going to back down to every single tweak in every driver. All the drivers should handle this case consistently, and if I have to edit every single driver to make this patch that is exactly what I am going to do and enforce. If you patch the ipoib driver behavior back afterwards, I will NAK that patch every single time unless you make EVERY SINGLE OTHER DRIVER do the same and thus retain the consistency. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][BNX2X]: New driver for Broadcom 10Gb Ethernet.
From: Christoph Hellwig [EMAIL PROTECTED] Date: Wed, 8 Aug 2007 00:04:59 +0100 Please take a look at kernel/irq/handle.c. The irq handler is always called with the right dev_id argument. Everything would be a complete nightmare to handle because you usually need to access the device private data to check whether the shared irq is for this device. Absolutely. I can't believe we're even discussing something so obvious and wasting everyone's time. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] SMSC LAN911x and LAN921x vendor driver
Peter == Peter Korsgaard [EMAIL PROTECTED] writes: Hi, Peter I'll give your driver a try and report back. Ok, the driver seems to be working (after fixing up the accessor routines for my hw setup) and performance is comparable to Dustin's driver. It would be nice if the driver would enable the byte swapping support in the hw if it detects the wrong endian - Like this: Index: linux/drivers/net/smsc911x.c === --- linux.orig/drivers/net/smsc911x.c +++ linux/drivers/net/smsc911x.c @@ -1787,6 +1787,13 @@ return -ENODEV; } + /* check endian */ + if (smsc911x_reg_read(pdata, BYTE_TEST) == 0x43218765) { + SMSC_TRACE(Byte test looks swapped, inverting); + smsc911x_reg_write(~smsc911x_reg_read(pdata, ENDIAN), + pdata, ENDIAN); + } + /* Default generation to zero (all workarounds apply) */ pdata-generation = 0; -- Bye, Peter Korsgaard - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[git patches] net driver fixes
Please pull from 'upstream-linus' branch of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git upstream-linus to receive the following updates: drivers/net/atl1/atl1_main.c|4 +-- drivers/net/ehea/ehea.h |2 +- drivers/net/ehea/ehea_main.c| 44 ++ drivers/net/ibmveth.c | 27 +--- drivers/net/ibmveth.h |3 -- drivers/net/phy/phy.c |4 +- drivers/net/r8169.c | 24 ++- drivers/net/sis190.c|3 ++ drivers/net/smc91x.h|4 +-- drivers/net/ucc_geth_ethtool.c |1 - drivers/net/ucc_geth_mii.c |3 +- drivers/net/wireless/bcm43xx/bcm43xx_phy.c |2 +- drivers/net/wireless/rtl8187_dev.c |2 +- drivers/net/wireless/zd1211rw/zd_mac.c |2 +- fs/compat_ioctl.c |3 -- net/ieee80211/softmac/ieee80211softmac_wx.c | 11 +-- 16 files changed, 69 insertions(+), 70 deletions(-) Brian King (1): ibmveth: Fix rx pool deactivate oops Domen Puncer (2): ucc_geth: fix section mismatch phy layer: fix phy_mii_ioctl for autonegotiation Francois Romieu (1): r8169: avoid needless NAPI poll scheduling Ingo Molnar (1): atl1: use spin_trylock_irqsave() Jan Altenberg (1): ucc_geth: remove get_perm_addr from ucc_geth_ethtool.c John W. Linville (1): Revert [PATCH] bcm43xx: Fix deviation from specifications in set_baseband_attenuation Mariusz Kozlowski (1): drivers/net/ibmveth.c: memset fix Masakazu Mokuno (1): remove duplicated ioctl entries in compat_ioctl.c Michael Buesch (1): softmac: Fix deadlock of wx_set_essid with assoc work Michael Wu (1): rtl8187: ensure priv-hwaddr is always valid Neil Muller (1): sis190 check for ISA bridge on SiS966 Paul Mundt (1): net: smc91x: Build fixes for general sh boards. Roger So (1): r8169: PHY power-on fix Thomas Klein (3): ehea: Fix workqueue handling ehea: Simplify resource usage check ehea: Eliminated some compiler warnings Ulrich Kunitz (1): zd1211rw: fix filter for PSPOLL frames diff --git a/drivers/net/atl1/atl1_main.c b/drivers/net/atl1/atl1_main.c index 56f6389..3c1984e 100644 --- a/drivers/net/atl1/atl1_main.c +++ b/drivers/net/atl1/atl1_main.c @@ -1704,10 +1704,8 @@ static int atl1_xmit_frame(struct sk_buff *skb, struct net_device *netdev) } } - local_irq_save(flags); - if (!spin_trylock(adapter-lock)) { + if (!spin_trylock_irqsave(adapter-lock, flags)) { /* Can't get lock - tell upper layer to requeue */ - local_irq_restore(flags); dev_printk(KERN_DEBUG, adapter-pdev-dev, tx locked\n); return NETDEV_TX_LOCKED; } diff --git a/drivers/net/ehea/ehea.h b/drivers/net/ehea/ehea.h index 8ee2c2c..d67f97b 100644 --- a/drivers/net/ehea/ehea.h +++ b/drivers/net/ehea/ehea.h @@ -39,7 +39,7 @@ #include asm/io.h #define DRV_NAME ehea -#define DRV_VERSIONEHEA_0072 +#define DRV_VERSIONEHEA_0073 /* eHEA capability flags */ #define DLPAR_PORT_ADD_REM 1 diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c index 58702f5..9756211 100644 --- a/drivers/net/ehea/ehea_main.c +++ b/drivers/net/ehea/ehea_main.c @@ -1326,7 +1326,6 @@ static void write_swqe2_TSO(struct sk_buff *skb, u8 *imm_data = swqe-u.immdata_desc.immediate_data[0]; int skb_data_size = skb-len - skb-data_len; int headersize; - u64 tmp_addr; /* Packet is TCP with TSO enabled */ swqe-tx_control |= EHEA_SWQE_TSO; @@ -1347,9 +1346,8 @@ static void write_swqe2_TSO(struct sk_buff *skb, /* set sg1entry data */ sg1entry-l_key = lkey; sg1entry-len = skb_data_size - headersize; - - tmp_addr = (u64)(skb-data + headersize); - sg1entry-vaddr = ehea_map_vaddr(tmp_addr); + sg1entry-vaddr = + ehea_map_vaddr(skb-data + headersize); swqe-descriptors++; } } else @@ -1362,7 +1360,6 @@ static void write_swqe2_nonTSO(struct sk_buff *skb, int skb_data_size = skb-len - skb-data_len; u8 *imm_data = swqe-u.immdata_desc.immediate_data[0]; struct ehea_vsgentry *sg1entry = swqe-u.immdata_desc.sg_entry; - u64 tmp_addr; /* Packet is any nonTSO type * @@ -1379,8 +1376,8 @@ static void write_swqe2_nonTSO(struct sk_buff *skb, /* copy sg1entry data */ sg1entry-l_key = lkey; sg1entry-len = skb_data_size - SWQE2_MAX_IMM; - tmp_addr =
Re: [PATCH] xen-netfront: remove dead code
Jeremy Fitzhardinge wrote: Jeff Garzik wrote: Please send drivers/net/* through me and netdev... Sure. Did you pick this patch up? Yes. It's in my pending-for-2.6.24 folder, since it's not a bug fix. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xen-netfront: remove dead code
Jeff Garzik wrote: Yes. It's in my pending-for-2.6.24 folder, since it's not a bug fix. Great, thanks. J - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/14] nes: NetEffect 10Gb RNIC Driver
NetEffect is proud to announce the following series of patches which contain the source code for the NE020 10Gb RNIC adapter. The driver is split into two components - a kernel driver module and a userspace library. The code can also be found in the following git trees. git.openfabrics.org/~glenn/libnes.git git.openfabrics.org/~glenn/ofed_1_2.git git.openfabrics.org/~glenn/ofascripts.git git.openfabrics.org/~glenn/ofed_1_2_scripts.git Requirements * NE020 hardware * RHEL4u4 or FC5 * OFED 1.2 GA * OFED 1.2 version of MVAPICH2 Known issues * DAPL only works with 1 process per node due to lack of loopback * MPI over DAPL must use MPI shared memory loopback Plans for next release == * Plan on adding verbs loopback to enable DAPL loopback * Increase robustness and stability What we tested == The performance results are meant to be broadly representative. Results can vary depending on switches used, system configuration, OS used, etc. Configuration notes: All two node tests were performed back-to-back i.e. no switch Multinode testing was performed using a high performance, low latency cut-through switch. Platform: CentOS x86_64 1.cbench rotate latency and bandwidth using mvapich2 over OFA verbs: Rotate Latency: 6.67 us Rotate Bandwidth: 9.3 Gpbs 2.OSU bandwidth and latency tests using mvapich2 over OFA verbs: OSU Latency: 6.74 us OSU Bi-Bandwidth: 14.4 Gbps 3.Perftest (rdma_bw Uni-dir/Bi-dir and rdma_lat) RDMA Bandwidth Uni-directional: 8.9 Gpbs RDMA Bandwidth Bi-directional : 15.09 Gpbs RDMA Latency : 5.95 us 4.NIC Testing Iperf 4 stream Bi-directional test Iperf -c Srvr IP Addr -d -M -N -i 4 -P 4 (Jumbo packets enabled) -- 10.76 Gbps NetPerf netperf -H Srvr IP Addr -T1,1 -t TCP_STREAM -l 60 -C -c (Jumbo packets enabled) -- 6.2 Gbps Thanks, Glenn. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/14] nes: device structures and defines
Main include file for device structures and defines Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED] --- diff -Nurp NULL ofa_kernel-1.2/drivers/infiniband/hw/nes/nes.h --- NULL1969-12-31 18:00:00.0 -0600 +++ ofa_kernel-1.2/drivers/infiniband/hw/nes/nes.h 2007-08-06 20:09:04.0 -0500 @@ -0,0 +1,525 @@ +/* + * Copyright (c) 2006 - 2007 NetEffect, Inc. All rights reserved. + * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#ifndef __NES_H +#define __NES_H + +#include linux/netdevice.h +#include linux/inetdevice.h +#include linux/spinlock.h +#include linux/kernel.h +#include linux/delay.h +#include linux/pci.h +#include linux/dma-mapping.h +#include linux/workqueue.h +#include linux/slab.h +#include asm/semaphore.h +#include linux/version.h + +#include rdma/ib_smi.h +#include rdma/ib_verbs.h +#include rdma/ib_pack.h +#include rdma/rdma_cm.h +#include rdma/iw_cm.h + +#define TBIRD +#define NES_TWO_PORT +#define NES_LEGACY_INT_DETECT +#define NES_ENABLE_CQE_READ +#define NES_SEND_FIRST_WRITE + +#define QUEUE_DISCONNECTS + +#define DRV_BUILD rc9.13.1 + +#define DRV_NAMEiw_nes +#define DRV_VERSION0.4 Build DRV_BUILD +#define PFX DRV_NAME : + +/* + * NetEffect PCI vendor id and NE010 PCI device id. + */ +#ifndef PCI_VENDOR_ID_NETEFFECT/* not in pci.ids yet */ +#define PCI_VENDOR_ID_NETEFFECT 0x1678 +#define PCI_DEVICE_ID_NETEFFECT_NE020 0x0100 +#endif + +#define NE020_REV 4 + +#define BAR_0 0 +#define BAR_1 2 + +#define RX_BUF_SIZE(1536 + 8) + +#define NES_REG0_SIZE (4 * 1024) +#define NES_TX_TIMEOUT (6*HZ) +#define NES_FIRST_QPN 64 +#define NES_SW_CONTEXT_ALIGN 1024 + +#define NES_NIC_MAX_NICS 16 +#define NES_MAX_ARP_TABLE_SIZE 4096 + +#define MAX_DPC_ITERATIONS 128 + +/* debug levels */ +#define NES_DBG_RX 0x0001 +#define NES_DBG_RX_PKT_DUMP0x0002 +#define NES_DBG_TX 0x0004 +#define NES_DBG_TX_PKT_DUMP0x0008 +#define NES_DBG_ALL0x + +#define NES_DRV_OPT_ENABLE_MPA_VER_0 0x0001 +#define NES_DRV_OPT_DISABLE_MPA_CRC0x0002 +#define NES_DRV_OPT_DISABLE_FIRST_WRITE0x0004 +#define NES_DRV_OPT_DISABLE_INTF 0x0008 +#define NES_DRV_OPT_ENABLE_MSI 0x0010 +#define NES_DRV_OPT_DUAL_LOGICAL_PORT 0x0020 +#define NES_DRV_OPT_SUPRESS_OPTION_BC 0x0040 + +#define NES_AEQ_EVENT_TIMEOUT 2500 +#define NES_DISCONNECT_EVENT_TIMEOUT 2000 + +#ifdef NES_DEBUG +#define assert(expr) \ +if(!(expr)) { \ + printk(KERN_ERR PFX Assertion failed! %s, %s, %s, line %d\n, \ + #expr, __FILE__, __FUNCTION__, __LINE__); \ +} +#ifndef dprintk +#define dprintk(fmt, args...) do { printk(KERN_ERR PFX fmt, ##args); } while (0) +#endif +#define NES_EVENT_TIMEOUT 120 +/* #define NES_EVENT_TIMEOUT 1200 */ +#else +#define assert(expr) do {} while (0) +#define dprintk(fmt, args...) do {} while (0) + +#define NES_EVENT_TIMEOUT 10 +#endif + +#include nes_hw.h +#include nes_verbs.h +#include nes_context.h +#include nes_user.h +#include nes_cm.h + + +extern unsigned int nes_drv_opt; +extern unsigned int nes_debug_level; + +extern struct list_head
Re: [patch 1/1] NetLabel: add missing rcu_dereference() calls in the LSM domain mapping hash table
From: Paul Moore [EMAIL PROTECTED] Date: Tue, 07 Aug 2007 16:54:50 -0400 The LSM domain mapping head table pointer was not being referenced via the RCU safe dereferencing function, rcu_dereference(). This patch adds those missing calls to the NetLabel code. This has been tested using recent linux-2.6 git kernels with no visible regressions. Signed-off-by: Paul Moore [EMAIL PROTECTED] Patch applied, thanks Paul. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/14] nes: connection manager structures and defines
NetEffect connection manager includes, structures and defines. Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED] --- diff -Nurp NULL ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_cm.h --- NULL1969-12-31 18:00:00.0 -0600 +++ ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_cm.h 2007-08-07 13:35:50.0 -0500 @@ -0,0 +1,419 @@ +/* + * Copyright (c) 2006 - 2007 NetEffect, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + */ + +#ifndef NES_CM_H +#define NES_CM_H + +#define QUEUE_EVENTS + +#define NES_MANAGE_APBVT_DEL 0 +#define NES_MANAGE_APBVT_ADD 1 + +/* IETF MPA -- defines, enums, structs */ +#define IEFT_MPA_KEY_REQ MPA ID Req Frame +#define IEFT_MPA_KEY_REP MPA ID Rep Frame +#define IETF_MPA_KEY_SIZE 16 +#define IETF_MPA_VERSION 1 + +enum ietf_mpa_flags { + IETF_MPA_FLAGS_MARKERS = 0x80, /* receive Markers */ + IETF_MPA_FLAGS_CRC = 0x40, /* receive Markers */ + IETF_MPA_FLAGS_REJECT = 0x20, /* Reject */ +}; + +struct ietf_mpa_frame { + u8 key[IETF_MPA_KEY_SIZE]; + u8 flags; + u8 rev; + u16 priv_data_len; + u8 priv_data[0]; +}; + +#define ietf_mpa_req_resp_frame ietf_mpa_frame + +struct nes_v4_quad { + u32 rsvd0; + u32 DstIpAdrIndex; /* Only most significant 5 bits are valid */ + u32 SrcIpadr; + u32 TcpPorts; /* src is low, dest is high */ +}; + +struct nes_cm_node; +enum nes_timer_type +{ + NES_TIMER_TYPE_SEND, + NES_TIMER_TYPE_RECV, + NES_TIMER_NODE_CLEANUP, + NES_TIMER_TYPE_CLOSE, +}; + +#define MAX_NES_IFS 4 + +#define SET_ACK 1 +#define SET_SYN 2 +#define SET_FIN 4 +#define SET_RST 8 + +struct option_base { + u8 optionnum; + u8 length; +}; + +enum option_numbers { + OPTION_NUMBER_END, + OPTION_NUMBER_NONE, + OPTION_NUMBER_MSS, + OPTION_NUMBER_WINDOW_SCALE, + OPTION_NUMBER_SACK_PERM, + OPTION_NUMBER_SACK, + OPTION_NUMBER_WRITE0 = 0xbc +}; + +struct option_mss { + u8 optionnum; + u8 length; + u16 mss; +}; + +struct option_windowscale { + u8 optionnum; + u8 length; + u8 shiftcount; +}; + +union all_known_options{ + char as_end; + struct option_base as_base; + struct option_mss as_mss; + struct option_windowscale as_windowscale; +}; + +struct nes_timer_entry +{ + struct list_head list; + unsigned long timetosend; /* jiffies */ + struct sk_buff *skb; + u32 type; + u32 retrycount; + u32 retranscount; + u32 context; + u32 seq_num; + u32 send_retrans; + struct net_device *netdev; +}; + +#define NES_DEFAULT_RETRYS 64 +#define NES_DEFAULT_RETRANS 4 +#define NES_RETRY_TIMEOUT (1000*HZ/1000) +#define NES_SHORT_TIME (10) +#define NES_LONG_TIME (2000*HZ/1000) + +#define NES_CM_HASHTABLE_SIZE 1024 +#define NES_CM_TCP_TIMER_INTERVAL 3000 +#define NES_CM_DEFAULT_MTU 1540 +#define NES_CM_DEFAULT_FRAME_CNT 10 +#define NES_CM_THREAD_STACK_SIZE 256 +#define NES_CM_DEFAULT_RCV_WND 64240 // before we know that window scaling is allowed +#define NES_CM_DEFAULT_RCV_WND_SCALED 256960 // after we know that window scaling is allowed +#define NES_CM_DEFAULT_RCV_WND_SCALE 2 +#define NES_CM_DEFAULT_FREE_PKTS 0x000A +#define NES_CM_FREE_PKT_LO_WATERMARK 2 + +#define NES_CM_DEF_SEQ0x159bf75f +#define NES_CM_DEF_LOCAL_ID 0x3b47 + +#define NES_CM_DEF_SEQ2 0x18ed5740 +#define NES_CM_DEF_LOCAL_ID2 0xb807 + +typedef u32 nes_addr_t; + +#define
Re: [PATCH RFC]: napi_struct V5
From: jamal [EMAIL PROTECTED] Date: Tue, 07 Aug 2007 08:52:35 -0400 That doc is out of date on the split of work - it focusses mostly describing the original tulip which did not mix rx and tx in the napi_poll(). AFAIK, no driver does that today (although i really liked that scheme, there is a lot of fscked hardware out there that wont work well with that scheme). Where are the firemen when you need them? I am tempted to suggest we toss the document completely, for two reasons: 1) It's geared towards conversions, whereas %99. of the conversions that will ever happen, have happened. Every single potential reader of this document is therefore writing new drivers with NAPI from the beginning. 2) Inaccurate documentation is often worse than no documentation. It's not a bad thing to delete the document, and also we have a lot of time until 2.6.24 finalizes with these changes and in that time someone with the right incentive could write a fresh new NAPI manual that represents reality. Such a document could be added after the merge window closes. This also reminds me that we confuse people by having two driver models for interrupt handling. I've been reluctant to remove the optional component of NAPI especially when it didn't handle multi-queue properly (which basically made drivers for virtualized devices impossible without using dummy devices for each queue). But that is no longer true and there isn't any reason for a new driver not to be NAPI from the beginning. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net/core/utils: fix sparse warning
From: Johannes Berg [EMAIL PROTECTED] Date: Mon, 06 Aug 2007 18:37:26 +0200 net_msg_warn is not defined because it is in net/sock.h which isn't included. Signed-off-by: Johannes Berg [EMAIL PROTECTED] Applied, thanks Johannes. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] allow device to stop packet mirror behaviour
From: Johannes Berg [EMAIL PROTECTED] Date: Tue, 07 Aug 2007 10:25:55 +0200 The only way to solve this problem therefore seems to be to suppress the mirroring out of the packet by dev_queue_xmit_nit(). The patch below does that by way of adding a new netdev flag. Multicast packets also get looped back in a similar manner in the ipv4 code. These will also be seen twice due to this issue. There are probably many other examples as well, dev_queue_xmit_nit() is just the tip of the iceberg. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/14] nes: device structures and defines
[EMAIL PROTECTED] wrote: +#ifndef PCI_VENDOR_ID_NETEFFECT/* not in pci.ids yet */ +#define PCI_VENDOR_ID_NETEFFECT 0x1678 this should be part of your patch +#define PCI_DEVICE_ID_NETEFFECT_NE020 0x0100 no need for a #define at all, just use the hex number if the ONLY place its used is in the pci_device_id table. Doing so avoids patch hell that is pci_ids.h, avoids adding a constant for a single-use hex number that's arbitrary anyway +#define BAR_0 0 +#define BAR_1 2 delete +#define RX_BUF_SIZE(1536 + 8) this number was blindly copied from another driver, right? +#ifdef NES_DEBUG +#define assert(expr) \ +if(!(expr)) { \ + printk(KERN_ERR PFX Assertion failed! %s, %s, %s, line %d\n,\ + #expr, __FILE__, __FUNCTION__, __LINE__); \ +} +#ifndef dprintk +#define dprintk(fmt, args...) do { printk(KERN_ERR PFX fmt, ##args); } while (0) +#endif look around, we already have debug macros. you're probably copying from an older net driver that doesn't yet use the new stuff +#define NES_EVENT_TIMEOUT 120 +/* #define NES_EVENT_TIMEOUT 1200 */ +#else +#define assert(expr) do {} while (0) +#define dprintk(fmt, args...) do {} while (0) + +#define NES_EVENT_TIMEOUT 10 +#endif + +#include nes_hw.h +#include nes_verbs.h +#include nes_context.h +#include nes_user.h +#include nes_cm.h + + +extern unsigned int nes_drv_opt; +extern unsigned int nes_debug_level; + +extern struct list_head nes_adapter_list; +extern struct list_head nes_dev_list; + +extern int max_mtu; +#define max_frame_len (max_mtu+ETH_HLEN) +extern int interrupt_mod_interval; + +struct nes_device { + struct nes_adapter *nesadapter; + void __iomem *regs; + void __iomem *index_reg; + struct pci_dev *pcidev; + struct net_device *netdev[NES_NIC_MAX_NICS]; this is questionable. why do you need multiple netdevs? multiple ports? ok. multiple queues? not ok. see recent netdev discussions. + u64 link_status_interrupts; + struct tasklet_struct dpc_tasklet; + spinlock_t indexed_regs_lock; + unsigned long doorbell_start; + unsigned long csr_start; + unsigned long mac_tx_errors; + unsigned long mac_pause_frames_sent; + unsigned long mac_pause_frames_received; + unsigned long mac_rx_errors; + unsigned long mac_rx_crc_errors; + unsigned long mac_rx_symbol_err_frames; + unsigned long mac_rx_jabber_frames; + unsigned long mac_rx_oversized_frames; + unsigned long mac_rx_short_frames; + unsigned int mac_index; + unsigned int nes_stack_start; + + /* Control Structures */ + void *cqp_vbase; + dma_addr_t cqp_pbase; + u32 cqp_mem_size; + u8 ceq_index; + u8 nic_ceq_index; + struct nes_hw_cqp cqp; + struct nes_hw_cq ccq; + struct list_head cqp_avail_reqs; + struct list_head cqp_pending_reqs; + struct nes_cqp_request *nes_cqp_requests; + + u32 int_req; + u32 int_stat; + u32 timer_int_req; + u32 timer_only_int_count; + u32 intf_int_req; + u32 et_rx_coalesce_usecs_irq; + struct list_head list; + + u16 base_doorbell_index; + u8 msi_enabled; + u8 netdev_count; + u8 napi_isr_ran; + u8 disable_rx_flow_control; + u8 disable_tx_flow_control; #1: please consider using tabs to separate type and name, which makes the struct definition far easier to read. See drivers/net/tg3.h for an example #2: consider putting all RX-related items together, and all TX-related items together. this makes cacheline sharing more likely. +/* Linux kernel version interface changes */ +#if (LINUX_VERSION_CODE KERNEL_VERSION(2,6,18)) +static inline unsigned short nes_skb_lso_size(const struct sk_buff *skb) +{ + return(skb_shinfo(skb)-tso_size); +} +#define nes_skb_linearize(_skb, _type) skb_linearize(_skb, _type) +#define NES_INIT_WORK(_work, _func, _data) INIT_WORK(_work, _func) +#else +static inline unsigned short nes_skb_lso_size(const struct sk_buff *skb) +{ + return(skb_shinfo(skb)-gso_size); +} +#define nes_skb_linearize(_skb, _type) skb_linearize(_skb) +#define NES_INIT_WORK(_work, _func, _data) INIT_WORK(_work, _func) +#endif delete all old-kernel compat code. not appropriate for upstream submission +static inline u32 nes_read32(const void __iomem * addr) +{ + return(le32_to_cpu(readl(addr))); +} + +static inline u16 nes_read16(const void __iomem * addr) +{ + return(le16_to_cpu(readw(addr))); +} + +static inline u8 nes_read8(const void __iomem * addr) +{ + return(readb(addr)); +} #1: delete these completely useless
[PATCH 7/14] nes: hardware specific includes
Hardware structures and defines Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED] --- diff -Nurp NULL ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_hw.h --- NULL1969-12-31 18:00:00.0 -0600 +++ ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_hw.h 2007-08-06 20:09:05.0 -0500 @@ -0,0 +1,1102 @@ +/* +* Copyright (c) 2006 - 2007 NetEffect, Inc. All rights reserved. +* +* This software is available to you under a choice of one of two +* licenses. You may choose to be licensed under the terms of the GNU +* General Public License (GPL) Version 2, available from the file +* COPYING in the main directory of this source tree, or the +* OpenIB.org BSD license below: +* +* Redistribution and use in source and binary forms, with or +* without modification, are permitted provided that the following +* conditions are met: +* +* - Redistributions of source code must retain the above +*copyright notice, this list of conditions and the following +*disclaimer. +* +* - Redistributions in binary form must reproduce the above +*copyright notice, this list of conditions and the following +*disclaimer in the documentation and/or other materials +*provided with the distribution. +* +* THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, +* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS +* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +* SOFTWARE. +*/ + +#ifndef __NES_HW_H +#define __NES_HW_H + +enum pci_regs { + NES_INT_STAT = 0x, + NES_INT_MASK = 0x0004, + NES_INT_PENDING = 0x0008, + NES_INTF_INT_STAT = 0x000C, + NES_INTF_INT_MASK = 0x0010, + NES_TIMER_STAT = 0x0014, + NES_PERIODIC_CONTROL = 0x0018, + NES_ONE_SHOT_CONTROL = 0x001C, + NES_EEPROM_COMMAND = 0x0020, + NES_EEPROM_DATA = 0x0024, + NES_SOFTWARE_RESET = 0x0030, + NES_CQ_ACK = 0x0034, + NES_WQE_ALLOC = 0x0040, + NES_CQE_ALLOC = 0x0044, +}; + +enum indexed_regs { + NES_IDX_CREATE_CQP_LOW = 0x, + NES_IDX_CREATE_CQP_HIGH = 0x0004, + NES_IDX_QP_CONTROL = 0x0040, + NES_IDX_FLM_CONTROL = 0x0080, + NES_IDX_INT_CPU_STATUS = 0x00a0, + NES_IDX_GPIO_CONTROL = 0x00f0, + NES_IDX_GPIO_DATA = 0x00f4, + NES_IDX_TCP_CONFIG0 = 0x01e4, + NES_IDX_TCP_TIMER_CONFIG = 0x01ec, + NES_IDX_TCP_NOW = 0x01f0, + NES_IDX_QP_MAX_CFG_SIZES = 0x0200, + NES_IDX_QP_CTX_SIZE = 0x0218, + NES_IDX_TCP_TIMER_SIZE0 = 0x0238, + NES_IDX_TCP_TIMER_SIZE1 = 0x0240, + NES_IDX_ARP_CACHE_SIZE = 0x0258, + NES_IDX_CQ_CTX_SIZE = 0x0260, + NES_IDX_MRT_SIZE = 0x0278, + NES_IDX_PBL_REGION_SIZE = 0x0280, + NES_IDX_IRRQ_COUNT = 0x02b0, + NES_IDX_RX_WINDOW_BUFFER_PAGE_TABLE_SIZE = 0x02f0, + NES_IDX_RX_WINDOW_BUFFER_SIZE = 0x0300, + NES_IDX_DST_IP_ADDR = 0x0400, + NES_IDX_PCIX_DIAG = 0x08e8, + NES_IDX_MPP_DEBUG = 0x0a00, + NES_IDX_MPP_LB_DEBUG = 0x0b00, + NES_IDX_DENALI_CTL_22 = 0x1058, + NES_IDX_MAC_TX_CONTROL = 0x2000, + NES_IDX_MAC_TX_CONFIG = 0x2004, + NES_IDX_MAC_TX_PAUSE_QUANTA = 0x2008, + NES_IDX_MAC_RX_CONTROL = 0x200c, + NES_IDX_MAC_RX_CONFIG = 0x2010, + NES_IDX_MAC_EXACT_MATCH_BOTTOM = 0x201c, + NES_IDX_MAC_MDIO_CONTROL = 0x2084, + NES_IDX_MAC_TX_OCTETS_LOW = 0x2100, + NES_IDX_MAC_TX_OCTETS_HIGH = 0x2104, + NES_IDX_MAC_TX_FRAMES_LOW = 0x2108, + NES_IDX_MAC_TX_FRAMES_HIGH = 0x210c, + NES_IDX_MAC_TX_PAUSE_FRAMES = 0x2118, + NES_IDX_MAC_TX_ERRORS = 0x2138, + NES_IDX_MAC_RX_OCTETS_LOW = 0x213c, + NES_IDX_MAC_RX_OCTETS_HIGH = 0x2140, + NES_IDX_MAC_RX_FRAMES_LOW = 0x2144, + NES_IDX_MAC_RX_FRAMES_HIGH = 0x2148, + NES_IDX_MAC_RX_BC_FRAMES_LOW = 0x214c, + NES_IDX_MAC_RX_MC_FRAMES_HIGH = 0x2150, + NES_IDX_MAC_RX_PAUSE_FRAMES = 0x2154, + NES_IDX_MAC_RX_SHORT_FRAMES = 0x2174, + NES_IDX_MAC_RX_OVERSIZED_FRAMES = 0x2178, + NES_IDX_MAC_RX_JABBER_FRAMES = 0x217c, + NES_IDX_MAC_RX_CRC_ERR_FRAMES = 0x2180, + NES_IDX_MAC_RX_LENGTH_ERR_FRAMES = 0x2184, + NES_IDX_MAC_RX_SYMBOL_ERR_FRAMES = 0x2188, + NES_IDX_MAC_INT_STATUS = 0x21f0, + NES_IDX_MAC_INT_MASK = 0x21f4, + NES_IDX_PHY_PCS_CONTROL_STATUS0 = 0x2800, + NES_IDX_PHY_PCS_CONTROL_STATUS1 = 0x2a00, + NES_IDX_ETH_SERDES_COMMON_CONTROL0 = 0x2808, + NES_IDX_ETH_SERDES_COMMON_CONTROL1 = 0x2a08, + NES_IDX_ETH_SERDES_COMMON_STATUS0 = 0x280c, + NES_IDX_ETH_SERDES_COMMON_STATUS1 = 0x2a0c, + NES_IDX_ETH_SERDES_TX_EMP0 = 0x2810, +
[PATCH 8/14] nes: nic device routines
Nic device routines. Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED] --- diff -Nurp NULL ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_nic.c --- NULL1969-12-31 18:00:00.0 -0600 +++ ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_nic.c 2007-08-06 20:09:05.0 -0500 @@ -0,0 +1,1467 @@ +/* + * Copyright (c) 2006 - 2007 NetEffect, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + */ + +#include linux/module.h +#include linux/moduleparam.h +#include linux/netdevice.h +#include linux/etherdevice.h +#include linux/ip.h +#include linux/tcp.h +#include linux/if_arp.h +#include linux/if_vlan.h +#include linux/ethtool.h +#include net/tcp.h + +#include net/inet_common.h +#include linux/inet.h + +#include nes.h + +struct nic_qp_map nic_qp_mapping_0[] = { + {16,0,0,1},{24,4,0,0},{28,8,0,0},{32,12,0,0}, + {20,2,2,1},{26,6,2,0},{30,10,2,0},{34,14,2,0}, + {18,1,1,1},{25,5,1,0},{29,9,1,0},{33,13,1,0}, + {22,3,3,1},{27,7,3,0},{31,11,3,0},{35,15,3,0} +}; + +struct nic_qp_map nic_qp_mapping_1[] = { + {18,1,1,1},{25,5,1,0},{29,9,1,0},{33,13,1,0}, + {22,3,3,1},{27,7,3,0},{31,11,3,0},{35,15,3,0} +}; + +struct nic_qp_map nic_qp_mapping_2[] = { + {20,2,2,1},{26,6,2,0},{30,10,2,0},{34,14,2,0} +}; + +struct nic_qp_map nic_qp_mapping_3[] = { + {22,3,3,1},{27,7,3,0},{31,11,3,0},{35,15,3,0} +}; + +struct nic_qp_map nic_qp_mapping_4[] = { + {28,8,0,0},{32,12,0,0} +}; + +struct nic_qp_map nic_qp_mapping_5[] = { + {29,9,1,0},{33,13,1,0} +}; + +struct nic_qp_map nic_qp_mapping_6[] = { + {30,10,2,0},{34,14,2,0} +}; + +struct nic_qp_map nic_qp_mapping_7[] = { + {31,11,3,0},{35,15,3,0} +}; + +struct nic_qp_map *nic_qp_mapping_per_function[] = { + nic_qp_mapping_0, nic_qp_mapping_1, nic_qp_mapping_2, nic_qp_mapping_3, + nic_qp_mapping_4, nic_qp_mapping_5, nic_qp_mapping_6, nic_qp_mapping_7 +}; + +extern int nics_per_function; + +static const u32 default_msg = NETIF_MSG_DRV | NETIF_MSG_PROBE | NETIF_MSG_LINK + | NETIF_MSG_IFUP | NETIF_MSG_IFDOWN; +static int debug = -1; + +static int rdma_enabled = 0; +extern atomic_t cm_connects; +extern atomic_t cm_accepts; +extern atomic_t cm_disconnects; +extern atomic_t cm_closes; +extern atomic_t cm_connecteds; +extern atomic_t cm_connect_reqs; +extern atomic_t cm_rejects; +extern atomic_t mod_qp_timouts; +extern atomic_t qps_created; +extern atomic_t qps_destroyed; +extern atomic_t sw_qps_destroyed; +extern u32 mh_detected; + +static int nes_netdev_open(struct net_device *); +static int nes_netdev_stop(struct net_device *); +static int nes_netdev_start_xmit(struct sk_buff *, struct net_device *); +static struct net_device_stats *nes_netdev_get_stats(struct net_device *); +static void nes_netdev_tx_timeout(struct net_device *); +static int nes_netdev_set_mac_address(struct net_device *, void *); +static int nes_netdev_change_mtu(struct net_device *, int); + +#ifdef NES_NAPI +/** + * nes_netdev_poll + */ +static int nes_netdev_poll(struct net_device* netdev, int* budget) +{ + struct nes_vnic *nesvnic = netdev_priv(netdev); + struct nes_device *nesdev = nesvnic-nesdev; + struct nes_hw_nic_cq *nescq = nesvnic-nic_cq; + + nesvnic-budget = *budget; + nesvnic-cqes_pending = 0; + nesvnic-rx_cqes_completed = 0; + nesvnic-cqe_allocs_pending = 0; + + nes_nic_ce_handler(nesdev, nescq); + + netdev-quota -= nesvnic-rx_cqes_completed; + *budget -= nesvnic-rx_cqes_completed; + + if (0 == nesvnic-cqes_pending) { + netif_rx_complete(netdev); + /* clear out completed
Re: [RFC] cubic: backoff after slow start
Hi Stephen, We have been working on slow start and we have a nice solution for this. We will send you a patch and test results. Thanks Injong - Original Message - From: Stephen Hemminger [EMAIL PROTECTED] To: Injong Rhee [EMAIL PROTECTED]; Sangtae Ha [EMAIL PROTECTED] Cc: netdev@vger.kernel.org Sent: Tuesday, August 07, 2007 2:37 PM Subject: [RFC] cubic: backoff after slow start CUBIC takes several unnecessary iterations to converge out of slow start. This is most noticable over a link where the bottleneck queue size is much larger than BDP, and the sender has to fill the pipe in slow start before the first loss. Typical consumer broadband links seem to have large (up to 2secs) of queue that needs to get filled before the first loss. A possible fix is to use a beta of .5 (same as original TCP) when leaving slow start. Originally, the Linux version didn't do slow start so it probably never was observed. --- a/net/ipv4/tcp_cubic.c 2007-08-02 12:16:22.0 +0100 +++ b/net/ipv4/tcp_cubic.c 2007-08-03 15:57:12.0 +0100 @@ -289,7 +289,11 @@ static u32 bictcp_recalc_ssthresh(struct ca-loss_cwnd = tp-snd_cwnd; - return max((tp-snd_cwnd * beta) / BICTCP_BETA_SCALE, 2U); + /* Initial backoff when leaving slow start */ + if (tp-snd_ssthresh == 0x7fff) + return max(tp-snd_cwnd 1U, 2U); + else + return max((tp-snd_cwnd * beta) / BICTCP_BETA_SCALE, 2U); } static u32 bictcp_undo_cwnd(struct sock *sk) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 9/14] nes: kernel to userspace structures
Kernel to userspace includes, structures and defines. Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED] --- diff -Nurp NULL ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_user.h --- NULL1969-12-31 18:00:00.0 -0600 +++ ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_user.h 2007-08-06 20:09:05.0 -0500 @@ -0,0 +1,95 @@ +/* + * Copyright (c) 2006 - 2007 NetEffect. All rights reserved. + * Copyright (c) 2005 Topspin Communications. All rights reserved. + * Copyright (c) 2005 Cisco Systems. All rights reserved. + * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + */ + +#ifndef NES_USER_H +#define NES_USER_H + +#include linux/types.h + +/* + * Make sure that all structs defined in this file remain laid out so + * that they pack the same way on 32-bit and 64-bit architectures (to + * avoid incompatibility between 32-bit userspace and 64-bit kernels). + * In particular do not use pointer types -- pass pointers in __u64 + * instead. + */ + +struct nes_alloc_ucontext_resp { + __u32 max_pds; /* maximum pds allowed for this user process */ + __u32 max_qps; /* maximum qps allowed for this user process */ + __u32 wq_size; /* size of the WQs (sq+rq) allocated to the mmaped area */ + __u32 reserved; +}; + +struct nes_alloc_pd_resp { + __u32 pd_id; + __u32 mmap_db_index; +}; + +struct nes_create_cq_req { + __u64 user_cq_buffer; +}; + +enum iwnes_memreg_type { + IWNES_MEMREG_TYPE_MEM = 0x, + IWNES_MEMREG_TYPE_QP = 0x0001, + IWNES_MEMREG_TYPE_CQ = 0x0002, + IWNES_MEMREG_TYPE_MW = 0x0003, + IWNES_MEMREG_TYPE_FMR = 0x0004, +}; + +struct nes_mem_reg_req { + __u32 reg_type; /* indicates if id is memory, QP or CQ */ + __u32 reserved; +}; + +struct nes_create_cq_resp { + __u32 cq_id; + __u32 cq_size; + __u32 mmap_db_index; + __u32 reserved; +}; + +struct nes_create_qp_resp { + __u32 qp_id; + __u32 actual_sq_size; + __u32 actual_rq_size; + __u32 mmap_sq_db_index; + __u32 mmap_rq_db_index; + __u32 reserved; +}; + +#endif /* NES_USER_H */ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.24]S2io: Enhance device error/alarm handling
- Added support to poll for entire set of device errors and alarms. - Optimized interrupt routine fast path. - Removed the unused variable, intr_type, in device private structure. Signed-off-by: Santosh Rastapur [EMAIL PROTECTED] Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED] --- diff -Nurp 2.0.26.1/drivers/net/s2io.c 2.0.26.2/drivers/net/s2io.c --- 2.0.26.1/drivers/net/s2io.c 2007-08-06 15:24:43.0 -0700 +++ 2.0.26.2/drivers/net/s2io.c 2007-08-06 15:22:29.0 -0700 @@ -84,7 +84,7 @@ #include s2io.h #include s2io-regs.h -#define DRV_VERSION 2.0.26.1 +#define DRV_VERSION 2.0.26.2 /* S2io Driver name version. */ static char s2io_driver_name[] = Neterion; @@ -263,7 +263,14 @@ static char ethtool_driver_stats_keys[][ {serious_err_cnt}, {soft_reset_cnt}, {fifo_full_cnt}, - {ring_full_cnt}, + {ring_0_full_cnt}, + {ring_1_full_cnt}, + {ring_2_full_cnt}, + {ring_3_full_cnt}, + {ring_4_full_cnt}, + {ring_5_full_cnt}, + {ring_6_full_cnt}, + {ring_7_full_cnt}, (alarm_transceiver_temp_high), (alarm_transceiver_temp_low), (alarm_laser_bias_current_high), @@ -303,7 +310,24 @@ static char ethtool_driver_stats_keys[][ (rx_tcode_fcs_err_cnt), (rx_tcode_buf_size_err_cnt), (rx_tcode_rxd_corrupt_cnt), - (rx_tcode_unkn_err_cnt) + (rx_tcode_unkn_err_cnt), + {tda_err_cnt}, + {pfc_err_cnt}, + {pcc_err_cnt}, + {tti_err_cnt}, + {tpa_err_cnt}, + {sm_err_cnt}, + {lso_err_cnt}, + {mac_tmac_err_cnt}, + {mac_rmac_err_cnt}, + {xgxs_txgxs_err_cnt}, + {xgxs_rxgxs_err_cnt}, + {rc_err_cnt}, + {prc_pcix_err_cnt}, + {rpa_err_cnt}, + {rda_err_cnt}, + {rti_err_cnt}, + {mc_err_cnt} }; #define S2IO_XENA_STAT_LEN sizeof(ethtool_xena_stats_keys)/ ETH_GSTRING_LEN @@ -802,7 +826,7 @@ static void free_shared_mem(struct s2io_ if (!nic) return; - + dev = nic-dev; mac_control = nic-mac_control; @@ -892,7 +916,7 @@ static void free_shared_mem(struct s2io_ k++; } kfree(mac_control-rings[i].ba[j]); - nic-mac_control.stats_info-sw_stat.mem_freed += (sizeof(struct buffAdd) * + nic-mac_control.stats_info-sw_stat.mem_freed += (sizeof(struct buffAdd) * (rxd_count[nic-rxd_mode] + 1)); } kfree(mac_control-rings[i].ba); @@ -1456,7 +1480,7 @@ static int init_nic(struct s2io_nic *nic bar0-rts_frm_len_n[i]); } } - + /* Disable differentiated services steering logic */ for (i = 0; i 64; i++) { if (rts_ds_steer(nic, i, 0) == FAILURE) { @@ -1586,7 +1610,7 @@ static int init_nic(struct s2io_nic *nic val64 = RTI_DATA2_MEM_RX_UFC_A(0x1) | RTI_DATA2_MEM_RX_UFC_B(0x2) ; - if (nic-intr_type == MSI_X) + if (nic-config.intr_type == MSI_X) val64 |= (RTI_DATA2_MEM_RX_UFC_C(0x20) | \ RTI_DATA2_MEM_RX_UFC_D(0x40)); else @@ -1724,7 +1748,7 @@ static int init_nic(struct s2io_nic *nic static int s2io_link_fault_indication(struct s2io_nic *nic) { - if (nic-intr_type != INTA) + if (nic-config.intr_type != INTA) return MAC_RMAC_ERR_TIMER; if (nic-device_type == XFRAME_II_DEVICE) return LINK_UP_DOWN_INTERRUPT; @@ -1732,6 +1756,362 @@ static int s2io_link_fault_indication(st return MAC_RMAC_ERR_TIMER; } + +void en_dis_err_alarms(struct s2io_nic *nic, u16 mask, int flag) +{ + struct XENA_dev_config __iomem *bar0 = nic-bar0; + register u64 val64 = 0, temp64 = 0, gen_int_mask = 0; + + if (mask TX_DMA_INTR) { + gen_int_mask |= TXDMA_INT_M; + + if (flag == ENABLE_INTRS) { + + val64 = TXDMA_TDA_INT|TXDMA_PFC_INT|TXDMA_PCC_INT + |TXDMA_TTI_INT|TXDMA_LSO_INT|TXDMA_TPA_INT + |TXDMA_SM_INT; + temp64 = readq(bar0-txdma_int_mask); + temp64 = ~((u64) val64); + writeq(temp64, bar0-txdma_int_mask); + + val64 = PFC_ECC_DB_ERR|PFC_SM_ERR_ALARM|PFC_MISC_0_ERR + |PFC_MISC_1_ERR|PFC_PCIX_ERR|PFC_ECC_SG_ERR; + temp64 = readq(bar0-pfc_err_mask); + temp64 = ~((u64) val64); + writeq(temp64, bar0-pfc_err_mask); + + val64 = TDA_Fn_ECC_DB_ERR|TDA_SM0_ERR_ALARM +
[PATCH 10/14] nes: eeprom, phy, routines
Misc eeprom, phy, debug, etc routines. Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED] --- diff -Nurp NULL ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_utils.c --- NULL1969-12-31 18:00:00.0 -0600 +++ ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_utils.c2007-08-06 20:09:05.0 -0500 @@ -0,0 +1,835 @@ +/* + * Copyright (c) 2006 - 2007 NetEffect, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + */ + +#include linux/module.h +#include linux/moduleparam.h +#include linux/netdevice.h +#include linux/etherdevice.h +#include linux/ethtool.h +#include linux/mii.h +#include linux/if_vlan.h +#include linux/crc32.h +#include linux/in.h +#include linux/ip.h +#include linux/tcp.h +#include linux/init.h + +#include asm/io.h +#include asm/irq.h +#include asm/byteorder.h + +#include nes.h + +#define BITMASK(X) (1L (X)) +#define NES_CRC_WID 32 + +static u16 nes_read16_eeprom(void __iomem *addr, u16 offset); + +static u32 nesCRCTable[256]; +static u32 nesCRCInitialized = 0; + +static u32 nesCRCWidMask(u32); +static u32 nes_crc_table_gen(u32 *, u32, u32, u32); +static u32 reflect(u32, u32); +static u32 byte_swap(u32, u32); + +u32 mh_detected; + +/** + * nes_read_eeprom_values - + */ +int nes_read_eeprom_values(struct nes_device *nesdev, struct nes_adapter *nesadapter) +{ + u32 mac_addr_low; + u16 mac_addr_high; + u16 eeprom_data; + u16 eeprom_offset; + u32 index; + + /* TODO: deal with EEPROM endian issues */ + if (nesadapter-firmware_eeprom_offset == 0) { + /* Read the EEPROM Parameters */ + eeprom_data = nes_read16_eeprom(nesdev-regs, 0); + dprintk(EEPROM Offset 0 = 0x%04X\n, eeprom_data); + eeprom_offset = 2 + (((eeprom_data 0x007f) 3) + ((eeprom_data 0x0080) 7)); + dprintk(Firmware Offset = 0x%04X\n, eeprom_offset); + nesadapter-firmware_eeprom_offset = eeprom_offset; + eeprom_data = nes_read16_eeprom(nesdev-regs, eeprom_offset + 4); + if (eeprom_data != 0x5746) { + dprintk(Not a valid Firmware Image = 0x%04X\n, eeprom_data); + return(-1); + } + + eeprom_data = nes_read16_eeprom(nesdev-regs, eeprom_offset + 2); + dprintk(EEPROM Offset %u = 0x%04X\n, eeprom_offset + 2, eeprom_data); + eeprom_offset += ((eeprom_data 0x00ff) 3) ((eeprom_data 0x0100) 8); + dprintk(Software Offset = 0x%04X\n, eeprom_offset); + nesadapter-software_eeprom_offset = eeprom_offset; + eeprom_data = nes_read16_eeprom(nesdev-regs, eeprom_offset); + dprintk(EEPROM Offset %u = 0x%04X\n, eeprom_offset, eeprom_data); + eeprom_data = nes_read16_eeprom(nesdev-regs, eeprom_offset + 4); + if (eeprom_data != 0x5753) { + dprintk(Not a valid Software Image = 0x%04X\n, eeprom_data); + return(-1); + } + + /* eeprom is valid */ + eeprom_offset = nesadapter-software_eeprom_offset; + eeprom_offset += 8; + nesadapter-netdev_max = (u8)nes_read16_eeprom(nesdev-regs, eeprom_offset); + eeprom_offset += 2; + mac_addr_high = nes_read16_eeprom(nesdev-regs, eeprom_offset); + eeprom_offset += 2; + mac_addr_low = (u32)nes_read16_eeprom(nesdev-regs, eeprom_offset); + eeprom_offset += 2; + mac_addr_low = 16; +
[PATCH 12/14] nes: OpenFabrics kernel verb includes
OpenFabrics kernel verbs provider structures and defines. Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED] --- diff -Nurp NULL ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_verbs.h --- NULL1969-12-31 18:00:00.0 -0600 +++ ofa_kernel-1.2/drivers/infiniband/hw/nes/nes_verbs.h2007-08-06 20:09:05.0 -0500 @@ -0,0 +1,152 @@ +/* + * Copyright (c) 2006 NetEffect, Inc. All rights reserved. + * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + */ + +#ifndef NES_VERBS_H +#define NES_VERBS_H + +struct nes_device; + +#define NES_MAX_USER_DB_REGIONS 4096 +#define NES_MAX_USER_WQ_REGIONS 4096 + +struct nes_ucontext { + struct ib_ucontext ibucontext; + struct nes_device *nesdev; + unsigned long mmap_wq_offset; + unsigned long mmap_cq_offset; /* to be removed */ + int index; /* rnic index (minor) */ + unsigned long allocated_doorbells[BITS_TO_LONGS(NES_MAX_USER_DB_REGIONS)]; + u16 mmap_db_index[NES_MAX_USER_DB_REGIONS]; + u16 first_free_db; + unsigned long allocated_wqs[BITS_TO_LONGS(NES_MAX_USER_WQ_REGIONS)]; + struct nes_qp * mmap_nesqp[NES_MAX_USER_WQ_REGIONS]; + u16 first_free_wq; + struct list_head cq_reg_mem_list; +}; + +struct nes_pd { + struct ib_pd ibpd; + u16 pd_id; + atomic_t sqp_count; + u16 mmap_db_index; +}; + +struct nes_mr { + union { + struct ib_mr ibmr; + struct ib_mw ibmw; + struct ib_fmr ibfmr; + }; + u16 pbls_used; + u8 mode; + u8 pbl_4k; +}; + +struct nes_hw_pb { + u32 pa_low; + u32 pa_high; +}; + +struct nes_vpbl { + dma_addr_t pbl_pbase; + struct nes_hw_pb *pbl_vbase; +}; + +struct nes_root_vpbl { + dma_addr_t pbl_pbase; + struct nes_hw_pb *pbl_vbase; + struct nes_vpbl *leaf_vpbl; +}; + +struct nes_av; + +struct nes_cq { + struct ib_cq ibcq; + struct nes_hw_cq hw_cq; + u32 polled_completions; + u32 cq_mem_size; + spinlock_t lock; + u8 virtual_cq; + u8 pad[3]; +}; + +struct nes_wq { + spinlock_t lock; +}; + +struct iw_cm_id; +struct ietf_mpa_frame; + +struct nes_qp { + struct ib_qp ibqp; + void * allocated_buffer; + struct iw_cm_id *cm_id; + struct workqueue_struct *wq; + struct work_struct disconn_work; + struct socket *ksock; + struct nes_cq *nesscq; + struct nes_cq *nesrcq; + struct nes_pd *nespd; + struct ietf_mpa_frame *ietf_frame; + dma_addr_t ietf_frame_pbase; + wait_queue_head_t state_waitq; + unsigned long socket; + struct nes_hw_qp hwqp; + struct work_struct work; + struct work_struct ae_work; + enum ib_qp_state ibqp_state; + u32 iwarp_state; + u32 hte_index; + u32 last_aeq; + u32 qp_mem_size; + atomic_t refcount; + u32 mmap_sq_db_index; + u32 mmap_rq_db_index; + spinlock_t lock; + struct nes_qp_context *nesqp_context; + dma_addr_t nesqp_context_pbase; + wait_queue_head_t kick_waitq; + u16 in_disconnect; + u16 private_data_len; + u8 active_conn; + u8 skip_lsmm; + u8 user_mode; + u8 hte_added; + u8 hw_iwarp_state; + u8 flush_issued; + u8 hw_tcp_state; + u8 disconn_pending; + void *cm_node_p; /* handle of the node this QP is associated with */ +}; +#endif /* NES_VERBS_H */ - To unsubscribe from this list: send the line unsubscribe netdev in the
Re: TCP's initial cwnd setting correct?...
From: Ilpo_Järvinen [EMAIL PROTECTED] Date: Mon, 6 Aug 2007 15:37:15 +0300 (EEST) @@ -805,13 +805,13 @@ void tcp_update_metrics(struct sock *sk) } } -/* Numbers are taken from RFC2414. */ +/* Numbers are taken from RFC3390. */ __u32 tcp_init_cwnd(struct tcp_sock *tp, struct dst_entry *dst) { __u32 cwnd = (dst ? dst_metric(dst, RTAX_INITCWND) : 0); if (!cwnd) { - if (tp-mss_cache 1460) + if (tp-mss_cache = 2190) cwnd = 2; else cwnd = (tp-mss_cache 1095) ? 3 : 4; I remember suggesting something similar about 5 or 6 years ago and Alexey Kuznetsov at the time explained the numbers which are there and why they should not be changed. I forget the reasons though, and I'll try to do the research. These numbers have been like this forever, FWIW. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TCP: H-TCP maxRTT estimation at startup
From: Stephen Hemminger [EMAIL PROTECTED] Date: Fri, 3 Aug 2007 10:57:56 +0100 Small patch to H-TCP from Douglas Leith. Fix estimation of maxRTT. The original code ignores rtt measurements during slow start (via the check tp-snd_ssthresh 0x) yet this is probably a good time to try to estimate max rtt as delayed acking is disabled and slow start will only exit on a loss which presumably corresponds to a maxrtt measurement. Second, the original code (via the check htcp_ccount(ca) 3) ignores rtt data during what it estimates to be the first 3 round-trip times. This seems like an unnecessary check now that the RCV timestamp are no longer used for rtt estimation. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] Applied, thanks Stephen. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/14] nes: kernel build infrastructure
Kconfig kernel build file. Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED] --- diff -Nurp NULL ofa_kernel-1.2/drivers/infiniband/hw/nes/Kconfig --- NULL1969-12-31 18:00:00.0 -0600 +++ ofa_kernel-1.2/drivers/infiniband/hw/nes/Kconfig2007-08-06 20:09:04.0 -0500 @@ -0,0 +1,15 @@ +config INFINIBAND_NES + tristate NetEffect RNIC Driver + depends on PCI INET INFINIBAND + ---help--- + This is a low-level driver for NetEffect RDMA enabled + Network Interface Cards (RNIC). + +config INFINIBAND_NES_DEBUG + bool Verbose debugging output + depends on INFINIBAND_NES + default n + ---help--- + This option causes the NetEffect RNIC driver to produce debug + messages. Select this if you are developing the driver + or trying to diagnose a problem. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/14] nes: kernel build infrastructure
Makefile kernel build file. Signed-off-by: Glenn Grundstrom [EMAIL PROTECTED] --- diff -Nurp NULL ofa_kernel-1.2/drivers/infiniband/hw/nes/Makefile --- NULL1969-12-31 18:00:00.0 -0600 +++ ofa_kernel-1.2/drivers/infiniband/hw/nes/Makefile 2007-08-06 20:09:04.0 -0500 @@ -0,0 +1,10 @@ +ifdef CONFIG_INFINIBAND_NES_DEBUG +EXTRA_CFLAGS += -DNES_DEBUG +endif + +EXTRA_CFLAGS += -DNES_MINICM + +obj-$(CONFIG_INFINIBAND_NES) += iw_nes.o + +iw_nes-objs := nes.o nes_hw.o nes_nic.o nes_utils.o nes_verbs.o nes_cm.o + - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html