RE: [PATCH] Disable TSO for non standard qdiscs
...But, on the other hand, in this case the realization seems to be wrong: probably still all locally created packets will be treated the same - or I miss something? Jarek P. The TCP layer will generate TSO packets based on the kernel socket features associated with the flow. So if you have two devices, one supporting TSO, the other not, then the flows associated with the non-TSO device will not have their packets built for TSO. This has no bearing on the device supporting TSO, which its feature flags will propogate into the kernel socket for that flow, and cause any TCP flows to that device to be TSO packets. So in a nutshell, disabling TSO is on a per-device level, not a global switch. -PJ Waskiewicz -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Disable TSO for non standard qdiscs
Waskiewicz Jr, Peter P wrote: Indeed. As an example of an unknowing user, this discussion made me check whether my cablemodem device (on which I'm using HFSC) uses TSO :) The TSO defer logic is based on your congestion window and current window size. So the actual frame sizes hitting your NIC attached to your DSL probably aren't anywhere near 64KB, but probably more in line with whatever your window size is for DSL. The bottom line is TSO saves CPU cycles. If we want to make it go away because of a traffic shaping qdisc interfering, then that's fine. I just don't think a TSO option should be added to the scheduler layer, since it already exists in the ethtool layer. Asking a user to type 'ethtool -k devicename tso off' is probably going to be much easier than setting an option on your qdisc through tc to turn TSO back on. I think we're having more of a disagreement of what is considered the normal case user. If you are on a slow link, such as a DSL/cable line, your TCP window/congestion window aren't going to be big enough to generate large TSO's, so what is the issue? But disabling TSO, say on a 10 GbE link, can cut throughput by half (I have data on 8-core machines with 10 GbE with/without TSO if you're interested). Even on a single-core machine with a 1GbE link can have bad performance hits. So this is why I'm so concerned about a proposal to turn off TSO outside of the current established methods of using ethtool. Rather than educating the user about how to turn TSO back on using tc if they want it, educate them why they may want to consider turning TSO off in certain configurations. And I don't consider any user effectively using a TBF qdisc someone incapable of understanding how to use ethtool. We don't want to disable TSO for cases where it makes sense, but who is using TBF on 10GbE? The point is that most users of qdiscs which are incapable of dealing with TSO without hacks or special configuration probably don't care, and 10GbE users know about ethtool *and* don't use TBF or HTB (which are probably the only qdiscs which actually have problems, maybe also CBQ). -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] Disable TSO for non standard qdiscs
Indeed. As an example of an unknowing user, this discussion made me check whether my cablemodem device (on which I'm using HFSC) uses TSO :) The TSO defer logic is based on your congestion window and current window size. So the actual frame sizes hitting your NIC attached to your DSL probably aren't anywhere near 64KB, but probably more in line with whatever your window size is for DSL. The bottom line is TSO saves CPU cycles. If we want to make it go away because of a traffic shaping qdisc interfering, then that's fine. I just don't think a TSO option should be added to the scheduler layer, since it already exists in the ethtool layer. Asking a user to type 'ethtool -k devicename tso off' is probably going to be much easier than setting an option on your qdisc through tc to turn TSO back on. I think we're having more of a disagreement of what is considered the normal case user. If you are on a slow link, such as a DSL/cable line, your TCP window/congestion window aren't going to be big enough to generate large TSO's, so what is the issue? But disabling TSO, say on a 10 GbE link, can cut throughput by half (I have data on 8-core machines with 10 GbE with/without TSO if you're interested). Even on a single-core machine with a 1GbE link can have bad performance hits. So this is why I'm so concerned about a proposal to turn off TSO outside of the current established methods of using ethtool. Rather than educating the user about how to turn TSO back on using tc if they want it, educate them why they may want to consider turning TSO off in certain configurations. And I don't consider any user effectively using a TBF qdisc someone incapable of understanding how to use ethtool. Cheers, -PJ Waskiewicz -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
oops with ipcomp
One more issue with 2.6.24, some hours after I reactivated ipcomp with Herb's 2 patches. The httpd log shows a http request per esp tunnel at oops time. Don't know whether it is for network or compression guys, so I started posting here. Daniel Unable to handle kernel paging request at c20fb000 RIP: [8031b8f0] deflate_slow+0x40/0x400 PGD 7f845067 PUD 7f846067 PMD 7f847067 PTE 0 Oops: [1] SMP CPU 0 Modules linked in: Pid: 9136, comm: httpd Not tainted 2.6.24 #2 RIP: 0010:[8031b8f0] [8031b8f0] deflate_slow+0x40/0x400 RSP: 0018:81002ad35938 EFLAGS: 00010206 RAX: RBX: c20b9000 RCX: 000408d8 RDX: c20ba728 RSI: RDI: 5f65 RBP: 08d4 R08: 3dae R09: 1800 R10: 0010 R11: c20b94bc R12: 01ad R13: 0005 R14: R15: c2097000 FS: 2b00bb68b190() GS:805a8000() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: c20fb000 CR3: 2ac82000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process httpd (pid: 9136, threadinfo 81002ad34000, task 81007d2d4080) Stack: 810042f3f710 81007dfb0700 0005 c20b9000 81007de89000 8031c25d 81007dfb0700 81007dfb06c0 81007de890a8 010a 802ff351 Call Trace: [8031c25d] zlib_deflate+0x10d/0x330 [802ff351] deflate_compress+0x91/0xb0 [804771b8] ipcomp_output+0x98/0x1e0 [80489ef6] xfrm_output+0x116/0x1e0 [80482dc4] xfrm4_output_finish2+0x44/0x1e0 [80483075] xfrm4_output+0x55/0x60 [80445989] ip_queue_xmit+0x209/0x450 [8049b0d0] thread_return+0x3d/0x54d [8023b094] lock_timer_base+0x34/0x70 [80456dcf] tcp_transmit_skb+0x40f/0x7c0 [80458aae] __tcp_push_pending_frames+0x11e/0x940 [8044cb8e] tcp_sendmsg+0x81e/0xc40 [80291e3f] dput+0x1f/0x130 [80410b01] sock_aio_write+0x111/0x120 [804109f0] sock_aio_write+0x0/0x120 [8027f95b] do_sync_readv_writev+0xcb/0x110 [80246850] autoremove_wake_function+0x0/0x30 [8027fb99] do_sync_read+0xd9/0x120 [80287941] permission+0x61/0x100 [8027f7bd] rw_copy_check_uvector+0x9d/0x130 [802800a2] do_readv_writev+0xe2/0x210 [8027e1ba] do_filp_open+0x3a/0x50 [802806e3] sys_writev+0x53/0x90 [8020bb3e] system_call+0x7e/0x83 Code: 0f b6 14 0a 31 d0 23 43 74 48 8b 53 60 89 43 68 89 c0 0f b7 RIP [8031b8f0] deflate_slow+0x40/0x400 RSP 81002ad35938 CR2: c20fb000 ---[ end trace cfeb10aa23b54939 ]--- -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ieee80211: fix section mismatch warning
Fix the following warnings: WARNING: net/built-in.o(.init.text+0xd6c0): Section mismatch in reference from the function ieee80211_init() to the function .exit.text:rc80211_simple_exit() WARNING: net/built-in.o(.init.text+0xd6c5): Section mismatch in reference from the function ieee80211_init() to the function .exit.text:rc80211_pid_exit() The fix was simple - I just did as modpost told me and removed the wrong __exit annotation of rc80211_simple_exit and rc80211_pid_exit. Signed-off-by: Sam Ravnborg [EMAIL PROTECTED] Cc: Johannes Berg [EMAIL PROTECTED] Cc: John W. Linville [EMAIL PROTECTED] Cc: David S. Miller [EMAIL PROTECTED] --- With this patch my allyesconfig build on x86 (64 bit) is section mismatch clean in net/ Sam diff --git a/net/mac80211/rc80211_pid_algo.c b/net/mac80211/rc80211_pid_algo.c index 554c4ba..c339571 100644 --- a/net/mac80211/rc80211_pid_algo.c +++ b/net/mac80211/rc80211_pid_algo.c @@ -538,7 +538,7 @@ int __init rc80211_pid_init(void) return ieee80211_rate_control_register(mac80211_rcpid); } -void __exit rc80211_pid_exit(void) +void rc80211_pid_exit(void) { ieee80211_rate_control_unregister(mac80211_rcpid); } diff --git a/net/mac80211/rc80211_simple.c b/net/mac80211/rc80211_simple.c index 934676d..9a78b11 100644 --- a/net/mac80211/rc80211_simple.c +++ b/net/mac80211/rc80211_simple.c @@ -389,7 +389,7 @@ int __init rc80211_simple_init(void) return ieee80211_rate_control_register(mac80211_rcsimple); } -void __exit rc80211_simple_exit(void) +void rc80211_simple_exit(void) { ieee80211_rate_control_unregister(mac80211_rcsimple); } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Disable TSO for non standard qdiscs
On Fri, 2008-01-02 at 10:56 +0100, Patrick McHardy wrote: We don't want to disable TSO for cases where it makes sense, but who is using TBF on 10GbE? The point is that most users of qdiscs which are incapable of dealing with TSO without hacks or special configuration probably don't care, and 10GbE users know about ethtool *and* don't use TBF or HTB (which are probably the only qdiscs which actually have problems, maybe also CBQ). Right - Essentially it is a usability issue: People who know how to use TSO (Peter for example) will be clueful enough to turn it on. Which means the default should be to protect the clueless and turn it off. On Andis approach: Turning TSO off at netdev registration time with a warning will be a cleaner IMO. Or alternatively introducing a kernel-config I know what TSO is option which is then used at netdev registration. From a usability perspective it would make more sense to just keep ethtool as the only way to configure TSO. [I recently spent a few days helping someone debug a problem with IFB because he was redirecting packets from an TSO netdevice and occasionaly some multi-packet will be missed in the calculation; my answer was turn off TSO; so there are more use cases for this TSO issue]. cheers, jamal -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH for 2.6.25 2/2] [NET] ucc_geth: add support for netpoll
This patch adds netpoll support for the QE UCC Gigabit Ethernet driver. Tested using netconsole and KGDBoE. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Just resending this. drivers/net/ucc_geth.c | 20 1 files changed, 20 insertions(+), 0 deletions(-) diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c index e41da46..fba0811 100644 --- a/drivers/net/ucc_geth.c +++ b/drivers/net/ucc_geth.c @@ -3666,6 +3666,23 @@ static irqreturn_t ucc_geth_irq_handler(int irq, void *info) return IRQ_HANDLED; } +#ifdef CONFIG_NET_POLL_CONTROLLER +/* + * Polling 'interrupt' - used by things like netconsole to send skbs + * without having to re-enable interrupts. It's not called while + * the interrupt routine is executing. + */ +static void ucc_netpoll(struct net_device *dev) +{ + struct ucc_geth_private *ugeth = netdev_priv(dev); + int irq = ugeth-ug_info-uf_info.irq; + + disable_irq(irq); + ucc_geth_irq_handler(irq, dev); + enable_irq(irq); +} +#endif /* CONFIG_NET_POLL_CONTROLLER */ + /* Called when something needs to use the ethernet device */ /* Returns 0 for success. */ static int ucc_geth_open(struct net_device *dev) @@ -4008,6 +4025,9 @@ static int ucc_geth_probe(struct of_device* ofdev, const struct of_device_id *ma #ifdef CONFIG_UGETH_NAPI netif_napi_add(dev, ugeth-napi, ucc_geth_poll, UCC_GETH_DEV_WEIGHT); #endif /* CONFIG_UGETH_NAPI */ +#ifdef CONFIG_NET_POLL_CONTROLLER + dev-poll_controller = ucc_netpoll; +#endif dev-stop = ucc_geth_close; //dev-change_mtu = ucc_geth_change_mtu; dev-mtu = 1500; -- 1.5.2.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Slow OOM in netif_RX function
Ivan Dichev a écrit : Arnaldo Carvalho de Melo wrote: Em Fri, Jan 25, 2008 at 02:21:08PM +0100, Andi Kleen escreveu: Ivan H. Dichev [EMAIL PROTECTED] writes: What could happen if I put different Lan card in every slot? In ex. to-private - 3com to-inet- VIA to-dmz - rtl8139 And then to look which RX function is consuming the memory. (boomerang_rx, rtl8139_rx, ... etc) The problem is unlikely to be in the driver (these are both well tested ones) but more likely your complicated iptables setup somehow triggers a skb leak. There are unfortunately no shrink wrapped debug mechanisms in the kernel for leaks like this (ok you could enable CONFIG_NETFILTER_DEBUG and see if it prints something interesting, but that's a long shot). If you wanted to write a custom debugging patch I would do something like this: - Add two new integer fields to struct sk_buff: a time stamp and a integer field - Fill the time stamp with jiffies in alloc_skb and clear the integer field - In __kfree_skb clear the time stamp - For all the ipt target modules in net/ipv4/netfilter/*.c you use change their -target functions to put an unique value into the integer field you added. - Do the same for the pkt_to_tuple functions for all conntrack modules Then when you observe the leak take a crash dump using kdump on the router and then use crash to dump all the slab objects for the sk_head_cache. Then look for any that have an old time stamp and check what value they have in the integer field. Then the netfilter function who set that unique value likely triggered the leak somehow. I wrote some systemtap scripts that do parts of what you suggest, and at least for the timestamp there was no need to add a new field to struct sk_buff, I just reuse skb-timestamp, as it is only used when we use a packet sniffer. Here it is for reference, but it needs some tapsets I wrote, so I'll publish this git repo in git.kernel.org, perhaps it can be useful in this case as a starting point. Find another unused field (hint: I know that at least 4 bytes on 64 bits is present as a hole) and you're done, no need to rebuild the kernel :) http://git.kernel.org/?p=linux/kernel/git/acme/nettaps.git - Arnaldo Thanks to everyone for the given ideas. I am not kernel guru so writing patch is difficult. This is a production server and it is quite difficult to debug (only at night) I removed some iptables exotics - recent , ulog, string , but no effect. Since we can reach OOM most of the memory is going to be filled with the leak, and we are thinking to try to dump and analyze it. We have looked at the crash tool, and we will see what we can do with it. Meanwhile do you have any hint/ideas ? Thanks a lot. I understand you dont want to tell us exact firewall rules you have. Maybe you could post at least following infos : # cat /proc/slabinfo # lsmod -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 0/7] s390: ctc patches for 2.6.25
-- Jeff, the following patches are intended for 2.6.25. Besides clean-ups they replace the old ctc driver by a reworked ctcm driver. This ctcm driver supports the channel-to-channel connections of the old ctc driver plus an additional MPC protocol to provide SNA connectivity. Patch 1/7: clean-ups in ctc and netiucv Patch 2/7: clean-ups in Kconfig Patch 3/7: ctcm changes in Kconfig and Makefile Patch 4/7: ctcm-patch part 1 Patch 5/7: ctcm-patch part 2 Patch 6/7: ctcm-patch part 3 Patch 7/7: removal of old ctc driver Regards,Ursula Braun -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ehea: fix sysfs link compile problem
Due to changes in the struct device_driver there is no direct access to its kobj any longer. The kobj was used to create sysfs links between eHEA ethernet devices and the driver. This patch removes the affected sysfs links to resolve the build problems. Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED] --- drivers/net/ehea/ehea_main.c | 37 - 1 files changed, 0 insertions(+), 37 deletions(-) diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c index 869e160..9a3fd81 100644 --- a/drivers/net/ehea/ehea_main.c +++ b/drivers/net/ehea/ehea_main.c @@ -2804,34 +2804,6 @@ static void __devinit logical_port_release(struct device *dev) of_node_put(port-ofdev.node); } -static int ehea_driver_sysfs_add(struct device *dev, -struct device_driver *driver) -{ - int ret; - - ret = sysfs_create_link(driver-kobj, dev-kobj, - kobject_name(dev-kobj)); - if (ret == 0) { - ret = sysfs_create_link(dev-kobj, driver-kobj, - driver); - if (ret) - sysfs_remove_link(driver-kobj, - kobject_name(dev-kobj)); - } - return ret; -} - -static void ehea_driver_sysfs_remove(struct device *dev, -struct device_driver *driver) -{ - struct device_driver *drv = driver; - - if (drv) { - sysfs_remove_link(drv-kobj, kobject_name(dev-kobj)); - sysfs_remove_link(dev-kobj, driver); - } -} - static struct device *ehea_register_port(struct ehea_port *port, struct device_node *dn) { @@ -2856,16 +2828,8 @@ static struct device *ehea_register_port(struct ehea_port *port, goto out_unreg_of_dev; } - ret = ehea_driver_sysfs_add(port-ofdev.dev, ehea_driver.driver); - if (ret) { - ehea_error(failed to register sysfs driver link); - goto out_rem_dev_file; - } - return port-ofdev.dev; -out_rem_dev_file: - device_remove_file(port-ofdev.dev, dev_attr_log_port_id); out_unreg_of_dev: of_device_unregister(port-ofdev); out: @@ -2874,7 +2838,6 @@ out: static void ehea_unregister_port(struct ehea_port *port) { - ehea_driver_sysfs_remove(port-ofdev.dev, ehea_driver.driver); device_remove_file(port-ofdev.dev, dev_attr_log_port_id); of_device_unregister(port-ofdev); } -- 1.5.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
netdev share of section mismatch warnings...
Can we please get the following warnings fixed in mainline soonish. I get the below list with a x86 64bit allyesconfig build and I expect anyone to see roughly the same list. It would be great to say that both net/ and drivers/net/ were warning free in this respect. If you have questions let me know and I will try to help out. Sam WARNING: drivers/net/built-in.o(.text+0x42e93): Section mismatch in reference from the function t3_io_slot_reset() to the function .devinit.text:t3_prep_adapter() The function t3_io_slot_reset() references the function __devinit t3_prep_adapter(). This is often because t3_io_slot_reset lacks a __devinit annotation or the annotation of t3_prep_adapter is wrong. WARNING: drivers/net/built-in.o(.text+0x89927): Section mismatch in reference from the function sis190_get_mac_addr() to the function .devinit.text:sis190_get_mac_addr_from_apc() The function sis190_get_mac_addr() references the function __devinit sis190_get_mac_addr_from_apc(). This is often because sis190_get_mac_addr lacks a __devinit annotation or the annotation of sis190_get_mac_addr_from_apc is wrong. WARNING: drivers/net/built-in.o(.text+0x89934): Section mismatch in reference from the function sis190_get_mac_addr() to the function .devinit.text:sis190_get_mac_addr_from_eeprom() The function sis190_get_mac_addr() references the function __devinit sis190_get_mac_addr_from_eeprom(). This is often because sis190_get_mac_addr lacks a __devinit annotation or the annotation of sis190_get_mac_addr_from_eeprom is wrong. WARNING: drivers/net/built-in.o(.text+0x14866b): Section mismatch in reference from the function mlx4_init_icm() to the function .devinit.text:mlx4_init_cmpt_table() The function mlx4_init_icm() references the function __devinit mlx4_init_cmpt_table(). This is often because mlx4_init_icm lacks a __devinit annotation or the annotation of mlx4_init_cmpt_table is wrong. WARNING: drivers/net/built-in.o(.text+0x148c19): Section mismatch in reference from the function mlx4_init_hca() to the function .devinit.text:mlx4_load_fw() The function mlx4_init_hca() references the function __devinit mlx4_load_fw(). This is often because mlx4_init_hca lacks a __devinit annotation or the annotation of mlx4_load_fw is wrong. WARNING: drivers/net/built-in.o(.text+0x1494bd): Section mismatch in reference from the function __mlx4_init_one() to the function .devinit.text:mlx4_enable_msi_x() The function __mlx4_init_one() references the function __devinit mlx4_enable_msi_x(). This is often because __mlx4_init_one lacks a __devinit annotation or the annotation of mlx4_enable_msi_x is wrong. WARNING: drivers/net/built-in.o(.text+0x14a77e): Section mismatch in reference from the function mlx4_init_mr_table() to the function .devinit.text:mlx4_buddy_init() The function mlx4_init_mr_table() references the function __devinit mlx4_buddy_init(). This is often because mlx4_init_mr_table lacks a __devinit annotation or the annotation of mlx4_buddy_init is wrong. WARNING: drivers/net/built-in.o(.text+0x1500a1): Section mismatch in reference from the function olympic_open() to the function .devinit.text:olympic_init() The function olympic_open() references the function __devinit olympic_init(). This is often because olympic_open lacks a __devinit annotation or the annotation of olympic_init is wrong. WARNING: drivers/net/built-in.o(.data+0x54478): Section mismatch in reference from the variable ath5k_pci_drv_id to the variable .devinit.data:ath5k_pci_id_table The variable ath5k_pci_drv_id references the variable __devinitdata ath5k_pci_id_table If the reference is valid then annotate the variable with __init* (see linux/init.h) or name the variable: *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console, WARNING: drivers/net/built-in.o(.data+0x54480): Section mismatch in reference from the variable ath5k_pci_drv_id to the function .devinit.text:ath5k_pci_probe() The variable ath5k_pci_drv_id references the function __devinit ath5k_pci_probe() If the reference is valid then annotate the variable with __init* (see linux/init.h) or name the variable: *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console, WARNING: drivers/net/built-in.o(.data+0x54488): Section mismatch in reference from the variable ath5k_pci_drv_id to the function .devexit.text:ath5k_pci_remove() The variable ath5k_pci_drv_id references the function __devexit ath5k_pci_remove() If the reference is valid then annotate the variable with __exit* (see linux/init.h) or name the variable: *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console, -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ieee80211: fix section mismatch warning
On Fri, 2008-02-01 at 12:52 +0100, Sam Ravnborg wrote: Fix the following warnings: WARNING: net/built-in.o(.init.text+0xd6c0): Section mismatch in reference from the function ieee80211_init() to the function .exit.text:rc80211_simple_exit() WARNING: net/built-in.o(.init.text+0xd6c5): Section mismatch in reference from the function ieee80211_init() to the function .exit.text:rc80211_pid_exit() The fix was simple - I just did as modpost told me and removed the wrong __exit annotation of rc80211_simple_exit and rc80211_pid_exit. Heh, I just sent the same patch. Signed-off-by: Sam Ravnborg [EMAIL PROTECTED] Acked-by: Johannes Berg [EMAIL PROTECTED] Cc: Johannes Berg [EMAIL PROTECTED] Cc: John W. Linville [EMAIL PROTECTED] Cc: David S. Miller [EMAIL PROTECTED] --- With this patch my allyesconfig build on x86 (64 bit) is section mismatch clean in net/ Sam diff --git a/net/mac80211/rc80211_pid_algo.c b/net/mac80211/rc80211_pid_algo.c index 554c4ba..c339571 100644 --- a/net/mac80211/rc80211_pid_algo.c +++ b/net/mac80211/rc80211_pid_algo.c @@ -538,7 +538,7 @@ int __init rc80211_pid_init(void) return ieee80211_rate_control_register(mac80211_rcpid); } -void __exit rc80211_pid_exit(void) +void rc80211_pid_exit(void) { ieee80211_rate_control_unregister(mac80211_rcpid); } diff --git a/net/mac80211/rc80211_simple.c b/net/mac80211/rc80211_simple.c index 934676d..9a78b11 100644 --- a/net/mac80211/rc80211_simple.c +++ b/net/mac80211/rc80211_simple.c @@ -389,7 +389,7 @@ int __init rc80211_simple_init(void) return ieee80211_rate_control_register(mac80211_rcsimple); } -void __exit rc80211_simple_exit(void) +void rc80211_simple_exit(void) { ieee80211_rate_control_unregister(mac80211_rcsimple); } signature.asc Description: This is a digitally signed message part
[PATCH RESEND] [XFRM]: Speed up xfrm_policy and xfrm_state walking
Change xfrm_policy and xfrm_state walking algorithm from O(n^2) to O(n). This is achieved adding the entries to one more list which is used solely for walking the entries. This also fixes some races where the dump can have duplicate or missing entries when the SPD/SADB is modified during an ongoing dump. Dumping SADB with 2 entries using time ip xfrm state the sys time dropped from 1.012s to 0.080s. Signed-off-by: Timo Teras [EMAIL PROTECTED] --- include/linux/xfrm.h |3 +- include/net/xfrm.h | 52 ++- net/key/af_key.c | 22 +++-- net/xfrm/xfrm_policy.c | 79 net/xfrm/xfrm_state.c | 53 ++-- net/xfrm/xfrm_user.c | 71 ++- 6 files changed, 195 insertions(+), 85 deletions(-) diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h index e31b8c8..0c82c80 100644 --- a/include/linux/xfrm.h +++ b/include/linux/xfrm.h @@ -113,7 +113,8 @@ enum { XFRM_POLICY_TYPE_MAIN = 0, XFRM_POLICY_TYPE_SUB= 1, - XFRM_POLICY_TYPE_MAX= 2 + XFRM_POLICY_TYPE_MAX= 2, + XFRM_POLICY_TYPE_ANY= 255 }; enum diff --git a/include/net/xfrm.h b/include/net/xfrm.h index ac72116..7bd021b 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -121,6 +121,7 @@ extern struct mutex xfrm_cfg_mutex; struct xfrm_state { /* Note: bydst is re-used during gc */ + struct list_headall; struct hlist_node bydst; struct hlist_node bysrc; struct hlist_node byspi; @@ -424,6 +425,7 @@ struct xfrm_tmpl struct xfrm_policy { struct xfrm_policy *next; + struct list_headbytype; struct hlist_node bydst; struct hlist_node byidx; @@ -1157,6 +1159,18 @@ struct xfrm6_tunnel { int priority; }; +struct xfrm_state_walk { + u8 proto; + struct xfrm_state *state; + int count; +}; + +struct xfrm_policy_walk { + u8 type, cur_type; + struct xfrm_policy *policy; + int count; +}; + extern void xfrm_init(void); extern void xfrm4_init(void); extern void xfrm_state_init(void); @@ -1181,7 +1195,23 @@ static inline void xfrm6_fini(void) extern int xfrm_proc_init(void); #endif -extern int xfrm_state_walk(u8 proto, int (*func)(struct xfrm_state *, int, void*), void *); +static inline void xfrm_state_walk_init(struct xfrm_state_walk *walk, u8 proto) +{ + walk-proto = proto; + walk-state = NULL; + walk-count = 0; +} + +static inline void xfrm_state_walk_done(struct xfrm_state_walk *walk) +{ + if (walk-state != NULL) { + xfrm_state_put(walk-state); + walk-state = NULL; + } +} + +extern int xfrm_state_walk(struct xfrm_state_walk *walk, + int (*func)(struct xfrm_state *, int, void*), void *); extern struct xfrm_state *xfrm_state_alloc(void); extern struct xfrm_state *xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, struct flowi *fl, struct xfrm_tmpl *tmpl, @@ -1303,7 +1333,25 @@ static inline int xfrm4_udp_encap_rcv(struct sock *sk, struct sk_buff *skb) #endif struct xfrm_policy *xfrm_policy_alloc(gfp_t gfp); -extern int xfrm_policy_walk(u8 type, int (*func)(struct xfrm_policy *, int, int, void*), void *); + +static inline void xfrm_policy_walk_init(struct xfrm_policy_walk *walk, u8 type) +{ + walk-cur_type = XFRM_POLICY_TYPE_MAIN; + walk-type = type; + walk-policy = NULL; + walk-count = 0; +} + +static inline void xfrm_policy_walk_done(struct xfrm_policy_walk *walk) +{ + if (walk-policy != NULL) { + xfrm_pol_put(walk-policy); + walk-policy = NULL; + } +} + +extern int xfrm_policy_walk(struct xfrm_policy_walk *walk, + int (*func)(struct xfrm_policy *, int, int, void*), void *); int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl); struct xfrm_policy *xfrm_policy_bysel_ctx(u8 type, int dir, struct xfrm_selector *sel, diff --git a/net/key/af_key.c b/net/key/af_key.c index 16b72b5..d24bb9c 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -1742,12 +1742,18 @@ static int pfkey_dump(struct sock *sk, struct sk_buff *skb, struct sadb_msg *hdr { u8 proto; struct pfkey_dump_data data = { .skb = skb, .hdr = hdr, .sk = sk }; + struct xfrm_state_walk walk; + int rc; proto = pfkey_satype2proto(hdr-sadb_msg_satype); if (proto == 0) return -EINVAL; - return xfrm_state_walk(proto, dump_sa, data); + xfrm_state_walk_init(walk, proto); + rc = xfrm_state_walk(walk, dump_sa, data); + xfrm_state_walk_done(walk); + + return rc; } static int pfkey_promisc(struct sock *sk, struct sk_buff *skb, struct sadb_msg
Re: Lots of BUG eth1 code -5 qlen 0 messages in 2.6.24
On Tue, Jan 29, 2008 at 06:08:28PM -0500, jamal wrote: On Tue, 2008-29-01 at 22:45 +0100, Erik Mouw wrote: The driver seems buggy. Make it return NETDEV_TX_BUSY instead of -EIO in xircom_start_xmit() and the messages will go away. Like this? Indeed. I've changed the -EIO into NETDEV_TX_BUSY and so far I can't trigger the bug anymore. It was quite easy to trigger within minutes with rsync, but I can't trigger it anymore. Should I send a patch, and if so: to who? The tulip/xircom_cb driver seems to be orphaned. Peter, I suspect that driver is just buggy in some other way as opposed to being re-entered; couldnt tell by inspection. It is possible it may be too eager to open up before it really has space. It will be easy to check your theory by having the driver just check if it is netif_stopped just before it returns NETDEV_TX_BUSY. No need to test that, it *is* netif_stopped before the return: /* Uh oh... no free descriptor... drop the packet */ netif_stop_queue(dev); spin_unlock_irqrestore(card-lock,flags); trigger_transmit(card); return NETDEV_TX_BUSY; trigger_transmit() is a simple function that just writes a single register on the card to trigger a transmit. Erik -- They're all fools. Don't worry. Darwin may be slow, but he'll eventually get them. -- Matthew Lammers in alt.sysadmin.recovery signature.asc Description: Digital signature
[patch 1/7] ctc / netiucv: consolidate fsm_action_nop
From: Peter Tiedemann [EMAIL PROTECTED] move fsm_action_nop to fsm.h to avoid duplicate definitions in both drivers ctc and netiucv. Signed-off-by: Peter Tiedemann [EMAIL PROTECTED] Signed-off-by: Ursula Braun [EMAIL PROTECTED] --- drivers/s390/net/ctcmain.c |8 drivers/s390/net/fsm.h |8 drivers/s390/net/netiucv.c |8 +--- 3 files changed, 9 insertions(+), 15 deletions(-) Index: linux-2.6-uschi/drivers/s390/net/fsm.h === --- linux-2.6-uschi.orig/drivers/s390/net/fsm.h +++ linux-2.6-uschi/drivers/s390/net/fsm.h @@ -260,4 +260,12 @@ extern int fsm_addtimer(fsm_timer *timer */ extern void fsm_modtimer(fsm_timer *timer, int millisec, int event, void *arg); +/** + * NOP action for statemachines + */ +static inline void +fsm_action_nop(fsm_instance *fi, int event, void *arg) +{ +} + #endif /* _FSM_H_ */ Index: linux-2.6-uschi/drivers/s390/net/netiucv.c === --- linux-2.6-uschi.orig/drivers/s390/net/netiucv.c +++ linux-2.6-uschi/drivers/s390/net/netiucv.c @@ -137,6 +137,7 @@ PRINT_##importance(header %02x %02x %02 #define PRINTK_HEADER iucv:/* for debugging */ static struct device_driver netiucv_driver = { + .owner = THIS_MODULE, .name = netiucv, .bus = iucv_bus, }; @@ -571,13 +572,6 @@ static void netiucv_callback_connres(str fsm_event(conn-fsm, CONN_EVENT_CONN_RES, conn); } -/** - * Dummy NOP action for all statemachines - */ -static void fsm_action_nop(fsm_instance *fi, int event, void *arg) -{ -} - /* * Actions of the connection statemachine */ Index: linux-2.6-uschi/drivers/s390/net/ctcmain.c === --- linux-2.6-uschi.orig/drivers/s390/net/ctcmain.c +++ linux-2.6-uschi/drivers/s390/net/ctcmain.c @@ -644,14 +644,6 @@ ctc_checkalloc_buffer(struct channel *ch } /** - * Dummy NOP action for statemachines - */ -static void -fsm_action_nop(fsm_instance * fi, int event, void *arg) -{ -} - -/** * Actions for channel - statemachines. */ -- -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
kernel panic on 2.6.24 with esfq patch applied
Hi Probably bug related to ESFQ, now i will unload module and will test more. But probably not related, so if not difficult, please take a look. Feb 1 09:08:50 SERVER [12380.067104] BUG: unable to handle kernel NULL pointer dereference Feb 1 09:08:50 SERVER at virtual address 0008 Feb 1 09:08:50 SERVER [12380.067140] printing eip: c01f10ed Feb 1 09:08:50 SERVER *pde = Feb 1 09:08:50 SERVER Feb 1 09:08:50 SERVER [12380.067162] Oops: [#1] Feb 1 09:08:50 SERVER SMP Feb 1 09:08:50 SERVER Feb 1 09:08:50 SERVER [12380.067181] Modules linked in: Feb 1 09:08:50 SERVER netconsole Feb 1 09:08:50 SERVER configfs Feb 1 09:08:50 SERVER iTCO_wdt Feb 1 09:08:50 SERVER nf_nat_pptp Feb 1 09:08:50 SERVER nf_conntrack_pptp Feb 1 09:08:50 SERVER nf_conntrack_proto_gre Feb 1 09:08:50 SERVER nf_nat_proto_gre Feb 1 09:08:50 SERVER sch_esfq Feb 1 09:08:50 SERVER xt_tcpudp Feb 1 09:08:50 SERVER ipt_TTL Feb 1 09:08:50 SERVER ipt_ttl Feb 1 09:08:50 SERVER xt_NOTRACK Feb 1 09:08:50 SERVER iptable_raw Feb 1 09:08:50 SERVER iptable_mangle Feb 1 09:08:50 SERVER ifb Feb 1 09:08:50 SERVER e1000e Feb 1 09:08:50 SERVER em_nbyte Feb 1 09:08:50 SERVER cls_tcindex Feb 1 09:08:50 SERVER act_gact Feb 1 09:08:50 SERVER cls_rsvp Feb 1 09:08:50 SERVER sch_htb Feb 1 09:08:50 SERVER cls_fw Feb 1 09:08:50 SERVER act_mirred Feb 1 09:08:50 SERVER em_u32 Feb 1 09:08:50 SERVER sch_red Feb 1 09:08:50 SERVER sch_sfq Feb 1 09:08:50 SERVER sch_tbf Feb 1 09:08:50 SERVER sch_teql Feb 1 09:08:50 SERVER cls_basic Feb 1 09:08:50 SERVER act_police Feb 1 09:08:50 SERVER sch_gred Feb 1 09:08:50 SERVER act_pedit Feb 1 09:08:50 SERVER sch_hfsc Feb 1 09:08:50 SERVER cls_rsvp6 Feb 1 09:08:50 SERVER sch_ingress Feb 1 09:08:50 SERVER em_meta Feb 1 09:08:50 SERVER em_text Feb 1 09:08:50 SERVER act_ipt Feb 1 09:08:50 SERVER sch_dsmark Feb 1 09:08:50 SERVER sch_prio Feb 1 09:08:50 SERVER sch_netem Feb 1 09:08:50 SERVER act_simple Feb 1 09:08:50 SERVER cls_u32 Feb 1 09:08:50 SERVER em_cmp Feb 1 09:08:50 SERVER sch_cbq Feb 1 09:08:50 SERVER cls_route Feb 1 09:08:50 SERVER xt_TCPMSS Feb 1 09:08:50 SERVER iptable_nat Feb 1 09:08:50 SERVER nf_conntrack_ipv4 Feb 1 09:08:50 SERVER ipt_LOG Feb 1 09:08:50 SERVER ipt_MASQUERADE Feb 1 09:08:50 SERVER ipt_REDIRECT Feb 1 09:08:50 SERVER nf_nat Feb 1 09:08:50 SERVER nf_conntrack Feb 1 09:08:50 SERVER nfnetlink Feb 1 09:08:50 SERVER iptable_filter Feb 1 09:08:50 SERVER ip_tables Feb 1 09:08:50 SERVER x_tables Feb 1 09:08:50 SERVER 8021q Feb 1 09:08:50 SERVER tun Feb 1 09:08:50 SERVER tulip Feb 1 09:08:50 SERVER r8169 Feb 1 09:08:50 SERVER sky2 Feb 1 09:08:50 SERVER via_velocity Feb 1 09:08:50 SERVER via_rhine Feb 1 09:08:50 SERVER sis900 Feb 1 09:08:50 SERVER ne2k_pci Feb 1 09:08:50 SERVER 8390 Feb 1 09:08:50 SERVER skge Feb 1 09:08:50 SERVER tg3 Feb 1 09:08:50 SERVER 8139too Feb 1 09:08:50 SERVER e1000 Feb 1 09:08:50 SERVER e100 Feb 1 09:08:50 SERVER usb_storage Feb 1 09:08:50 SERVER mtdblock Feb 1 09:08:50 SERVER mtd_blkdevs Feb 1 09:08:50 SERVER usbhid Feb 1 09:08:50 SERVER uhci_hcd Feb 1 09:08:50 SERVER ehci_hcd Feb 1 09:08:50 SERVER ohci_hcd Feb 1 09:08:50 SERVER usbcore Feb 1 09:08:50 SERVER Feb 1 09:08:50 SERVER [12380.067515] Feb 1 09:08:50 SERVER [12380.067530] Pid: 0, comm: swapper Not tainted (2.6.24-build-0021 #26) Feb 1 09:08:50 SERVER [12380.067550] EIP: 0060:[c01f10ed] EFLAGS: 00010086 CPU: 0 Feb 1 09:08:50 SERVER [12380.067571] EIP is at rb_erase+0x110/0x22f Feb 1 09:08:50 SERVER [12380.067589] EAX: f52bbea0 EBX: ECX: EDX: f52bbea0 Feb 1 09:08:50 SERVER [12380.067608] ESI: f717df50 EDI: c1fed000 EBP: c1fecf80 ESP: c037fda8 Feb 1 09:08:50 SERVER [12380.067628] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 Feb 1 09:08:50 SERVER [12380.067647] Process swapper (pid: 0, ti=c037e000 task=c03533a0 task.ti=c037e000) Feb 1 09:08:50 SERVER Feb 1 09:08:50 SERVER [12380.067668] Stack: Feb 1 09:08:50 SERVER 0001 Feb 1 09:08:50 SERVER c1fed000 Feb 1 09:08:50 SERVER c1fecf78 Feb 1 09:08:50 SERVER 0002 Feb 1 09:08:50 SERVER 0001 Feb 1 09:08:50 SERVER c0134663 Feb 1 09:08:50 SERVER c1fed000 Feb 1 09:08:50 SERVER c1fecf78 Feb 1 09:08:50 SERVER Feb 1 09:08:50 SERVER [12380.067714] Feb 1 09:08:50 SERVER c1fecf40 Feb 1 09:08:50 SERVER c013515b Feb 1 09:08:50 SERVER Feb 1 09:08:50 SERVER 4f3f473e Feb 1 09:08:50 SERVER 02d0 Feb 1 09:08:50 SERVER Feb 1 09:08:50 SERVER 7fff Feb 1 09:08:50 SERVER 4f3f473e Feb 1 09:08:50 SERVER Feb 1 09:08:50 SERVER [12380.067760] Feb 1 09:08:50 SERVER 02d0 Feb 1 09:08:50 SERVER Feb 1 09:08:50 SERVER c1fec120 Feb 1 09:08:50 SERVER c037ff84 Feb 1 09:08:50 SERVER c037fe70 Feb 1 09:08:50 SERVER f76ae880 Feb 1 09:08:50 SERVER c0113963 Feb 1 09:08:50 SERVER c1ff5f78 Feb 1 09:08:50 SERVER Feb 1 09:08:50 SERVER [12380.067806] Call Trace: Feb 1 09:08:50 SERVER [12380.067839] [c0134663] Feb 1
Re: Slow OOM in netif_RX function
Arnaldo Carvalho de Melo wrote: Em Fri, Jan 25, 2008 at 02:21:08PM +0100, Andi Kleen escreveu: Ivan H. Dichev [EMAIL PROTECTED] writes: What could happen if I put different Lan card in every slot? In ex. to-private - 3com to-inet- VIA to-dmz - rtl8139 And then to look which RX function is consuming the memory. (boomerang_rx, rtl8139_rx, ... etc) The problem is unlikely to be in the driver (these are both well tested ones) but more likely your complicated iptables setup somehow triggers a skb leak. There are unfortunately no shrink wrapped debug mechanisms in the kernel for leaks like this (ok you could enable CONFIG_NETFILTER_DEBUG and see if it prints something interesting, but that's a long shot). If you wanted to write a custom debugging patch I would do something like this: - Add two new integer fields to struct sk_buff: a time stamp and a integer field - Fill the time stamp with jiffies in alloc_skb and clear the integer field - In __kfree_skb clear the time stamp - For all the ipt target modules in net/ipv4/netfilter/*.c you use change their -target functions to put an unique value into the integer field you added. - Do the same for the pkt_to_tuple functions for all conntrack modules Then when you observe the leak take a crash dump using kdump on the router and then use crash to dump all the slab objects for the sk_head_cache. Then look for any that have an old time stamp and check what value they have in the integer field. Then the netfilter function who set that unique value likely triggered the leak somehow. I wrote some systemtap scripts that do parts of what you suggest, and at least for the timestamp there was no need to add a new field to struct sk_buff, I just reuse skb-timestamp, as it is only used when we use a packet sniffer. Here it is for reference, but it needs some tapsets I wrote, so I'll publish this git repo in git.kernel.org, perhaps it can be useful in this case as a starting point. Find another unused field (hint: I know that at least 4 bytes on 64 bits is present as a hole) and you're done, no need to rebuild the kernel :) http://git.kernel.org/?p=linux/kernel/git/acme/nettaps.git - Arnaldo Thanks to everyone for the given ideas. I am not kernel guru so writing patch is difficult. This is a production server and it is quite difficult to debug (only at night) I removed some iptables exotics - recent , ulog, string , but no effect. Since we can reach OOM most of the memory is going to be filled with the leak, and we are thinking to try to dump and analyze it. We have looked at the crash tool, and we will see what we can do with it. Meanwhile do you have any hint/ideas ? Thanks a lot. Ivan Dichev -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH for 2.6.25 1/2] [NET] ucc_geth: fix module removal
- uccf should be set to NULL to not double-free memory on subsequent calls; - ind_hash_q and group_hash_q lists should be initialized in the probe() function, instead of struct_init() (called by open()), otherwise there will be an oops if ucc_geth_driver removed prior 'ifconfig ethX up'; - add unregister_netdev(); - reorder geth_remove() steps. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Hi Li, You kinda promised that these two patches would hit 2.6.25... ;-) I've rebased the patches so they apply cleanly on the current tree. Thanks, drivers/net/ucc_geth.c | 17 ++--- 1 files changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c index 4ffd873..e41da46 100644 --- a/drivers/net/ucc_geth.c +++ b/drivers/net/ucc_geth.c @@ -2084,8 +2084,10 @@ static void ucc_geth_memclean(struct ucc_geth_private *ugeth) if (!ugeth) return; - if (ugeth-uccf) + if (ugeth-uccf) { ucc_fast_free(ugeth-uccf); + ugeth-uccf = NULL; + } if (ugeth-p_thread_data_tx) { qe_muram_free(ugeth-thread_dat_tx_offset); @@ -2305,10 +2307,6 @@ static int ucc_struct_init(struct ucc_geth_private *ugeth) ug_info = ugeth-ug_info; uf_info = ug_info-uf_info; - /* Create CQs for hash tables */ - INIT_LIST_HEAD(ugeth-group_hash_q); - INIT_LIST_HEAD(ugeth-ind_hash_q); - if (!((uf_info-bd_mem_part == MEM_PART_SYSTEM) || (uf_info-bd_mem_part == MEM_PART_MURAM))) { if (netif_msg_probe(ugeth)) @@ -3990,6 +3988,10 @@ static int ucc_geth_probe(struct of_device* ofdev, const struct of_device_id *ma ugeth = netdev_priv(dev); spin_lock_init(ugeth-lock); + /* Create CQs for hash tables */ + INIT_LIST_HEAD(ugeth-group_hash_q); + INIT_LIST_HEAD(ugeth-ind_hash_q); + dev_set_drvdata(device, dev); /* Set the dev-base_addr to the gfar reg region */ @@ -4040,9 +4042,10 @@ static int ucc_geth_remove(struct of_device* ofdev) struct net_device *dev = dev_get_drvdata(device); struct ucc_geth_private *ugeth = netdev_priv(dev); - dev_set_drvdata(device, NULL); - ucc_geth_memclean(ugeth); + unregister_netdev(dev); free_netdev(dev); + ucc_geth_memclean(ugeth); + dev_set_drvdata(device, NULL); return 0; } -- 1.5.2.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Protocol handler for Marvell DSA EtherType packets
Hi Netdev I writing a new protocol handler using dev_add_pack(). (For a Marvell switch chip handling DSA (Distributed Switch Architecture) Ethertype packets). My protocol handler works and I get the skb. But I want to remove the DSA Headers and send the packet back for normal processing on a device. (I actually just want to be able to tcpdump these packets on the device). I'm removing the headers by: skb_pull(skb, sizeof(struct dsa_header)); I'm trying to retransmit it by: netif_rx(skb); But it seems that I just retransmit the same packet without removing the DSA headers. Any hints about which functions I should use the remove the DSA header? -- Med venlig hilsen / Best regards Jesper Brouer ComX Networks A/S Linux Network developer Cand. Scient Datalog / MSc. Author of http://adsl-optimizer.dk LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Slow OOM in netif_RX function
On Fri, Feb 01, 2008 at 02:51:40PM +0200, Ivan Dichev wrote: Arnaldo Carvalho de Melo wrote: Em Fri, Jan 25, 2008 at 02:21:08PM +0100, Andi Kleen escreveu: Ivan H. Dichev [EMAIL PROTECTED] writes: What could happen if I put different Lan card in every slot? In ex. to-private - 3com to-inet- VIA to-dmz - rtl8139 And then to look which RX function is consuming the memory. (boomerang_rx, rtl8139_rx, ... etc) The problem is unlikely to be in the driver (these are both well tested ones) but more likely your complicated iptables setup somehow triggers a skb leak. There are unfortunately no shrink wrapped debug mechanisms in the kernel for leaks like this (ok you could enable CONFIG_NETFILTER_DEBUG and see if it prints something interesting, but that's a long shot). If you wanted to write a custom debugging patch I would do something like this: - Add two new integer fields to struct sk_buff: a time stamp and a integer field - Fill the time stamp with jiffies in alloc_skb and clear the integer field - In __kfree_skb clear the time stamp - For all the ipt target modules in net/ipv4/netfilter/*.c you use change their -target functions to put an unique value into the integer field you added. - Do the same for the pkt_to_tuple functions for all conntrack modules Then when you observe the leak take a crash dump using kdump on the router and then use crash to dump all the slab objects for the sk_head_cache. Then look for any that have an old time stamp and check what value they have in the integer field. Then the netfilter function who set that unique value likely triggered the leak somehow. I wrote some systemtap scripts that do parts of what you suggest, and at least for the timestamp there was no need to add a new field to struct sk_buff, I just reuse skb-timestamp, as it is only used when we use a packet sniffer. Here it is for reference, but it needs some tapsets I wrote, so I'll publish this git repo in git.kernel.org, perhaps it can be useful in this case as a starting point. Find another unused field (hint: I know that at least 4 bytes on 64 bits is present as a hole) and you're done, no need to rebuild the kernel :) http://git.kernel.org/?p=linux/kernel/git/acme/nettaps.git - Arnaldo Thanks to everyone for the given ideas. I am not kernel guru so writing patch is difficult. This is a production server and it is quite difficult to debug (only at night) I removed some iptables exotics - recent , ulog, string , but no effect. Since we can reach OOM most of the memory is going to be filled with the leak, and we are thinking to try to dump and analyze it. You could perhaps use crash to look for leaked packets and then see if you can see a pattern, as in what types of packets they are. Still I expect without modifying the kernel to add some more netfilter tracing it will be difficult to diagnose this. I suppose it would be possible to write a suitable systemtap script to also trace this without modifying the kernel, although it will be probably not easy and more complicated than just changing the C code. -Andi -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 3/7] ctcm: infrastructure for replaced ctc driver
From: Peter Tiedemann [EMAIL PROTECTED] establish base stuff for the replaced ctc driver, i.e. Kconfig and Makefile adaptions arch/s390/defconfig drivers/s390/net/Kconfig drivers/s390/net/Makefile Signed-off-by: Peter Tiedemann [EMAIL PROTECTED] Signed-off-by: Ursula Braun [EMAIL PROTECTED] --- drivers/s390/net/Kconfig | 12 +++- drivers/s390/net/Makefile |5 ++--- 2 files changed, 9 insertions(+), 8 deletions(-) Index: linux-2.6-uschi/drivers/s390/net/Makefile === --- linux-2.6-uschi.orig/drivers/s390/net/Makefile +++ linux-2.6-uschi/drivers/s390/net/Makefile @@ -2,11 +2,10 @@ # S/390 network devices # -ctc-objs := ctcmain.o ctcdbug.o - +ctcm-objs := ctcm_main.o ctcm_fsms.o ctcm_mpc.o ctcm_sysfs.o ctcm_dbug.o +obj-$(CONFIG_CTCM) += ctcm.o fsm.o cu3088.o obj-$(CONFIG_NETIUCV) += netiucv.o fsm.o obj-$(CONFIG_SMSGIUCV) += smsgiucv.o -obj-$(CONFIG_CTC) += ctc.o fsm.o cu3088.o obj-$(CONFIG_LCS) += lcs.o cu3088.o obj-$(CONFIG_CLAW) += claw.o cu3088.o qeth-y := qeth_main.o qeth_mpc.o qeth_sys.o qeth_eddp.o Index: linux-2.6-uschi/drivers/s390/net/Kconfig === --- linux-2.6-uschi.orig/drivers/s390/net/Kconfig +++ linux-2.6-uschi/drivers/s390/net/Kconfig @@ -11,15 +11,17 @@ config LCS To compile as a module, choose M. The module name is lcs.ko. If you do not know what it is, it's safe to choose Y. -config CTC - tristate CTC device support +config CTCM + tristate CTC and MPC SNA device support depends on CCW NETDEVICES help Select this option if you want to use channel-to-channel point-to-point networking on IBM System z. This device driver supports real CTC coupling using ESCON. It also supports virtual CTCs when running under VM. - To compile as a module, choose M. The module name is ctc.ko. + This driver also supports channel-to-channel MPC SNA devices. + MPC is an SNA protocol device used by Communication Server for Linux. + To compile as a module, choose M. The module name is ctcm.ko. To compile into the kernel, choose Y. If you do not need any channel-to-channel connection, choose N. @@ -84,7 +86,7 @@ config QETH_VLAN 802.1q VLAN support in the qeth device driver. config CCWGROUP - tristate - default (LCS || CTC || QETH) + tristate + default (LCS || CTCM || QETH) endmenu -- -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Disable TSO for non standard qdiscs
The TSO defer logic is based on your congestion window and current window size. So the actual frame sizes hitting your NIC attached to your DSL probably aren't anywhere near 64KB, but probably more in line with whatever your window size is for DSL. DSL windows can be quite large because a lot of DSL lines have a quite long latency due to error correction. And with ADSL2 we have upto 16Mbit now. I think we're having more of a disagreement of what is considered the normal case user. If you are on a slow link, such as a DSL/cable line, your TCP window/congestion window aren't going to be big enough to generate large TSO's, so what is the issue? But disabling TSO, say on a 64k TSOs are likely even with DSL. Anyways even with smaller TSOs the change still makes sense because each increase makes packet scheduling less smooth. -Andi -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6] e1000e: make a function static
From: Adrian Bunk [EMAIL PROTECTED] This patch makes the needlessly global reg_pattern_test_array() static. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/e1000e/ethtool.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c index 6d9c27f..a2034cf 100644 --- a/drivers/net/e1000e/ethtool.c +++ b/drivers/net/e1000e/ethtool.c @@ -690,8 +690,8 @@ err_setup: return err; } -bool reg_pattern_test_array(struct e1000_adapter *adapter, u64 *data, - int reg, int offset, u32 mask, u32 write) +static bool reg_pattern_test_array(struct e1000_adapter *adapter, u64 *data, + int reg, int offset, u32 mask, u32 write) { int i; u32 read; -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/6] igb: remove unneeded declaration shadowing earlier one
This removes a sparse warning. Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/igb/igb_main.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c index f3c144d..d4eb8e2 100644 --- a/drivers/net/igb/igb_main.c +++ b/drivers/net/igb/igb_main.c @@ -438,7 +438,6 @@ static int igb_request_irq(struct igb_adapter *adapter) if (adapter-msix_entries) { err = igb_request_msix(adapter); if (!err) { - struct e1000_hw *hw = adapter-hw; /* enable IAM, auto-mask, * DO NOT USE EIAME or IAME in legacy mode */ wr32(E1000_IAM, IMS_ENABLE_MASK); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/6] e1000e: tweak irq allocation messages
From: Andy Gospodarek [EMAIL PROTECTED] There's too much noise on systems that don't support MSI. Let's get rid of a few and make the real error message more specific. Signed-off-by: Andy Gospodarek [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/e1000e/netdev.c | 12 +--- 1 files changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c index 0a2cb79..f58f017 100644 --- a/drivers/net/e1000e/netdev.c +++ b/drivers/net/e1000e/netdev.c @@ -945,11 +945,7 @@ static int e1000_request_irq(struct e1000_adapter *adapter) int irq_flags = IRQF_SHARED; int err; - err = pci_enable_msi(adapter-pdev); - if (err) { - ndev_warn(netdev, -Unable to allocate MSI interrupt Error: %d\n, err); - } else { + if (!pci_enable_msi(adapter-pdev)) { adapter-flags |= FLAG_MSI_ENABLED; handler = e1000_intr_msi; irq_flags = 0; @@ -958,10 +954,12 @@ static int e1000_request_irq(struct e1000_adapter *adapter) err = request_irq(adapter-pdev-irq, handler, irq_flags, netdev-name, netdev); if (err) { + ndev_err(netdev, + Unable to allocate %s interrupt (return: %d)\n, + adapter-flags FLAG_MSI_ENABLED ? MSI:INTx, + err); if (adapter-flags FLAG_MSI_ENABLED) pci_disable_msi(adapter-pdev); - ndev_err(netdev, - Unable to allocate interrupt Error: %d\n, err); } return err; -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/6] e100: Fix iomap mem accesses
From: Jiri Slaby [EMAIL PROTECTED] writeX functions are not permitted on iomap-ped space change to iowriteX, also pci_unmap pci_map-ped space on exit (instead of iounmap). Signed-off-by: Jiri Slaby [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/e100.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/e100.c b/drivers/net/e100.c index 51cf577..9d42dd8 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -1958,7 +1958,7 @@ static void e100_rx_clean(struct nic *nic, unsigned int *work_done, if(restart_required) { // ack the rnr? - writeb(stat_ack_rnr, nic-csr-scb.stat_ack); + iowrite8(stat_ack_rnr, nic-csr-scb.stat_ack); e100_start_receiver(nic, nic-rx_to_clean); if(work_done) (*work_done)++; @@ -2774,7 +2774,7 @@ static void __devexit e100_remove(struct pci_dev *pdev) struct nic *nic = netdev_priv(netdev); unregister_netdev(netdev); e100_free(nic); - iounmap(nic-csr); + pci_iounmap(pdev, nic-csr); free_netdev(netdev); pci_release_regions(pdev); pci_disable_device(pdev); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2.6 patch] rtnetlink.c: remove no longer used functions
On Wed, Jan 30, 2008 at 09:04:33PM +0100, Patrick McHardy wrote: Adrian Bunk wrote: This patch #if 0's the following no longer used functions: - rtattr_parse() - rtattr_strlcpy() - __rtattr_parse_nested_compat() Please remove them instead. Updated patch below. cu Adrian -- snip -- This patch removes the following no longer used functions: - rtattr_parse() - rtattr_strlcpy() - __rtattr_parse_nested_compat() Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- include/linux/rtnetlink.h | 12 -- net/core/rtnetlink.c | 44 -- 2 files changed, 56 deletions(-) dcfe6b63a05c4944afcfc22fd4f2d2b495dc04c6 diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index b014f6b..b9e1740 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -602,24 +602,12 @@ struct tcamsg #include linux/mutex.h -extern size_t rtattr_strlcpy(char *dest, const struct rtattr *rta, size_t size); static __inline__ int rtattr_strcmp(const struct rtattr *rta, const char *str) { int len = strlen(str) + 1; return len rta-rta_len || memcmp(RTA_DATA(rta), str, len); } -extern int rtattr_parse(struct rtattr *tb[], int maxattr, struct rtattr *rta, int len); -extern int __rtattr_parse_nested_compat(struct rtattr *tb[], int maxattr, - struct rtattr *rta, int len); - -#define rtattr_parse_nested(tb, max, rta) \ - rtattr_parse((tb), (max), RTA_DATA((rta)), RTA_PAYLOAD((rta))) - -#define rtattr_parse_nested_compat(tb, max, rta, data, len) \ -({ data = RTA_PAYLOAD(rta) = len ? RTA_DATA(rta) : NULL; \ - __rtattr_parse_nested_compat(tb, max, rta, len); }) - extern int rtnetlink_send(struct sk_buff *skb, struct net *net, u32 pid, u32 group, int echo); extern int rtnl_unicast(struct sk_buff *skb, struct net *net, u32 pid); extern int rtnl_notify(struct sk_buff *skb, struct net *net, u32 pid, u32 group, diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index ddbdde8..61ac8d0 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -82,32 +82,6 @@ int rtnl_trylock(void) return mutex_trylock(rtnl_mutex); } -int rtattr_parse(struct rtattr *tb[], int maxattr, struct rtattr *rta, int len) -{ - memset(tb, 0, sizeof(struct rtattr*)*maxattr); - - while (RTA_OK(rta, len)) { - unsigned flavor = rta-rta_type; - if (flavor flavor = maxattr) - tb[flavor-1] = rta; - rta = RTA_NEXT(rta, len); - } - return 0; -} - -int __rtattr_parse_nested_compat(struct rtattr *tb[], int maxattr, -struct rtattr *rta, int len) -{ - if (RTA_PAYLOAD(rta) len) - return -1; - if (RTA_PAYLOAD(rta) = RTA_ALIGN(len) + sizeof(struct rtattr)) { - rta = RTA_DATA(rta) + RTA_ALIGN(len); - return rtattr_parse_nested(tb, maxattr, rta); - } - memset(tb, 0, sizeof(struct rtattr *) * maxattr); - return 0; -} - static struct rtnl_link *rtnl_msg_handlers[NPROTO]; static inline int rtm_msgindex(int msgtype) @@ -442,21 +416,6 @@ void __rta_fill(struct sk_buff *skb, int attrtype, int attrlen, const void *data memset(RTA_DATA(rta) + attrlen, 0, RTA_ALIGN(size) - size); } -size_t rtattr_strlcpy(char *dest, const struct rtattr *rta, size_t size) -{ - size_t ret = RTA_PAYLOAD(rta); - char *src = RTA_DATA(rta); - - if (ret 0 src[ret - 1] == '\0') - ret--; - if (size 0) { - size_t len = (ret = size) ? size - 1 : ret; - memset(dest, 0, size); - memcpy(dest, src, len); - } - return ret; -} - int rtnetlink_send(struct sk_buff *skb, struct net *net, u32 pid, unsigned group, int echo) { struct sock *rtnl = net-rtnl; @@ -1411,9 +1370,6 @@ void __init rtnetlink_init(void) } EXPORT_SYMBOL(__rta_fill); -EXPORT_SYMBOL(rtattr_strlcpy); -EXPORT_SYMBOL(rtattr_parse); -EXPORT_SYMBOL(__rtattr_parse_nested_compat); EXPORT_SYMBOL(rtnetlink_put_metrics); EXPORT_SYMBOL(rtnl_lock); EXPORT_SYMBOL(rtnl_trylock); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH]: TCP + SACK + skb processing latency
Hi all, Here's an attempt to reduce amount of skb cleanup work TCP with TSO has to do after SACKs have arrived. I'm not on very familiar grounds with TSOed skbs so there likely is much I just couldn't take into account. I probably should at least check some flag somewhere? (=NETIF_F_SG?). Also the current the length handling and potential presence of headers worries me a lot. I'd appreciate if somebody with more knowledge about skb internals could take a look at the relevant parts (mainly skb_shift) and confirm that doing all this I'm trying to do is allowed :-). -- i. -- [RFC PATCH] [TCP]: Try to restore large SKBs while SACK processing During SACK processing, most of the benefits of TSO are eaten by the SACK blocks that one-by-one fragment SKBs to MSS sized chunks. Then we're in problems when cleanup work for them has to be done when a large cumulative ACK comes. Try to return to pre-split state while more and more SACK info gets discovered. Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED] --- include/net/tcp.h|5 + net/ipv4/tcp_input.c | 206 +- 2 files changed, 209 insertions(+), 2 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 7de4ea3..cdf4468 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1170,6 +1170,11 @@ static inline struct sk_buff *tcp_write_queue_next(struct sock *sk, struct sk_bu return skb-next; } +static inline struct sk_buff *tcp_write_queue_prev(struct sock *sk, struct sk_buff *skb) +{ + return skb-prev; +} + #define tcp_for_write_queue(skb, sk) \ for (skb = (sk)-sk_write_queue.next; \ (skb != (struct sk_buff *)(sk)-sk_write_queue); \ diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 19c449f..2cb7e86 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1206,6 +1206,8 @@ static int tcp_check_dsack(struct tcp_sock *tp, struct sk_buff *ack_skb, * aligned portion of it that matches. Therefore we might need to fragment * which may fail and creates some hassle (caller must handle error case * returns). + * + * FIXME: this could be merged to shift decision code */ static int tcp_match_skb_to_sack(struct sock *sk, struct sk_buff *skb, u32 start_seq, u32 end_seq) @@ -1322,6 +1324,206 @@ static int tcp_sacktag_one(struct sk_buff *skb, struct sock *sk, return flag; } +/* Attempts to shift up to shiftlen worth of bytes from prev to skb. + * Returns number bytes shifted. + * + * TODO: in case the prev runs out of frag space, operation could be + * made to return with a partial result (would allow tighter packing). + */ +static int skb_shift(struct sk_buff *prev, struct sk_buff *skb, +unsigned int shiftlen) +{ + int i, to, merge; + unsigned int todo; + struct skb_frag_struct *from, *fragto; + + if (skb_cloned(skb) || skb_cloned(prev)) + return 0; + + todo = shiftlen; + i = 0; + from = skb_shinfo(skb)-frags[i]; + to = skb_shinfo(prev)-nr_frags; + + merge = to - 1; + if (!skb_can_coalesce(prev, merge + 1, from-page, from-page_offset)) + merge = -1; + if (merge = 0) { + i++; + if (from-size = shiftlen) + goto onlymerge; + todo -= from-size; + } + + /* Skip full, not-fitting skb to avoid expensive operations */ + if ((shiftlen == skb-len) + (skb_shinfo(skb)-nr_frags - merge) (MAX_SKB_FRAGS - to)) + return 0; + + while (todo (i skb_shinfo(skb)-nr_frags)) { + if (to == MAX_SKB_FRAGS) + return 0; + + from = skb_shinfo(skb)-frags[i]; + fragto = skb_shinfo(prev)-frags[to]; + + if (todo = from-size) { + *fragto = *from; + todo -= from-size; + i++; + to++; + + } else { + fragto-page = from-page; + get_page(fragto-page); + fragto-page_offset = from-page_offset; + fragto-size = todo; + from-page_offset += todo; + from-size -= todo; + + to++; + break; + } + } + skb_shinfo(prev)-nr_frags = to; + + /* Delayed, so that we don't have to undo it */ + if (merge = 0) { +onlymerge: + from = skb_shinfo(skb)-frags[0]; + skb_shinfo(prev)-frags[merge].size += min(from-size, shiftlen); + put_page(from-page); + } + + /* Reposition in the original skb */ + to = 0; + while (i skb_shinfo(skb)-nr_frags) + skb_shinfo(skb)-frags[to++] =
[PATCH] drivers/base: export (un)register_memory_notifier
Drivers like eHEA need memory notifiers in order to update their internal DMA memory map when memory is added to or removed from the system. Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED] --- Comment: eHEA patches that exploit these functions will follow drivers/base/memory.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 7ae413f..1e1bd4c 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -52,11 +52,13 @@ int register_memory_notifier(struct notifier_block *nb) { return blocking_notifier_chain_register(memory_chain, nb); } +EXPORT_SYMBOL(register_memory_notifier); void unregister_memory_notifier(struct notifier_block *nb) { blocking_notifier_chain_unregister(memory_chain, nb); } +EXPORT_SYMBOL(unregister_memory_notifier); /* * register_memory - Setup a sysfs device for a memory block -- 1.5.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] rtnetlink locking and notification fixes
Hi Dave, These two patches remove unnecessary locks from rtnetlink, it was managed in an inconsistent way, and change notification. The latter is always sent if anything is changed but for compatibility the old nofification is also kept. Regards, Attila -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Remove unnecessary locks from rtnetlink
The do_setlink() function is protected by rtnl, additional locks are unnecessary. and the set_operstate() function is called from protected parts. Locks removed from both functions. The set_operstate() is also called from rtnl_create_link() and from no other places. In rtnl_create_link() none of the changes is protected by set_lock_bh() except inside set_operstate(), different locking scheme is not necessary for the operstate. Signed-off-by: Laszlo Attila Toth [EMAIL PROTECTED] --- net/core/rtnetlink.c |4 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index ddbdde8..724e8f5 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -565,9 +565,7 @@ static void set_operstate(struct net_device *dev, unsigned char transition) } if (dev-operstate != operstate) { - write_lock_bh(dev_base_lock); dev-operstate = operstate; - write_unlock_bh(dev_base_lock); netdev_state_change(dev); } } @@ -882,9 +880,7 @@ static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm, set_operstate(dev, nla_get_u8(tb[IFLA_OPERSTATE])); if (tb[IFLA_LINKMODE]) { - write_lock_bh(dev_base_lock); dev-link_mode = nla_get_u8(tb[IFLA_LINKMODE]); - write_unlock_bh(dev_base_lock); } err = 0; -- 1.5.2.5 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] rtnetlink: send a single notification on device state changes
In do_setlink() a single notification is sent at the end of the function if any modification occured. If the address has been changed, another notification is sent. Both of them is required because originally only the NETDEV_CHANGEADDR notification was sent and although device state change implies address change, some programs may expect the original notification. It remains for compatibity. Signed-off-by: Laszlo Attila Toth [EMAIL PROTECTED] --- net/core/rtnetlink.c | 27 --- 1 files changed, 20 insertions(+), 7 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 724e8f5..d67b950 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -545,7 +545,7 @@ int rtnl_put_cacheinfo(struct sk_buff *skb, struct dst_entry *dst, u32 id, EXPORT_SYMBOL_GPL(rtnl_put_cacheinfo); -static void set_operstate(struct net_device *dev, unsigned char transition) +static int set_operstate(struct net_device *dev, unsigned char transition) { unsigned char operstate = dev-operstate; @@ -566,8 +566,9 @@ static void set_operstate(struct net_device *dev, unsigned char transition) if (dev-operstate != operstate) { dev-operstate = operstate; - netdev_state_change(dev); - } + return 1; + } else + return 0; } static void copy_rtnl_link_stats(struct rtnl_link_stats *a, @@ -861,6 +862,7 @@ static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm, if (tb[IFLA_BROADCAST]) { nla_memcpy(dev-broadcast, tb[IFLA_BROADCAST], dev-addr_len); send_addr_notify = 1; + modified = 1; } if (ifm-ifi_flags || ifm-ifi_change) { @@ -873,14 +875,21 @@ static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm, dev_change_flags(dev, flags); } - if (tb[IFLA_TXQLEN]) - dev-tx_queue_len = nla_get_u32(tb[IFLA_TXQLEN]); + if (tb[IFLA_TXQLEN]) { + if (dev-tx_queue_len != nla_get_u32(tb[IFLA_TXQLEN])) { + dev-tx_queue_len = nla_get_u32(tb[IFLA_TXQLEN]); + modified = 1; + } + } if (tb[IFLA_OPERSTATE]) - set_operstate(dev, nla_get_u8(tb[IFLA_OPERSTATE])); + modified |= set_operstate(dev, nla_get_u8(tb[IFLA_OPERSTATE])); if (tb[IFLA_LINKMODE]) { - dev-link_mode = nla_get_u8(tb[IFLA_LINKMODE]); + if (dev-link_mode != nla_get_u8(tb[IFLA_LINKMODE])) { + dev-link_mode = nla_get_u8(tb[IFLA_LINKMODE]); + modified = 1; + } } err = 0; @@ -894,6 +903,10 @@ errout: if (send_addr_notify) call_netdevice_notifiers(NETDEV_CHANGEADDR, dev); + + if (modified) + netdev_state_change(dev); + return err; } -- 1.5.2.5 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Slow OOM in netif_RX function
I understand you dont want to tell us exact firewall rules you have. Maybe you could post at least following infos : # cat /proc/slabinfo # lsmod I have changed slab with slub. firewall # cat /sys/slab/kmalloc-2048/alloc_calls 1 add_sect_attrs+0x57/0x120 age=20565254 pid=1115 1 __vmalloc_area_node+0x5d/0xf3 age=586962 pid=31548 6 journal_init_revoke+0xe0/0x241 age=20562046/20563010/20566655 pid=1-1510 6 journal_init_revoke+0x1c7/0x241 age=20562046/20563010/20566655 pid=1-1510 6 journal_init_inode+0x7d/0x123 age=20562046/20563010/20566655 pid=1-1510 1 tty_write+0xe8/0x1bc age=685813 pid=21217 1 input_allocate_device+0x10/0x6c age=20566814 pid=38 2 reqsk_queue_alloc+0x58/0xa8 age=5679409/8932742/12186076 pid=1135-2500 5 alloc_netdev_mq+0x3c/0x9a age=20555675/20561345/20565141 pid=1233-3041 1 neigh_hash_alloc+0x14/0x2c age=20527064 pid=0 11 neigh_sysctl_register+0x24/0x1fd age=20555673/20560511/20567588 pid=1-3041 6 qdisc_alloc+0x1b/0x70 age=585308/585373/585539 pid=31629-31818 26 qdisc_get_rtab+0x5e/0xa2 age=585498/585519/585535 pid=31630-31664 11 devinet_sysctl_register+0x21/0xd7 age=20555673/20560511/20567588 pid=1-3041 1 netlink_proto_init+0x2a/0x123 age=20567604 pid=1 3106 boomerang_rx+0x30d/0x40d [3c59x] age=1/9966140/20553543 pid=0-22895 3 bm_init+0x28/0xa3 [ts_bm] age=586918/586918/586918 pid=31548 firewall # cat /sys/slab/kmalloc-2048/free_calls 109 not-available age=20515711 pid=0 1 rcu_do_batch+0x1a/0x71 age=608627 pid=31627 19 kobject_uevent_env+0x3c5/0x3da age=20578755/20585359/20590686 pid=1-3041 3055 kfree_skbmem+0x8/0x68 age=3/9973654/20579063 pid=0-31818 1 pskb_expand_head+0xe3/0x13d age=15946314 pid=695 1 huft_build+0x498/0x4a2 age=20590653 pid=1 3 htb_destroy_class+0x5e/0x12e [sch_htb] age=608630/608630/608630 pid=31625 6 htb_destroy_class+0x69/0x12e [sch_htb] age=608630/608630/608630 pid=31625 fire-sp # lsmod Module Size Used by xt_string 2272 3 ipt_ULOG8004 5 ipt_recent 9360 40 softdog 5792 0 act_mirred 5060 4 cls_u32 7972 4 sch_sfq 5760 56 cls_fw 5408 54 sch_htb16192 6 ifb 5156 0 aes28512 0 des15456 0 md5 3936 0 sha256 9248 0 ipsec 312176 2 nf_nat_tftp 1792 0 nf_conntrack_tftp 5144 1 nf_nat_tftp nf_nat_pptp 3712 0 nf_conntrack_pptp 6688 1 nf_nat_pptp nf_conntrack_proto_gre 4992 1 nf_conntrack_pptp nf_nat_proto_gre2724 1 nf_nat_pptp nf_nat_ftp 3236 0 nf_conntrack_ftp8680 1 nf_nat_ftp ipt_tos 1536 492 xt_mark 1760 12 xt_DSCP 2336 13 ipt_NETMAP 1888 6 xt_TCPMSS 4064 4 xt_length 1856 3 ts_bm 2304 3 xt_mac 1792 28 ipt_REJECT 4416 74 xt_limit2496 153 xt_state2368 2948 iptable_nat 7172 1 nf_nat 18412 6 nf_nat_tftp,nf_nat_pptp,nf_nat_proto_gre,nf_nat_ftp,ipt_NETMAP,iptable_nat nf_conntrack_ipv4 16744 2950 iptable_nat nf_conntrack 57412 11 nf_nat_tftp,nf_conntrack_tftp,nf_nat_pptp,nf_conntrack_pptp,nf_conntrack_proto_gre,nf_nat_ftp,nf_conntrack_ftp,xt_state,iptable_nat,nf_nat,nf_conntrack_ipv4 nfnetlink 5784 3 nf_nat,nf_conntrack_ipv4,nf_conntrack xt_MARK 2176 28 iptable_mangle 2720 1 xt_multiport3232 2325 iptable_filter 2852 1 ip_tables 12824 3 iptable_nat,iptable_mangle,iptable_filter binfmt_misc10792 1 dm_mirror 20608 0 dm_mod 53280 1 dm_mirror i2c_viapro 8252 0 i2c_core 23376 1 i2c_viapro 3c59x 41132 0 mii 5280 1 3c59x floppy 53892 0 pata_via 11684 0 libata110188 1 pata_via scsi_mod 137996 1 libata raid1 20448 7 firewall# -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2.6 patch] remove obsolete tokenring maintainer information
- Peter's email address is bouncing - the project webpage no longer exists - neither Peter nor Mike had a single patch included in the kernel since 2.6.12-rc2 (when the git history begins) Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- MAINTAINERS | 23 --- 1 file changed, 23 deletions(-) c6ad2e060090ec05fb0cb58f885079360ad48234 diff --git a/MAINTAINERS b/MAINTAINERS index ba05e80..f09e844 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -84,13 +84,6 @@ S: Status, one of the following: it has been replaced by a better system and you should be using that. -3C359 NETWORK DRIVER -P: Mike Phillips -M: [EMAIL PROTECTED] -L: netdev@vger.kernel.org -W: http://www.linuxtr.net -S: Maintained - 3C505 NETWORK DRIVER P: Philip Blundell M: [EMAIL PROTECTED] @@ -2864,15 +2857,6 @@ L: [EMAIL PROTECTED] W: http://oss.oracle.com/projects/ocfs2/ S: Supported -OLYMPIC NETWORK DRIVER -P: Peter De Shrijver -M: [EMAIL PROTECTED] -P: Mike Phillips -M: [EMAIL PROTECTED] -L: netdev@vger.kernel.org -W: http://www.linuxtr.net -S: Maintained - OMNIKEY CARDMAN 4000 DRIVER P: Harald Welte M: [EMAIL PROTECTED] @@ -3786,13 +3770,6 @@ L: [EMAIL PROTECTED] (subscribers-only) W: http://sourceforge.net/projects/tlan/ S: Maintained -TOKEN-RING NETWORK DRIVER -P: Mike Phillips -M: [EMAIL PROTECTED] -L: netdev@vger.kernel.org -W: http://www.linuxtr.net -S: Maintained - TOSHIBA ACPI EXTRAS DRIVER P: John Belmonte M: [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] e1000: make e1000_dump_eeprom() static
From: Adrian Bunk [EMAIL PROTECTED] This patch makes the needlessly global e1000_dump_eeprom() static. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/e1000/e1000_main.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 8c87940..7c5b05a 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -853,7 +853,7 @@ e1000_reset(struct e1000_adapter *adapter) /** * Dump the eeprom for users having checksum issues **/ -void e1000_dump_eeprom(struct e1000_adapter *adapter) +static void e1000_dump_eeprom(struct e1000_adapter *adapter) { struct net_device *netdev = adapter-netdev; struct ethtool_eeprom eeprom; -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6] e100: fix spelling errors
From: Andreas Mohr [EMAIL PROTECTED] Signed-off-by: Andreas Mohr [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/e100.c | 14 +++--- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/net/e100.c b/drivers/net/e100.c index 9d42dd8..36ba6dc 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -94,7 +94,7 @@ * enabled. 82557 pads with 7Eh, while the later controllers pad * with 00h. * - * IV. Recieve + * IV. Receive * * The Receive Frame Area (RFA) comprises a ring of Receive Frame * Descriptors (RFD) + data buffer, thus forming the simplified mode @@ -120,7 +120,7 @@ * and Rx indication and re-allocation happen in the same context, * therefore no locking is required. A software-generated interrupt * is generated from the watchdog to recover from a failed allocation - * senario where all Rx resources have been indicated and none re- + * scenario where all Rx resources have been indicated and none re- * placed. * * V. Miscellaneous @@ -954,7 +954,7 @@ static void e100_get_defaults(struct nic *nic) /* Quadwords to DMA into FIFO before starting frame transmit */ nic-tx_threshold = 0xE0; - /* no interrupt for every tx completion, delay = 256us if not 557*/ + /* no interrupt for every tx completion, delay = 256us if not 557 */ nic-tx_command = cpu_to_le16(cb_tx | cb_tx_sf | ((nic-mac = mac_82558_D101_A4) ? cb_cid : cb_i)); @@ -1497,7 +1497,7 @@ static void e100_update_stats(struct nic *nic) s-complete; /* Device's stats reporting may take several microseconds to -* complete, so where always waiting for results of the +* complete, so we're always waiting for results of the * previous command. */ if(*complete == cpu_to_le32(cuc_dump_reset_complete)) { @@ -2858,17 +2858,17 @@ static void e100_shutdown(struct pci_dev *pdev) /** * e100_io_error_detected - called when PCI error is detected. * @pdev: Pointer to PCI device - * @state: The current pci conneection state + * @state: The current pci connection state */ static pci_ers_result_t e100_io_error_detected(struct pci_dev *pdev, pci_channel_state_t state) { struct net_device *netdev = pci_get_drvdata(pdev); struct nic *nic = netdev_priv(netdev); - /* Similar to calling e100_down(), but avoids adpater I/O. */ + /* Similar to calling e100_down(), but avoids adapter I/O. */ netdev-stop(netdev); - /* Detach; put netif into state similar to hotplug unplug. */ + /* Detach; put netif into a state similar to hotplug unplug. */ napi_enable(nic-napi); netif_device_detach(netdev); pci_disable_device(pdev); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 2/7] drivers/s390/net: Kconfig brush up
From: Peter Tiedemann [EMAIL PROTECTED] From: Ursula Braun [EMAIL PROTECTED] adapt drivers/s390/net/Kconfig to current IBM wording and further cosmetics Signed-off-by: Peter Tiedemann [EMAIL PROTECTED] Signed-off-by: Ursula Braun [EMAIL PROTECTED] --- drivers/s390/net/Kconfig | 43 ++- 1 file changed, 22 insertions(+), 21 deletions(-) Index: linux-2.6-uschi/drivers/s390/net/Kconfig === --- linux-2.6-uschi.orig/drivers/s390/net/Kconfig +++ linux-2.6-uschi/drivers/s390/net/Kconfig @@ -5,22 +5,23 @@ config LCS tristate Lan Channel Station Interface depends on CCW NETDEVICES (NET_ETHERNET || TR || FDDI) help - Select this option if you want to use LCS networking on IBM S/390 - or zSeries. This device driver supports Token Ring (IEEE 802.5), - FDDI (IEEE 802.7) and Ethernet. - This option is also available as a module which will be - called lcs.ko. If you do not know what it is, it's safe to say Y. + Select this option if you want to use LCS networking on IBM System z. + This device driver supports Token Ring (IEEE 802.5), + FDDI (IEEE 802.7) and Ethernet. + To compile as a module, choose M. The module name is lcs.ko. + If you do not know what it is, it's safe to choose Y. config CTC tristate CTC device support depends on CCW NETDEVICES help - Select this option if you want to use channel-to-channel networking - on IBM S/390 or zSeries. This device driver supports real CTC - coupling using ESCON. It also supports virtual CTCs when running - under VM. It will use the channel device configuration if this is - available. This option is also available as a module which will be - called ctc.ko. If you do not know what it is, it's safe to say Y. + Select this option if you want to use channel-to-channel + point-to-point networking on IBM System z. + This device driver supports real CTC coupling using ESCON. + It also supports virtual CTCs when running under VM. + To compile as a module, choose M. The module name is ctc.ko. + To compile into the kernel, choose Y. + If you do not need any channel-to-channel connection, choose N. config NETIUCV tristate IUCV network device support (VM only) @@ -29,9 +30,9 @@ config NETIUCV Select this option if you want to use inter-user communication vehicle networking under VM or VIF. It enables a fast communication link between VM guests. Using ifconfig a point-to-point connection - can be established to the Linux for zSeries and S7390 system - running on the other VM guest. This option is also available - as a module which will be called netiucv.ko. If unsure, say Y. + can be established to the Linux on IBM System z + running on the other VM guest. To compile as a module, choose M. + The module name is netiucv.ko. If unsure, choose Y. config SMSGIUCV tristate IUCV special message support (VM only) @@ -47,22 +48,22 @@ config CLAW This driver supports channel attached CLAW devices. CLAW is Common Link Access for Workstation. Common devices that use CLAW are RS/6000s, Cisco Routers (CIP) and 3172 devices. - To compile as a module choose M here: The module will be called - claw.ko to compile into the kernel choose Y + To compile as a module, choose M. The module name is claw.ko. + To compile into the kernel, choose Y. config QETH tristate Gigabit Ethernet device support depends on CCW NETDEVICES IP_MULTICAST QDIO help - This driver supports the IBM S/390 and zSeries OSA Express adapters + This driver supports the IBM System z OSA Express adapters in QDIO mode (all media types), HiperSockets interfaces and VM GuestLAN interfaces in QDIO and HIPER mode. - For details please refer to the documentation provided by IBM at - http://www10.software.ibm.com/developerworks/opensource/linux390 + For details please refer to the documentation provided by IBM at + http://www.ibm.com/developerworks/linux/linux390 - To compile this driver as a module, choose M here: the - module will be called qeth.ko. + To compile this driver as a module, choose M. + The module name is qeth.ko. comment Gigabit Ethernet default settings -- -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] drivers/base: export (un)register_memory_notifier
On Fri, 2008-02-01 at 17:16 +0100, Jan-Bernd Themann wrote: Drivers like eHEA need memory notifiers in order to update their internal DMA memory map when memory is added to or removed from the system. Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED] --- Comment: eHEA patches that exploit these functions will follow drivers/base/memory.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 7ae413f..1e1bd4c 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -52,11 +52,13 @@ int register_memory_notifier(struct notifier_block *nb) { return blocking_notifier_chain_register(memory_chain, nb); } +EXPORT_SYMBOL(register_memory_notifier); void unregister_memory_notifier(struct notifier_block *nb) { blocking_notifier_chain_unregister(memory_chain, nb); } +EXPORT_SYMBOL(unregister_memory_notifier); /* * register_memory - Setup a sysfs device for a memory block Is there a reason for not making them EXPORT_SYMBOL_GPL() ? Otherwise, looks good to me. I have been planning to send this as part of my next update with ppc64 arch-specific remove support and generic __remove_pages() support. If this is blocking your work, lets get this in. Acked-by: Badari Pulavarty [EMAIL PROTECTED] Thanks, Badari -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Disable TSO for non standard qdiscs
On Fri, 1 Feb 2008 15:34:21 +0100 Andi Kleen [EMAIL PROTECTED] wrote: The TSO defer logic is based on your congestion window and current window size. So the actual frame sizes hitting your NIC attached to your DSL probably aren't anywhere near 64KB, but probably more in line with whatever your window size is for DSL. DSL windows can be quite large because a lot of DSL lines have a quite long latency due to error correction. And with ADSL2 we have upto 16Mbit now. I think we're having more of a disagreement of what is considered the normal case user. If you are on a slow link, such as a DSL/cable line, your TCP window/congestion window aren't going to be big enough to generate large TSO's, so what is the issue? But disabling TSO, say on a 64k TSOs are likely even with DSL. Anyways even with smaller TSOs the change still makes sense because each increase makes packet scheduling less smooth. -Andi I wish there was a per-device setting for the max-size of TSO burst. -- Stephen Hemminger [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [POWERPC][NET][SERIAL] UCCs: replace device-id with cell-index (was: Re: [PATCH] [POWERPC] get rid of `model = UCC' in the ucc nodes)
On Fri, Feb 01, 2008 at 09:32:38AM -0600, Kumar Gala wrote: On Feb 1, 2008, at 9:01 AM, Anton Vorontsov wrote: It isn't used anywhere, so remove it. If we'll ever need something like this, we'll use compatible property instead. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Rebased on top of recent tree. Documentation/powerpc/booting-without-of.txt |1 - arch/powerpc/boot/dts/mpc832x_mds.dts|3 --- arch/powerpc/boot/dts/mpc832x_rdb.dts|2 -- arch/powerpc/boot/dts/mpc836x_mds.dts|2 -- arch/powerpc/boot/dts/mpc8568mds.dts |2 -- 5 files changed, 0 insertions(+), 10 deletions(-) diff --git a/Documentation/powerpc/booting-without-of.txt b/ Documentation/powerpc/booting-without-of.txt index 410c847..dcf9758 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -1675,7 +1675,6 @@ platforms are moved over to use the flattened- device-tree model. [EMAIL PROTECTED] { device_type = network; compatible = ucc_geth; -model = UCC; device-id = 1; can we change device-id to cell-index? Sure. But let's do this in the separate patch? Because this change actually touches the code in the two subsystems: net and serial. I hope everybody will agree to pass it through powerpc tree..? - - - - From: Anton Vorontsov [EMAIL PROTECTED] Subject: [POWERPC][NET][SERIAL] UCCs: replace device-id with cell-index device-id is worse than cell-index. Probably cell-index isn't good either, but device-id is worse anyway. Drivers are modified for backward compatibility's sake. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Documentation/powerpc/booting-without-of.txt |4 ++-- arch/powerpc/boot/dts/mpc832x_mds.dts|4 +--- arch/powerpc/boot/dts/mpc832x_rdb.dts|2 -- arch/powerpc/boot/dts/mpc836x_mds.dts|2 -- arch/powerpc/boot/dts/mpc836x_rdk.dts| 12 ++-- arch/powerpc/boot/dts/mpc8568mds.dts |2 -- drivers/net/ucc_geth.c |8 +++- drivers/net/ucc_geth_mii.c | 11 --- drivers/serial/ucc_uart.c| 16 9 files changed, 36 insertions(+), 25 deletions(-) diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt index dcf9758..7b30798 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -1618,7 +1618,7 @@ platforms are moved over to use the flattened-device-tree model. bisync, atm, or serial. - compatible : could be ucc_geth or fsl_atm and so on. - model : should be UCC. - - device-id : the ucc number(1-8), corresponding to UCCx in UM. + - cell-index : the ucc number(1-8), corresponding to UCCx in UM. - reg : Offset and length of the register set for the device - interrupts : a b where a is the interrupt number and b is a field that represents an encoding of the sense and level @@ -1675,7 +1675,7 @@ platforms are moved over to use the flattened-device-tree model. [EMAIL PROTECTED] { device_type = network; compatible = ucc_geth; - device-id = 1; + cell-index = 1; reg = 2000 200; interrupts = a0 0; interrupt-parent = 700; diff --git a/arch/powerpc/boot/dts/mpc832x_mds.dts b/arch/powerpc/boot/dts/mpc832x_mds.dts index f8b4a37..539e02f 100644 --- a/arch/powerpc/boot/dts/mpc832x_mds.dts +++ b/arch/powerpc/boot/dts/mpc832x_mds.dts @@ -256,7 +256,6 @@ device_type = network; compatible = ucc_geth; cell-index = 3; - device-id = 3; reg = 0x2200 0x200; interrupts = 34; interrupt-parent = qeic; @@ -271,7 +270,6 @@ device_type = network; compatible = ucc_geth; cell-index = 4; - device-id = 4; reg = 0x3200 0x200; interrupts = 35; interrupt-parent = qeic; @@ -285,7 +283,7 @@ [EMAIL PROTECTED] { device_type = serial; compatible = ucc_uart; - device-id = 5;/* The UCC number, 1-7*/ + cell-index = 5; /* The UCC number, 1-7*/ port-number = 0; /* Which ttyQEx device */ soft-uart; /* We need Soft-UART */ reg = 0x2400 0x200; diff --git a/arch/powerpc/boot/dts/mpc832x_rdb.dts b/arch/powerpc/boot/dts/mpc832x_rdb.dts index ea7fcbf..179c81c 100644 --- a/arch/powerpc/boot/dts/mpc832x_rdb.dts +++
RE: [Bugme-new] [Bug 9808] New: system hung with htb QoS
Andrew Morton wrote: On Thu, 24 Jan 2008 03:03:11 -0800 (PST) [EMAIL PROTECTED] wrote: http://bugzilla.kernel.org/show_bug.cgi?id=9808 Summary: system hung with htb QoS Product: Networking Version: 2.5 KernelVersion: 2.6.23.9 Platform: All OS/Version: Linux Tree: Fedora Status: NEW Severity: normal Priority: P1 Component: Netfilter/Iptables AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] This user was able to eliminate the tx hang problem with htb QoS by disabling TSO. We also have some reports of xen hangs when TSO is enabled as well. Not sure if they are related. At this point this user has a workaround. We will continue to try to reproduce here to see if we can root cause the hardware issue, but we haven't had any luck so far. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [POWERPC][NET][SERIAL] UCCs: replace device-id with cell-index
Anton Vorontsov wrote: On Fri, Feb 01, 2008 at 09:32:38AM -0600, Kumar Gala wrote: On Feb 1, 2008, at 9:01 AM, Anton Vorontsov wrote: It isn't used anywhere, so remove it. If we'll ever need something like this, we'll use compatible property instead. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Rebased on top of recent tree. Documentation/powerpc/booting-without-of.txt |1 - arch/powerpc/boot/dts/mpc832x_mds.dts|3 --- arch/powerpc/boot/dts/mpc832x_rdb.dts|2 -- arch/powerpc/boot/dts/mpc836x_mds.dts|2 -- arch/powerpc/boot/dts/mpc8568mds.dts |2 -- 5 files changed, 0 insertions(+), 10 deletions(-) diff --git a/Documentation/powerpc/booting-without-of.txt b/ Documentation/powerpc/booting-without-of.txt index 410c847..dcf9758 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -1675,7 +1675,6 @@ platforms are moved over to use the flattened- device-tree model. [EMAIL PROTECTED] { device_type = network; compatible = ucc_geth; - model = UCC; device-id = 1; can we change device-id to cell-index? Sure. But let's do this in the separate patch? Because this change actually touches the code in the two subsystems: net and serial. I hope everybody will agree to pass it through powerpc tree..? - - - - From: Anton Vorontsov [EMAIL PROTECTED] Subject: [POWERPC][NET][SERIAL] UCCs: replace device-id with cell-index device-id is worse than cell-index. Probably cell-index isn't good either, but device-id is worse anyway. Drivers are modified for backward compatibility's sake. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Documentation/powerpc/booting-without-of.txt |4 ++-- arch/powerpc/boot/dts/mpc832x_mds.dts|4 +--- arch/powerpc/boot/dts/mpc832x_rdb.dts|2 -- arch/powerpc/boot/dts/mpc836x_mds.dts|2 -- arch/powerpc/boot/dts/mpc836x_rdk.dts| 12 ++-- arch/powerpc/boot/dts/mpc8568mds.dts |2 -- drivers/net/ucc_geth.c |8 +++- drivers/net/ucc_geth_mii.c | 11 --- drivers/serial/ucc_uart.c| 16 9 files changed, 36 insertions(+), 25 deletions(-) ACK drivers/net -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [IPV4 0/9] TRIE performance patches
Hello, finally got some time to test... Table w. 214k routes with full rDoS on two intrefaces on 2 x AMD64 processors, speed 2814.43 MHz. Profiled with CPU_CLK_UNHALTED and rtstat w/o latest patch fib_trie pathes. Tput ~233 kpps samples %symbol name 109925 14.4513 fn_trie_lookup 109821 14.4376 ip_route_input 8724511.4696 rt_intern_hash 31270 4.1109 kmem_cache_alloc 24159 3.1761 dev_queue_xmit 23200 3.0500 neigh_lookup 22464 2.9532 free_block 18412 2.4205 kmem_cache_free 17830 2.3440 dst_destroy 15740 2.0693 fib_get_table With latest patch fib_patches.(Stephens others) Same throughput ~233 kpps but we see a different profile. Why we don't get any better better throughput is yet to be understand (the drops in qdisc could be the cause) more analysis needed 7924214.3520 ip_route_input 6518811.8066 fn_trie_lookup 6455911.6927 rt_intern_hash 22901 4.1477 kmem_cache_alloc 21038 3.8103 check_leaf 16197 2.9335 neigh_lookup 14802 2.6809 free_block 14596 2.6436 ip_rcv_finish 12365 2.2395 fib_validate_source 12048 2.1821 dst_destroy fib_hash thoughput ~177 kpps Hard work for fib_hash here as we have many zones. it can be fast with less zines. 200568 37.8013 fn_hash_lookup 5835210.9977 ip_route_input 44495 8.3860 rt_intern_hash 12873 2.4262 kmem_cache_alloc 12115 2.2833 rt_may_expire 11691 2.2034 rt_garbage_collect 10821 2.0394 dev_queue_xmit 1.8845 fib_validate_source 8762 1.6514 fib_get_table 8558 1.6129 fib_semantic_match Cheers --ro -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] ehea: fix ehea.h checkpatch complaints
Doug Maxey wrote: Cc: Jan-Bernd Themann [EMAIL PROTECTED] Signed-off-by: Doug Maxey [EMAIL PROTECTED] --- drivers/net/ehea/ehea.h|3 +++ drivers/net/ehea/ehea_hw.h |8 2 files changed, 7 insertions(+), 4 deletions(-) applied 1-5 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for 2.6.25 1/2] [NET] ucc_geth: fix module removal
Anton Vorontsov wrote: - uccf should be set to NULL to not double-free memory on subsequent calls; - ind_hash_q and group_hash_q lists should be initialized in the probe() function, instead of struct_init() (called by open()), otherwise there will be an oops if ucc_geth_driver removed prior 'ifconfig ethX up'; - add unregister_netdev(); - reorder geth_remove() steps. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Hi Li, You kinda promised that these two patches would hit 2.6.25... ;-) I've rebased the patches so they apply cleanly on the current tree. Thanks, drivers/net/ucc_geth.c | 17 ++--- 1 files changed, 10 insertions(+), 7 deletions(-) applied 1-2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ehea: fix sysfs link compile problem
Jan-Bernd Themann wrote: Due to changes in the struct device_driver there is no direct access to its kobj any longer. The kobj was used to create sysfs links between eHEA ethernet devices and the driver. This patch removes the affected sysfs links to resolve the build problems. Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED] --- drivers/net/ehea/ehea_main.c | 37 - 1 files changed, 0 insertions(+), 37 deletions(-) applied -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: NET: AX88796 use dev_dbg() instead of printk()
applied -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6 patch] via-rhine.c:rhine_hw_init() must be __devinit
applied the net driver portion of these patches... -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rtl8150: use default MTU of 1500
Lennert Buytenhek wrote: The RTL8150 driver uses an MTU of 1540 by default, which causes a bunch of problems -- it prevents booting from NFS root, for one. Signed-off-by: Lennert Buytenhek [EMAIL PROTECTED] Cc: Petko Manolov [EMAIL PROTECTED] --- linux-2.6.24-git7.orig/drivers/net/usb/rtl8150.c2008-01-24 23:58:37.0 +0100 +++ linux-2.6.24-git7/drivers/net/usb/rtl8150.c 2008-01-30 20:29:00.0 +0100 @@ -925,9 +925,8 @@ netdev-hard_start_xmit = rtl8150_start_xmit; netdev-set_multicast_list = rtl8150_set_multicast; netdev-set_mac_address = rtl8150_set_mac_address; netdev-get_stats = rtl8150_netdev_stats; - netdev-mtu = RTL8150_MTU; SET_ETHTOOL_OPS(netdev, ops); dev-intr_interval = 100;/* 100ms */ if (!alloc_all_urbs(dev)) { applied -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] macb: Fix section mismatch and shrink runtime footprint
Haavard Skinnemoen wrote: macb devices are only found integrated on SoCs, so they can't be hotplugged. Thus, the probe() and exit() functions can be __init and __exit, respectively. By using platform_driver_probe() instead of platform_driver_register(), there won't be any references to the discarded probe() function after the driver has loaded. This also fixes a section mismatch due to macb_probe(), defined as __devinit, calling macb_get_hwaddr, defined as __init. Signed-off-by: Haavard Skinnemoen [EMAIL PROTECTED] --- drivers/net/macb.c |9 - 1 files changed, 4 insertions(+), 5 deletions(-) applied -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.24] e1000e: add new wakeup cababilities
Mitch Williams wrote: Ethtool supports wake-on-ARP and wake-on-link, and so does the hardware supported by e1000e. This patch just introduces the two. Signed-off-by: Mitch Williams [EMAIL PROTECTED] applied -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] PHYLIB: Add BCM5482 PHY support
Nate Case wrote: This Broadcom PHY is similar to other bcm54xx devices. Signed-off-by: Nate Case [EMAIL PROTECTED] --- drivers/net/phy/broadcom.c | 20 1 files changed, 20 insertions(+), 0 deletions(-) applied -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 2.6.24 1/3]S2io: Support for vlan_rx_kill_vid entry point
Sreenivasa Honnur wrote: - Added s2io_vlan_rx_kill_vid entry point function for unregistering vlan. - Fix to aggregate vlan packets. IP offset is incremented by 4 bytes if the packet contains vlan header. Signed-off-by: Surjit Reang [EMAIL PROTECTED] Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED] applied 1-2, patch #3 failed to apply -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] pasemi_mac: Add support for changing mac address
Olof Johansson wrote: Straightforward. It used to be hardcoded and impossible to override with ifconfig. Signed-off-by: Olof Johansson [EMAIL PROTECTED] applied 1-3 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] e1000e: make a function static
applied 1-6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6 patch] remove obsolete tokenring maintainer information
Adrian Bunk wrote: - Peter's email address is bouncing - the project webpage no longer exists - neither Peter nor Mike had a single patch included in the kernel since 2.6.12-rc2 (when the git history begins) Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- MAINTAINERS | 23 --- 1 file changed, 23 deletions(-) applied -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] ehea: fix ehea.h checkpatch complaints
Jeff Garzik wrote: Doug Maxey wrote: Cc: Jan-Bernd Themann [EMAIL PROTECTED] Signed-off-by: Doug Maxey [EMAIL PROTECTED] --- drivers/net/ehea/ehea.h|3 +++ drivers/net/ehea/ehea_hw.h |8 2 files changed, 7 insertions(+), 4 deletions(-) applied 1-5 BTW please do not reply to [EMAIL PROTECTED] That was a Thunderbird fsck-up on my side. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[BUILD FAILURE]2.6.24-git10 section type conflict
Hi All, I found the following kernel build failure for 2.6.24-git10 on my machine: CC [M] drivers/scsi/lpfc/lpfc_attr.o drivers/net/sis190.c:329: error: sis190_pci_tbl causes a section type conflict make[2]: *** [drivers/net/sis190.o] Error 1 make[1]: *** [drivers/net] Error 2 make[1]: *** Waiting for unfinished jobs CC [M] drivers/scsi/lpfc/lpfc_vport.o machine info: [EMAIL PROTECTED] ~]# uname -a Linux x330d.in.ibm.com 2.6.24-rc8 #1 SMP Thu Jan 31 16:46:34 NPT 2008 i686 i686 i386 GNU/Linux Thanks Sudhir Kumar Linux Technology Centre Bangalore -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] Disable TSO for non standard qdiscs
Right - Essentially it is a usability issue: People who know how to use TSO (Peter for example) will be clueful enough to turn it on. Which means the default should be to protect the clueless and turn it off. On Andis approach: Turning TSO off at netdev registration time with a warning will be a cleaner IMO. Or alternatively introducing a kernel-config I know what TSO is option which is then used at netdev registration. From a usability perspective it would make more sense to just keep ethtool as the only way to configure TSO. To me this sounds like the most reasonable approach. I've put out my concerns, so I'll get out of the way now so we can move forward in some direction. :-) Cheers, -PJ Waskiewicz -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUILD FAILURE]2.6.24-git10 section type conflict
On Fri, Feb 01, 2008 at 11:56:27PM +0530, Sudhir Kumar wrote: Hi All, I found the following kernel build failure for 2.6.24-git10 on my machine: CC [M] drivers/scsi/lpfc/lpfc_attr.o drivers/net/sis190.c:329: error: sis190_pci_tbl causes a section type conflict make[2]: *** [drivers/net/sis190.o] Error 1 make[1]: *** [drivers/net] Error 2 make[1]: *** Waiting for unfinished jobs CC [M] drivers/scsi/lpfc/lpfc_vport.o Known issue with fix already pushed upstream. A workaround is to delete __devinitdata from the corresponding .c (sis190.c) file. Sam -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/7] s390: ctc patches for 2.6.25
This patchset breaks git-bisect, by creating interim unbuildable states... -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Lots of BUG eth1 code -5 qlen 0 messages in 2.6.24
I've changed the -EIO into NETDEV_TX_BUSY and so far I can't trigger the bug anymore. It was quite easy to trigger within minutes with rsync, but I can't trigger it anymore. Should I send a patch, and if so: to who? The tulip/xircom_cb driver seems to be orphaned. Perhaps Jeff Garzik is a good place to start. He maintains the driver tree, he either can take the patch or direct you to where it should go. No need to test that, it *is* netif_stopped before the return: /* Uh oh... no free descriptor... drop the packet */ netif_stop_queue(dev); spin_unlock_irqrestore(card-lock,flags); trigger_transmit(card); return NETDEV_TX_BUSY; trigger_transmit() is a simple function that just writes a single register on the card to trigger a transmit. Eek! No need to make this smarter at this point, but this is pretty basic. Cheers, -PJ Waskiewicz -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Typo in net/netfilter/xt_iprange.c (git tree)
Forward to netdev list. --- Forwarded message (begin) Subject: Typo in net/netfilter/xt_iprange.c (git tree) From: Jiri Moravec ... Date: Fri, 01 Feb 2008 15:50:15 +0100 Function iprange_mt4 belong to IPv4 family - AF_INET. Right? .name = iprange, .revision = 1, .family= AF_INET6, -- Typo? .match = iprange_mt4, .matchsize = sizeof(struct xt_iprange_mtinfo), .me= THIS_MODULE, --- Forwarded message (end) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] ehea: fix phyp checkpatch complaints
On Thu, Jan 31, 2008 at 08:20:50PM -0600, Doug Maxey wrote: /* input param R5 */ -#define H_ALL_RES_QP_EQPO EHEA_BMASK_IBM(9, 11) -#define H_ALL_RES_QP_QPP EHEA_BMASK_IBM(12, 12) -#define H_ALL_RES_QP_RQR EHEA_BMASK_IBM(13, 15) -#define H_ALL_RES_QP_EQEG EHEA_BMASK_IBM(16, 16) -#define H_ALL_RES_QP_LL_QPEHEA_BMASK_IBM(17, 17) -#define H_ALL_RES_QP_DMA128 EHEA_BMASK_IBM(19, 19) -#define H_ALL_RES_QP_HSM EHEA_BMASK_IBM(20, 21) -#define H_ALL_RES_QP_SIGT EHEA_BMASK_IBM(22, 23) -#define H_ALL_RES_QP_TENURE EHEA_BMASK_IBM(48, 55) -#define H_ALL_RES_QP_RES_TYP EHEA_BMASK_IBM(56, 63) +#define H_ALL_RES_QP_EQPO EHEA_BMASK_IBM(9, 11) +#define H_ALL_RES_QP_QPP EHEA_BMASK_IBM(12, 12) +#define H_ALL_RES_QP_RQR EHEA_BMASK_IBM(13, 15) +#define H_ALL_RES_QP_EQEG EHEA_BMASK_IBM(16, 16) +#define H_ALL_RES_QP_LL_QP EHEA_BMASK_IBM(17, 17) +#define H_ALL_RES_QP_DMA128EHEA_BMASK_IBM(19, 19) +#define H_ALL_RES_QP_HSM EHEA_BMASK_IBM(20, 21) +#define H_ALL_RES_QP_SIGT EHEA_BMASK_IBM(22, 23) +#define H_ALL_RES_QP_TENUREEHEA_BMASK_IBM(48, 55) +#define H_ALL_RES_QP_RES_TYP EHEA_BMASK_IBM(56, 63) This was better the way it was (before, it was readable at any tab setting); checkpatch is overeager to complain on tab/space issues (it's a bit hard to distinguish indentation from alignment with a regex). -Scott -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH]: TCP + SACK + skb processing latency
On Fri, 1 Feb 2008, Ilpo Järvinen wrote: @@ -1322,6 +1324,206 @@ static int tcp_sacktag_one(struct sk_buff *skb, struct sock *sk, return flag; } +/* Attempts to shift up to shiftlen worth of bytes from prev to skb. + * Returns number bytes shifted. + * + * TODO: in case the prev runs out of frag space, operation could be + * made to return with a partial result (would allow tighter packing). + */ +static int skb_shift(struct sk_buff *prev, struct sk_buff *skb, + unsigned int shiftlen) +{ + int i, to, merge; + unsigned int todo; + struct skb_frag_struct *from, *fragto; + + if (skb_cloned(skb) || skb_cloned(prev)) + return 0; + + todo = shiftlen; + i = 0; + from = skb_shinfo(skb)-frags[i]; + to = skb_shinfo(prev)-nr_frags; + + merge = to - 1; + if (!skb_can_coalesce(prev, merge + 1, from-page, from-page_offset)) + merge = -1; + if (merge = 0) { + i++; + if (from-size = shiftlen) + goto onlymerge; + todo -= from-size; + } + + /* Skip full, not-fitting skb to avoid expensive operations */ + if ((shiftlen == skb-len) + (skb_shinfo(skb)-nr_frags - merge) (MAX_SKB_FRAGS - to)) Before somebody else notices it: s/merge/i/, a leftover from flag - idx conversion. + return 0; + + while (todo (i skb_shinfo(skb)-nr_frags)) { + if (to == MAX_SKB_FRAGS) + return 0; + -- i.
Re: e1000 full-duplex TCP performance well below wire speed
Hi all Rick Jones wrote: 2) use the aforementioned burst TCP_RR test. This is then a single netperf with data flowing both ways on a single connection so no issue of skew, but perhaps an issue of being one connection and so one process on each end. Since our major gaol is to establish a reliable way to test duplex connections this looks like a very good choice. Right now we just run this on a back to back test (cable connecting two hosts), but want to move to a high performance network with up to three switches between hosts. For this we want to have a stable test. I doubt that I will be able to finish the tests tonight, but I'll post a follow-up latest on Monday. Have a nice week-end and thanks a lot for all the suggestions so far! Cheers Carsten -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
R8101 driver link issue
L.S. I'm using linux primaliry through Novell ZEN as part of it's imaging system. A standard problem is arrival of new network cards. Procedures to add new drivers to the boot image are well documented. Recently I got problems with a new motherbord using the Realtek RTL8101E PCI Express Fast Ethernet controller. This card is recognised by the r8169 driver but it always gives me link down. To troubleshoot this issue I installed SLED10SP1 on such a machine, which uses the same kernel as the ZEN imaging environment. So i searched for drivers to compile and came across the r8101-1.006.tar.bz2 driver. -1 To get this driver compiled, I have to edit the source at line 3273: u32 mss = skb_shinfo(skb)-tso_size; replace tso_size - gso_size (or simply removing the if else endif construction altogether :-) -2 but unfortunately, also this new driver, although it get's loaded, it still reports (ethtool eth0) current message level : 0x0..033 (51) Link detected : no I hope I'm writing to the right person to help solve this issue This link detection issue seems to trouble realtek drivers for a long time. -- Thomas Roes Switact Kvk: Leiden 28100283 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] Generic HDLC - use random_ether_addr()
Generic HDLC now uses random_ether_addr() for generating MAC addresse for Ethernet-alike interfaces. Signed-off-by: Krzysztof Halasa [EMAIL PROTECTED] diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c index 071a64c..7926842 100644 --- a/drivers/net/wan/hdlc_fr.c +++ b/drivers/net/wan/hdlc_fr.c @@ -42,7 +42,6 @@ #include linux/init.h #include linux/skbuff.h #include linux/pkt_sched.h -#include linux/random.h #include linux/inetdevice.h #include linux/lapb.h #include linux/rtnetlink.h @@ -1122,10 +1121,9 @@ static int fr_add_pvc(struct net_device *frad, unsigned int dlci, int type) return -ENOBUFS; } - if (type == ARPHRD_ETHER) { - memcpy(dev-dev_addr, \x00\x01, 2); -get_random_bytes(dev-dev_addr + 2, ETH_ALEN - 2); - } else { + if (type == ARPHRD_ETHER) + random_ether_addr(dev-dev_addr); + else { *(__be16*)dev-dev_addr = htons(dlci); dlci_to_q922(dev-broadcast, dlci); } diff --git a/drivers/net/wan/hdlc_raw_eth.c b/drivers/net/wan/hdlc_raw_eth.c index 1a69a9a..104de6f 100644 --- a/drivers/net/wan/hdlc_raw_eth.c +++ b/drivers/net/wan/hdlc_raw_eth.c @@ -18,7 +18,6 @@ #include linux/init.h #include linux/skbuff.h #include linux/pkt_sched.h -#include linux/random.h #include linux/inetdevice.h #include linux/lapb.h #include linux/rtnetlink.h @@ -107,8 +106,7 @@ static int raw_eth_ioctl(struct net_device *dev, struct ifreq *ifr) ether_setup(dev); dev-change_mtu = old_ch_mtu; dev-tx_queue_len = old_qlen; - memcpy(dev-dev_addr, \x00\x01, 2); -get_random_bytes(dev-dev_addr + 2, ETH_ALEN - 2); + random_ether_addr(dev-dev_addr); netif_dormant_off(dev); return 0; } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Disable TSO for non standard qdiscs
On Fri, Feb 01, 2008 at 01:28:15AM -0800, Waskiewicz Jr, Peter P wrote: ... The TCP layer will generate TSO packets based on the kernel socket features associated with the flow. So if you have two devices, one supporting TSO, the other not, then the flows associated with the non-TSO device will not have their packets built for TSO. This has no bearing on the device supporting TSO, which its feature flags will propogate into the kernel socket for that flow, and cause any TCP flows to that device to be TSO packets. So in a nutshell, disabling TSO is on a per-device level, not a global switch. Fine, but I was rather wondering if there could be something more in the idea of this patch that can't be done with ethtool. And I don't think qdisc code currently treats or should treat TCP special. Jarek P. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Disable TSO for non standard qdiscs
Does this also imply that JumboFrames interacts badly with these qdiscs? Or IPoIB with its 65000ish byte MTU? Correct. Of course it is always relative to the link speed. So if your link is 10x faster and your packets 10x bigger you can get similarly smooth shaping. If the later-in-thread mentioned person shaping for their DSL line happens to have enabled JumboFrames on their GbE network, will/should the qdisc negate that? Or is the qdisc currently assuming that the remote end of the DSL will have asked for a smaller MSS? rick jones -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] Generic HDLC - remove now unneeded hdlc_device_desc
Removes now unneeded struct hdlc_device_desc Signed-off-by: Krzysztof Halasa [EMAIL PROTECTED] diff --git a/drivers/net/wan/hdlc.c b/drivers/net/wan/hdlc.c index d553e6f..39951d0 100644 --- a/drivers/net/wan/hdlc.c +++ b/drivers/net/wan/hdlc.c @@ -1,7 +1,7 @@ /* * Generic HDLC support routines for Linux * - * Copyright (C) 1999 - 2006 Krzysztof Halasa [EMAIL PROTECTED] + * Copyright (C) 1999 - 2008 Krzysztof Halasa [EMAIL PROTECTED] * * This program is free software; you can redistribute it and/or modify it * under the terms of version 2 of the GNU General Public License @@ -39,7 +39,7 @@ #include net/net_namespace.h -static const char* version = HDLC support module revision 1.21; +static const char* version = HDLC support module revision 1.22; #undef DEBUG_LINK @@ -66,19 +66,15 @@ static struct net_device_stats *hdlc_get_stats(struct net_device *dev) static int hdlc_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *p, struct net_device *orig_dev) { - struct hdlc_device_desc *desc = dev_to_desc(dev); + struct hdlc_device *hdlc = dev_to_hdlc(dev); if (dev-nd_net != init_net) { kfree_skb(skb); return 0; } - if (desc-netif_rx) - return desc-netif_rx(skb); - - desc-stats.rx_dropped++; /* Shouldn't happen */ - dev_kfree_skb(skb); - return NET_RX_DROP; + BUG_ON(!hdlc-proto-netif_rx); + return hdlc-proto-netif_rx(skb); } @@ -87,7 +83,7 @@ static inline void hdlc_proto_start(struct net_device *dev) { hdlc_device *hdlc = dev_to_hdlc(dev); if (hdlc-proto-start) - return hdlc-proto-start(dev); + hdlc-proto-start(dev); } @@ -96,7 +92,7 @@ static inline void hdlc_proto_stop(struct net_device *dev) { hdlc_device *hdlc = dev_to_hdlc(dev); if (hdlc-proto-stop) - return hdlc-proto-stop(dev); + hdlc-proto-stop(dev); } @@ -263,8 +259,7 @@ static void hdlc_setup(struct net_device *dev) struct net_device *alloc_hdlcdev(void *priv) { struct net_device *dev; - dev = alloc_netdev(sizeof(struct hdlc_device_desc) + - sizeof(hdlc_device), hdlc%d, hdlc_setup); + dev = alloc_netdev(sizeof(struct hdlc_device), hdlc%d, hdlc_setup); if (dev) dev_to_hdlc(dev)-priv = priv; return dev; @@ -281,7 +276,7 @@ void unregister_hdlc_device(struct net_device *dev) int attach_hdlc_protocol(struct net_device *dev, struct hdlc_proto *proto, -int (*rx)(struct sk_buff *skb), size_t size) +size_t size) { detach_hdlc_protocol(dev); @@ -297,7 +292,6 @@ int attach_hdlc_protocol(struct net_device *dev, struct hdlc_proto *proto, return -ENOBUFS; } dev_to_hdlc(dev)-proto = proto; - dev_to_desc(dev)-netif_rx = rx; return 0; } diff --git a/drivers/net/wan/hdlc_cisco.c b/drivers/net/wan/hdlc_cisco.c index 038a6e7..7133c68 100644 --- a/drivers/net/wan/hdlc_cisco.c +++ b/drivers/net/wan/hdlc_cisco.c @@ -250,7 +250,7 @@ static int cisco_rx(struct sk_buff *skb) return NET_RX_DROP; rx_error: - dev_to_desc(dev)-stats.rx_errors++; /* Mark error */ + dev_to_hdlc(dev)-stats.rx_errors++; /* Mark error */ dev_kfree_skb_any(skb); return NET_RX_DROP; } @@ -314,6 +314,7 @@ static struct hdlc_proto proto = { .stop = cisco_stop, .type_trans = cisco_type_trans, .ioctl = cisco_ioctl, + .netif_rx = cisco_rx, .module = THIS_MODULE, }; @@ -360,7 +361,7 @@ static int cisco_ioctl(struct net_device *dev, struct ifreq *ifr) if (result) return result; - result = attach_hdlc_protocol(dev, proto, cisco_rx, + result = attach_hdlc_protocol(dev, proto, sizeof(struct cisco_state)); if (result) return result; diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c index 071a64c..ccd11be 100644 --- a/drivers/net/wan/hdlc_fr.c +++ b/drivers/net/wan/hdlc_fr.c @@ -957,7 +957,7 @@ static int fr_rx(struct sk_buff *skb) if ((skb = skb_share_check(skb, GFP_ATOMIC)) == NULL) { - dev_to_desc(frad)-stats.rx_dropped++; + dev_to_hdlc(frad)-stats.rx_dropped++; return NET_RX_DROP; } @@ -1018,7 +1018,7 @@ static int fr_rx(struct sk_buff *skb) } rx_error: - dev_to_desc(frad)-stats.rx_errors++; /* Mark error */ + dev_to_hdlc(frad)-stats.rx_errors++; /* Mark error */ dev_kfree_skb_any(skb); return NET_RX_DROP; } @@ -1219,6 +1219,7 @@ static struct hdlc_proto proto = { .stop = fr_stop, .detach = fr_destroy,
[WAN] Generic HDLC - three patches
Hi Jeff, Three patches - the first one fixes kernel panic in Frame Relay mode (regression present in 2.6.23 and 2.6.24, noticed only recently). The second one removes now unneeded struct proliferation. The third one - Ethernet encapsulations now use random_ether_addr() instead of my old invention. PPP (hdlc_ppp using syncppp) is still broken (by the same change as FR), I guess the correct fix now is rewriting syncppp (for hdlc_ppp use). No ETA :-( Tested in action. -- Krzysztof Halasa -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] Generic HDLC - fix kernel panic
Fixes kernel panic in Frame Relay mode Signed-off-by: Krzysztof Halasa [EMAIL PROTECTED] diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c index 7926842..327b218 100644 --- a/drivers/net/wan/hdlc_fr.c +++ b/drivers/net/wan/hdlc_fr.c @@ -135,6 +135,10 @@ typedef struct pvc_device_struct { }state; }pvc_device; +struct pvc_desc { + struct net_device_stats stats; + pvc_device *pvc; +}; struct frad_state { fr_proto settings; @@ -170,17 +174,20 @@ static inline void dlci_to_q922(u8 *hdr, u16 dlci) } -static inline struct frad_state * state(hdlc_device *hdlc) +static inline struct frad_state* state(hdlc_device *hdlc) { return(struct frad_state *)(hdlc-state); } - -static __inline__ pvc_device* dev_to_pvc(struct net_device *dev) +static inline struct pvc_desc* pvcdev_to_desc(struct net_device *dev) { return dev-priv; } +static inline struct net_device_stats* pvc_get_stats(struct net_device *dev) +{ + return pvcdev_to_desc(dev)-stats; +} static inline pvc_device* find_pvc(hdlc_device *hdlc, u16 dlci) { @@ -350,7 +357,7 @@ static int fr_hard_header(struct sk_buff **skb_p, u16 dlci) static int pvc_open(struct net_device *dev) { - pvc_device *pvc = dev_to_pvc(dev); + pvc_device *pvc = pvcdev_to_desc(dev)-pvc; if ((pvc-frad-flags IFF_UP) == 0) return -EIO; /* Frad must be UP in order to activate PVC */ @@ -370,7 +377,7 @@ static int pvc_open(struct net_device *dev) static int pvc_close(struct net_device *dev) { - pvc_device *pvc = dev_to_pvc(dev); + pvc_device *pvc = pvcdev_to_desc(dev)-pvc; if (--pvc-open_count == 0) { hdlc_device *hdlc = dev_to_hdlc(pvc-frad); @@ -389,7 +396,7 @@ static int pvc_close(struct net_device *dev) static int pvc_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) { - pvc_device *pvc = dev_to_pvc(dev); + pvc_device *pvc = pvcdev_to_desc(dev)-pvc; fr_proto_pvc_info info; if (ifr-ifr_settings.type == IF_GET_PROTO) { @@ -415,17 +422,9 @@ static int pvc_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) return -EINVAL; } - -static inline struct net_device_stats *pvc_get_stats(struct net_device *dev) -{ - return dev_to_desc(dev)-stats; -} - - - static int pvc_xmit(struct sk_buff *skb, struct net_device *dev) { - pvc_device *pvc = dev_to_pvc(dev); + pvc_device *pvc = pvcdev_to_desc(dev)-pvc; struct net_device_stats *stats = pvc_get_stats(dev); if (pvc-state.active) { @@ -1108,11 +1107,10 @@ static int fr_add_pvc(struct net_device *frad, unsigned int dlci, int type) used = pvc_is_used(pvc); if (type == ARPHRD_ETHER) - dev = alloc_netdev(sizeof(struct net_device_stats), - pvceth%d, ether_setup); + dev = alloc_netdev(sizeof(struct pvc_desc), pvceth%d, + ether_setup); else - dev = alloc_netdev(sizeof(struct net_device_stats), - pvc%d, pvc_setup); + dev = alloc_netdev(sizeof(struct pvc_desc), pvc%d, pvc_setup); if (!dev) { printk(KERN_WARNING %s: Memory squeeze on fr_pvc()\n, @@ -1135,7 +1133,7 @@ static int fr_add_pvc(struct net_device *frad, unsigned int dlci, int type) dev-change_mtu = pvc_change_mtu; dev-mtu = HDLC_MAX_MTU; dev-tx_queue_len = 0; - dev-priv = pvc; + pvcdev_to_desc(dev)-pvc = pvc; result = dev_alloc_name(dev, dev-name); if (result 0) { -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/5] [IPV6]: Reorg struct ifmcaddr6 to save some bytes
From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] /home/acme/git/net-2.6/net/ipv6/mcast.c: struct ifmcaddr6 | -8 1 struct changed igmp6_group_dropped | -6 add_grec | -3 mld_ifc_timer_expire | -18 ip6_mc_add_src | -3 ip6_mc_del_src | -3 igmp6_group_added| -3 6 functions changed, 36 bytes removed, diff: -36 ipv6.ko: 6 functions changed, 36 bytes removed, diff: -36 Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/net/if_inet6.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index 66c43e2..b2cfc49 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -112,13 +112,13 @@ struct ifmcaddr6 struct ip6_sf_list *mca_sources; struct ip6_sf_list *mca_tomb; unsigned intmca_sfmode; + unsigned char mca_crcount; unsigned long mca_sfcount[2]; struct timer_list mca_timer; unsignedmca_flags; int mca_users; atomic_tmca_refcnt; spinlock_t mca_lock; - unsigned char mca_crcount; unsigned long mca_cstamp; unsigned long mca_tstamp; }; -- 1.5.3.8 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] [INET6]: Reorganize struct inet6_dev to save 8 bytes
From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] And make it a multiple of a 64 bytes, reducing cacheline trashing: Before: [EMAIL PROTECTED] net-2.6]$ pahole -C inet6_dev net/dccp/ipv6.o struct inet6_dev { SNIP long unsigned int mc_maxdelay; /*48 8 */ unsigned char mc_qrv; /*56 1 */ unsigned char mc_gq_running;/*57 1 */ unsigned char mc_ifc_count; /*58 1 */ /* XXX 5 bytes hole, try to pack */ /* --- cacheline 1 boundary (64 bytes) --- */ struct timer_list mc_gq_timer; /*6448 */ SNIP __u32 if_flags; /* 180 4 */ intdead; /* 184 4 */ u8 rndid[8]; /* 188 8 */ /* XXX 4 bytes hole, try to pack */ /* --- cacheline 3 boundary (192 bytes) was 8 bytes ago --- */ struct timer_list regen_timer; /* 20048 */ SNIP /* size: 456, cachelines: 8 */ /* sum members: 447, holes: 2, sum holes: 9 */ /* last cacheline: 8 bytes */ }; After: net-2.6/net/ipv6/af_inet6.c: struct inet6_dev | -8 1 struct changed Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/net/if_inet6.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index b24508a..66c43e2 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -166,11 +166,11 @@ struct inet6_dev struct ifmcaddr6*mc_list; struct ifmcaddr6*mc_tomb; rwlock_tmc_lock; - unsigned long mc_v1_seen; - unsigned long mc_maxdelay; unsigned char mc_qrv; unsigned char mc_gq_running; unsigned char mc_ifc_count; + unsigned long mc_v1_seen; + unsigned long mc_maxdelay; struct timer_list mc_gq_timer;/* general query timer */ struct timer_list mc_ifc_timer; /* interface change timer */ -- 1.5.3.8 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] [DCCP]: Reorganize struct dccp_sock to save 8 bytes
From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] /home/acme/git/net-2.6/net/dccp/ipv6.c: struct dccp_sock | -8 struct dccp6_sock | -8 2 structs changed Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/dccp.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/include/linux/dccp.h b/include/linux/dccp.h index 484e45c..aa07370 100644 --- a/include/linux/dccp.h +++ b/include/linux/dccp.h @@ -525,6 +525,7 @@ struct dccp_sock { __u64 dccps_gsr; __u64 dccps_gar; __be32 dccps_service; + __u32 dccps_mss_cache; struct dccp_service_list*dccps_service_list; __u32 dccps_timestamp_echo; __u32 dccps_timestamp_time; @@ -533,7 +534,6 @@ struct dccp_sock { __u16 dccps_pcslen; __u16 dccps_pcrlen; unsigned long dccps_ndp_count; - __u32 dccps_mss_cache; unsigned long dccps_rate_last; struct dccp_minisockdccps_minisock; struct dccp_ackvec *dccps_hc_rx_ackvec; -- 1.5.3.8 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHES 0/5]: Add hashinfo member to struct proto and get net/ structs back on a diet
Hi David, Please consider pulling from: master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.25 There are many more structs that have holes, they seem to crop up, even on DCCP! People should use some paholing 8-) I'll prepare another round after carnival, and also will look at using sparse and/or Uli's libdisasm and/or DeHydra to find struct member accesses in functions to correlate that with struct layout when doing automatic struct member reorgs, so as not to step into Eric's fast path 8-P Best Regards, - Arnaldo include/linux/dccp.h |2 - include/net/if_inet6.h |6 +-- include/net/inet6_hashtables.h |2 - include/net/inet_connection_sock.h |8 + include/net/inet_hashtables.h | 51 +--- include/net/inet_timewait_sock.h |2 - include/net/sock.h |3 + net/dccp/dccp.h|2 - net/dccp/ipv4.c| 18 --- net/dccp/ipv6.c| 20 +--- net/dccp/proto.c | 18 +-- net/ipv4/inet_connection_sock.c|8 + net/ipv4/inet_hashtables.c | 58 ++--- net/ipv4/tcp.c |2 - net/ipv4/tcp_ipv4.c| 31 +-- net/ipv6/inet6_hashtables.c|4 +- net/ipv6/tcp_ipv6.c| 19 +--- 17 files changed, 108 insertions(+), 146 deletions(-) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] [INET_TIMEWAIT_SOCK]: Reorganize struct inet_timewait_sock to save some bytes
From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] /home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c: struct inet_timewait_sock | -8 struct tcp_timewait_sock | -8 2 structs changed tcp_v6_rcv| -6 1 function changed, 6 bytes removed, diff: -6 Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/net/inet_timewait_sock.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h index 67e9250..296547b 100644 --- a/include/net/inet_timewait_sock.h +++ b/include/net/inet_timewait_sock.h @@ -116,6 +116,7 @@ struct inet_timewait_sock { #define tw_hash__tw_common.skc_hash #define tw_prot__tw_common.skc_prot #define tw_net __tw_common.skc_net + int tw_timeout; volatile unsigned char tw_substate; /* 3 bits hole, try to pack */ unsigned char tw_rcv_wscale; @@ -130,7 +131,6 @@ struct inet_timewait_sock { __u8tw_ipv6only:1; /* 15 bits hole, try to pack */ __u16 tw_ipv6_offset; - int tw_timeout; unsigned long tw_ttd; struct inet_bind_bucket *tw_tb; struct hlist_node tw_death_node; -- 1.5.3.8 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] [SOCK] proto: Add hashinfo member to struct proto
From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] This way we can remove TCP and DCCP specific versions of sk-sk_prot-get_port: both v4 and v6 use inet_csk_get_port sk-sk_prot-hash: inet_hash is directly used, only v6 need a specific version to deal with mapped sockets sk-sk_prot-unhash: both v4 and v6 use inet_hash directly struct inet_connection_sock_af_ops also gets a new member, bind_conflict, so that inet_csk_get_port can find the per family routine. Now only the lookup routines receive as a parameter a struct inet_hashtable. With this we further reuse code, reducing the difference among INET transport protocols. Eventually work has to be done on UDP and SCTP to make them share this infrastructure and get as a bonus inet_diag interfaces so that iproute can be used with these protocols. net-2.6/net/ipv4/inet_hashtables.c: struct proto | +8 struct inet_connection_sock_af_ops | +8 2 structs changed __inet_hash_nolisten | +18 __inet_hash| -210 inet_put_port | +8 inet_bind_bucket_create| +1 __inet_hash_connect| -8 5 functions changed, 27 bytes added, 218 bytes removed, diff: -191 net-2.6/net/core/sock.c: proto_seq_show | +3 1 function changed, 3 bytes added, diff: +3 net-2.6/net/ipv4/inet_connection_sock.c: inet_csk_get_port | +15 1 function changed, 15 bytes added, diff: +15 net-2.6/net/ipv4/tcp.c: tcp_set_state | -7 1 function changed, 7 bytes removed, diff: -7 net-2.6/net/ipv4/tcp_ipv4.c: tcp_v4_get_port| -31 tcp_v4_hash| -48 tcp_v4_destroy_sock| -7 tcp_v4_syn_recv_sock | -2 tcp_unhash | -179 5 functions changed, 267 bytes removed, diff: -267 net-2.6/net/ipv6/inet6_hashtables.c: __inet6_hash | +8 1 function changed, 8 bytes added, diff: +8 net-2.6/net/ipv4/inet_hashtables.c: inet_unhash| +190 inet_hash | +242 2 functions changed, 432 bytes added, diff: +432 vmlinux: 16 functions changed, 485 bytes added, 492 bytes removed, diff: -7 /home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c: tcp_v6_get_port| -31 tcp_v6_hash| -7 tcp_v6_syn_recv_sock | -9 3 functions changed, 47 bytes removed, diff: -47 /home/acme/git/net-2.6/net/dccp/proto.c: dccp_destroy_sock | -7 dccp_unhash| -179 dccp_hash | -49 dccp_set_state | -7 dccp_done | +1 5 functions changed, 1 bytes added, 242 bytes removed, diff: -241 /home/acme/git/net-2.6/net/dccp/ipv4.c: dccp_v4_get_port | -31 dccp_v4_request_recv_sock | -2 2 functions changed, 33 bytes removed, diff: -33 /home/acme/git/net-2.6/net/dccp/ipv6.c: dccp_v6_get_port | -31 dccp_v6_hash | -7 dccp_v6_request_recv_sock | +5 3 functions changed, 5 bytes added, 38 bytes removed, diff: -33 Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/net/inet6_hashtables.h |2 +- include/net/inet_connection_sock.h |8 ++--- include/net/inet_hashtables.h | 51 +-- include/net/sock.h |3 ++ net/dccp/dccp.h|2 - net/dccp/ipv4.c| 18 --- net/dccp/ipv6.c| 20 +--- net/dccp/proto.c | 18 +-- net/ipv4/inet_connection_sock.c|8 ++--- net/ipv4/inet_hashtables.c | 58 +-- net/ipv4/tcp.c |2 +- net/ipv4/tcp_ipv4.c| 31 +-- net/ipv6/inet6_hashtables.c|4 +- net/ipv6/tcp_ipv6.c| 19 +--- 14 files changed, 103 insertions(+), 141 deletions(-) diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h index fdff630..62a5b69 100644 --- a/include/net/inet6_hashtables.h +++ b/include/net/inet6_hashtables.h @@ -49,7 +49,7 @@ static inline int inet6_sk_ehashfn(const struct sock *sk) return inet6_ehashfn(laddr, lport, faddr, fport); } -extern void __inet6_hash(struct inet_hashinfo *hashinfo, struct sock *sk); +extern void __inet6_hash(struct sock *sk); /* * Sockets in TCP_CLOSE state are _always_ taken out of the hash, so diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index 133cf30..f00f057 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -29,7 +29,6 @@ #undef INET_CSK_CLEAR_TIMERS struct inet_bind_bucket; -struct inet_hashinfo; struct tcp_congestion_ops; /*
Re: [PATCH] Disable TSO for non standard qdiscs
jamal wrote, On 02/01/2008 01:06 PM: On Fri, 2008-01-02 at 10:56 +0100, Patrick McHardy wrote: We don't want to disable TSO for cases where it makes sense, but who is using TBF on 10GbE? The point is that most users of qdiscs which are incapable of dealing with TSO without hacks or special configuration probably don't care, and 10GbE users know about ethtool *and* don't use TBF or HTB (which are probably the only qdiscs which actually have problems, maybe also CBQ). Right - Essentially it is a usability issue: People who know how to use TSO (Peter for example) will be clueful enough to turn it on. Which means the default should be to protect the clueless and turn it off. On Andis approach: Turning TSO off at netdev registration time with a warning will be a cleaner IMO. Or alternatively introducing a kernel-config I know what TSO is option which is then used at netdev registration. From a usability perspective it would make more sense to just keep ethtool as the only way to configure TSO. [I recently spent a few days helping someone debug a problem with IFB because he was redirecting packets from an TSO netdevice and occasionaly some multi-packet will be missed in the calculation; my answer was turn off TSO; so there are more use cases for this TSO issue]. I totally disagree with these POVs: - 10G cards should be treated by default as 10G cards - not DSL modems, and common users shouldn't have to read any warnings or configs to see this. - tc with TBF or HTB are professional tools; there should be added some warnings to manuals. But trying to change the way they work because we think we know better what users want, and changing BTW some other things (making debugging this later a hell), is simply disrespectful for target users of these tools. There are some wrappers or creators invented for this. And, BTW, I think I've seen somewhere a system which does this this other way - with creators for professionals. So, you could be right with this too... Cheers, Jarek P. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
pull request: wireless-2.6 'fixes' 2008-02-01
Dave, Here are some more fixes suitable for 2.6.25. Also there is a patch that includes the mac80211 alignment warning as a configurable option, which should stop it from annoying normal users. Let me know if there are problems! Thanks, John --- Individual patches are available here: http://www.kernel.org/pub/linux/kernel/people/linville/wireless-2.6/fixes --- The following changes since commit 24e1c13c93cbdd05e4b7ea921c0050b036555adc: Linus Torvalds (1): Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git fixes Johannes Berg (3): mac80211: make alignment warning optional mac80211 rate control: fix section mismatch mac80211: fix initialisation error path John W. Linville (1): ath5k: fix section mismatch warning Reinette Chatre (1): iwlwifi: fix merge sequence: exit on error before state change Ron Rindjunsky (1): iwlwifi: fix sparse warning in iwl 3945 Tomas Winkler (2): iwlwifi: Fix MIMO PS mode iwlwifi: remove ieee80211 types from iwl-helpers.h drivers/net/wireless/ath5k/base.c |6 +++--- drivers/net/wireless/iwlwifi/iwl-3945.c |7 --- drivers/net/wireless/iwlwifi/iwl-4965.c | 23 ++- drivers/net/wireless/iwlwifi/iwl-helpers.h |3 --- drivers/net/wireless/iwlwifi/iwl3945-base.c | 10 +- drivers/net/wireless/iwlwifi/iwl4965-base.c | 10 +- include/linux/ieee80211.h |6 ++ net/mac80211/Kconfig| 12 net/mac80211/ieee80211.c| 14 +++--- net/mac80211/rc80211_pid_algo.c |2 +- net/mac80211/rc80211_simple.c |2 +- net/mac80211/rx.c |7 +++ 12 files changed, 69 insertions(+), 33 deletions(-) diff --git a/drivers/net/wireless/ath5k/base.c b/drivers/net/wireless/ath5k/base.c index d6599d2..ddc8714 100644 --- a/drivers/net/wireless/ath5k/base.c +++ b/drivers/net/wireless/ath5k/base.c @@ -153,7 +153,7 @@ static int ath5k_pci_resume(struct pci_dev *pdev); #define ath5k_pci_resume NULL #endif /* CONFIG_PM */ -static struct pci_driver ath5k_pci_drv_id = { +static struct pci_driver ath5k_pci_driver = { .name = ath5k_pci, .id_table = ath5k_pci_id_table, .probe = ath5k_pci_probe, @@ -329,7 +329,7 @@ init_ath5k_pci(void) ath5k_debug_init(); - ret = pci_register_driver(ath5k_pci_drv_id); + ret = pci_register_driver(ath5k_pci_driver); if (ret) { printk(KERN_ERR ath5k_pci: can't register pci driver\n); return ret; @@ -341,7 +341,7 @@ init_ath5k_pci(void) static void __exit exit_ath5k_pci(void) { - pci_unregister_driver(ath5k_pci_drv_id); + pci_unregister_driver(ath5k_pci_driver); ath5k_debug_finish(); } diff --git a/drivers/net/wireless/iwlwifi/iwl-3945.c b/drivers/net/wireless/iwlwifi/iwl-3945.c index 4fdeb53..8d4d91d 100644 --- a/drivers/net/wireless/iwlwifi/iwl-3945.c +++ b/drivers/net/wireless/iwlwifi/iwl-3945.c @@ -238,9 +238,10 @@ void iwl3945_hw_rx_statistics(struct iwl3945_priv *priv, struct iwl3945_rx_mem_b priv-last_statistics_time = jiffies; } -void iwl3945_add_radiotap(struct iwl3945_priv *priv, struct sk_buff *skb, - struct iwl3945_rx_frame_hdr *rx_hdr, - struct ieee80211_rx_status *stats) +static void iwl3945_add_radiotap(struct iwl3945_priv *priv, +struct sk_buff *skb, +struct iwl3945_rx_frame_hdr *rx_hdr, +struct ieee80211_rx_status *stats) { /* First cache any information we need before we overwrite * the information provided in the skb from the hardware */ diff --git a/drivers/net/wireless/iwlwifi/iwl-4965.c b/drivers/net/wireless/iwlwifi/iwl-4965.c index 569347f..d727de8 100644 --- a/drivers/net/wireless/iwlwifi/iwl-4965.c +++ b/drivers/net/wireless/iwlwifi/iwl-4965.c @@ -4658,17 +4658,30 @@ void iwl4965_set_ht_add_station(struct iwl4965_priv *priv, u8 index, struct ieee80211_ht_info *sta_ht_inf) { __le32 sta_flags; + u8 mimo_ps_mode; if (!sta_ht_inf || !sta_ht_inf-ht_supported) goto done; + mimo_ps_mode = (sta_ht_inf-cap IEEE80211_HT_CAP_MIMO_PS) 2; + sta_flags = priv-stations[index].sta.station_flags; - if (((sta_ht_inf-cap IEEE80211_HT_CAP_MIMO_PS 2)) - == IWL_MIMO_PS_DYNAMIC) + sta_flags = ~(STA_FLG_RTS_MIMO_PROT_MSK | STA_FLG_MIMO_DIS_MSK); + + switch (mimo_ps_mode) { + case WLAN_HT_CAP_MIMO_PS_STATIC: + sta_flags |= STA_FLG_MIMO_DIS_MSK; +
Re: R8101 driver link issue
Switact - Thomas Roes [EMAIL PROTECTED] : [...] Recently I got problems with a new motherbord using the Realtek RTL8101E PCI Express Fast Ethernet controller. This card is recognised by the r8169 driver but it always gives me link down. Can you send the output of 'lspci -vvxxx' and the dmesg of a recent kernel ? It should include a line like this one: [...] eth0: RTL8168b/8111b at 0xf892, 00:13:8f:ea:b1:5d, XID 3800 IRQ 221 ^^^ [...] So i searched for drivers to compile and came across the r8101-1.006.tar.bz2 driver. Ok. [...] -2 but unfortunately, also this new driver, although it get's loaded, it still reports (ethtool eth0) current message level : 0x0..033 (51) Link detected : no Can you send the output of mii-diag and try so force a reset with this driver ? -- Ueimor -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Still oopsing in nf_nat_move_storage()
On 01/31/2008 01:03 PM, Chuck Ebbert wrote: On 01/29/2008 12:18 PM, Patrick McHardy wrote: Chuck Ebbert wrote: nf_nat_move_storage(): /usr/src/debug/kernel-2.6.23/linux-2.6.23.i686/net/ipv4/netfilter/nf_nat_core.c:612 87: f7 47 64 80 01 00 00testl $0x180,0x64(%edi) 8e: 74 39 je c9 nf_nat_move_storage+0x65 line 612: if (!(ct-status IPS_NAT_DONE_MASK)) return; ct is NULL The current kernel (and 2.6.23-stable) have: if (!ct || !(ct-status IPS_NAT_DONE_MASK)) return; so it seems you're using an old version. So, it is now oopsing after the test for NULL and only x86_64 is catching the invalid address because it is non-canonical. Checking for NULL is obviously not enough... -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1]: Add support for aes-ctr to ipsec
On Thu, Jan 31, 2008 at 10:59:28AM -0600, Joy Latten wrote: Very sorry, re-posting as first patch was incomplete. The below patch allows IPsec to use CTR mode with AES encryption algorithm. Tested this using setkey in ipsec-tools. regards, Joy Signed-off-by: Joy Latten [EMAIL PROTECTED] Acked-by: Herbert Xu [EMAIL PROTECTED] Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/8] ixgbe: remove obsolete irq_sem, add driver state checking code
From: Ayyappan Veeraiyan [EMAIL PROTECTED] After testing we confirmed that the irq_sem can safely be removed from ixgbe. Add strict state checking code to various ethtool parts to properly protect against races between various driver reset paths. Signed-off-by: Ayyappan Veeraiyan [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/ixgbe/ixgbe.h |2 + drivers/net/ixgbe/ixgbe_ethtool.c | 29 -- drivers/net/ixgbe/ixgbe_main.c| 60 ++--- 3 files changed, 49 insertions(+), 42 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h index a021a6e..7dd9a03 100644 --- a/drivers/net/ixgbe/ixgbe.h +++ b/drivers/net/ixgbe/ixgbe.h @@ -174,7 +174,6 @@ struct ixgbe_adapter { struct vlan_group *vlgrp; u16 bd_number; u16 rx_buf_len; - atomic_t irq_sem; struct work_struct reset_task; /* TX */ @@ -244,6 +243,7 @@ extern const char ixgbe_driver_version[]; extern int ixgbe_up(struct ixgbe_adapter *adapter); extern void ixgbe_down(struct ixgbe_adapter *adapter); +extern void ixgbe_reinit_locked(struct ixgbe_adapter *adapter); extern void ixgbe_reset(struct ixgbe_adapter *adapter); extern void ixgbe_update_stats(struct ixgbe_adapter *adapter); extern void ixgbe_set_ethtool_ops(struct net_device *netdev); diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c index 3635344..9f3cdb8 100644 --- a/drivers/net/ixgbe/ixgbe_ethtool.c +++ b/drivers/net/ixgbe/ixgbe_ethtool.c @@ -179,12 +179,10 @@ static int ixgbe_set_pauseparam(struct net_device *netdev, hw-fc.original_type = hw-fc.type; - if (netif_running(adapter-netdev)) { - ixgbe_down(adapter); - ixgbe_up(adapter); - } else { + if (netif_running(netdev)) + ixgbe_reinit_locked(adapter); + else ixgbe_reset(adapter); - } return 0; } @@ -203,12 +201,10 @@ static int ixgbe_set_rx_csum(struct net_device *netdev, u32 data) else adapter-flags = ~IXGBE_FLAG_RX_CSUM_ENABLED; - if (netif_running(netdev)) { - ixgbe_down(adapter); - ixgbe_up(adapter); - } else { + if (netif_running(netdev)) + ixgbe_reinit_locked(adapter); + else ixgbe_reset(adapter); - } return 0; } @@ -662,7 +658,10 @@ static int ixgbe_set_ringparam(struct net_device *netdev, return 0; } - if (netif_running(adapter-netdev)) + while (test_and_set_bit(__IXGBE_RESETTING, adapter-state)) + msleep(1); + + if (netif_running(netdev)) ixgbe_down(adapter); /* @@ -733,6 +732,7 @@ err_setup: if (netif_running(adapter-netdev)) ixgbe_up(adapter); + clear_bit(__IXGBE_RESETTING, adapter-state); return err; } @@ -820,11 +820,8 @@ static int ixgbe_nway_reset(struct net_device *netdev) { struct ixgbe_adapter *adapter = netdev_priv(netdev); - if (netif_running(netdev)) { - ixgbe_down(adapter); - ixgbe_reset(adapter); - ixgbe_up(adapter); - } + if (netif_running(netdev)) + ixgbe_reinit_locked(adapter); return 0; } diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c index 3732dd6..28bb203 100644 --- a/drivers/net/ixgbe/ixgbe_main.c +++ b/drivers/net/ixgbe/ixgbe_main.c @@ -535,7 +535,9 @@ static irqreturn_t ixgbe_msix_lsc(int irq, void *data) if (!test_bit(__IXGBE_DOWN, adapter-state)) mod_timer(adapter-watchdog_timer, jiffies); } - IXGBE_WRITE_REG(adapter-hw, IXGBE_EIMS, IXGBE_EIMS_OTHER); + + if (!test_bit(__IXGBE_DOWN, adapter-state)) + IXGBE_WRITE_REG(hw, IXGBE_EIMS, IXGBE_EIMS_OTHER); return IRQ_HANDLED; } @@ -713,7 +715,6 @@ static irqreturn_t ixgbe_intr(int irq, void *data) if (netif_rx_schedule_prep(netdev, adapter-napi)) { /* Disable interrupts and register for poll. The flush of the * posted write is intentionally left out. */ - atomic_inc(adapter-irq_sem); IXGBE_WRITE_REG(adapter-hw, IXGBE_EIMC, ~0); __netif_rx_schedule(netdev, adapter-napi); } @@ -801,7 +802,6 @@ static void ixgbe_free_irq(struct ixgbe_adapter *adapter) **/ static inline void ixgbe_irq_disable(struct ixgbe_adapter *adapter) { - atomic_inc(adapter-irq_sem); IXGBE_WRITE_REG(adapter-hw, IXGBE_EIMC, ~0); IXGBE_WRITE_FLUSH(adapter-hw); synchronize_irq(adapter-pdev-irq); @@ -813,15 +813,13 @@ static inline void ixgbe_irq_disable(struct ixgbe_adapter *adapter) **/ static inline void ixgbe_irq_enable(struct ixgbe_adapter *adapter) { - if (atomic_dec_and_test(adapter-irq_sem)) { -
[PATCH 2/8] ixbge: remove TX lock and redo TX accounting.
From: Ayyappan Veeraiyan [EMAIL PROTECTED] This ports Herbert Xu's maybe_stop_tx code and removes the tx_lock which is not needed. Signed-off-by: Ayyappan Veeraiyan [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/ixgbe/ixgbe.h |2 - drivers/net/ixgbe/ixgbe_main.c | 110 2 files changed, 76 insertions(+), 36 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h index 7dd9a03..d0bf206 100644 --- a/drivers/net/ixgbe/ixgbe.h +++ b/drivers/net/ixgbe/ixgbe.h @@ -136,8 +136,6 @@ struct ixgbe_ring { u16 head; u16 tail; - /* To protect race between sender and clean_tx_irq */ - spinlock_t tx_lock; struct ixgbe_queue_stats stats; diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c index 28bb203..b4c9c77 100644 --- a/drivers/net/ixgbe/ixgbe_main.c +++ b/drivers/net/ixgbe/ixgbe_main.c @@ -165,6 +165,15 @@ static inline bool ixgbe_check_tx_hang(struct ixgbe_adapter *adapter, return false; } +#define IXGBE_MAX_TXD_PWR 14 +#define IXGBE_MAX_DATA_PER_TXD (1 IXGBE_MAX_TXD_PWR) + +/* Tx Descriptors needed, worst case */ +#define TXD_USE_COUNT(S) (((S) IXGBE_MAX_TXD_PWR) + \ +(((S) (IXGBE_MAX_DATA_PER_TXD - 1)) ? 1 : 0)) +#define DESC_NEEDED (TXD_USE_COUNT(IXGBE_MAX_DATA_PER_TXD) /* skb-data */ + \ + MAX_SKB_FRAGS * TXD_USE_COUNT(PAGE_SIZE) + 1) /* for context */ + /** * ixgbe_clean_tx_irq - Reclaim resources after transmit completes * @adapter: board private structure @@ -177,18 +186,34 @@ static bool ixgbe_clean_tx_irq(struct ixgbe_adapter *adapter, struct ixgbe_tx_buffer *tx_buffer_info; unsigned int i, eop; bool cleaned = false; - int count = 0; + unsigned int total_tx_bytes = 0, total_tx_packets = 0; i = tx_ring-next_to_clean; eop = tx_ring-tx_buffer_info[i].next_to_watch; eop_desc = IXGBE_TX_DESC_ADV(*tx_ring, eop); while (eop_desc-wb.status cpu_to_le32(IXGBE_TXD_STAT_DD)) { - for (cleaned = false; !cleaned;) { + cleaned = false; + while (!cleaned) { tx_desc = IXGBE_TX_DESC_ADV(*tx_ring, i); tx_buffer_info = tx_ring-tx_buffer_info[i]; cleaned = (i == eop); tx_ring-stats.bytes += tx_buffer_info-length; + if (cleaned) { + struct sk_buff *skb = tx_buffer_info-skb; +#ifdef NETIF_F_TSO + unsigned int segs, bytecount; + segs = skb_shinfo(skb)-gso_segs ?: 1; + /* multiply data chunks by size of headers */ + bytecount = ((segs - 1) * skb_headlen(skb)) + + skb-len; + total_tx_packets += segs; + total_tx_bytes += bytecount; +#else + total_tx_packets++; + total_tx_bytes += skb-len; +#endif + } ixgbe_unmap_and_free_tx_resource(adapter, tx_buffer_info); tx_desc-wb.status = 0; @@ -204,29 +229,34 @@ static bool ixgbe_clean_tx_irq(struct ixgbe_adapter *adapter, eop_desc = IXGBE_TX_DESC_ADV(*tx_ring, eop); /* weight of a sort for tx, avoid endless transmit cleanup */ - if (count++ = tx_ring-work_limit) + if (total_tx_packets = tx_ring-work_limit) break; } tx_ring-next_to_clean = i; -#define TX_WAKE_THRESHOLD 32 - spin_lock(tx_ring-tx_lock); - - if (cleaned netif_carrier_ok(netdev) - (IXGBE_DESC_UNUSED(tx_ring) = TX_WAKE_THRESHOLD) - !test_bit(__IXGBE_DOWN, adapter-state)) - netif_wake_queue(netdev); - - spin_unlock(tx_ring-tx_lock); +#define TX_WAKE_THRESHOLD (DESC_NEEDED * 2) + if (total_tx_packets netif_carrier_ok(netdev) + (IXGBE_DESC_UNUSED(tx_ring) = TX_WAKE_THRESHOLD)) { + /* Make sure that anybody stopping the queue after this +* sees the new next_to_clean. +*/ + smp_mb(); + if (netif_queue_stopped(netdev) + !test_bit(__IXGBE_DOWN, adapter-state)) { + netif_wake_queue(netdev); + adapter-restart_queue++; + } + } if (adapter-detect_tx_hung) if (ixgbe_check_tx_hang(adapter, tx_ring, eop, eop_desc)) netif_stop_queue(netdev); - if (count = tx_ring-work_limit) + if (total_tx_packets = tx_ring-work_limit) IXGBE_WRITE_REG(adapter-hw,
[PATCH 4/8] ixgbe: Fix pause code for ethtool
From: Ayyappan Veeraiyan [EMAIL PROTECTED] Signed-off-by: Ayyappan Veeraiyan [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/ixgbe/ixgbe_ethtool.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c index b447dd7..a119cbd 100644 --- a/drivers/net/ixgbe/ixgbe_ethtool.c +++ b/drivers/net/ixgbe/ixgbe_ethtool.c @@ -167,7 +167,7 @@ static void ixgbe_get_pauseparam(struct net_device *netdev, struct ixgbe_adapter *adapter = netdev_priv(netdev); struct ixgbe_hw *hw = adapter-hw; - pause-autoneg = AUTONEG_DISABLE; + pause-autoneg = (hw-fc.type == ixgbe_fc_full ? 1 : 0); if (hw-fc.type == ixgbe_fc_rx_pause) { pause-rx_pause = 1; @@ -185,10 +185,8 @@ static int ixgbe_set_pauseparam(struct net_device *netdev, struct ixgbe_adapter *adapter = netdev_priv(netdev); struct ixgbe_hw *hw = adapter-hw; - if (pause-autoneg == AUTONEG_ENABLE) - return -EINVAL; - - if (pause-rx_pause pause-tx_pause) + if ((pause-autoneg == AUTONEG_ENABLE) || + (pause-rx_pause pause-tx_pause)) hw-fc.type = ixgbe_fc_full; else if (pause-rx_pause !pause-tx_pause) hw-fc.type = ixgbe_fc_rx_pause; @@ -196,6 +194,8 @@ static int ixgbe_set_pauseparam(struct net_device *netdev, hw-fc.type = ixgbe_fc_tx_pause; else if (!pause-rx_pause !pause-tx_pause) hw-fc.type = ixgbe_fc_none; + else + return -EINVAL; hw-fc.original_type = hw-fc.type; -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/8] ixgbe: Fix FW init/release, make this code a function
From: Ayyappan Veeraiyan [EMAIL PROTECTED] A gap was left in the FW release/grab code in up/down path. Fix it by making the release/grab code a function and calling it in appropriate locations. Signed-off-by: Ayyappan Veeraiyan [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/ixgbe/ixgbe_main.c | 38 -- 1 files changed, 28 insertions(+), 10 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c index b4c9c77..c814d9b 100644 --- a/drivers/net/ixgbe/ixgbe_main.c +++ b/drivers/net/ixgbe/ixgbe_main.c @@ -87,6 +87,25 @@ MODULE_VERSION(DRV_VERSION); #define DEFAULT_DEBUG_LEVEL_SHIFT 3 +static void ixgbe_release_hw_control(struct ixgbe_adapter *adapter) +{ + u32 ctrl_ext; + + /* Let firmware take over control of h/w */ + ctrl_ext = IXGBE_READ_REG(adapter-hw, IXGBE_CTRL_EXT); + IXGBE_WRITE_REG(adapter-hw, IXGBE_CTRL_EXT, + ctrl_ext ~IXGBE_CTRL_EXT_DRV_LOAD); +} + +static void ixgbe_get_hw_control(struct ixgbe_adapter *adapter) +{ + u32 ctrl_ext; + + /* Let firmware know the driver has taken over */ + ctrl_ext = IXGBE_READ_REG(adapter-hw, IXGBE_CTRL_EXT); + IXGBE_WRITE_REG(adapter-hw, IXGBE_CTRL_EXT, + ctrl_ext | IXGBE_CTRL_EXT_DRV_LOAD); +} #ifdef DEBUG /** @@ -1204,6 +1223,8 @@ static int ixgbe_up_complete(struct ixgbe_adapter *adapter) u32 txdctl, rxdctl, mhadd; int max_frame = netdev-mtu + ETH_HLEN + ETH_FCS_LEN; + ixgbe_get_hw_control(adapter); + if (adapter-flags (IXGBE_FLAG_MSIX_ENABLED | IXGBE_FLAG_MSI_ENABLED)) { if (adapter-flags IXGBE_FLAG_MSIX_ENABLED) { @@ -1490,6 +1511,8 @@ static int ixgbe_suspend(struct pci_dev *pdev, pm_message_t state) pci_enable_wake(pdev, PCI_D3hot, 0); pci_enable_wake(pdev, PCI_D3cold, 0); + ixgbe_release_hw_control(adapter); + pci_disable_device(pdev); pci_set_power_state(pdev, pci_choose_state(pdev, state)); @@ -1891,14 +1914,8 @@ static int ixgbe_open(struct net_device *netdev) { struct ixgbe_adapter *adapter = netdev_priv(netdev); int err; - u32 ctrl_ext; u32 num_rx_queues = adapter-num_rx_queues; - /* Let firmware know the driver has taken over */ - ctrl_ext = IXGBE_READ_REG(adapter-hw, IXGBE_CTRL_EXT); - IXGBE_WRITE_REG(adapter-hw, IXGBE_CTRL_EXT, - ctrl_ext | IXGBE_CTRL_EXT_DRV_LOAD); - try_intr_reinit: /* allocate transmit descriptors */ err = ixgbe_setup_all_tx_resources(adapter); @@ -1949,6 +1966,7 @@ try_intr_reinit: return 0; err_up: + ixgbe_release_hw_control(adapter); ixgbe_free_irq(adapter); err_req_irq: ixgbe_free_all_rx_resources(adapter); @@ -1974,7 +1992,6 @@ err_setup_tx: static int ixgbe_close(struct net_device *netdev) { struct ixgbe_adapter *adapter = netdev_priv(netdev); - u32 ctrl_ext; ixgbe_down(adapter); ixgbe_free_irq(adapter); @@ -1982,9 +1999,7 @@ static int ixgbe_close(struct net_device *netdev) ixgbe_free_all_tx_resources(adapter); ixgbe_free_all_rx_resources(adapter); - ctrl_ext = IXGBE_READ_REG(adapter-hw, IXGBE_CTRL_EXT); - IXGBE_WRITE_REG(adapter-hw, IXGBE_CTRL_EXT, - ctrl_ext ~IXGBE_CTRL_EXT_DRV_LOAD); + ixgbe_release_hw_control(adapter); return 0; } @@ -2749,6 +2764,7 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev, return 0; err_register: + ixgbe_release_hw_control(adapter); err_hw_init: err_sw_init: err_eeprom: @@ -2784,6 +2800,8 @@ static void __devexit ixgbe_remove(struct pci_dev *pdev) unregister_netdev(netdev); + ixgbe_release_hw_control(adapter); + kfree(adapter-tx_ring); kfree(adapter-rx_ring); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/8] ixgbe: properly return CHECKSUM_NONE, cleanup csum code
From: Ayyappan Veeraiyan [EMAIL PROTECTED] We were not returning CHECKSUM_NONE in a lot of cases which is wrong. Move common exit points in this function and error code up before the actual work in this function. Signed-off-by: Ayyappan Veeraiyan [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/ixgbe/ixgbe_main.c | 29 ++--- 1 files changed, 22 insertions(+), 7 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c index c814d9b..ee5ee10 100644 --- a/drivers/net/ixgbe/ixgbe_main.c +++ b/drivers/net/ixgbe/ixgbe_main.c @@ -304,25 +304,40 @@ static void ixgbe_receive_skb(struct ixgbe_adapter *adapter, } } +/** + * ixgbe_rx_checksum - indicate in skb if hw indicated a good cksum + * @adapter: address of board private structure + * @status_err: hardware indication of status of receive + * @skb: skb currently being received and modified + **/ static inline void ixgbe_rx_checksum(struct ixgbe_adapter *adapter, u32 status_err, struct sk_buff *skb) { skb-ip_summed = CHECKSUM_NONE; - /* Ignore Checksum bit is set */ + /* Ignore Checksum bit is set, or rx csum disabled */ if ((status_err IXGBE_RXD_STAT_IXSM) || -!(adapter-flags IXGBE_FLAG_RX_CSUM_ENABLED)) + !(adapter-flags IXGBE_FLAG_RX_CSUM_ENABLED)) return; - /* TCP/UDP checksum error bit is set */ - if (status_err (IXGBE_RXDADV_ERR_TCPE | IXGBE_RXDADV_ERR_IPE)) { - /* let the stack verify checksum errors */ + + /* if IP and error */ + if ((status_err IXGBE_RXD_STAT_IPCS) + (status_err IXGBE_RXDADV_ERR_IPE)) { adapter-hw_csum_rx_error++; return; } + + if (!(status_err IXGBE_RXD_STAT_L4CS)) + return; + + if (status_err IXGBE_RXDADV_ERR_TCPE) { + adapter-hw_csum_rx_error++; + return; + } + /* It must be a TCP or UDP packet with a valid checksum */ - if (status_err (IXGBE_RXD_STAT_L4CS | IXGBE_RXD_STAT_UDPCS)) - skb-ip_summed = CHECKSUM_UNNECESSARY; + skb-ip_summed = CHECKSUM_UNNECESSARY; adapter-hw_csum_rx_good++; } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/8] ixbge: Make ethtool code account for media types
From: Ayyappan Veeraiyan [EMAIL PROTECTED] The i82598 can support various media types but this ethtool code only was coded for fiber just yet. Signed-off-by: Ayyappan Veeraiyan [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/ixgbe/ixgbe_ethtool.c | 52 ++--- 1 files changed, 36 insertions(+), 16 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c index 9f3cdb8..b447dd7 100644 --- a/drivers/net/ixgbe/ixgbe_ethtool.c +++ b/drivers/net/ixgbe/ixgbe_ethtool.c @@ -103,21 +103,41 @@ static int ixgbe_get_settings(struct net_device *netdev, struct ethtool_cmd *ecmd) { struct ixgbe_adapter *adapter = netdev_priv(netdev); + struct ixgbe_hw *hw = adapter-hw; + u32 link_speed = 0; + bool link_up; - ecmd-supported = (SUPPORTED_1baseT_Full | SUPPORTED_FIBRE); - ecmd-advertising = (ADVERTISED_1baseT_Full | ADVERTISED_FIBRE); - ecmd-port = PORT_FIBRE; + ecmd-supported = SUPPORTED_1baseT_Full; + ecmd-autoneg = AUTONEG_ENABLE; ecmd-transceiver = XCVR_EXTERNAL; + if (hw-phy.media_type == ixgbe_media_type_copper) { + ecmd-supported |= (SUPPORTED_1000baseT_Full | + SUPPORTED_TP | SUPPORTED_Autoneg); + + ecmd-advertising = (ADVERTISED_TP | ADVERTISED_Autoneg); + if (hw-phy.autoneg_advertised IXGBE_LINK_SPEED_10GB_FULL) + ecmd-advertising |= ADVERTISED_1baseT_Full; + if (hw-phy.autoneg_advertised IXGBE_LINK_SPEED_1GB_FULL) + ecmd-advertising |= ADVERTISED_1000baseT_Full; + + ecmd-port = PORT_TP; + } else { + ecmd-supported |= SUPPORTED_FIBRE; + ecmd-advertising = (ADVERTISED_1baseT_Full | +ADVERTISED_FIBRE); + ecmd-port = PORT_FIBRE; + } - if (netif_carrier_ok(adapter-netdev)) { - ecmd-speed = SPEED_1; + adapter-hw.mac.ops.check_link(hw, (link_speed), link_up); + if (link_up) { + ecmd-speed = (link_speed == IXGBE_LINK_SPEED_10GB_FULL) ? + SPEED_1 : SPEED_1000; ecmd-duplex = DUPLEX_FULL; } else { ecmd-speed = -1; ecmd-duplex = -1; } - ecmd-autoneg = AUTONEG_DISABLE; return 0; } @@ -125,17 +145,17 @@ static int ixgbe_set_settings(struct net_device *netdev, struct ethtool_cmd *ecmd) { struct ixgbe_adapter *adapter = netdev_priv(netdev); + struct ixgbe_hw *hw = adapter-hw; - if (ecmd-autoneg == AUTONEG_ENABLE || - ecmd-speed + ecmd-duplex != SPEED_1 + DUPLEX_FULL) - return -EINVAL; - - if (netif_running(adapter-netdev)) { - ixgbe_down(adapter); - ixgbe_reset(adapter); - ixgbe_up(adapter); - } else { - ixgbe_reset(adapter); + switch (hw-phy.media_type) { + case ixgbe_media_type_fiber: + if ((ecmd-autoneg == AUTONEG_ENABLE) || + (ecmd-speed + ecmd-duplex != SPEED_1 + DUPLEX_FULL)) + return -EINVAL; + /* in this case we currently only support 10Gb/FULL */ + break; + default: + break; } return 0; -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/8] ixgbe: fix several counter register errata
From: Ayyappan Veeraiyan [EMAIL PROTECTED] Several counters behave differently on 82598 causing them to display incorrect values. Adjust the accounting so the reported numbers make sense and do not double count or represent the wrong item. Signed-off-by: Ayyappan Veeraiyan [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/ixgbe/ixgbe_main.c | 53 +++- 1 files changed, 31 insertions(+), 22 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c index ee5ee10..6e7d90e 100644 --- a/drivers/net/ixgbe/ixgbe_main.c +++ b/drivers/net/ixgbe/ixgbe_main.c @@ -2026,22 +2026,26 @@ static int ixgbe_close(struct net_device *netdev) void ixgbe_update_stats(struct ixgbe_adapter *adapter) { struct ixgbe_hw *hw = adapter-hw; - u64 good_rx, missed_rx, bprc; + u64 total_mpc = 0; + u32 i, missed_rx = 0, mpc, bprc, lxon, lxoff, xon_off_tot; adapter-stats.crcerrs += IXGBE_READ_REG(hw, IXGBE_CRCERRS); - good_rx = IXGBE_READ_REG(hw, IXGBE_GPRC); - missed_rx = IXGBE_READ_REG(hw, IXGBE_MPC(0)); - missed_rx += IXGBE_READ_REG(hw, IXGBE_MPC(1)); - missed_rx += IXGBE_READ_REG(hw, IXGBE_MPC(2)); - missed_rx += IXGBE_READ_REG(hw, IXGBE_MPC(3)); - missed_rx += IXGBE_READ_REG(hw, IXGBE_MPC(4)); - missed_rx += IXGBE_READ_REG(hw, IXGBE_MPC(5)); - missed_rx += IXGBE_READ_REG(hw, IXGBE_MPC(6)); - missed_rx += IXGBE_READ_REG(hw, IXGBE_MPC(7)); - adapter-stats.gprc += (good_rx - missed_rx); - - adapter-stats.mpc[0] += missed_rx; + for (i = 0; i 8; i++) { + /* for packet buffers not used, the register should read 0 */ + mpc = IXGBE_READ_REG(hw, IXGBE_MPC(i)); + missed_rx += mpc; + adapter-stats.mpc[i] += mpc; + total_mpc += adapter-stats.mpc[i]; + adapter-stats.rnbc[i] += IXGBE_READ_REG(hw, IXGBE_RNBC(i)); + } + adapter-stats.gprc += IXGBE_READ_REG(hw, IXGBE_GPRC); + /* work around hardware counting issue */ + adapter-stats.gprc -= missed_rx; + + /* 82598 hardware only has a 32 bit counter in the high register */ adapter-stats.gorc += IXGBE_READ_REG(hw, IXGBE_GORCH); + adapter-stats.gotc += IXGBE_READ_REG(hw, IXGBE_GOTCH); + adapter-stats.tor += IXGBE_READ_REG(hw, IXGBE_TORH); bprc = IXGBE_READ_REG(hw, IXGBE_BPRC); adapter-stats.bprc += bprc; adapter-stats.mprc += IXGBE_READ_REG(hw, IXGBE_MPRC); @@ -2053,28 +2057,34 @@ void ixgbe_update_stats(struct ixgbe_adapter *adapter) adapter-stats.prc511 += IXGBE_READ_REG(hw, IXGBE_PRC511); adapter-stats.prc1023 += IXGBE_READ_REG(hw, IXGBE_PRC1023); adapter-stats.prc1522 += IXGBE_READ_REG(hw, IXGBE_PRC1522); - adapter-stats.rlec += IXGBE_READ_REG(hw, IXGBE_RLEC); adapter-stats.lxonrxc += IXGBE_READ_REG(hw, IXGBE_LXONRXC); - adapter-stats.lxontxc += IXGBE_READ_REG(hw, IXGBE_LXONTXC); adapter-stats.lxoffrxc += IXGBE_READ_REG(hw, IXGBE_LXOFFRXC); - adapter-stats.lxofftxc += IXGBE_READ_REG(hw, IXGBE_LXOFFTXC); + lxon = IXGBE_READ_REG(hw, IXGBE_LXONTXC); + adapter-stats.lxontxc += lxon; + lxoff = IXGBE_READ_REG(hw, IXGBE_LXOFFTXC); + adapter-stats.lxofftxc += lxoff; adapter-stats.ruc += IXGBE_READ_REG(hw, IXGBE_RUC); adapter-stats.gptc += IXGBE_READ_REG(hw, IXGBE_GPTC); - adapter-stats.gotc += IXGBE_READ_REG(hw, IXGBE_GOTCH); - adapter-stats.rnbc[0] += IXGBE_READ_REG(hw, IXGBE_RNBC(0)); + adapter-stats.mptc += IXGBE_READ_REG(hw, IXGBE_MPTC); + /* +* 82598 errata - tx of flow control packets is included in tx counters +*/ + xon_off_tot = lxon + lxoff; + adapter-stats.gptc -= xon_off_tot; + adapter-stats.mptc -= xon_off_tot; + adapter-stats.gotc -= (xon_off_tot * (ETH_ZLEN + ETH_FCS_LEN)); adapter-stats.ruc += IXGBE_READ_REG(hw, IXGBE_RUC); adapter-stats.rfc += IXGBE_READ_REG(hw, IXGBE_RFC); adapter-stats.rjc += IXGBE_READ_REG(hw, IXGBE_RJC); - adapter-stats.tor += IXGBE_READ_REG(hw, IXGBE_TORH); adapter-stats.tpr += IXGBE_READ_REG(hw, IXGBE_TPR); adapter-stats.ptc64 += IXGBE_READ_REG(hw, IXGBE_PTC64); + adapter-stats.ptc64 -= xon_off_tot; adapter-stats.ptc127 += IXGBE_READ_REG(hw, IXGBE_PTC127); adapter-stats.ptc255 += IXGBE_READ_REG(hw, IXGBE_PTC255); adapter-stats.ptc511 += IXGBE_READ_REG(hw, IXGBE_PTC511); adapter-stats.ptc1023 += IXGBE_READ_REG(hw, IXGBE_PTC1023); adapter-stats.ptc1522 += IXGBE_READ_REG(hw, IXGBE_PTC1522); - adapter-stats.mptc += IXGBE_READ_REG(hw, IXGBE_MPTC); adapter-stats.bptc += IXGBE_READ_REG(hw, IXGBE_BPTC); /* Fill out the OS statistics structure */ @@ -2090,8 +2100,7 @@ void ixgbe_update_stats(struct ixgbe_adapter *adapter)
[PATCH 8/8] ixgbe: add real-time traffic counters
From: Ayyappan Veeraiyan [EMAIL PROTECTED] Just like our other drivers before we can switch ixgbe to provide real-time packet/byte counters to the stack easily. Signed-off-by: Ayyappan Veeraiyan [EMAIL PROTECTED] Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/ixgbe/ixgbe_main.c | 15 +++ 1 files changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c index 6e7d90e..ead49e5 100644 --- a/drivers/net/ixgbe/ixgbe_main.c +++ b/drivers/net/ixgbe/ixgbe_main.c @@ -275,6 +275,8 @@ static bool ixgbe_clean_tx_irq(struct ixgbe_adapter *adapter, if (total_tx_packets = tx_ring-work_limit) IXGBE_WRITE_REG(adapter-hw, IXGBE_EICS, tx_ring-eims_value); + adapter-net_stats.tx_bytes += total_tx_bytes; + adapter-net_stats.tx_packets += total_tx_packets; cleaned = total_tx_packets ? true : false; return cleaned; } @@ -443,6 +445,7 @@ static bool ixgbe_clean_rx_irq(struct ixgbe_adapter *adapter, u16 hdr_info, vlan_tag; bool is_vlan, cleaned = false; int cleaned_count = 0; + unsigned int total_rx_bytes = 0, total_rx_packets = 0; i = rx_ring-next_to_clean; upper_len = 0; @@ -522,6 +525,11 @@ static bool ixgbe_clean_rx_irq(struct ixgbe_adapter *adapter, } ixgbe_rx_checksum(adapter, staterr, skb); + + /* probably a little skewed due to removing CRC */ + total_rx_bytes += skb-len; + total_rx_packets++; + skb-protocol = eth_type_trans(skb, netdev); ixgbe_receive_skb(adapter, skb, is_vlan, vlan_tag); netdev-last_rx = jiffies; @@ -550,6 +558,9 @@ next_desc: if (cleaned_count) ixgbe_alloc_rx_buffers(adapter, rx_ring, cleaned_count); + adapter-net_stats.rx_bytes += total_rx_bytes; + adapter-net_stats.rx_packets += total_rx_packets; + return cleaned; } @@ -2088,10 +2099,6 @@ void ixgbe_update_stats(struct ixgbe_adapter *adapter) adapter-stats.bptc += IXGBE_READ_REG(hw, IXGBE_BPTC); /* Fill out the OS statistics structure */ - adapter-net_stats.rx_packets = adapter-stats.gprc; - adapter-net_stats.tx_packets = adapter-stats.gptc; - adapter-net_stats.rx_bytes = adapter-stats.gorc; - adapter-net_stats.tx_bytes = adapter-stats.gotc; adapter-net_stats.multicast = adapter-stats.mprc; /* Rx Errors */ -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
why does DCCP SO_REUSEADDR have to be SOL_DCCP?
Hi - I'm tweaking the netperf omni tests to be able to run over DCCP. I've run across a not-unorecedented problem with getaddrinfo() not groking either SOCK_DCCP or IPPROTO_DCCP in the hints, and that I can more or less live with - I had to do a kludge for getaddrinfo() for IPPROTO_SCTP under Linux at one point and I can see how the two are not necessarily going to be in sync. And I've worked-around no user-level include files (ie without setting __KERNEL__) define the DCCP stuff, and that is OK too, albeit somewhat inconvenient. My question though is why on earth does an SO_REUSEADDR setsockopt() against a DCCP socket have to be SOL_DCCP? SCTP and TCP are quite happy with SOL_SOCKET, and it might be foolish consistency, but since the option _does_ begin with SO_ I'd have expected it to work for SOL_SOCKET, but (again RHEL5.1, yes, I do plan on getting upstream but have to satisfy several masters) it doesn't seem to be the case - a subsequent listen() or connect() call after an SOL_SOCKET SO_REUSEADDR against a DCCP socket leaves one SOL as it were... Of course the setsockopt(SO_REUSEADDR) against the DCCP socket using SOL_SOCKET itself doesn't fail, only the later listen() or connect() call... happy benchmarking, rick jones -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
linux-2.6.24 compile error in drivers/net/b44.c
drivers/net/b44.c: In function 'b44_remove_one': drivers/net/b44.c:2231: error: implicit declaration of function 'ssb_pcihost_set_power_state' make[2]: *** [drivers/net/b44.o] Error 1 make[1]: *** [drivers/net] Error 2 I think it is caused by: CONFIG_SSB_PCIHOST=n CONFIG_B44=y -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] Disable TSO for non standard qdiscs
I totally disagree with these POVs: - 10G cards should be treated by default as 10G cards - not DSL modems, and common users shouldn't have to read any warnings or configs to see this. - tc with TBF or HTB are professional tools; there should be added some warnings to manuals. But trying to change the way they work because we think we know better what users want, and changing BTW some other things (making debugging this later a hell), is simply disrespectful for target users of these tools. There are some wrappers or creators invented for this. And, BTW, I think I've seen somewhere a system which does this this other way - with creators for professionals. So, you could be right with this too... Ok, maybe I'm not done quite yet. Jarek is echo'ing my original point, changing the behavior of the tool automatically (these qdiscs in question) is not good for a normal end user. It might be fine for kernel developers, but not users of these tools, IMO. A less disruptive approach, such as a warning message printed when loading the qdisc if TSO is enabled, and documenting recommended usage, I think is more prudent here. Cheers, -PJ Waskiewicz -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: why does DCCP SO_REUSEADDR have to be SOL_DCCP?
Em Fri, Feb 01, 2008 at 05:42:23PM -0800, Rick Jones escreveu: Hi - I'm tweaking the netperf omni tests to be able to run over DCCP. I've run across a not-unorecedented problem with getaddrinfo() not groking either SOCK_DCCP or IPPROTO_DCCP in the hints, and that I can more or less live with - I had to do a kludge for getaddrinfo() for IPPROTO_SCTP under Linux at one point and I can see how the two are not necessarily going to be in sync. See the ttcp patch where we do a xgetaddrinfo crude hack to handle dccp: http://vger.kernel.org/~acme/dccp/ttcp.c And I've worked-around no user-level include files (ie without setting __KERNEL__) define the DCCP stuff, and that is OK too, albeit somewhat inconvenient. Humm, for what? Again, see the ttcp code above: My question though is why on earth does an SO_REUSEADDR setsockopt() against a DCCP socket have to be SOL_DCCP? SCTP and TCP are quite happy with SOL_SOCKET, and it might be foolish consistency, but since the option _does_ begin with SO_ I'd have expected it to work for SOL_SOCKET, but (again RHEL5.1, yes, I do plan on getting upstream but have to satisfy several masters) it doesn't seem to be the case - a subsequent listen() or connect() call after an SOL_SOCKET SO_REUSEADDR against a DCCP socket leaves one SOL as it were... Strange, lemme check... 1. sys_socketcall - 2. sys_setsockopt - 3.if (level == SOL_SOCKET) { 4. sock_setsockopt: 5.case SO_REUSEADDR: 6. sk-sk_reuse = valbool; 7.} else 8. sock-ops-setsockopt = inet_dccp_ops-setsockopt = 9.inet_dccp_ops-setsockopt = sock_common_setsockopt - 10. sk-sk_prot-setsockopt = dccp_v4_prot-setsockopt = 11. dccp_setsockopt 12. if (level != SOL_DCCP) 13.return inet_csk(sk)-icsk_af_ops-setsockopt() = 14. ip_setsockopt 15. return do_dccp_setsockopt() SO_REUSEADDR is handled in 4, if you pass SOL_SOCKET. If instead you pass SOL_DCCP we'll go down the rabbit hole till do_dccp_setsockopt() and SO_REUSEADDR, that is equal to 2, will be interpreted as DCCP_SOCKOPT_SERVICE, that is also equal to 2, so you'll be setting the service, not changing the SO_REUSEADDR setting. The problem here is that you need to use: setsockopt(fd, SOL_DCCP, DCCP_SOCKOPT_PACKET_SIZE, service, sizeof(service)); Again, take a look at the ttcp patch, the other patches for iperf, netcat, etc handles this. Of course the setsockopt(SO_REUSEADDR) against the DCCP socket using SOL_SOCKET itself doesn't fail, only the later listen() or connect() call... happy benchmarking, Look forward for a happy DCCP netperf bencharking session! Thanks a lot, - Arnaldo -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: why does DCCP SO_REUSEADDR have to be SOL_DCCP?
Em Sat, Feb 02, 2008 at 12:52:59AM -0200, Arnaldo Carvalho de Melo escreveu: If instead you pass SOL_DCCP we'll go down the rabbit hole till do_dccp_setsockopt() and SO_REUSEADDR, that is equal to 2, will be interpreted as DCCP_SOCKOPT_SERVICE, that is also equal to 2, so you'll be setting the service, not changing the SO_REUSEADDR setting. The problem here is that you need to use: setsockopt(fd, SOL_DCCP, DCCP_SOCKOPT_PACKET_SIZE, service, sizeof(service)); Further info on DCCP service codes: http://www.rfc.net/rfc4340.txt - 8.1.2. Service Codes Again, take a look at the ttcp patch, the other patches for iperf, netcat, etc handles this. - Arnaldo -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Disable TSO for non standard qdiscs
On Fri, Feb 01, 2008 at 01:58:30PM -0800, Rick Jones wrote: Does this also imply that JumboFrames interacts badly with these qdiscs? Or IPoIB with its 65000ish byte MTU? Correct. Of course it is always relative to the link speed. So if your link is 10x faster and your packets 10x bigger you can get similarly smooth shaping. If the later-in-thread mentioned person shaping for their DSL line happens to have enabled JumboFrames on their GbE network, will/should the qdisc negate that? I don't think so, mostly because jumbo frames are not enabled by default. I'm only concerned about usable defaults there -- if you set non default options you should certainly know what you're doing. There are other reasons to not use jumbo frames anyways; e.g. a lot of cards still do not support SG for them but only process them as a single continuous buffer in memory so you often run into memory fragmentation problems. Or is the qdisc currently assuming that the remote end of the DSL will have asked for a smaller MSS? First there are lots of different qdiscs that all do different things. Take a look at net/sched/*. Then they usually don't strictly require particular MTUs (or know anything about MSS), but tend to work better with smaller MTUs because that allows more choices in packet scheduling. Generally the larger your packets the less they can be scheduled. -Andi -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 9873] New: BUG at net/ipv4/icmp.c:874
On Fri, 1 Feb 2008 17:21:34 -0800 (PST) [EMAIL PROTECTED] wrote: http://bugzilla.kernel.org/show_bug.cgi?id=9873 Summary: BUG at net/ipv4/icmp.c:874 Product: Networking Version: 2.5 KernelVersion: 2.6.24-06481-gaa62999 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: IPV4 AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Latest working kernel version: - Earliest failing kernel version: 2.6.24-06481-gaa62999 Distribution: Ubuntu Problem Description: using icmpsic on a box triggers the ooops Steps to reproduce: start icmpsic -s 192.168.0.201 -d 192.168.0.201 -p 35000 -r 22361 -k 33000 on the machine containing those interfaces will completely lock it up [ 360.552115] [ cut here ] [ 360.552235] kernel BUG at net/ipv4/icmp.c:874! [ 360.552235] invalid opcode: [#1] PREEMPT DEBUG_PAGEALLOC [ 360.552235] Modules linked in: [ 360.552235] [ 360.552235] Pid: 3899, comm: icmpsic Not tainted (2.6.24-06481-gaa62999 #20) [ 360.552235] EIP: 0060:[c05eb1b3] EFLAGS: 00010282 CPU: 0 [ 360.552235] EIP is at icmp_timestamp+0x83/0xd0 [ 360.552235] EAX: fff2 EBX: cae7d700 ECX: EDX: fffc [ 360.552235] ESI: 003f04d0 EDI: caded000 EBP: c08efeb4 ESP: c08efe48 [ 360.552235] DS: 007b ES: 007b FS: GS: 0033 SS: 0068 [ 360.552235] Process icmpsic (pid: 3899, ti=c08ef000 task=cacf task.ti=cacbf000) [ 360.552235] Stack: 0004 0001 cf449060 c0bfb720 cf0df4f0 cf0df4f0 cf081850 d8053f00 [ 360.552235]d8053f00 0004 cae7d700 c08efed4 c0899628 c08efe90 c0603442 [ 360.552235]cf097090 c0603420 c08efeb4 c05b73dd c05c5c10 0001 47a3c2a2 [ 360.552235] Call Trace: [ 360.552235] [c0603442] ? ipt_hook+0x22/0x30 [ 360.552235] [c0603420] ? ipt_hook+0x0/0x30 [ 360.552235] [c05b73dd] ? nf_iterate+0x5d/0x90 [ 360.552235] [c05c5c10] ? ip_local_deliver_finish+0x0/0x170 [ 360.552235] [c05eadf6] ? icmp_rcv+0xe6/0x200 [ 360.552235] [c05c5c77] ? ip_local_deliver_finish+0x67/0x170 [ 360.552235] [c05c60ed] ? ip_local_deliver+0x2d/0xa0 [ 360.552235] [c05c5c10] ? ip_local_deliver_finish+0x0/0x170 [ 360.552235] [c05c59cf] ? ip_rcv_finish+0xdf/0x320 [ 360.552235] [c05b74ca] ? nf_hook_slow+0xba/0xe0 [ 360.552235] [c05c58f0] ? ip_rcv_finish+0x0/0x320 [ 360.552235] [c05c5feb] ? ip_rcv+0x16b/0x240 [ 360.552235] [c05c58f0] ? ip_rcv_finish+0x0/0x320 [ 360.552235] [c05c5e80] ? do_softirq+0x8a/0xd0 [ 360.552235] [c0128eb4] ? local_bh_enable+0xa4/0x110 [ 360.552235] [c05a43d0] ? dev_queue_xmit+0xa0/0x340 [ 360.552235] [c01547ad] ? __rcu_read_unlock+0x7d/0x90 [ 360.552235] [c05c9d4d] ? ip_finish_output+0x12d/0x2d0 [ 360.552235] [c05ca9a9] ? ip_output+0x79/0xd0 [ 360.552235] [c05e4b40] ? dst_output+0x0/0x10 [ 360.552235] [c05e4f71] ? raw_send_hdrinc+0x121/0x310 [ 360.552235] [c05e4b40] ? dst_output+0x0/0x10 [ 360.552235] [c05e60cd] ? raw_sendmsg+0x36d/0x3a0 [ 360.552235] [c05ee914] ? inet_sendmsg+0x34/0x60 [ 360.552235] [c0595c54] ? sock_sendmsg+0xc4/0xf0 [ 360.552235] [c01373b0] ? autoremove_wake_function+0x0/0x50 [ 360.552235] [c01050d3] ? restore_nocheck+0x12/0x15 [ 360.552235] [c0144bf4] ? trace_hardirqs_on+0xc4/0x150 [ 360.552235] [c01050d3] ? restore_nocheck+0x12/0x15 [ 360.552235] [c043ff06] ? copy_from_user+0x46/0x80 [ 360.552235] [c0595f45] ? __lock_release+0x46/0x70 [ 360.552235] [c01070e5] ? do_softirq+0x55/0xd0 [ 360.552235] [c0596e37] ? sys_socketcall+0x187/0x260 [ 360.552235] [c0104fea] ? sysenter_past_esp+0x5f/0xa5 [ 360.552235] === [ 360.552235] Code: f7 ea 69 f6 e8 03 00 00 c1 f9 1f c1 fa 06 29 ca 8d 04 16 31 d2 0f c8 8d 4d ac 89 45 b0 89 45 b4 89 d8 e8 c1 02 89 5d 98 c7 45 9c 00 00 00 [ 360.552235] EIP: [c05eb1b3] icmp_timestamp+0x83/0xd0 SS:ESP 0068:c08efe48 [ 360.552276] Kernel panic - not syncing: Fatal exception in interrupt Using the icmpsic command from another box doesnt do a thing, using 127.0.0.1 will also work. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/5] ehea: fix phyp checkpatch complaints
On Fri, 01 Feb 2008 13:23:45 CST, Scott Wood wrote: On Thu, Jan 31, 2008 at 08:20:50PM -0600, Doug Maxey wrote: /* input param R5 */ -#define H_ALL_RES_QP_EQPO EHEA_BMASK_IBM(9, 11) ... +#define H_ALL_RES_QP_EQPOEHEA_BMASK_IBM(9, 11) ... This was better the way it was (before, it was readable at any tab setting); checkpatch is overeager to complain on tab/space issues (it's a bit hard to distinguish indentation from alignment with a regex). In emacs, with no special offsets, the lines appear to still line up. What did happen was spaces were turned to tabs where applicable. What editor shows a bad alignment? ++doug -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html