PKTGEN: Adds thread limit parameter.
Hello! Really needed? If so -- Wouldn't a concept of a bitmask to control also which CPU's that runs the threads be more general? Cheers. --ro Luiz Fernando Capitulino writes: Currently, pktgen will create one thread for each online CPU in the system. That behaivor may be annoying if you're using pktgen in a large SMP system. This patch adds a new module parameter called 'pg_max_threads', which can be set to the maximum number of threads pktgen should create. For example, if you're playing with pktgen in SMP system with 8 CPUs, but want only two pktgen's threads, you can do: modprobe pktgen pg_max_threads=2 Signed-off-by: Luiz Capitulino [EMAIL PROTECTED] --- net/core/pktgen.c | 23 +-- 1 files changed, 17 insertions(+), 6 deletions(-) e210ad47d0db52496fdaabdd50bfe6ee0bcc53cd diff --git a/net/core/pktgen.c b/net/core/pktgen.c index 37b25a6..994aef1 100644 --- a/net/core/pktgen.c +++ b/net/core/pktgen.c @@ -154,7 +154,7 @@ #include asm/div64.h /* do_div */ #include asm/timex.h -#define VERSION pktgen v2.66: Packet Generator for packet performance testing.\n +#define VERSION pktgen v2.67: Packet Generator for packet performance testing.\n /* #define PG_DEBUG(a) a */ #define PG_DEBUG(a) @@ -488,6 +488,7 @@ static unsigned int fmt_ip6(char *s, con static int pg_count_d = 1000; /* 1000 pkts by default */ static int pg_delay_d; static int pg_clone_skb_d; +static int pg_max_threads; static int debug; static DEFINE_MUTEX(pktgen_thread_lock); @@ -3184,7 +3185,7 @@ static int pktgen_remove_device(struct p static int __init pg_init(void) { -int cpu; +int i, threads; struct proc_dir_entry *pe; printk(version); @@ -3208,15 +3209,24 @@ static int __init pg_init(void) /* Register us to receive netdevice events */ register_netdevice_notifier(pktgen_notifier_block); -for_each_online_cpu(cpu) { +threads = num_online_cpus(); + +/* + * Check if we should have less threads than the number + * of online CPUs + */ +if ((pg_max_threads 0) (pg_max_threads threads)) +threads = pg_max_threads; + +for (i = 0; i threads; i++) { int err; char buf[30]; -sprintf(buf, kpktgend_%i, cpu); -err = pktgen_create_thread(buf, cpu); +sprintf(buf, kpktgend_%i, i); +err = pktgen_create_thread(buf, i); if (err) printk(pktgen: WARNING: Cannot create thread for cpu %d (%d)\n, -cpu, err); +i, err); } if (list_empty(pktgen_threads)) { @@ -3263,4 +3273,5 @@ MODULE_LICENSE(GPL); module_param(pg_count_d, int, 0); module_param(pg_delay_d, int, 0); module_param(pg_clone_skb_d, int, 0); +module_param(pg_max_threads, int, 0); module_param(debug, int, 0); -- 1.2.4.gbe76 -- Luiz Fernando N. Capitulino - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[TCP]: Fix zero port problem in IPv6
Hi Dave: I was looking into the bug report titled [IPv6 since 2.6.12] connect() without bind() can't time out. There are actually a couple of different issues here. Here is a fix to the most obvious one. [TCP]: Fix zero port problem in IPv6 When we link a socket into the hash table, we need to make sure that we set the num/port fields so that it shows us with a non-zero port value in proc/netlink and on the wire. This code and comment is copied over from the IPv4 stack as is. Signed-off-by: Herbert Xu [EMAIL PROTECTED] Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c --- a/net/ipv6/inet6_hashtables.c +++ b/net/ipv6/inet6_hashtables.c @@ -87,7 +87,7 @@ static int __inet6_check_established(str struct inet_timewait_sock **twp) { struct inet_hashinfo *hinfo = death_row-hashinfo; - const struct inet_sock *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); const struct ipv6_pinfo *np = inet6_sk(sk); const struct in6_addr *daddr = np-rcv_saddr; const struct in6_addr *saddr = np-daddr; @@ -129,6 +129,10 @@ static int __inet6_check_established(str } unique: + /* Must record num and sport now. Otherwise we will see +* in hash table socket with a funny identity. */ + inet-num = lport; + inet-sport = htons(lport); BUG_TRAP(sk_unhashed(sk)); __sk_add_node(sk, head-chain); sk-sk_hash = hash;
Re: Possible bug in PFKEY implementation...
Hi, On Sunday 12 March 2006 23.29, Stjepan Gros wrote: setkey command behaves strangely when SPD is large. Either because I'm doing something wrong or because there is a bug. I believe it's a bug, but who knows... Anyway, after 529 items it simply stops displaying items from SPD with a message recv: Resource temporarily unavailable This has been discussed a couple of times on netdev and on the ipsec-tools development list. You can find more info in these threads, for example: http://marc.theaimsgroup.com/?t=4171168r=1w=2 http://marc.theaimsgroup.com/?t=11387258982r=1w=2 http://sourceforge.net/mailarchive/forum.php?thread_id=9674898forum_id=32000 As a workaround for the problem you could try increasing the size of the socket buffers available for PF_KEY sockets. Unfortunately you still have to patch ipsec-tools for this to work, because for some unknown reason it forces 128K buffers on all pfkey sockets. -- Regards, Krisztian Kovacs - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Router stops routing after changing MAC Address
On eth0 - no. My fudged MAC Address is based on the IP Address. So 1.2.3.50 becomes 001.002.003.050, which turns into 00:10:02:00:30:50. But 1.2.3 is fake - it isn't the one I really use. The other one, 172.16.16.3 - that is a real IP Address that turns into 17:20:16:01:60:03. And here I thought I was pretty clever - it never dawned on me in my wildest dreams that those bits had any special meaning! I will do some homework about what all the bits mean and then put together another scheme for my fudged IP Addresses and post the results here. - Greg -Original Message- From: Chuck Ebbert [mailto:[EMAIL PROTECTED] Sent: Monday, March 13, 2006 12:11 AM To: Greg Scott Cc: linux-kernel; David S. Miller Subject: Re: Router stops routing after changing MAC Address In-Reply-To: [EMAIL PROTECTED] On Fri, 10 Mar 2006 18:33:15 -0600, Greg Scott wrote: How to change MAC addresses is documented well enough - and it works - but when I change MAC addresses, my router stops routing. From the router, I can see the systems on both sides - but the router just refuses to forward packets. Here are my little test scripts to change MAC Addresses. First - ip-fudge-mac.sh [EMAIL PROTECTED] gregs]# more ip-fudge-mac.sh ip link set eth0 down ip link set eth0 address 01:02:03:04:05:06 ^ Bit zero is set, so this is a multicast address. Is that intentional? ip link set eth0 up ip link set eth1 down ip link set eth1 address 17:20:16:01:60:03 ^ Ditto. ip link set eth1 up echo 1 /proc/sys/net/ipv4/ip_forward -- Chuck Penguins don't come from next door, they come from the Antarctic! - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PKTGEN: Adds thread limit parameter.
Hi Robert, On Mon, 13 Mar 2006 10:44:49 +0100 Robert Olsson [EMAIL PROTECTED] wrote: | | Hello! | | Really needed? Well, I wouldn't say it's _really_ needed. But it really avoids having too many thread entries in the pktgen's /proc directory, and as a good result, you will not have pending threads which will never run as well. Also note that the patch is trivial, if you look at it in detail, you'll see that the biggest change we have is the 'if' part. The rest I would call cosmetic because the behaivor is the same. | If so -- Wouldn't a concept of a bitmask to control also which CPU's | that runs the threads be more general? Sounds like a bit complex, and would be my turn to ask if it's really needed. :) BTW Robert, my TODO list for pktgen so far is: * Some minor fixes and cleanups, like functions returns being not checked. * A new command called 'rem_device' to remove one device at a time (currently, we can only remove all devices in one shoot with 'rem_devices_all') * Ports pktgen to use the kernel thread API * cleanup the debug function usage Would be good to hear from you if they really makes sense. Thank you for the feedback, | Luiz Fernando Capitulino writes: | | Currently, pktgen will create one thread for each online CPU in the | system. That behaivor may be annoying if you're using pktgen in a | large SMP system. | | This patch adds a new module parameter called 'pg_max_threads', which | can be set to the maximum number of threads pktgen should create. | | For example, if you're playing with pktgen in SMP system with 8 | CPUs, but want only two pktgen's threads, you can do: | | modprobe pktgen pg_max_threads=2 | | Signed-off-by: Luiz Capitulino [EMAIL PROTECTED] | | --- | |net/core/pktgen.c | 23 +-- |1 files changed, 17 insertions(+), 6 deletions(-) | | e210ad47d0db52496fdaabdd50bfe6ee0bcc53cd | diff --git a/net/core/pktgen.c b/net/core/pktgen.c | index 37b25a6..994aef1 100644 | --- a/net/core/pktgen.c | +++ b/net/core/pktgen.c | @@ -154,7 +154,7 @@ |#include asm/div64.h/* do_div */ |#include asm/timex.h | | -#define VERSION pktgen v2.66: Packet Generator for packet performance testing.\n | +#define VERSION pktgen v2.67: Packet Generator for packet performance testing.\n | |/* #define PG_DEBUG(a) a */ |#define PG_DEBUG(a) | @@ -488,6 +488,7 @@ static unsigned int fmt_ip6(char *s, con |static int pg_count_d = 1000; /* 1000 pkts by default */ |static int pg_delay_d; |static int pg_clone_skb_d; | +static int pg_max_threads; |static int debug; | |static DEFINE_MUTEX(pktgen_thread_lock); | @@ -3184,7 +3185,7 @@ static int pktgen_remove_device(struct p | |static int __init pg_init(void) |{ | - int cpu; | + int i, threads; | struct proc_dir_entry *pe; | | printk(version); | @@ -3208,15 +3209,24 @@ static int __init pg_init(void) | /* Register us to receive netdevice events */ | register_netdevice_notifier(pktgen_notifier_block); | | - for_each_online_cpu(cpu) { | + threads = num_online_cpus(); | + | + /* | + * Check if we should have less threads than the number | + * of online CPUs | + */ | + if ((pg_max_threads 0) (pg_max_threads threads)) | + threads = pg_max_threads; | + | + for (i = 0; i threads; i++) { | int err; | char buf[30]; | | - sprintf(buf, kpktgend_%i, cpu); | - err = pktgen_create_thread(buf, cpu); | + sprintf(buf, kpktgend_%i, i); | + err = pktgen_create_thread(buf, i); | if (err) | printk(pktgen: WARNING: Cannot create thread for cpu %d (%d)\n, | - cpu, err); | + i, err); | } | | if (list_empty(pktgen_threads)) { | @@ -3263,4 +3273,5 @@ MODULE_LICENSE(GPL); |module_param(pg_count_d, int, 0); |module_param(pg_delay_d, int, 0); |module_param(pg_clone_skb_d, int, 0); | +module_param(pg_max_threads, int, 0); |module_param(debug, int, 0); | -- | 1.2.4.gbe76 | | | | -- | Luiz Fernando N. Capitulino -- Luiz Fernando N. Capitulino - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PKTGEN: Adds thread limit parameter.
Luiz Fernando Capitulino writes: Well, I wouldn't say it's _really_ needed. But it really avoids having too many thread entries in the pktgen's /proc directory, and as a good result, you will not have pending threads which will never run as well. Also note that the patch is trivial, if you look at it in detail, you'll see that the biggest change we have is the 'if' part. The rest I would call cosmetic because the behaivor is the same. | If so -- Wouldn't a concept of a bitmask to control also which CPU's | that runs the threads be more general? Sounds like a bit complex, and would be my turn to ask if it's really needed. :) A bit set for each CPU there will a pktgen process running. * Some minor fixes and cleanups, like functions returns being not checked. * A new command called 'rem_device' to remove one device at a time (currently, we can only remove all devices in one shoot with 'rem_devices_all') * Ports pktgen to use the kernel thread API The current model was chosen simplicity and to bound a device to a specific CPU so it never cahanges. A change will need to carefully tested. * cleanup the debug function usage I would like to remove the do_softirq stuff from pktgen... Cheers. --ro - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Announce] Intel PRO/Wireless 3945ABG Network Connection
On Sun, Mar 12, 2006 at 04:01:57PM -0800, David Rosky wrote: (Telecommunications Certification Body).The FCC also maintains a mechanism whereby certification-related questions can be asked directly of them and whereby answers to previous questions can be Can you point me at this? What restrictions (if any) are there on who can access this information? it is not possible to say (except for someone at Intel) whether it was Intel, or the certifying TCB who decided that making it binary-only would be necessary to meet the FCC requirements of being reasonably tamper-proof. Perhaps someone from Intel can answer this? Thanks for the good information! John -- John W. Linville [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PKTGEN: Adds thread limit parameter.
On Mon, 13 Mar 2006 14:29:17 +0100 Robert Olsson [EMAIL PROTECTED] wrote: | | Luiz Fernando Capitulino writes: | |Well, I wouldn't say it's _really_ needed. But it really avoids having | too many thread entries in the pktgen's /proc directory, and as a good | result, you will not have pending threads which will never run as well. | |Also note that the patch is trivial, if you look at it in detail, | you'll see that the biggest change we have is the 'if' part. The rest I | would call cosmetic because the behaivor is the same. | | | If so -- Wouldn't a concept of a bitmask to control also which CPU's | | that runs the threads be more general? | |Sounds like a bit complex, and would be my turn to ask if it's | really needed. :) | | A bit set for each CPU there will a pktgen process running. Looks interesting. Could you give a few more details? Or an example somewhere in the kernel (if any) for what you have in mind. |* Some minor fixes and cleanups, like functions returns being not |checked. | |* A new command called 'rem_device' to remove one device at a time | (currently, we can only remove all devices in one shoot with | 'rem_devices_all') | |* Ports pktgen to use the kernel thread API | | The current model was chosen simplicity and to bound a device | to a specific CPU so it never cahanges. A change will need to | carefully tested. Sure thing. |* cleanup the debug function usage | | I would like to remove the do_softirq stuff from pktgen... Added to the TODO list. :) -- Luiz Fernando N. Capitulino - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: GigE on PowerMac G5
Andreas Schwab [EMAIL PROTECTED] writes: Benjamin Herrenschmidt [EMAIL PROTECTED] writes: At this point, all I can say is... does it work in OS X ? Strange, OS X can't do it either. Looks like I have a hardware problem. It turned out that one of the contacts in the RJ-45 jack was twisted. After straightening it the Gb connection is working now. Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
kmalloc_node returns unaligned memory
kmalloc_node returns unaligned pointers on powerpc, when CONFIG_DEBUG_SLAB is enabled. This makes iptables very unhappy. It checks the alignment in net/ipv6/netfilter/ip6_tables.c:check_entry_size_and_hooks(). __alignof__(struct ip6t_entry) returns 8. But returned pointers from xt_alloc_table_info() are unaligned: Linux version 2.6.16-rc6-git1-default-iptables-slab ([EMAIL PROTECTED]) (gcc version 4.1.0 (SUSE Linux)) #2 Mon Mar 13 15:19:45 CET 2006 ... xt_alloc_table_info(250) modprobe(1687):c0,j4294904016 newinfo/size cfc82498/0x278 xt_alloc_table_info(265) modprobe(1687):c0,j4294904038 entries[0] c449611c ip_nat_init: can't setup rules. sys_init_module(1960) modprobe(1687):c0,j4294904071 iptable_nat returned -22 ... Any ideas how to fix that? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kmalloc_node returns unaligned memory
Olaf == Olaf Hering [EMAIL PROTECTED] writes: Olaf kmalloc_node returns unaligned pointers on powerpc, when Olaf CONFIG_DEBUG_SLAB is enabled. This makes iptables very Olaf unhappy. It checks the alignment in Olaf net/ipv6/netfilter/ip6_tables.c:check_entry_size_and_hooks(). Olaf __alignof__(struct ip6t_entry) returns 8. But returned pointers Olaf from xt_alloc_table_info() are unaligned: Hi Olaf, I believe this is expected behavior ;-( We have the same problem with the XPC driver for the SN2 which resulted in a wrapper macro being created for it. Some sort of SLAB_HWCACHE_ALIGN flag to Slab that was always respected for this would be nice. Cheers, Jes - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PKTGEN: Adds thread limit parameter.
On Mon, 13 Mar 2006, Luiz Fernando Capitulino wrote: BTW Robert, my TODO list for pktgen so far is: * A new command called 'rem_device' to remove one device at a time (currently, we can only remove all devices in one shoot with 'rem_devices_all') This should be very simple to implement - there's now a T_REMDEV flag in addition to T_REMDEVALL, though no rem_device method has been added. (T_REMDEV is currently used only when a NETDEV_UNREGISTER notification is received for the device.) -- Arthur - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [NET] : get rid of 'struct sec_path *' in skbuff if ! CONFIG_XFRM
There is a strange glitch about secpath_put() and secpath_get() calls. secpath_put() is called from __kfree_skb() only if CONFIG_XFRM is defined. secpath_get() is called from skb_clone() and copy_skb_header() if CONFIG_INET is defined (!!!) I think the intention was to use CONFIG_XFRM everywhere ? If yes, we can save a 'struct sec_path' pointer from 'struct sk_buff' if ! CONFIG_XFRM Signed-off-by: Eric Dumazet [EMAIL PROTECTED] --- a/include/linux/skbuff.h2006-03-13 18:30:21.0 +0100 +++ b/include/linux/skbuff.h2006-03-13 18:38:27.0 +0100 @@ -243,7 +243,9 @@ } mac; struct dst_entry *dst; +#ifdef CONFIG_XFRM struct sec_path*sp; +#endif /* * This is the control buffer. It is free to use for every --- a/net/core/skbuff.c 2006-03-13 18:31:03.0 +0100 +++ b/net/core/skbuff.c 2006-03-13 18:39:01.0 +0100 @@ -415,9 +415,8 @@ C(mac); C(dst); dst_clone(skb-dst); - C(sp); -#ifdef CONFIG_INET - secpath_get(skb-sp); +#ifdef CONFIG_XFRM + n-sp = secpath_get(skb-sp); #endif memcpy(n-cb, skb-cb, sizeof(skb-cb)); C(len); @@ -483,7 +482,7 @@ new-priority = old-priority; new-protocol = old-protocol; new-dst= dst_clone(old-dst); -#ifdef CONFIG_INET +#ifdef CONFIG_XFRM new-sp = secpath_get(old-sp); #endif new-h.raw = old-h.raw + offset;
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
Eric Molitor wrote: You are correct, the format is as follows. Command Packet * Header o length (4 bytes) o id (4 bytes) o flags (1 byte) o command set (1 byte) o command (1 byte) * data (Variable) Reply Packet * Header o length (4 bytes) o id (4 bytes) o flags (1 byte) o error code (2 bytes) * data (Variable) Source: http://java.sun.com/j2se/1.4.2/docs/guide/jpda/jdwp-spec.html Also maybe useful: http://java.sun.com/j2se/1.5.0/docs/guide/jpda/socketTransport-example.c S, since they know the length it means they know the data to send, which means there is no valid excuse for not sending all the data at one time - ie in one call to the transport. Their use of TCP_NODELAY was simply a kludge, and a massive one at that. I still think there are issues trying to map the byte-based RFC cwnds to a packet based cwnd in the stack, but the application is defintely broken. rick jones - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Mon, 2006-13-03 at 14:44 +1000, Russell Stuart wrote: On Sat, 2006-03-11 at 08:11 -0500, jamal wrote: On my machine tc does not parse filter sample for the u32 filter. Eg: tc filter add dev eth2 parent 1:0 protocol ip prio 1 u32 ht 801: \ classid 1:3 \ sample ip protocol 1 0xff match ip protocol 1 0xff Illegal sample The syntax is correct but ht 801: must exist - that is why it is being rejected.. So there is absolutely categorically _no way in hell_ your memset would have fixed it ;- Apologies for being overdramatic ;- You are wrong on both counts. I am wrong on why it is being rejected - but what you are seeing is worse than i thought initially. Lets put it this way: The only you will _ever_ get that message is if you had made a syntax error (which you did not). Please look at the code on where that message appears and: a) tell me how you would have got that message to begin with using perfectly legal syntax. a) tell me how a memset would have fixed that. Firstly, the error message came from tc when it parsed the line. If tc gets an error talking to the kernel it says, among other things: We have an error talking to the kernel Yes, I added that message to tc - I should know. And thats what you should have got because nothing is wrong with your syntax. Again, it does not make sense whatsover that you got a message saying it was illegal sample (dont make me go overdramatic again ;-), heres my laptop: # tc -V tc utility, iproute2-ss041019 # tc qdisc add dev eth0 ingress # tc filter add dev eth0 parent : protocol ip prio 1 u32 ht 801: classid 1:3 sample ip protocol 1 0xff match ip protocol 1 0xff RTNETLINK answers: Invalid argument We have an error talking to the kernel #tc filter add dev eth0 parent : protocol ip prio 10 handle 801:: u32 divisor 256 # tc filter add dev eth0 parent : protocol ip prio 1 u32 ht 801: classid 1:3 sample ip protocol 1 0xff match ip protocol 1 0xff now it works. Are you sure you didnt have your own changes in there that might have contributed that? Of course on some machines (perhaps yours?) that random crap might be 0, and then it would work. That is why I said at the start of my patch On my machine tc does not On other machines the bug may not appear. That maybe a different bug not what you stated. Please be a little more reasonable - look at the code. Just send the memset fix to Stephen with a different reason. Your current reason is _wrong_ and i really dont have much time to have this kind of discussion. If you had said I added that memset there because it looks like the right thing to do then we would not have had this discussion. You made claims you fixed a bug. It cant possibly be the bug you fixed. Was it some other bug perhaps and you mixed up the two? Re: sample never worked 100% of the time with that hash. Can you give an example? As far as I know it always worked. Off top of my head i was using something along 3 bits mask. I have to find/look at my notes for an example from about 2 years ago when i did the SUCON presentation [http://www.suug.ch/sucon/04/slides/pkt_cls.pdf]. Both sample and hashkey depend on the hash - I used both at the time for that presentation; if i find my notes i will give you a precise answer. Re: it will work some of the time as it is right now with 2.6.x. Yes - it will work when you are sampling one byte. #-1;#-1; I am sampling port numbers, which are two bytes. It will not work in any case where there are two non-zero bytes. yes, this is another problem. So the issue was it works great if you had an 8 bit selection but was unpredictable if you didnt. The end goal is you want the hash to spread incoming selection evenly into different buckets. Re: Can you post the output of tc -s filter ls on 2.6 with and without your user space change?. Here it is: With my change: tc qdisc add dev eth0 root handle 1: htb tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 ht 801: divisor 256 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample tcp src 0x369 0x match tcp src 0x0369 0x tc -s filter show dev eth0 filter parent 1: protocol ip pref 1 u32 filter parent 1: protocol ip pref 1 u32 fh 801: ht divisor 256 filter parent 1: protocol ip pref 1 u32 fh 801:3:800 order 2048 key ht 801 bkt 3 flowid 1: match 0369/ at nexthdr+0 filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1 On the orginal tc shipped with debian sarge: tc qdisc add dev eth0 root handle 1: htb tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 ht 801: divisor 256 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample tcp src 0x369 0x match tcp src 0x0369 0x Illegal sample Ooops. Looks like I can't get out of this without patching and compiling: Now we may
RE: Router stops routing after changing MAC Address
HOT DOGGIES!! I think Chuck found the problem. It turns out that the OUI portion of the MAC Address - those leftmost 6 hex digits that identify the vendor - do also have some other special meaning built in. Chuck, I am indebted to you and the list. If the second hex digit is odd, this means the high-order bit of the OUI is set and that means it's a multicast address. I think I have my bits right. Here is an excerpt from http://www.iana.org/assignments/ethernet-numbers. These addresses are physical station addresses, not multicast nor broadcast, so the second hex digit (reading from the left) will be even, not odd. There are also other sources describing how the bits are arranged and how we display MAC Addresses. Google is our friend. Anyway, one of my fudged MAC Addresses had an odd number in that second hex digit - and that's why the router did not route. The solution - just make sure my fudged MAC Addresses are real unicast MAC Addresses and not multicast addresses. Here is my modified ip-fudge-mac.sh script - note that I also turned rp_filter back on: [EMAIL PROTECTED] gregs]# more ip-fudge-mac.sh /sbin/ip link set eth0 down /sbin/ip link set eth0 address 12:34:56:00:30:50 /sbin/ip link set eth0 up /sbin/ip link set eth1 down /sbin/ip link set eth1 address 12:34:56:01:60:03 /sbin/ip link set eth1 up echo 1 /proc/sys/net/ipv4/ip_forward echo 1 /proc/sys/net/ipv4/conf/eth0/rp_filter echo 1 /proc/sys/net/ipv4/conf/eth1/rp_filter ##6: eth0: BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast qlen 1000 ##link/ether 00:10:4b:71:20:60 brd ff:ff:ff:ff:ff:ff ##7: eth1: BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast qlen 1000 ##link/ether 00:60:97:b6:f9:4a brd ff:ff:ff:ff:ff:ff [EMAIL PROTECTED] gregs]# I also learned the IEEE has an easy way for anyone to register their own OUI. You fill out a web form and pay $1650 and 7 days later, you're the proud owner of your own OUI block - with 24 bits to use as you see fit. If $1650 is too steep, you can pay $550 and buy 12 bits of MAC Addresses. For now, I decided to use a fudged OUI of 12-34-56 and then use the rightmost 2 octets of the IP Address with leading zeros to fill out the rest of the MAC Address. I will buy some official numbers from the IEEE later. It is proper to give back when given a gift from the community. So here is my failover-monitor.sh script in its state right now. I will probably do a few more tweaks before going into production. The .conf file referenced defines a bunch of IP Addresses and interface names specific to this site. This little script starts up as a daemon at boot time and sends its output to a log file. It polls the heartbeat NIC every 10 seconds. If the other end does not respond to a ping, it checks all the other NIC interfaces. If no response from the other NICs either, it checks the gateway - the router to the Internet. If the gateway DOES respond, then it assumes the primary role. After assuming the primary role, it polls the gateway every 10 seconds. If the gatway goes offline, it takes itself offline and assumes a backup role - polling every 10 seconds to determine if it should take control again. This hopefully minimizes the probability that both members of the failover pair will try to take control and that both will assume a backup role with nobody taking control.But I may have to tweak the algorithm a bit more after more testing. - Greg Scott #!/bin/bash # failover-monitor.sh # First find out if this node or its partner should be primary by # checking the flag file. If the file exists then this node thinks # it is supposed to be primary, so take control if its partner is # unreachable on all interfaces. # If the flag file does not exist then assume a backup role. # Poll its partner. If its parter is offline then take control. # If its partner is online then sleep for a few seconds and repeat. # # Greg Scott, March 8, 2006 . /firewall-scripts/rcfirewall.conf # # Figure out who we are # if [ $(hostname) = $FW1_HOST ] then ME_HOST=$FW1_HOST ME_HBEAT=$FW1_HBEAT ME_INET=$FW1_INET ME_INETMAC=$FW1_INETMAC ME_TRUSTED=$FW1_TRUSTED ME_TRUSTEDMAC=$FW1_TRUSTEDMAC YOU_HOST=$FW2_HOST YOU_HBEAT=$FW2_HBEAT YOU_INET=$FW2_INET YOU_TRUSTED=$FW2_TRUSTED else ME_HOST=$FW2_HOST ME_HBEAT=$FW2_HBEAT ME_INET=$FW2_INET ME_INETMAC=$FW2_INETMAC ME_TRUSTED=$FW2_TRUSTED ME_TRUSTEDMAC=$FW2_TRUSTEDMAC YOU_HOST=$FW1_HOST YOU_HBEAT=$FW1_HBEAT YOU_INET=$FW1_INET YOU_TRUSTED=$FW1_TRUSTED fi function take_control { # This function is called when the failover partner does not reply # on the YOU_HBEAT IP Address. # # Take over the firewall IP address and special MAC address iff: # This node, ME, can see the Internet gateway and YOU_TRUSTED # and INET_IP do not answer. Remember that INET_IP is the # IP Address of the primary firewall. That is why we test # for INET_IP and not YOU_INET.
Re: [PATCH] TC: bug fixes to the sample clause
jamal wrote: On Mon, 2006-13-03 at 14:44 +1000, Russell Stuart wrote: You are wrong on both counts. I am wrong on why it is being rejected - but what you are seeing is worse than i thought initially. Lets put it this way: The only you will _ever_ get that message is if you had made a syntax error (which you did not). Please look at the code on where that message appears and: a) tell me how you would have got that message to begin with using perfectly legal syntax. a) tell me how a memset would have fixed that. He already told you, pack_key expects the selector to be initialized, otherwise nkeys might contain a value = 128, which would cause exactly this error, if a matching key is not found within the uninitialized memory by accident. Just send the memset fix to Stephen with a different reason. Your current reason is _wrong_ and i really dont have much time to have this kind of discussion. If you had said I added that memset there because it looks like the right thing to do then we would not have had this discussion. You made claims you fixed a bug. It cant possibly be the bug you fixed. Was it some other bug perhaps and you mixed up the two? The patch as well as the description are perfectly fine. BTW, running valgrind on tc shows lots of uses of uninitialized values, it seems like a good idea if someone would go over these and fix them up. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6 v2] IB: IP address based RDMA connection manager
Sean It's dropping the reference on cma_dev, as opposed to Sean id_priv. Duh, sorry. - R. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Router stops routing after changing MAC Address
There still is a bug in the 3c59x driver. It doesn't include any code to handle changing the mac address. It will work if you take the device down, change address, then bring it up. But you shouldn't have to do that. Also, if the driver handles setting mac address, it could have prevented you from using a multicast address. Something like this is needed (untested, I don't have that hardware). --- linux-2.6/drivers/net/3c59x.c.orig 2006-03-13 09:58:25.0 -0800 +++ linux-2.6/drivers/net/3c59x.c 2006-03-13 09:52:47.0 -0800 @@ -895,6 +895,7 @@ static void dump_tx_ring(struct net_devi static void update_stats(void __iomem *ioaddr, struct net_device *dev); static struct net_device_stats *vortex_get_stats(struct net_device *dev); static void set_rx_mode(struct net_device *dev); +static int set_rx_address(struct net_device *dev, void *addr); #ifdef CONFIG_PCI static int vortex_ioctl(struct net_device *dev, struct ifreq *rq, int cmd); #endif @@ -1563,6 +1564,7 @@ static int __devinit vortex_probe1(struc #endif dev-ethtool_ops = vortex_ethtool_ops; dev-set_multicast_list = set_rx_mode; + dev-set_mac_address = set_rx_address; dev-tx_timeout = vortex_tx_timeout; dev-watchdog_timeo = (watchdog * HZ) / 1000; #ifdef CONFIG_NET_POLL_CONTROLLER @@ -3150,6 +3152,27 @@ static void set_rx_mode(struct net_devic iowrite16(new_mode, ioaddr + EL3_CMD); } + +static int set_rx_address(struct net_device *dev, void *p) +{ + struct vortex_private *vp = netdev_priv(dev); + void __iomem *ioaddr = vp-ioaddr; + const struct sockaddr *addr = p; + + if (!is_valid_ether_addr(addr-sa_data)) + return -EADDRNOTAVAIL; + + spin_lock_bh(vp-lock); + memcpy(dev-dev_addr, addr-sa_data, ETH_ALEN); + + EL3WINDOW(2); + for (i = 0; i ETH_ALEN; i++) + iowrite8(dev-dev_addr[i], ioaddr + i); + spin_unlock_bh(vp-lock); + + return 0; +} + #if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE) /* Setup the card so that it can receive frames with an 802.1q VLAN tag. Note that this must be done after each RxReset due to some backwards - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
On Mon, 13 Mar 2006 09:25:44 -0800 Rick Jones [EMAIL PROTECTED] wrote: Eric Molitor wrote: You are correct, the format is as follows. Command Packet * Header o length (4 bytes) o id (4 bytes) o flags (1 byte) o command set (1 byte) o command (1 byte) * data (Variable) Reply Packet * Header o length (4 bytes) o id (4 bytes) o flags (1 byte) o error code (2 bytes) * data (Variable) Source: http://java.sun.com/j2se/1.4.2/docs/guide/jpda/jdwp-spec.html Also maybe useful: http://java.sun.com/j2se/1.5.0/docs/guide/jpda/socketTransport-example.c S, since they know the length it means they know the data to send, which means there is no valid excuse for not sending all the data at one time - ie in one call to the transport. Their use of TCP_NODELAY was simply a kludge, and a massive one at that. I still think there are issues trying to map the byte-based RFC cwnds to a packet based cwnd in the stack, but the application is defintely broken. Even doing it two pieces (header and data) would have worked. --- socketTransport-example.c.orig 2006-03-13 10:14:00.0 -0800 +++ socketTransport-example.c 2006-03-13 10:24:02.0 -0800 @@ -445,7 +445,16 @@ static jdwpTransportError JNICALL socketTransport_writePacket(jdwpTransportEnv* env, const jdwpPacket *packet) { -jint len, data_len, id; +#pragma pack +struct { + jint len; + jint id; + jbyte flags; + union { + jshort errorCode; + jbyte cmd[2]; + }; +} header; jbyte *data; /* packet can't be null */ @@ -453,7 +462,6 @@ RETURN_ERROR(JDWPTRANSPORT_ERROR_ILLEGAL_ARGUMENT, packet is NULL); } -len = packet-type.cmd.len;/* includes header */ data_len = len - 11; /* bad packet */ @@ -461,40 +469,22 @@ RETURN_ERROR(JDWPTRANSPORT_ERROR_ILLEGAL_ARGUMENT, invalid length); } -len = (jint)dbgsysHostToNetworkLong(len); - -if (dbgsysSend(socketFD,(char *)len,sizeof(jint),0) != sizeof(jint)) { - RETURN_IO_ERROR(send failed); -} - -id = (jint)dbgsysHostToNetworkLong(packet-type.cmd.id); - -if (dbgsysSend(socketFD,(char *)(id),sizeof(jint),0) != sizeof(jint)) { - RETURN_IO_ERROR(send failed); -} - -if (dbgsysSend(socketFD,(char *)(packet-type.cmd.flags) - ,sizeof(jbyte),0) != sizeof(jbyte)) { - RETURN_IO_ERROR(send failed); -} +header.len = (jint)dbgsysHostToNetworkLong(packet-type.cmd.len); +header.id = (jint)dbgsysHostToNetworkLong(packet-type.cmd.id); +header.flags = packet-type.cmd.flags; if (packet-type.cmd.flags JDWPTRANSPORT_FLAGS_REPLY) { -jshort errorCode = dbgsysHostToNetworkShort(packet-type.reply.errorCode); -if (dbgsysSend(socketFD,(char *)(errorCode) - ,sizeof(jshort),0) != sizeof(jshort)) { - RETURN_IO_ERROR(send failed); - } + header.errorCode = dbgsysHostToNetworkShort(packet-type.reply.errorCode); } else { -if (dbgsysSend(socketFD,(char *)(packet-type.cmd.cmdSet) - ,sizeof(jbyte),0) != sizeof(jbyte)) { - RETURN_IO_ERROR(send failed); - } -if (dbgsysSend(socketFD,(char *)(packet-type.cmd.cmd) - ,sizeof(jbyte),0) != sizeof(jbyte)) { - RETURN_IO_ERROR(send failed); - } + header.cmd[0] = packet-type.cmd.cmdSet; + header.cmd[1] = packet-type.cmd.cmd; } - + +if (dbgsysSend(socketFD,(char *)header, sizeof(header), 0) + != sizeof(header)) { + RETURN_IO_ERROR(send failed); +} + data = packet-type.cmd.data; if (dbgsysSend(socketFD,(char *)data,data_len,0) != data_len) { RETURN_IO_ERROR(send failed); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Mon, 2006-13-03 at 18:55 +0100, Patrick McHardy wrote: jamal wrote: Lets put it this way: The only you will _ever_ get that message is if you had made a syntax error (which you did not). Please look at the code on where that message appears and: a) tell me how you would have got that message to begin with using perfectly legal syntax. a) tell me how a memset would have fixed that. He already told you, pack_key expects the selector to be initialized, otherwise nkeys might contain a value = 128, which would cause exactly this error, if a matching key is not found within the uninitialized memory by accident. oops. Yes - that is possible. BTW, running valgrind on tc shows lots of uses of uninitialized values, it seems like a good idea if someone would go over these and fix them up. It will be valuable to fix so we dont run in these issues again. Russell - I will respond to the rest of the email either tonight or tomorrow. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
S, since they know the length it means they know the data to send, which means there is no valid excuse for not sending all the data at one time - ie in one call to the transport. Their use of TCP_NODELAY was simply a kludge, and a massive one at that. I still think there are issues trying to map the byte-based RFC cwnds to a packet based cwnd in the stack, but the application is defintely broken. Even doing it two pieces (header and data) would have worked. Insofar as it would not have stumbled over the cwnd. Still it should be presenting everything at one time - is there no gathering send to be made? rick jones - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] natsemi: Add support for using MII port with no PHY
On Sun, Mar 12, 2006 at 01:41:13PM -0800, [EMAIL PROTECTED] wrote: Not that my opinion should hold much weight, having been absent from the driver for some time, but yuck. Is there no better way to do this thatn sprinkling poo all over it? The changes are mostly isolated into check_link(), the fact that half the function gets placed inside a conditional but diff sees it as a bunch of smaller changes makes the changes look a lot more invasive than they actually are. I guess that could be helped by splitting the PHY access code out of check_link() into check_phy_status() or something but I'm not sure how much that really helps. -- You grabbed my hand and we fell into it, like a daydream - or a fever. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] natsemi: Add support for using MII port with no PHY
On Mon, Mar 13, 2006 at 06:23:31PM +, Mark Brown wrote: On Sun, Mar 12, 2006 at 01:41:13PM -0800, [EMAIL PROTECTED] wrote: Not that my opinion should hold much weight, having been absent from the driver for some time, but yuck. Is there no better way to do this thatn sprinkling poo all over it? The changes are mostly isolated into check_link(), the fact that half the function gets placed inside a conditional but diff sees it as a bunch of smaller changes makes the changes look a lot more invasive than they actually are. I guess that could be helped by splitting the PHY access code out of check_link() into check_phy_status() or something but I'm not sure how much that really helps. It's not terribly offensive it just seems like a hack. :) I'm not sure I really understand the reasoning, so I can't offer anythign better or more general purpose. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: drivers/net/chelsio/sge.c: two array overflows
Adrian, This is a bug. The array should contain 2 elements. Attached is a patch which fixes it. Thanks. Signed-off-by: Scott Bardone [EMAIL PROTECTED] Adrian Bunk wrote: The Coverity checker spotted the following two array overflows in drivers/net/chelsio/sge.c (in both cases, the arrays contain 3 elements): -- snip -- ... static void restart_tx_queues(struct sge *sge) { ... sge-stats.cmdQ_restarted[3]++; ... static int t1_sge_tx(struct sk_buff *skb, struct adapter *adapter, unsigned int qid, struct net_device *dev) { ... sge-stats.cmdQ_full[3]++; ... -- snip -- cu Adrian --- sge.c 2006-02-17 14:23:45.0 -0800 +++ sge.fix.c 2006-03-13 10:51:24.0 -0800 @@ -1021,7 +1021,7 @@ if (test_and_clear_bit(nd-if_port, sge-stopped_tx_queues) netif_running(nd)) { -sge-stats.cmdQ_restarted[3]++; +sge-stats.cmdQ_restarted[2]++; netif_wake_queue(nd); } } @@ -1350,7 +1350,7 @@ if (unlikely(credits count)) { netif_stop_queue(dev); set_bit(dev-if_port, sge-stopped_tx_queues); - sge-stats.cmdQ_full[3]++; + sge-stats.cmdQ_full[2]++; spin_unlock(q-lock); if (!netif_queue_stopped(dev)) CH_ERR(%s: Tx ring full while queue awake!\n, @@ -1358,7 +1358,7 @@ return NETDEV_TX_BUSY; } if (unlikely(credits - count q-stop_thres)) { - sge-stats.cmdQ_full[3]++; + sge-stats.cmdQ_full[2]++; netif_stop_queue(dev); set_bit(dev-if_port, sge-stopped_tx_queues); }
Re: Router stops routing after changing MAC Address
On Mon, 13 Mar 2006, Stephen Hemminger wrote: There still is a bug in the 3c59x driver. It doesn't include any code to handle changing the mac address. It will work if you take the device down, change address, then bring it up. But you shouldn't have to do that. Also, if the driver handles setting mac address, it could have prevented you from using a multicast address. Something like this is needed (untested, I don't have that hardware). --- linux-2.6/drivers/net/3c59x.c.orig2006-03-13 09:58:25.0 -0800 +++ linux-2.6/drivers/net/3c59x.c 2006-03-13 09:52:47.0 -0800 @@ -895,6 +895,7 @@ static void dump_tx_ring(struct net_devi static void update_stats(void __iomem *ioaddr, struct net_device *dev); static struct net_device_stats *vortex_get_stats(struct net_device *dev); static void set_rx_mode(struct net_device *dev); +static int set_rx_address(struct net_device *dev, void *addr); #ifdef CONFIG_PCI static int vortex_ioctl(struct net_device *dev, struct ifreq *rq, int cmd); #endif @@ -1563,6 +1564,7 @@ static int __devinit vortex_probe1(struc #endif dev-ethtool_ops = vortex_ethtool_ops; dev-set_multicast_list = set_rx_mode; + dev-set_mac_address = set_rx_address; dev-tx_timeout = vortex_tx_timeout; dev-watchdog_timeo = (watchdog * HZ) / 1000; #ifdef CONFIG_NET_POLL_CONTROLLER @@ -3150,6 +3152,27 @@ static void set_rx_mode(struct net_devic iowrite16(new_mode, ioaddr + EL3_CMD); } + +static int set_rx_address(struct net_device *dev, void *p) +{ + struct vortex_private *vp = netdev_priv(dev); + void __iomem *ioaddr = vp-ioaddr; + const struct sockaddr *addr = p; + + if (!is_valid_ether_addr(addr-sa_data)) + return -EADDRNOTAVAIL; + + spin_lock_bh(vp-lock); + memcpy(dev-dev_addr, addr-sa_data, ETH_ALEN); + + EL3WINDOW(2); + for (i = 0; i ETH_ALEN; i++) + iowrite8(dev-dev_addr[i], ioaddr + i); + spin_unlock_bh(vp-lock); + + return 0; +} + #if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE) /* Setup the card so that it can receive frames with an 802.1q VLAN tag. Note that this must be done after each RxReset due to some backwards - Actually, it doesn't make any difference. Changing the IEEE station (physical) address is not an allowed procedure even though hooks are available in many drivers to do this. According to the IEEE 802 physical media specification, this 48-bit address must be unique and must be one of a group assigned by IEEE. Failure to follow this simple protocol can (will) cause an entire network to fail. If you don't care, then you certainly don't care about multicast bits either, basically let them set it to all ones as well. Cheers, Dick Johnson Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips). Warning : 98.36% of all statistics are fiction, book release in April. _ The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [EMAIL PROTECTED] - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scm: fold __scm_send() into scm_send()
On Mon, Mar 13, 2006 at 09:05:31PM +0100, Ingo Oeser wrote: From: Ingo Oeser [EMAIL PROTECTED] Fold __scm_send() into scm_send() and remove that interface completly from the kernel. Whoa, what are you doing here? Uninlining scm_send() is a Bad Thing to do given that scm_send() is in the AF_UNIX hot path. -ben -- Time is of no importance, Mr. President, only life is important. Don't Email: [EMAIL PROTECTED]. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] scm: fold __scm_send() into scm_send()
From: Ingo Oeser [EMAIL PROTECTED] Fold __scm_send() into scm_send() and remove that interface completly from the kernel. Signed-off-by: Ingo Oeser [EMAIL PROTECTED] --- Inspired by the patch to inline scm_send() I did the next logical step :-) Regards Ingo Oeser diff --git a/include/net/scm.h b/include/net/scm.h index eb44e5a..ec8b891 100644 --- a/include/net/scm.h +++ b/include/net/scm.h @@ -26,11 +26,9 @@ struct scm_cookie extern void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm); extern void scm_detach_fds_compat(struct msghdr *msg, struct scm_cookie *scm); -extern int __scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *scm); extern void __scm_destroy(struct scm_cookie *scm); extern struct scm_fp_list * scm_fp_dup(struct scm_fp_list *fpl); -extern int scm_send(struct socket *sock, struct msghdr *msg, - struct scm_cookie *scm); +extern int scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *scm); extern void scm_recv(struct socket *sock, struct msghdr *msg, struct scm_cookie *scm, int flags); diff --git a/net/core/scm.c b/net/core/scm.c index b6dee90..6adbe60 100644 --- a/net/core/scm.c +++ b/net/core/scm.c @@ -110,10 +110,21 @@ void __scm_destroy(struct scm_cookie *sc } } -int __scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *p) +int scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *scm) { struct cmsghdr *cmsg; int err; + struct task_struct *tsk = current; + scm-creds = (struct ucred) { + .uid = tsk-uid, + .gid = tsk-gid, + .pid = tsk-tgid + }; + scm-fp = NULL; + scm-sid = security_sk_sid(sock-sk, NULL, 0); + scm-seq = 0; + if (msg-msg_controllen = 0) + return 0; for (cmsg = CMSG_FIRSTHDR(msg); cmsg; cmsg = CMSG_NXTHDR(msg, cmsg)) { @@ -136,15 +147,15 @@ int __scm_send(struct socket *sock, stru switch (cmsg-cmsg_type) { case SCM_RIGHTS: - err=scm_fp_copy(cmsg, p-fp); + err=scm_fp_copy(cmsg, scm-fp); if (err0) goto error; break; case SCM_CREDENTIALS: if (cmsg-cmsg_len != CMSG_LEN(sizeof(struct ucred))) goto error; - memcpy(p-creds, CMSG_DATA(cmsg), sizeof(struct ucred)); - err = scm_check_creds(p-creds); + memcpy(scm-creds, CMSG_DATA(cmsg), sizeof(struct ucred)); + err = scm_check_creds(scm-creds); if (err) goto error; break; @@ -153,15 +164,15 @@ int __scm_send(struct socket *sock, stru } } - if (p-fp !p-fp-count) + if (scm-fp !scm-fp-count) { - kfree(p-fp); - p-fp = NULL; + kfree(scm-fp); + scm-fp = NULL; } return 0; error: - scm_destroy(p); + scm_destroy(scm); return err; } @@ -284,22 +295,6 @@ struct scm_fp_list *scm_fp_dup(struct sc return new_fpl; } -int scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *scm) -{ - struct task_struct *p = current; - scm-creds = (struct ucred) { - .uid = p-uid, - .gid = p-gid, - .pid = p-tgid - }; - scm-fp = NULL; - scm-sid = security_sk_sid(sock-sk, NULL, 0); - scm-seq = 0; - if (msg-msg_controllen = 0) - return 0; - return __scm_send(sock, msg, scm); -} - void scm_recv(struct socket *sock, struct msghdr *msg, struct scm_cookie *scm, int flags) { @@ -332,7 +326,6 @@ void scm_recv(struct socket *sock, struc } EXPORT_SYMBOL(__scm_destroy); -EXPORT_SYMBOL(__scm_send); EXPORT_SYMBOL(scm_send); EXPORT_SYMBOL(scm_recv); EXPORT_SYMBOL(put_cmsg); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Router stops routing after changing MAC Address
But in a failover scenario you want two devices to have the same IEEE (station) Address (or MAC Address or hardware address). So many names for the same thing! When the primary unit fails, you want the backup unit to completely assume the failed unit's identity - right down to the MAC Address. The other way to do it using gratuitous ARPs is not good enough because some cheap router someplace with an ARP cache of several hours will not listen and will never update its own ARP cache. I like to think of this as bending the rules a little bit, not really breaking them. :) - Greg Actually, it doesn't make any difference. Changing the IEEE station (physical) address is not an allowed procedure even though hooks are available in many drivers to do this. According to the IEEE 802 physical media specification, this 48-bit address must be unique and must be one of a group assigned by IEEE. Failure to follow this simple protocol can (will) cause an entire network to fail. If you don't care, then you certainly don't care about multicast bits either, basically let them set it to all ones as well. Cheers, Dick Johnson Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips). Warning : 98.36% of all statistics are fiction, book release in April. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[-mm patch] make drivers/net/tg3.c:tg3_request_irq()
On Sun, Mar 12, 2006 at 03:10:36AM -0800, Andrew Morton wrote: ... Changes since 2.6.16-rc5-mm3: ... git-net.patch ... git trees ... This patch makes the needlessly global function tg3_request_irq() static. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- linux-2.6.16-rc6-mm1-full/drivers/net/tg3.c.old 2006-03-13 21:13:31.0 +0100 +++ linux-2.6.16-rc6-mm1-full/drivers/net/tg3.c 2006-03-13 21:14:26.0 +0100 @@ -6531,7 +6531,7 @@ add_timer(tp-timer); } -int tg3_request_irq(struct tg3 *tp) +static int tg3_request_irq(struct tg3 *tp) { irqreturn_t (*fn)(int, void *, struct pt_regs *); unsigned long flags; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Router stops routing after changing MAC Address
On Mon, 13 Mar 2006, Greg Scott wrote: But in a failover scenario you want two devices to have the same IEEE (station) Address (or MAC Address or hardware address). So many names for the same thing! When the primary unit fails, you want the backup unit to completely assume the failed unit's identity - right down to the MAC Address. The other way to do it using gratuitous ARPs is not good enough because some cheap router someplace with an ARP cache of several hours will not listen and will never update its own ARP cache. I like to think of this as bending the rules a little bit, not really breaking them. :) - Greg Top posting, NotGood(tm). Anyway, if the device fails, you have routers and hosts ARPing the interface, trying to establish a route anyway. Actually, it doesn't make any difference. Changing the IEEE station (physical) address is not an allowed procedure even though hooks are available in many drivers to do this. According to the IEEE 802 physical media specification, this 48-bit address must be unique and must be one of a group assigned by IEEE. Failure to follow this simple protocol can (will) cause an entire network to fail. If you don't care, then you certainly don't care about multicast bits either, basically let them set it to all ones as well. Cheers, Dick Johnson Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips). Warning : 98.36% of all statistics are fiction, book release in April. Cheers, Dick Johnson Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips). Warning : 98.36% of all statistics are fiction, book release in April. _ The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [EMAIL PROTECTED] - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Router stops routing after changing MAC Address
Anyway, if the device fails, you have routers and hosts ARPing the interface, trying to establish a route anyway. But only after what may be a much longer time than the customer is willing to accept or able to configure. I know of a number of HA situations where the new device is given the old MAC just to avoid that speicific situation of ARP caches not being updated except after quite some time. Not necessarily on the end-systems, the issue can be with intermediate devices (routers). And if one has to work with static ARP entries to deal (however imperfectly) with ARP poisioning or whatnot... Indeed, there is a large onus on the software doing the MAC override to make sure it does not break the required uniqueness. Just as if one were using locally administered MAC addresses. rick jones - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Router stops routing after changing MAC Address
On Mon, 13 Mar 2006 15:27:26 -0500 linux-os \(Dick Johnson\) wrote: On Mon, 13 Mar 2006, Stephen Hemminger wrote: There still is a bug in the 3c59x driver. It doesn't include any code to handle changing the mac address. It will work if you take the device down, change address, then bring it up. But you shouldn't have to do that. Also, if the driver handles setting mac address, it could have prevented you from using a multicast address. Something like this is needed (untested, I don't have that hardware). [cut patch] Actually, it doesn't make any difference. Changing the IEEE station (physical) address is not an allowed procedure even though hooks are available in many drivers to do this. According to the IEEE 802 physical media specification, this 48-bit address must be unique and must be one of a group assigned by IEEE. Failure to follow this simple protocol can (will) cause an entire network to fail. If you don't care, then you certainly don't care about multicast bits either, basically let them set it to all ones as well. They used to allow Locally Administered Addresses. Hrm, google still finds 18,000 hits for that phrase. Is that now outlawed? Even ieee.org has hit(s) for it: http://standards.ieee.org/regauth/groupmac/tutorial.html http://en.wikipedia.org/wiki/MAC_address http://www.mynetwatchman.com/pckidiot/chap04.htm --- ~Randy You can't do anything without having to do something else first. -- Belefant's Law - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/15] bridge: netfilter dont use __constant_htons
Only use__constant_htons() for initializers and switch cases. For other uses, it is just as efficient and clearer to use htons Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- net-2.6.17.orig/net/bridge/br_netfilter.c +++ net-2.6.17/net/bridge/br_netfilter.c @@ -61,14 +61,14 @@ static int brnf_filter_vlan_tagged = 1; #define brnf_filter_vlan_tagged 1 #endif -#define IS_VLAN_IP (skb-protocol == __constant_htons(ETH_P_8021Q) \ - hdr-h_vlan_encapsulated_proto == __constant_htons(ETH_P_IP) \ +#define IS_VLAN_IP (skb-protocol == htons(ETH_P_8021Q) \ + hdr-h_vlan_encapsulated_proto == htons(ETH_P_IP) \ brnf_filter_vlan_tagged) -#define IS_VLAN_IPV6 (skb-protocol == __constant_htons(ETH_P_8021Q) \ - hdr-h_vlan_encapsulated_proto == __constant_htons(ETH_P_IPV6) \ +#define IS_VLAN_IPV6 (skb-protocol == htons(ETH_P_8021Q) \ + hdr-h_vlan_encapsulated_proto == htons(ETH_P_IPV6) \ brnf_filter_vlan_tagged) -#define IS_VLAN_ARP (skb-protocol == __constant_htons(ETH_P_8021Q)\ - hdr-h_vlan_encapsulated_proto == __constant_htons(ETH_P_ARP) \ +#define IS_VLAN_ARP (skb-protocol == htons(ETH_P_8021Q)\ + hdr-h_vlan_encapsulated_proto == htons(ETH_P_ARP) \ brnf_filter_vlan_tagged) /* We need these fake structures to make netfilter happy -- @@ -120,7 +120,7 @@ static int br_nf_pre_routing_finish_ipv6 dst_hold(skb-dst); skb-dev = nf_bridge-physindev; - if (skb-protocol == __constant_htons(ETH_P_8021Q)) { + if (skb-protocol == htons(ETH_P_8021Q)) { skb_push(skb, VLAN_HLEN); skb-nh.raw -= VLAN_HLEN; } @@ -196,7 +196,7 @@ static int br_nf_pre_routing_finish_brid if (!skb-dev) kfree_skb(skb); else { - if (skb-protocol == __constant_htons(ETH_P_8021Q)) { + if (skb-protocol == htons(ETH_P_8021Q)) { skb_pull(skb, VLAN_HLEN); skb-nh.raw += VLAN_HLEN; } @@ -252,7 +252,7 @@ bridged_dnat: nf_bridge-mask |= BRNF_BRIDGED_DNAT; skb-dev = nf_bridge-physindev; if (skb-protocol == - __constant_htons(ETH_P_8021Q)) { + htons(ETH_P_8021Q)) { skb_push(skb, VLAN_HLEN); skb-nh.raw -= VLAN_HLEN; } @@ -271,7 +271,7 @@ bridged_dnat: } skb-dev = nf_bridge-physindev; - if (skb-protocol == __constant_htons(ETH_P_8021Q)) { + if (skb-protocol == htons(ETH_P_8021Q)) { skb_push(skb, VLAN_HLEN); skb-nh.raw -= VLAN_HLEN; } @@ -421,7 +421,7 @@ static unsigned int br_nf_pre_routing(un struct nf_bridge_info *nf_bridge; struct vlan_ethhdr *hdr = vlan_eth_hdr(*pskb); - if (skb-protocol == __constant_htons(ETH_P_IPV6) || IS_VLAN_IPV6) { + if (skb-protocol == htons(ETH_P_IPV6) || IS_VLAN_IPV6) { #ifdef CONFIG_SYSCTL if (!brnf_call_ip6tables) return NF_ACCEPT; @@ -429,7 +429,7 @@ static unsigned int br_nf_pre_routing(un if ((skb = skb_share_check(*pskb, GFP_ATOMIC)) == NULL) goto out; - if (skb-protocol == __constant_htons(ETH_P_8021Q)) { + if (skb-protocol == htons(ETH_P_8021Q)) { skb_pull_rcsum(skb, VLAN_HLEN); skb-nh.raw += VLAN_HLEN; } @@ -440,13 +440,13 @@ static unsigned int br_nf_pre_routing(un return NF_ACCEPT; #endif - if (skb-protocol != __constant_htons(ETH_P_IP) !IS_VLAN_IP) + if (skb-protocol != htons(ETH_P_IP) !IS_VLAN_IP) return NF_ACCEPT; if ((skb = skb_share_check(*pskb, GFP_ATOMIC)) == NULL) goto out; - if (skb-protocol == __constant_htons(ETH_P_8021Q)) { + if (skb-protocol == htons(ETH_P_8021Q)) { skb_pull_rcsum(skb, VLAN_HLEN); skb-nh.raw += VLAN_HLEN; } @@ -523,7 +523,7 @@ static int br_nf_forward_finish(struct s struct net_device *in; struct vlan_ethhdr *hdr = vlan_eth_hdr(skb); - if (skb-protocol != __constant_htons(ETH_P_ARP) !IS_VLAN_ARP) { + if (skb-protocol != htons(ETH_P_ARP) !IS_VLAN_ARP) { in = nf_bridge-physindev; if (nf_bridge-mask BRNF_PKT_TYPE) { skb-pkt_type = PACKET_OTHERHOST; @@ -532,7 +532,7 @@ static int br_nf_forward_finish(struct s } else { in = *((struct net_device **)(skb-cb)); } - if (skb-protocol == __constant_htons(ETH_P_8021Q)) { + if (skb-protocol == htons(ETH_P_8021Q)) { skb_push(skb, VLAN_HLEN);
[PATCH 11/15] bridge: stp timer to jiffies cleanup
Cleanup the get/set of bridge timer value in the packets. It is clearer not to bury the conversion in macro. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- br-2.6.orig/net/bridge/br_stp_bpdu.c +++ br-2.6/net/bridge/br_stp_bpdu.c @@ -19,8 +19,7 @@ #include br_private.h #include br_private_stp.h -#define JIFFIES_TO_TICKS(j) (((j) 8) / HZ) -#define TICKS_TO_JIFFIES(j) (((j) * HZ) 8) +#define STP_HZ 256 static void br_send_bpdu(struct net_bridge_port *p, unsigned char *data, int length) { @@ -57,18 +56,18 @@ static void br_send_bpdu(struct net_brid dev_queue_xmit); } -static __inline__ void br_set_ticks(unsigned char *dest, int jiff) +static inline void br_set_ticks(unsigned char *dest, int j) { - __u16 ticks; - - ticks = JIFFIES_TO_TICKS(jiff); - dest[0] = (ticks 8) 0xFF; - dest[1] = ticks 0xFF; + unsigned long ticks = (STP_HZ * j)/ HZ; + + *((__be16 *) dest) = htons(ticks); } - -static __inline__ int br_get_ticks(unsigned char *dest) + +static inline int br_get_ticks(const unsigned char *src) { - return TICKS_TO_JIFFIES((dest[0] 8) | dest[1]); + unsigned long ticks = ntohs(*(__be16 *)src); + + return (ticks * HZ + STP_HZ - 1) / STP_HZ; } /* called under bridge lock */ -- - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/15] bridge: use setup_timer
Use the now standard setup_timer function. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] Index: br-2.6/net/bridge/br_stp_timer.c === --- br-2.6.orig/net/bridge/br_stp_timer.c +++ br-2.6/net/bridge/br_stp_timer.c @@ -144,39 +144,30 @@ static void br_hold_timer_expired(unsign spin_unlock(p-br-lock); } -static inline void br_timer_init(struct timer_list *timer, - void (*_function)(unsigned long), - unsigned long _data) -{ - init_timer(timer); - timer-function = _function; - timer-data = _data; -} - void br_stp_timer_init(struct net_bridge *br) { - br_timer_init(br-hello_timer, br_hello_timer_expired, + setup_timer(br-hello_timer, br_hello_timer_expired, (unsigned long) br); - br_timer_init(br-tcn_timer, br_tcn_timer_expired, + setup_timer(br-tcn_timer, br_tcn_timer_expired, (unsigned long) br); - br_timer_init(br-topology_change_timer, + setup_timer(br-topology_change_timer, br_topology_change_timer_expired, (unsigned long) br); - br_timer_init(br-gc_timer, br_fdb_cleanup, (unsigned long) br); + setup_timer(br-gc_timer, br_fdb_cleanup, (unsigned long) br); } void br_stp_port_timer_init(struct net_bridge_port *p) { - br_timer_init(p-message_age_timer, br_message_age_timer_expired, + setup_timer(p-message_age_timer, br_message_age_timer_expired, (unsigned long) p); - br_timer_init(p-forward_delay_timer, br_forward_delay_timer_expired, + setup_timer(p-forward_delay_timer, br_forward_delay_timer_expired, (unsigned long) p); - br_timer_init(p-hold_timer, br_hold_timer_expired, + setup_timer(p-hold_timer, br_hold_timer_expired, (unsigned long) p); } -- - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/15] bridge: netfilter whitespace
Run br_netfilter through Lindent to fix whitespace. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- net-2.6.17.orig/net/bridge/br_netfilter.c +++ net-2.6.17/net/bridge/br_netfilter.c @@ -136,7 +136,7 @@ static void __br_dnat_complain(void) if (jiffies - last_complaint = 5 * HZ) { printk(KERN_WARNING Performing cross-bridge DNAT requires IP - forwarding to be enabled\n); + forwarding to be enabled\n); last_complaint = jiffies; } } @@ -218,12 +218,17 @@ static int br_nf_pre_routing_finish(stru nf_bridge-mask ^= BRNF_NF_BRIDGE_PREROUTING; if (dnat_took_place(skb)) { - if (ip_route_input(skb, iph-daddr, iph-saddr, iph-tos, - dev)) { + if (ip_route_input(skb, iph-daddr, iph-saddr, iph-tos, dev)) { struct rtable *rt; - struct flowi fl = { .nl_u = - { .ip4_u = { .daddr = iph-daddr, .saddr = 0 , -.tos = RT_TOS(iph-tos)} }, .proto = 0}; + struct flowi fl = { + .nl_u = { + .ip4_u = { +.daddr = iph-daddr, +.saddr = 0, +.tos = RT_TOS(iph-tos) }, + }, + .proto = 0, + }; if (!ip_route_output_key(rt, fl)) { /* - Bridged-and-DNAT'ed traffic doesn't @@ -257,8 +262,7 @@ bridged_dnat: 1); return 0; } - memcpy(eth_hdr(skb)-h_dest, dev-dev_addr, - ETH_ALEN); + memcpy(eth_hdr(skb)-h_dest, dev-dev_addr, ETH_ALEN); skb-pkt_type = PACKET_HOST; } } else { @@ -297,10 +301,10 @@ static struct net_device *setup_pre_rout /* We only check the length. A bridge shouldn't do any hop-by-hop stuff anyway */ static int check_hbh_len(struct sk_buff *skb) { - unsigned char *raw = (u8*)(skb-nh.ipv6h+1); + unsigned char *raw = (u8 *) (skb-nh.ipv6h + 1); u32 pkt_len; int off = raw - skb-nh.raw; - int len = (raw[1]+1)3; + int len = (raw[1] + 1) 3; if ((raw + len) - skb-data skb_headlen(skb)) goto bad; @@ -309,7 +313,7 @@ static int check_hbh_len(struct sk_buff len -= 2; while (len 0) { - int optlen = skb-nh.raw[off+1]+2; + int optlen = skb-nh.raw[off + 1] + 2; switch (skb-nh.raw[off]) { case IPV6_TLV_PAD0: @@ -320,16 +324,16 @@ static int check_hbh_len(struct sk_buff break; case IPV6_TLV_JUMBO: - if (skb-nh.raw[off+1] != 4 || (off3) != 2) + if (skb-nh.raw[off + 1] != 4 || (off 3) != 2) goto bad; - pkt_len = ntohl(*(u32*)(skb-nh.raw+off+2)); + pkt_len = ntohl(*(u32 *) (skb-nh.raw + off + 2)); if (pkt_len = IPV6_MAXPLEN || skb-nh.ipv6h-payload_len) goto bad; if (pkt_len skb-len - sizeof(struct ipv6hdr)) goto bad; if (pskb_trim_rcsum(skb, - pkt_len+sizeof(struct ipv6hdr))) + pkt_len + sizeof(struct ipv6hdr))) goto bad; break; default: @@ -350,8 +354,10 @@ bad: /* Replicate the checks that IPv6 does on packet reception and pass the packet * to ip6tables, which doesn't support NAT, so things are fairly simple. */ static unsigned int br_nf_pre_routing_ipv6(unsigned int hook, - struct sk_buff *skb, const struct net_device *in, - const struct net_device *out, int (*okfn)(struct sk_buff *)) + struct sk_buff *skb, + const struct net_device *in, + const struct net_device *out, + int (*okfn)(struct sk_buff *)) { struct ipv6hdr *hdr; u32 pkt_len; @@ -381,9 +387,9 @@ static unsigned int br_nf_pre_routing_ip } } if (hdr-nexthdr == NEXTHDR_HOP check_hbh_len(skb)) - goto inhdr_error; + goto inhdr_error; - nf_bridge_put(skb-nf_bridge); + nf_bridge_put(skb-nf_bridge); if ((nf_bridge = nf_bridge_alloc(skb)) == NULL)
[PATCH 04/15] bridge: use kzalloc
Use kzalloc versus kmalloc+memset. Also don't need to do memset() of bridge address since it is in netdev private data that is already zero'd in alloc_netdev. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- net-2.6.17.orig/net/bridge/br_if.c +++ net-2.6.17/net/bridge/br_if.c @@ -210,7 +210,8 @@ static struct net_device *new_bridge_dev br-bridge_id.prio[0] = 0x80; br-bridge_id.prio[1] = 0x00; - memset(br-bridge_id.addr, 0, ETH_ALEN); + + memcpy(br-group_addr, br_group_address, ETH_ALEN); br-feature_mask = dev-features; br-stp_enabled = 0; @@ -263,11 +264,10 @@ static struct net_bridge_port *new_nbp(s if (index 0) return ERR_PTR(index); - p = kmalloc(sizeof(*p), GFP_KERNEL); + p = kzalloc(sizeof(*p), GFP_KERNEL); if (p == NULL) return ERR_PTR(-ENOMEM); - memset(p, 0, sizeof(*p)); p-br = br; dev_hold(dev); p-dev = dev; -- - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/15] bridge: forwarding remove unneeded preempt and bh diasables
Optimize the forwarding and transmit paths. Both places are called with bottom half/no preempt so there is no need to use spin_lock_bh or rcu_read_lock. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- br-2.6.orig/net/bridge/br_fdb.c +++ br-2.6/net/bridge/br_fdb.c @@ -341,7 +341,6 @@ void br_fdb_update(struct net_bridge *br if (hold_time(br) == 0) return; - rcu_read_lock(); fdb = fdb_find(head, addr); if (likely(fdb)) { /* attempt to update an entry for a local interface */ @@ -356,13 +355,12 @@ void br_fdb_update(struct net_bridge *br fdb-ageing_timer = jiffies; } } else { - spin_lock_bh(br-hash_lock); + spin_lock(br-hash_lock); if (!fdb_find(head, addr)) fdb_create(head, source, addr, 0); /* else we lose race and someone else inserts * it first, don't bother updating */ - spin_unlock_bh(br-hash_lock); + spin_unlock(br-hash_lock); } - rcu_read_unlock(); } --- br-2.6.orig/net/bridge/br_device.c +++ br-2.6/net/bridge/br_device.c @@ -27,6 +27,7 @@ static struct net_device_stats *br_dev_g return br-statistics; } +/* net device transmit always called with no BH (preempt_disabled) */ int br_dev_xmit(struct sk_buff *skb, struct net_device *dev) { struct net_bridge *br = netdev_priv(dev); @@ -39,7 +40,6 @@ int br_dev_xmit(struct sk_buff *skb, str skb-mac.raw = skb-data; skb_pull(skb, ETH_HLEN); - rcu_read_lock(); if (dest[0] 1) br_flood_deliver(br, skb, 0); else if ((dst = __br_fdb_get(br, dest)) != NULL) @@ -47,7 +47,6 @@ int br_dev_xmit(struct sk_buff *skb, str else br_flood_deliver(br, skb, 0); - rcu_read_unlock(); return 0; } -- - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/15] bridge: netfilter inline cleanup
Move nf_bridge_alloc from header file to the one place it is used and optimize it. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- net-2.6.17.orig/include/linux/netfilter_bridge.h +++ net-2.6.17/include/linux/netfilter_bridge.h @@ -47,22 +47,6 @@ enum nf_br_hook_priorities { #define BRNF_BRIDGED 0x08 #define BRNF_NF_BRIDGE_PREROUTING 0x10 -static inline -struct nf_bridge_info *nf_bridge_alloc(struct sk_buff *skb) -{ - struct nf_bridge_info **nf_bridge = (skb-nf_bridge); - - if ((*nf_bridge = kmalloc(sizeof(**nf_bridge), GFP_ATOMIC)) != NULL) { - atomic_set((*nf_bridge)-use, 1); - (*nf_bridge)-mask = 0; - (*nf_bridge)-physindev = (*nf_bridge)-physoutdev = NULL; -#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE) - (*nf_bridge)-netoutdev = NULL; -#endif - } - - return *nf_bridge; -} /* Only used in br_forward.c */ static inline @@ -77,17 +61,6 @@ void nf_bridge_maybe_copy_header(struct } } -static inline -void nf_bridge_save_header(struct sk_buff *skb) -{ -int header_size = 16; - - if (skb-protocol == __constant_htons(ETH_P_8021Q)) - header_size = 18; - - memcpy(skb-nf_bridge-data, skb-data - header_size, header_size); -} - /* This is called by the IP fragmenting code and it ensures there is * enough room for the encapsulating header (if there is one). */ static inline --- net-2.6.17.orig/net/bridge/br_netfilter.c +++ net-2.6.17/net/bridge/br_netfilter.c @@ -113,6 +113,25 @@ static inline struct net_device *bridge_ return port ? port-br-dev : NULL; } +static inline struct nf_bridge_info *nf_bridge_alloc(struct sk_buff *skb) +{ + skb-nf_bridge = kzalloc(sizeof(struct nf_bridge_info), GFP_ATOMIC); + if (likely(skb-nf_bridge)) + atomic_set((skb-nf_bridge-use), 1); + + return skb-nf_bridge; +} + +static inline void nf_bridge_save_header(struct sk_buff *skb) +{ +int header_size = 16; + + if (skb-protocol == htons(ETH_P_8021Q)) + header_size = 18; + + memcpy(skb-nf_bridge-data, skb-data - header_size, header_size); +} + /* PF_BRIDGE/PRE_ROUTING */ /* Undo the changes made for ip6tables PREROUTING and continue the * bridge PRE_ROUTING hook. */ @@ -371,7 +390,6 @@ static unsigned int br_nf_pre_routing_ip { struct ipv6hdr *hdr; u32 pkt_len; - struct nf_bridge_info *nf_bridge; if (skb-len sizeof(struct ipv6hdr)) goto inhdr_error; @@ -400,7 +418,7 @@ static unsigned int br_nf_pre_routing_ip goto inhdr_error; nf_bridge_put(skb-nf_bridge); - if ((nf_bridge = nf_bridge_alloc(skb)) == NULL) + if (!nf_bridge_alloc(skb)) return NF_DROP; if (!setup_pre_routing(skb)) return NF_DROP; @@ -428,7 +446,6 @@ static unsigned int br_nf_pre_routing(un struct iphdr *iph; __u32 len; struct sk_buff *skb = *pskb; - struct nf_bridge_info *nf_bridge; if (skb-protocol == htons(ETH_P_IPV6) || IS_VLAN_IPV6(skb)) { #ifdef CONFIG_SYSCTL @@ -485,7 +502,7 @@ static unsigned int br_nf_pre_routing(un } nf_bridge_put(skb-nf_bridge); - if ((nf_bridge = nf_bridge_alloc(skb)) == NULL) + if (!nf_bridge_alloc(skb)) return NF_DROP; if (!setup_pre_routing(skb)) return NF_DROP; -- - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/15] bridge: use kcalloc
Use kcalloc rather than kmalloc + memset. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] Index: br-2.6/net/bridge/br_if.c === --- br-2.6.orig/net/bridge/br_if.c +++ br-2.6/net/bridge/br_if.c @@ -237,12 +237,11 @@ static int find_portno(struct net_bridge struct net_bridge_port *p; unsigned long *inuse; - inuse = kmalloc(BITS_TO_LONGS(BR_MAX_PORTS)*sizeof(unsigned long), + inuse = kcalloc(BITS_TO_LONGS(BR_MAX_PORTS), sizeof(unsigned long), GFP_KERNEL); if (!inuse) return -ENOMEM; - memset(inuse, 0, BITS_TO_LONGS(BR_MAX_PORTS)*sizeof(unsigned long)); set_bit(0, inuse); /* zero is reserved */ list_for_each_entry(p, br-port_list, list) { set_bit(p-port_no, inuse); -- - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/15] bridge: allow show/store of group multicast address
Bridge's communicate with each other using Spanning Tree Protocol over a standard multicast address. There are times when testing or layering bridges over existing topologies or tunnels, when it is useful to use alternative multicast addresses for STP packets. The 802.1d standard has some unused addresses, that can be used for this. This patch is restrictive in that it only allows one of the possible addresses in the standard. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- net-2.6.17.orig/net/bridge/br_input.c +++ net-2.6.17/net/bridge/br_input.c @@ -19,7 +19,8 @@ #include linux/netfilter_bridge.h #include br_private.h -const unsigned char bridge_ula[6] = { 0x01, 0x80, 0xc2, 0x00, 0x00, 0x00 }; +/* Bridge group multicast address 802.1d (pg 51). */ +const u8 br_group_address[ETH_ALEN] = { 0x01, 0x80, 0xc2, 0x00, 0x00, 0x00 }; static void br_pass_frame_up(struct net_bridge *br, struct sk_buff *skb) { @@ -108,9 +109,9 @@ static int br_handle_local_finish(struct /* Does address match the link local multicast address. * 01:80:c2:00:00:0X */ -static inline int is_link_local(const unsigned char *dest) +static inline int is_link_local(const const unsigned char *dest) { - return memcmp(dest, bridge_ula, 5) == 0 (dest[5] 0xf0) == 0; + return memcmp(dest, br_group_address, 5) == 0 (dest[5] 0xf0) == 0; } /* --- net-2.6.17.orig/net/bridge/br_private.h +++ net-2.6.17/net/bridge/br_private.h @@ -109,6 +109,7 @@ struct net_bridge unsigned long bridge_hello_time; unsigned long bridge_forward_delay; + u8 group_addr[ETH_ALEN]; u16 root_port; unsigned char stp_enabled; unsigned char topology_change; @@ -122,7 +123,7 @@ struct net_bridge }; extern struct notifier_block br_device_notifier; -extern const unsigned char bridge_ula[6]; +extern const u8 br_group_address[ETH_ALEN]; /* called under bridge lock */ static inline int br_is_root_bridge(const struct net_bridge *br) --- net-2.6.17.orig/net/bridge/br_stp_bpdu.c +++ net-2.6.17/net/bridge/br_stp_bpdu.c @@ -47,7 +47,7 @@ static void br_send_bpdu(struct net_brid skb-dev = dev; skb-protocol = htons(ETH_P_802_2); skb-mac.raw = skb_put(skb, size); - memcpy(skb-mac.raw, bridge_ula, ETH_ALEN); + memcpy(skb-mac.raw, p-br-group_addr, ETH_ALEN); memcpy(skb-mac.raw+ETH_ALEN, dev-dev_addr, ETH_ALEN); skb-mac.raw[2*ETH_ALEN] = 0; skb-mac.raw[2*ETH_ALEN+1] = length; @@ -171,7 +171,7 @@ int br_stp_rcv(struct sk_buff *skb, stru || !(br-dev-flags IFF_UP)) goto out; - if (compare_ether_addr(dest, bridge_ula) != 0) + if (compare_ether_addr(dest, br-group_addr) != 0) goto out; buf = skb_pull(skb, 3); --- net-2.6.17.orig/net/bridge/br_sysfs_br.c +++ net-2.6.17/net/bridge/br_sysfs_br.c @@ -242,6 +242,54 @@ static ssize_t show_gc_timer(struct clas } static CLASS_DEVICE_ATTR(gc_timer, S_IRUGO, show_gc_timer, NULL); +static ssize_t show_group_addr(struct class_device *cd, char *buf) +{ + struct net_bridge *br = to_bridge(cd); + return sprintf(buf, %x:%x:%x:%x:%x:%x\n, + br-group_addr[0], br-group_addr[1], + br-group_addr[2], br-group_addr[3], + br-group_addr[4], br-group_addr[5]); +} + +static ssize_t store_group_addr(struct class_device *cd, const char *buf, + size_t len) +{ + struct net_bridge *br = to_bridge(cd); + unsigned new_addr[6]; + int i; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + if (sscanf(buf, %x:%x:%x:%x:%x:%x, + new_addr[0], new_addr[1], new_addr[2], + new_addr[3], new_addr[4], new_addr[5]) != 6) + return -EINVAL; + + /* Must be 01:80:c2:00:00:0X */ + for (i = 0; i 5; i++) + if (new_addr[i] != br_group_address[i]) + return -EINVAL; + + if (new_addr[5] ~0xf) + return -EINVAL; + + if (new_addr[5] == 1/* 802.3x Pause address */ + || new_addr[5] == 2 /* 802.3ad Slow protocols */ + || new_addr[5] == 3) /* 802.1X PAE address */ + return -EINVAL; + + spin_lock_bh(br-lock); + for (i = 0; i 6; i++) + br-group_addr[i] = new_addr[i]; + spin_unlock_bh(br-lock); + return len; +} + +static CLASS_DEVICE_ATTR(group_addr, S_IRUGO | S_IWUSR, +show_group_addr, store_group_addr); + + static struct attribute *bridge_attrs[] = { class_device_attr_forward_delay.attr, class_device_attr_hello_time.attr, @@ -259,6 +307,7 @@ static struct attribute *bridge_attrs[] class_device_attr_tcn_timer.attr,
[PATCH 15/15] bridge: use LLC to send STP
The bridge code can use existing LLC output code when building spanning tree protocol packets. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] Index: net-2.6.17/net/bridge/br_stp_bpdu.c === --- net-2.6.17.orig/net/bridge/br_stp_bpdu.c +++ net-2.6.17/net/bridge/br_stp_bpdu.c @@ -17,6 +17,7 @@ #include linux/netfilter_bridge.h #include linux/etherdevice.h #include linux/llc.h +#include net/llc.h #include net/llc_pdu.h #include br_private.h @@ -24,36 +25,31 @@ #define STP_HZ 256 -static void br_send_bpdu(struct net_bridge_port *p, unsigned char *data, int length) +#define LLC_RESERVE sizeof(struct llc_pdu_un) + +static void br_send_bpdu(struct net_bridge_port *p, +const unsigned char *data, int length) { - struct net_device *dev; struct sk_buff *skb; - int size; if (!p-br-stp_enabled) return; - size = length + 2*ETH_ALEN + 2; - if (size 60) - size = 60; - - dev = p-dev; - - if ((skb = dev_alloc_skb(size)) == NULL) { - printk(KERN_INFO br: memory squeeze!\n); + skb = dev_alloc_skb(length+LLC_RESERVE); + if (!skb) return; - } - skb-dev = dev; + skb-dev = p-dev; skb-protocol = htons(ETH_P_802_2); - skb-mac.raw = skb_put(skb, size); - memcpy(skb-mac.raw, p-br-group_addr, ETH_ALEN); - memcpy(skb-mac.raw+ETH_ALEN, dev-dev_addr, ETH_ALEN); - skb-mac.raw[2*ETH_ALEN] = 0; - skb-mac.raw[2*ETH_ALEN+1] = length; - skb-nh.raw = skb-mac.raw + 2*ETH_ALEN + 2; - memcpy(skb-nh.raw, data, length); - memset(skb-nh.raw + length, 0xa5, size - length - 2*ETH_ALEN - 2); + + skb_reserve(skb, LLC_RESERVE); + memcpy(__skb_put(skb, length), data, length); + + llc_pdu_header_init(skb, LLC_PDU_TYPE_U, LLC_SAP_BSPAN, + LLC_SAP_BSPAN, LLC_PDU_CMD); + llc_pdu_init_as_ui_cmd(skb); + + llc_mac_hdr_init(skb, p-dev-dev_addr, p-br-group_addr); NF_HOOK(PF_BRIDGE, NF_BR_LOCAL_OUT, skb, NULL, skb-dev, dev_queue_xmit); @@ -76,60 +72,54 @@ static inline int br_get_ticks(const uns /* called under bridge lock */ void br_send_config_bpdu(struct net_bridge_port *p, struct br_config_bpdu *bpdu) { - unsigned char buf[38]; + unsigned char buf[35]; - buf[0] = 0x42; - buf[1] = 0x42; - buf[2] = 0x03; - buf[3] = 0; - buf[4] = 0; - buf[5] = 0; - buf[6] = BPDU_TYPE_CONFIG; - buf[7] = (bpdu-topology_change ? 0x01 : 0) | + buf[0] = 0; + buf[1] = 0; + buf[2] = 0; + buf[3] = BPDU_TYPE_CONFIG; + buf[4] = (bpdu-topology_change ? 0x01 : 0) | (bpdu-topology_change_ack ? 0x80 : 0); - buf[8] = bpdu-root.prio[0]; - buf[9] = bpdu-root.prio[1]; - buf[10] = bpdu-root.addr[0]; - buf[11] = bpdu-root.addr[1]; - buf[12] = bpdu-root.addr[2]; - buf[13] = bpdu-root.addr[3]; - buf[14] = bpdu-root.addr[4]; - buf[15] = bpdu-root.addr[5]; - buf[16] = (bpdu-root_path_cost 24) 0xFF; - buf[17] = (bpdu-root_path_cost 16) 0xFF; - buf[18] = (bpdu-root_path_cost 8) 0xFF; - buf[19] = bpdu-root_path_cost 0xFF; - buf[20] = bpdu-bridge_id.prio[0]; - buf[21] = bpdu-bridge_id.prio[1]; - buf[22] = bpdu-bridge_id.addr[0]; - buf[23] = bpdu-bridge_id.addr[1]; - buf[24] = bpdu-bridge_id.addr[2]; - buf[25] = bpdu-bridge_id.addr[3]; - buf[26] = bpdu-bridge_id.addr[4]; - buf[27] = bpdu-bridge_id.addr[5]; - buf[28] = (bpdu-port_id 8) 0xFF; - buf[29] = bpdu-port_id 0xFF; - - br_set_ticks(buf+30, bpdu-message_age); - br_set_ticks(buf+32, bpdu-max_age); - br_set_ticks(buf+34, bpdu-hello_time); - br_set_ticks(buf+36, bpdu-forward_delay); + buf[5] = bpdu-root.prio[0]; + buf[6] = bpdu-root.prio[1]; + buf[7] = bpdu-root.addr[0]; + buf[8] = bpdu-root.addr[1]; + buf[9] = bpdu-root.addr[2]; + buf[10] = bpdu-root.addr[3]; + buf[11] = bpdu-root.addr[4]; + buf[12] = bpdu-root.addr[5]; + buf[13] = (bpdu-root_path_cost 24) 0xFF; + buf[14] = (bpdu-root_path_cost 16) 0xFF; + buf[15] = (bpdu-root_path_cost 8) 0xFF; + buf[16] = bpdu-root_path_cost 0xFF; + buf[17] = bpdu-bridge_id.prio[0]; + buf[18] = bpdu-bridge_id.prio[1]; + buf[19] = bpdu-bridge_id.addr[0]; + buf[20] = bpdu-bridge_id.addr[1]; + buf[21] = bpdu-bridge_id.addr[2]; + buf[22] = bpdu-bridge_id.addr[3]; + buf[23] = bpdu-bridge_id.addr[4]; + buf[24] = bpdu-bridge_id.addr[5]; + buf[25] = (bpdu-port_id 8) 0xFF; + buf[26] = bpdu-port_id 0xFF; + + br_set_ticks(buf+27, bpdu-message_age); + br_set_ticks(buf+29, bpdu-max_age); +
[PATCH 00/15] Bridge update for 2.6.17
This patches includes previous bridge changes for 2.6.17 and new patches to enable using LLC to handle bridge STP packets. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/15] llc: llc_mac_hdr_init const arguments
Cleanup of LLC. llc_mac_hdr_init can take constant arguments, and it is defined twice once in llc_output.h that is otherwise unused. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- net-2.6.17.orig/include/net/llc.h +++ net-2.6.17/include/net/llc.h @@ -71,7 +71,7 @@ extern int llc_rcv(struct sk_buff *skb, struct packet_type *pt, struct net_device *orig_dev); extern int llc_mac_hdr_init(struct sk_buff *skb, - unsigned char *sa, unsigned char *da); + const unsigned char *sa, const unsigned char *da); extern void llc_add_pack(int type, void (*handler)(struct llc_sap *sap, struct sk_buff *skb)); --- net-2.6.17.orig/net/llc/llc_c_ac.c +++ net-2.6.17/net/llc/llc_c_ac.c @@ -27,7 +27,6 @@ #include net/llc_pdu.h #include net/llc.h -#include llc_output.h static int llc_conn_ac_inc_vs_by_1(struct sock *sk, struct sk_buff *skb); static void llc_process_tmr_ev(struct sock *sk, struct sk_buff *skb); --- net-2.6.17.orig/net/llc/llc_output.c +++ net-2.6.17/net/llc/llc_output.c @@ -30,7 +30,8 @@ * Fills MAC header fields, depending on MAC type. Returns 0, If MAC type * is a valid type and initialization completes correctly 1, otherwise. */ -int llc_mac_hdr_init(struct sk_buff *skb, unsigned char *sa, unsigned char *da) +int llc_mac_hdr_init(struct sk_buff *skb, +const unsigned char *sa, const unsigned char *da) { int rc = 0; --- net-2.6.17.orig/net/llc/llc_output.h +++ /dev/null @@ -1,20 +0,0 @@ -#ifndef LLC_OUTPUT_H -#define LLC_OUTPUT_H -/* - * Copyright (c) 1997 by Procom Technology, Inc. - * 2001-2003 by Arnaldo Carvalho de Melo [EMAIL PROTECTED] - * - * This program can be redistributed or modified under the terms of the - * GNU General Public License version 2 as published by the Free Software - * Foundation. - * This program is distributed without any warranty or implied warranty - * of merchantability or fitness for a particular purpose. - * - * See the GNU General Public License version 2 for more details. - */ - -struct sk_buff; - -int llc_mac_hdr_init(struct sk_buff *skb, unsigned char *sa, unsigned char *da); - -#endif /* LLC_OUTPUT_H */ --- net-2.6.17.orig/net/llc/llc_s_ac.c +++ net-2.6.17/net/llc/llc_s_ac.c @@ -24,7 +24,7 @@ #include net/llc_s_ac.h #include net/llc_s_ev.h #include net/llc_sap.h -#include llc_output.h + /** * llc_sap_action_unit_data_ind - forward UI PDU to network layer -- - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/15] bridge: optimize frame pass up
The netfilter hook that is used to receive frames doesn't need to be a stub. It is only called in two ways, both of which ignore the return value. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- net-2.6.17.orig/net/bridge/br_input.c +++ net-2.6.17/net/bridge/br_input.c @@ -21,12 +21,6 @@ const unsigned char bridge_ula[6] = { 0x01, 0x80, 0xc2, 0x00, 0x00, 0x00 }; -static int br_pass_frame_up_finish(struct sk_buff *skb) -{ - netif_receive_skb(skb); - return 0; -} - static void br_pass_frame_up(struct net_bridge *br, struct sk_buff *skb) { struct net_device *indev; @@ -38,7 +32,7 @@ static void br_pass_frame_up(struct net_ skb-dev = br-dev; NF_HOOK(PF_BRIDGE, NF_BR_LOCAL_IN, skb, indev, NULL, - br_pass_frame_up_finish); + netif_receive_skb); } /* note: already called with rcu_read_lock (preempt_disabled) */ -- - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/15] bridge: netfilter VLAN macro cleanup
Fix the VLAN macros in bridge netfilter code. Macros should not depend on magic variables. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] Index: net-2.6.17/net/bridge/br_netfilter.c === --- net-2.6.17.orig/net/bridge/br_netfilter.c +++ net-2.6.17/net/bridge/br_netfilter.c @@ -61,15 +61,25 @@ static int brnf_filter_vlan_tagged = 1; #define brnf_filter_vlan_tagged 1 #endif -#define IS_VLAN_IP (skb-protocol == htons(ETH_P_8021Q) \ - hdr-h_vlan_encapsulated_proto == htons(ETH_P_IP) \ - brnf_filter_vlan_tagged) -#define IS_VLAN_IPV6 (skb-protocol == htons(ETH_P_8021Q) \ - hdr-h_vlan_encapsulated_proto == htons(ETH_P_IPV6) \ - brnf_filter_vlan_tagged) -#define IS_VLAN_ARP (skb-protocol == htons(ETH_P_8021Q)\ - hdr-h_vlan_encapsulated_proto == htons(ETH_P_ARP) \ - brnf_filter_vlan_tagged) +static __be16 inline vlan_proto(const struct sk_buff *skb) +{ + return vlan_eth_hdr(skb)-h_vlan_encapsulated_proto; +} + +#define IS_VLAN_IP(skb) \ + (skb-protocol == htons(ETH_P_8021Q) \ +vlan_proto(skb) == htons(ETH_P_IP) \ +brnf_filter_vlan_tagged) + +#define IS_VLAN_IPV6(skb) \ + (skb-protocol == htons(ETH_P_8021Q) \ +vlan_proto(skb) == htons(ETH_P_IPV6) \ +brnf_filter_vlan_tagged) + +#define IS_VLAN_ARP(skb) \ + (skb-protocol == htons(ETH_P_8021Q) \ +vlan_proto(skb) == htons(ETH_P_ARP) \ +brnf_filter_vlan_tagged) /* We need these fake structures to make netfilter happy -- * lots of places assume that skb-dst != NULL, which isn't @@ -419,9 +429,8 @@ static unsigned int br_nf_pre_routing(un __u32 len; struct sk_buff *skb = *pskb; struct nf_bridge_info *nf_bridge; - struct vlan_ethhdr *hdr = vlan_eth_hdr(*pskb); - if (skb-protocol == htons(ETH_P_IPV6) || IS_VLAN_IPV6) { + if (skb-protocol == htons(ETH_P_IPV6) || IS_VLAN_IPV6(skb)) { #ifdef CONFIG_SYSCTL if (!brnf_call_ip6tables) return NF_ACCEPT; @@ -440,7 +449,7 @@ static unsigned int br_nf_pre_routing(un return NF_ACCEPT; #endif - if (skb-protocol != htons(ETH_P_IP) !IS_VLAN_IP) + if (skb-protocol != htons(ETH_P_IP) !IS_VLAN_IP(skb)) return NF_ACCEPT; if ((skb = skb_share_check(*pskb, GFP_ATOMIC)) == NULL) @@ -521,9 +530,8 @@ static int br_nf_forward_finish(struct s { struct nf_bridge_info *nf_bridge = skb-nf_bridge; struct net_device *in; - struct vlan_ethhdr *hdr = vlan_eth_hdr(skb); - if (skb-protocol != htons(ETH_P_ARP) !IS_VLAN_ARP) { + if (skb-protocol != htons(ETH_P_ARP) !IS_VLAN_ARP(skb)) { in = nf_bridge-physindev; if (nf_bridge-mask BRNF_PKT_TYPE) { skb-pkt_type = PACKET_OTHERHOST; @@ -553,7 +561,6 @@ static unsigned int br_nf_forward_ip(uns { struct sk_buff *skb = *pskb; struct nf_bridge_info *nf_bridge; - struct vlan_ethhdr *hdr = vlan_eth_hdr(skb); struct net_device *parent; int pf; @@ -564,7 +571,7 @@ static unsigned int br_nf_forward_ip(uns if (!parent) return NF_DROP; - if (skb-protocol == htons(ETH_P_IP) || IS_VLAN_IP) + if (skb-protocol == htons(ETH_P_IP) || IS_VLAN_IP(skb)) pf = PF_INET; else pf = PF_INET6; @@ -596,7 +603,6 @@ static unsigned int br_nf_forward_arp(un int (*okfn)(struct sk_buff *)) { struct sk_buff *skb = *pskb; - struct vlan_ethhdr *hdr = vlan_eth_hdr(skb); struct net_device **d = (struct net_device **)(skb-cb); #ifdef CONFIG_SYSCTL @@ -605,14 +611,14 @@ static unsigned int br_nf_forward_arp(un #endif if (skb-protocol != htons(ETH_P_ARP)) { - if (!IS_VLAN_ARP) + if (!IS_VLAN_ARP(skb)) return NF_ACCEPT; skb_pull(*pskb, VLAN_HLEN); (*pskb)-nh.raw += VLAN_HLEN; } if (skb-nh.arph-ar_pln != 4) { - if (IS_VLAN_ARP) { + if (IS_VLAN_ARP(skb)) { skb_push(*pskb, VLAN_HLEN); (*pskb)-nh.raw -= VLAN_HLEN; } @@ -667,13 +673,12 @@ static unsigned int br_nf_local_out(unsi struct net_device *realindev, *realoutdev; struct sk_buff *skb = *pskb; struct nf_bridge_info *nf_bridge; - struct vlan_ethhdr *hdr = vlan_eth_hdr(skb); int pf; if (!skb-nf_bridge) return NF_ACCEPT; - if (skb-protocol == htons(ETH_P_IP) || IS_VLAN_IP) + if (skb-protocol == htons(ETH_P_IP) || IS_VLAN_IP(skb)) pf = PF_INET; else pf = PF_INET6; @@ -752,7 +757,6 @@ static unsigned int br_nf_post_routing(u { struct sk_buff *skb = *pskb;
Re: [PATCH 14/15] llc: llc_mac_hdr_init const arguments
On 3/13/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Cleanup of LLC. llc_mac_hdr_init can take constant arguments, and it is defined twice once in llc_output.h that is otherwise unused. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] Acked-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] - Arnaldo - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ES-API?
From: Zach Brown [EMAIL PROTECTED] Date: Mon, 13 Mar 2006 14:25:08 -0800 The API adds support for things like memory registration, async operations completed through event queues, standard sendfile() and async poll(), etc. Yes, this is how these standard groups try to force a particular approach to high speed networking down everyone's throats. Memory registration is all about VIA (Virtual Interface Architecture), RDMA, and the like. Neither of which is necessarily the way to get the best high performance networking. I think we'll pass on this stuff. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ES-API?
Yes, this is how these standard groups try to force a particular approach to high speed networking down everyone's throats. Yeah, from what I understand there's a lot of RDMA/IB people involved in that ``ICSC'' thing. I think we'll pass on this stuff. OK, so noted. Thanks. - z - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [TCP]: Fix zero port problem in IPv6
From: Herbert Xu [EMAIL PROTECTED] Date: Mon, 13 Mar 2006 22:50:31 +1100 I was looking into the bug report titled [IPv6 since 2.6.12] connect() without bind() can't time out. There are actually a couple of different issues here. Here is a fix to the most obvious one. Thanks for looking into this Herbert. [TCP]: Fix zero port problem in IPv6 When we link a socket into the hash table, we need to make sure that we set the num/port fields so that it shows us with a non-zero port value in proc/netlink and on the wire. This code and comment is copied over from the IPv4 stack as is. Signed-off-by: Herbert Xu [EMAIL PROTECTED] Applied, thanks a lot. I'll push this into 2.6.16 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ES-API?
Also, the following scares me: This specification has not been verified for avoidance of possible third-party proprietary rights. In implementing this specification, usual procedures to ensure the respect of possible third-party intellectual property rights should be followed. Scary, indeed. I'll try and ask people about that bit. - z - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 3/8] [I/OAT] Setup the networking subsystem as a DMA client
+#ifdef CONFIG_NET_DMA +#include linux/dmaengine.h +#endif There are still a number of instances of this in the patch series. Did you decide to keep the ifdefs in there for some reason? No, I just missed them in header files. Thanks. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2.6 patch] hostap_{pci,plx}.c: fix memory leaks
This patch fixes two memotry leaks spotted by the Coverity checker. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- drivers/net/wireless/hostap/hostap_pci.c |6 +++--- drivers/net/wireless/hostap/hostap_plx.c |6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) --- linux-2.6.16-rc6-mm1-full/drivers/net/wireless/hostap/hostap_pci.c.old 2006-03-13 22:34:30.0 +0100 +++ linux-2.6.16-rc6-mm1-full/drivers/net/wireless/hostap/hostap_pci.c 2006-03-13 22:37:57.0 +0100 @@ -301,14 +301,14 @@ static int prism2_pci_probe(struct pci_d struct hostap_interface *iface; struct hostap_pci_priv *hw_priv; + if (pci_enable_device(pdev)) + return -EIO; + hw_priv = kmalloc(sizeof(*hw_priv), GFP_KERNEL); if (hw_priv == NULL) return -ENOMEM; memset(hw_priv, 0, sizeof(*hw_priv)); - if (pci_enable_device(pdev)) - return -EIO; - phymem = pci_resource_start(pdev, 0); if (!request_mem_region(phymem, pci_resource_len(pdev, 0), Prism2)) { --- linux-2.6.16-rc6-mm1-full/drivers/net/wireless/hostap/hostap_plx.c.old 2006-03-13 22:39:40.0 +0100 +++ linux-2.6.16-rc6-mm1-full/drivers/net/wireless/hostap/hostap_plx.c 2006-03-13 22:40:09.0 +0100 @@ -446,14 +446,14 @@ static int prism2_plx_probe(struct pci_d int tmd7160; struct hostap_plx_priv *hw_priv; + if (pci_enable_device(pdev)) + return -EIO; + hw_priv = kmalloc(sizeof(*hw_priv), GFP_KERNEL); if (hw_priv == NULL) return -ENOMEM; memset(hw_priv, 0, sizeof(*hw_priv)); - if (pci_enable_device(pdev)) - return -EIO; - /* National Datacomm NCP130 based on TMD7160, not PLX9052. */ tmd7160 = (pdev-vendor == 0x15e8) (pdev-device == 0x0131); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Router stops routing after changing MAC Address
On Mon, 13 Mar 2006, Greg Scott wrote: Yup. I had a situation 2 weeks ago where a customer connected a system to the Internet with an IP Address he should not have used. And the little Cisco router on the frontend dutifully recorded it in its ARP cache - forever, with no TTL! This took down their webmail for most of a day until we finally had to cycle the power on that nasty little Cisco 678. Bigger routers do it too. I've had several situations over the years where I replaced an older firewall with a newer one with the same IP Addresses. All the internal servers find it soon enough. But I've waited literally hours for the routers to finally purge their ARP caches so they would see my replacement systems - often with the customer looking over my shoulders getting more and more nervous by the minute. And sometimes the routers are not accessible - you can't cycle them even if you had permission. Consider the cases of bridged DSL service - Bst... Not! There are not any MAC addresses associated with any of the intercity links, usually not even in WANs! MAC is for Ethernet! Once you go to fiber, ATM, T-N, etc., there are no MAC addresses. That's why there are bridges and routers, you got to connect your tiny time-slot to your LAN and that first device contains the MAC address that all your other stuff talks to. where the real router could be on the other side of the country. Try calling an ISP and asking the tech on the other end to purge an ARP cache on a router. So the same IP Addresses but different MAC addresses, all you can do is wait for the passage of (lots of) time. That happened to me in my own network once. I accidently took down my email server for something like 4 hours one time when I got careless. Indeed, there is a large onus on the software doing the MAC override to make sure it does not break the required uniqueness. Just as if one were using locally administered MAC addresses. Yes. My 12:34:56 OUI scheme will work for this project but it is definitely not good for the long term. I really really hope I have to spend some money with the IEEE soon to support lots and lots of rollouts. :) - Greg Scott -Original Message- From: Rick Jones [mailto:[EMAIL PROTECTED] Sent: Monday, March 13, 2006 3:50 PM To: linux-os (Dick Johnson) Cc: Greg Scott; Chuck Ebbert; linux-kernel; netdev@vger.kernel.org; Bart Samwel; Alan Cox; Simon Mackinlay Subject: Re: Router stops routing after changing MAC Address Anyway, if the device fails, you have routers and hosts ARPing the interface, trying to establish a route anyway. But only after what may be a much longer time than the customer is willing to accept or able to configure. I know of a number of HA situations where the new device is given the old MAC just to avoid that speicific situation of ARP caches not being updated except after quite some time. Not necessarily on the end-systems, the issue can be with intermediate devices (routers). And if one has to work with static ARP entries to deal (however imperfectly) with ARP poisioning or whatnot... Indeed, there is a large onus on the software doing the MAC override to make sure it does not break the required uniqueness. Just as if one were using locally administered MAC addresses. rick jones - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ Cheers, Dick Johnson Penguin : Linux version 2.6.15.4 on an i686 machine (5589.54 BogoMips). Warning : 98.36% of all statistics are fiction, book release in April. _ The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [EMAIL PROTECTED] - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Writing a Rate based transport protocol
Hi All, I am trying to write a new rate based transport protocol in linux kernel (either as a module or directly within the kernel). Basically it would be similar to UDP but with features like dynamic rate control, connection and state management, error control like TCP. Is there any established framework which i can use? I know there is one for window based protocols like TCP where one can dynamically register different congestion control mechanisms. I would appreciate if somebody can give me some direction in this regard. Thanks in advance. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[Patch 8/9] generic netlink utility functions
genetlink-utils.patch Two utilities for simplifying usage of NETLINK_GENERIC interface. Signed-off-by: Balbir Singh [EMAIL PROTECTED] Signed-off-by: Shailabh Nagar [EMAIL PROTECTED] include/net/genetlink.h | 20 1 files changed, 20 insertions(+) Index: linux-2.6.16-rc5/include/net/genetlink.h === --- linux-2.6.16-rc5.orig/include/net/genetlink.h 2006-03-11 07:41:32.0 -0500 +++ linux-2.6.16-rc5/include/net/genetlink.h2006-03-11 07:41:41.0 -0500 @@ -150,4 +150,24 @@ static inline int genlmsg_unicast(struct return nlmsg_unicast(genl_sock, skb, pid); } +/** + * gennlmsg_data - head of message payload + * @gnlh: genetlink messsage header + */ +static inline void *genlmsg_data(const struct genlmsghdr *gnlh) +{ + return ((unsigned char *) gnlh + GENL_HDRLEN); +} + +/** + * genlmsg_len - length of message payload + * @gnlh: genetlink message header + */ +static inline int genlmsg_len(const struct genlmsghdr *gnlh) +{ + struct nlmsghdr *nlh = (struct nlmsghdr *)((unsigned char *)gnlh - + NLMSG_HDRLEN); + return (nlh-nlmsg_len - GENL_HDRLEN - NLMSG_HDRLEN); +} + #endif /* __NET_GENERIC_NETLINK_H */ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scm: fold __scm_send() into scm_send()
Ingo Oeser [EMAIL PROTECTED] wrote: -int scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie *scm) -{ -struct task_struct *p = current; -scm-creds = (struct ucred) { -.uid = p-uid, -.gid = p-gid, -.pid = p-tgid -}; -scm-fp = NULL; -scm-sid = security_sk_sid(sock-sk, NULL, 0); -scm-seq = 0; -if (msg-msg_controllen = 0) -return 0; -return __scm_send(sock, msg, scm); -} It's worth noting that scm_send() will call security_sk_sid() even if (msg-msg_controllen = 0). If that test is likely to be true with any frequency then perhaps we can optimise things... - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [NET] : get rid of 'struct sec_path *' in skbuff if ! CONFIG_XFRM
Eric Dumazet [EMAIL PROTECTED] wrote: - C(sp); -#ifdef CONFIG_INET - secpath_get(skb-sp); +#ifdef CONFIG_XFRM + n-sp = secpath_get(skb-sp); #endif Please move the ifdef into xfrm.h while you're at it. That is, make secpath_get a no-op if CONFIG_XFRM is not defined. Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
If you dont mind please stop ccing lartc - they keep bouncing my mail. On Tue, 2006-14-03 at 10:31 +1000, Russell Stuart wrote: Anyway, jokes aside, the situation we have now is the current tc doesn't work with the current kernel. Slow down: The two perceived problematic combos are: a) old hash in tc for the case of sample with kernel 2.6 with current hash. - This is not a problem if you use a single byte. Usage of anything other than one byte is not documented anywhere as you point out. Therefore the issue of backward compatibility is not a big deal IMO because it is hardly used as a feature. People who are brave such as yourself, who go beyond the 1 byte or less than a byte can be brave enough to upgrade as well. b) new hash (if it is to be upgraded) with 2.4.x Again non-issue with 1 byte. Issues show up if you use or 1 byte. And besides if you really insist, look at using hashkey - it will work a lot better since the dependency is only at the kernel and none at user space. [You made assertions that the old hash was better - can we please defer that discussion to later and resolve this first? There are many variables that you ignored in your derivation (we can discuss what those are later)] So my take on this is: either forget about making any changes at all or fix things so going forward they will work(which is the recommendation i have made). Backward compatibility is a less important issue for something that perhaps a handful of people use (I consider myself a nig user of u32 and hardly use this feature). cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Tue, 2006-14-03 at 12:29 +1000, Russell Stuart wrote: On Sat, 2006-03-11 at 08:11 -0500, jamal wrote: On Fri, 2006-10-02 at 12:33 +1000, Russell Stuart wrote: This patch adds a divisor option to tc's sample clause: While this looks right - can we have more test data with tc filter ls both before and after your fix? Specify divisor of 256 and 16 for example. Show that for the 256 it is the same as before and for 16 it does the right thing. With patch, divisor 256: Thanks. Looks like it is doing the right thing - Although sometimes the only way to really see things is to send some traffic. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Tue, 2006-14-03 at 12:45 +1000, Russell Stuart wrote: On Sat, 2006-03-11 at 10:56 -0500, jamal wrote: [1] Essentially, if you teach u32 in 2.4 to spread rules over different buckets it will not do so evenly. Off top of my head i remember that the best you could ever do is have bucket selection in increments of 4 (so bucket 0, 4, 8,..) with certain divisor sizes - mostly works with 256; to work properly you need it to be (0,1,2,3,..). implies -- When it does work, it is detrimental to lookup speed when you hit thousands of rules because some buckets get overloaded and others are totaly unused and when it doesnt work, it results in classifier misses for rules you inserted. Hmm. I can't see how this could be so. Can you give specific examples. I will try to find the scripts. The only time I can think of where the 2.4 XOR hash would be worse is where there is a correlation between the bits in different bytes. actually if memory serves me right, thats what part of it ;- I can't think of a real life example of where that would be so. The end goal is to reduce the total lookups at the expense of more RAM. I dont see using 20M as a big deal for example. Imagine a few thousand hosts... and you want to distribute these across many hash tables (i recall the test i had was for 64K unique hosts). One of the techniques you could use is to have several layers of hash tables before eventually hitting the terminal; at each then you could spread the bits (of lets say the 3rd octet of src IP in 3 bit hash masks) in the same hash level at the diagonal level. The result is many hash tables created (a lot of memory consumed) but faster lookups. In such a case a selection on a hash table could be made with bit pattern 101 or 111 etc as the mask with a hash table with only 8 buckets. I am not sure if that made sense. In every other case it will be as good as, but typically better than, the 2.6 algorithm. That could be debated. There are many variables you need to consider with u32 before reaching that conclusion, heres what comes to mind: 1) the number of buckets at the hash table - you picked 256 in your example. 2) the mask and size (as you have found out with 1 or / 1 byte ) and whether it is contigous. 3) the bit offset of where the ANDing happens. Different bit positions will result in different distributions. To properly analyze things you need to have a variation of these. When i last compared the two, the all round winner was the 2.6.x one. The equation you provided for evaluating the different hashes does look clever but i have never seen it used anywhere. What i did was to compute the std deviation from the best value. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6 patch] hostap_{pci,plx}.c: fix memory leaks
Adrian Bunk [EMAIL PROTECTED] wrote: + if (pci_enable_device(pdev)) + return -EIO; + hw_priv = kmalloc(sizeof(*hw_priv), GFP_KERNEL); if (hw_priv == NULL) return -ENOMEM; memset(hw_priv, 0, sizeof(*hw_priv)); - if (pci_enable_device(pdev)) - return -EIO; - You've just turned it into a leak of a different kind. Why not jump to the error exit instead? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 9/9] Generic netlink interface for delay accounting
On Mon, 2006-13-03 at 19:56 -0500, Shailabh Nagar wrote: delayacct-genetlink.patch Create a generic netlink interface (NETLINK_GENERIC family), called taskstats, for getting delay and cpu statistics of tasks and thread groups during their lifetime and when they exit. Comments addressed (all in response to Jamal) Note, you are still not following the standard scheme of doing things. Example: using command = GET and the message carrying the TGID to note which TGID is of interest. Instead you have command = TGID. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 9/9] Generic netlink interface for delay accounting
On Mon, 2006-03-13 at 19:56 -0500, Shailabh Nagar wrote: snip Comments addressed (all in response to Jamal) - Eliminated TASKSTATS_CMD_LISTEN and TASKSTATS_CMD_IGNORE The enums for these are still in the patch. See below. snip +/* + * Commands sent from userspace + * Not versioned. New commands should only be inserted at the enum's end + */ + +enum { + TASKSTATS_CMD_UNSPEC, /* Reserved */ + TASKSTATS_CMD_NONE, /* Not a valid cmd to send +* Marks data sent on task/tgid exit */ + TASKSTATS_CMD_LISTEN, /* Start listening */ + TASKSTATS_CMD_IGNORE, /* Stop listening */ From the description I thought you had eliminated these. + TASKSTATS_CMD_PID, /* Send stats for a pid */ + TASKSTATS_CMD_TGID, /* Send stats for a tgid */ +}; Jamal, was your Mon, 13 Mar 2006 21:29:09 -0500 reply: Note, you are still not following the standard scheme of doing things. Example: using command = GET and the message carrying the TGID to note which TGID is of interest. Instead you have command = TGID. cheers, jamal meant to suggest that TASKSTATS_CMD_(P|TG)ID should be renamed to TASKSTATS_CMD_GET_(P|TG)ID ? Is that sufficient? Or am I misunderstanding? snip Cheers, -Matt Helsley - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 9/9] Generic netlink interface for delay accounting
On Mon, 2006-13-03 at 18:33 -0800, Matt Helsley wrote: On Mon, 2006-03-13 at 19:56 -0500, Shailabh Nagar wrote: Jamal, was your Mon, 13 Mar 2006 21:29:09 -0500 reply: Note, you are still not following the standard scheme of doing things. Example: using command = GET and the message carrying the TGID to note which TGID is of interest. Instead you have command = TGID. meant to suggest that TASKSTATS_CMD_(P|TG)ID should be renamed to TASKSTATS_CMD_GET_(P|TG)ID ? Is that sufficient? Or am I misunderstanding? I had a long description in an earlier email feedback; but the summary of it is the GET command is generic like TASKSTATS_CMD_GET; the message itself carries TLVs of what needs to be gotten which are either PID and/or TGID etc. Anyways, theres a long spill of what i am saying in that earlier email. Perhaps the current patch is a transition towards that? cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Mon, 13 Mar 2006 18:55:35 +0100 Patrick McHardy [EMAIL PROTECTED] wrote: jamal wrote: On Mon, 2006-13-03 at 14:44 +1000, Russell Stuart wrote: You are wrong on both counts. I am wrong on why it is being rejected - but what you are seeing is worse than i thought initially. Lets put it this way: The only you will _ever_ get that message is if you had made a syntax error (which you did not). Please look at the code on where that message appears and: a) tell me how you would have got that message to begin with using perfectly legal syntax. a) tell me how a memset would have fixed that. He already told you, pack_key expects the selector to be initialized, otherwise nkeys might contain a value = 128, which would cause exactly this error, if a matching key is not found within the uninitialized memory by accident. Just send the memset fix to Stephen with a different reason. Your current reason is _wrong_ and i really dont have much time to have this kind of discussion. If you had said I added that memset there because it looks like the right thing to do then we would not have had this discussion. You made claims you fixed a bug. It cant possibly be the bug you fixed. Was it some other bug perhaps and you mixed up the two? The memset fix is in current CVS. I just wasn't going to take the patch that looked at utsname to decide what hash to use. The patch as well as the description are perfectly fine. BTW, running valgrind on tc shows lots of uses of uninitialized values, it seems like a good idea if someone would go over these and fix them up. If we had a test script of commands (code coverage), that would help. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
Stephen Hemminger wrote: BTW, running valgrind on tc shows lots of uses of uninitialized values, it seems like a good idea if someone would go over these and fix them up. If we had a test script of commands (code coverage), that would help. Actually the Coverity scanner is quite good at spotting these problems, so I've requested iproute to be added to the list of regulary scanned projects. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
Stephen Hemminger wrote: The memset fix is in current CVS. I just wasn't going to take the patch that looked at utsname to decide what hash to use. Yes, that sucks. We have another incompatibility, old iproute versions (like the one shipped by Debian) show garbage statistics for HFSC. I think it still works properly, but I wasn't able to track down the cause. I have a feeling its somehow related to the gen_stats changes. BTW, running valgrind on tc shows lots of uses of uninitialized values, it seems like a good idea if someone would go over these and fix them up. If we had a test script of commands (code coverage), that would help. Getting good coverage looks like a lot of work, but I'll put it on my TODO list for a boring day :) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Mon, 2006-03-13 at 10:04 -0800, Stephen Hemminger wrote: The memset fix is in current CVS. I just wasn't going to take the patch that looked at utsname to decide what hash to use. Stephen, could you describe your objections to it please? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Tue, 14 Mar 2006 07:43:57 +1000 Russell Stuart [EMAIL PROTECTED] wrote: On Mon, 2006-03-13 at 10:04 -0800, Stephen Hemminger wrote: The memset fix is in current CVS. I just wasn't going to take the patch that looked at utsname to decide what hash to use. Stephen, could you describe your objections to it please? Because it means that the API for netlink isn't being used properly. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Mon, 2006-03-13 at 13:50 -0800, Stephen Hemminger wrote: On Tue, 14 Mar 2006 07:43:57 +1000 Russell Stuart [EMAIL PROTECTED] wrote: On Mon, 2006-03-13 at 10:04 -0800, Stephen Hemminger wrote: The memset fix is in current CVS. I just wasn't going to take the patch that looked at utsname to decide what hash to use. Stephen, could you describe your objections to it please? Because it means that the API for netlink isn't being used properly. Do you mean the binary API between the kernel and the applications has been broken; this is bad as it transgresses some unwritten law; and the patch is bad because it hides this problem rather than addressing it? Does the fact that the binary API changed during a major kernel release (between 2.4 and 2.6) give us some wriggle room here? Anyway, jokes aside, the situation we have now is the current tc doesn't work with the current kernel. This situation can't be allowed to continue, I presume. Ergo, here is a list of things we could do to fix this. All you (plural) have to do is choose one, or some combination: 1. Change the hashing algorithm back to the 2.4 way. (IMHO, the 2.4 algorithm was better.) Disadvantage: anybody who had a u32 filter that hashed on more than one non-zero byte will have their u32 filters suddenly break. However, since how the hashing algorithm was never documented beyond HOWTO's that showed how to hash on one byte, and every example of hashing I have seen has been a copy paste of said HOWTO's, my guess is there are stuff all people who will be effected. Another disadvantage: people who are using older 2.6 kernels (eg me on my production machines) won't have the problem fixed. 2. Change the hashing algorithm in tc to match the current kernel. Disadvantage: tc will no longer work with 2.4 kernels. 3. Change the hashing algorithm in tc to match the current kernel, and change the 2.4 kernel's hashing algorithm to match the 2.6 kernel. This is Jamal's proposed solution. Disadvantage: 2.4 is a stable kernel, produced when we made a real effort to distinguish between between stable and development kernels. This change would break API compatibility in a stable series. Yuk! 4. Make tc look at the kernel version, and choose the appropriate hashing function. This was my solution, and everybody seems to hate it bar me :(. Disadvantages: None, other than it hides a violation of an unwritten law. 5. A combination of 1 4. Change the hashing in 2.6 algorithm back to what it was in 2.4, and hide the change by making tc check the kernel version and choose the matching hash.Disadvantages: None, other than now we have wilfully broken the unwritten law twice. My personal preference is to have a tc in CVS that works with _all_ kernel versions. Yes - the netlink interface has been broken. But is was done, and now can't be undone. No matter why we do, there will forever more be kernel versions with incompatible netlink interfaces. We can't fix it. We just have to limit the damage. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Sat, 2006-03-11 at 08:11 -0500, jamal wrote: On Fri, 2006-10-02 at 12:33 +1000, Russell Stuart wrote: This patch adds a divisor option to tc's sample clause: While this looks right - can we have more test data with tc filter ls both before and after your fix? Specify divisor of 256 and 16 for example. Show that for the 256 it is the same as before and for 16 it does the right thing. With patch, divisor 256: tc qdisc del dev eth0 root tc qdisc add dev eth0 root handle 1: htb tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 ht 801: divisor 256 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0x01 0xff divisor 256 match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0x0f 0xff divisor 256 match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0x10 0xff divisor 256 match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0xef 0xff divisor 256 match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0xf0 0xff divisor 256 match u32 0 0 tc -s filter show dev eth0 filter parent 1: protocol ip pref 1 u32 filter parent 1: protocol ip pref 1 u32 fh 801: ht divisor 256 filter parent 1: protocol ip pref 1 u32 fh 801:1:800 order 2048 key ht 801 bkt 1 flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 801:f:800 order 2048 key ht 801 bkt f flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 801:10:800 order 2048 key ht 801 bkt 10 flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 801:ef:800 order 2048 key ht 801 bkt ef flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 801:f0:800 order 2048 key ht 801 bkt f0 flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1 Without patch, divisor 256: tc qdisc add dev eth0 root handle 1: htb tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 ht 801: divisor 256 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0x01 0xff match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0x0f 0xff match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0x10 0xff match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0xef 0xff match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0xf0 0xff match u32 0 0 tc -s filter show dev eth0 filter parent 1: protocol ip pref 1 u32 filter parent 1: protocol ip pref 1 u32 fh 801: ht divisor 256 filter parent 1: protocol ip pref 1 u32 fh 801:1:800 order 2048 key ht 801 bkt 1 flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 801:f:800 order 2048 key ht 801 bkt f flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 801:10:800 order 2048 key ht 801 bkt 10 flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 801:ef:800 order 2048 key ht 801 bkt ef flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 801:f0:800 order 2048 key ht 801 bkt f0 flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1 With patch, divisor 16: tc qdisc del dev eth0 root tc qdisc add dev eth0 root handle 1: htb tc filter add dev eth0 parent 1:0 prio 1 protocol ip u32 ht 801: divisor 16 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0x01 0xff divisor 16 match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0x0f 0xff divisor 16 match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0x10 0xff divisor 16 match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0xef 0xff divisor 16 match u32 0 0 tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 ht 801: classid 1:0 sample u8 0xf0 0xff divisor 16 match u32 0 0 tc -s filter show dev eth0 filter parent 1: protocol ip pref 1 u32 filter parent 1: protocol ip pref 1 u32 fh 801: ht divisor 16 filter parent 1: protocol ip pref 1 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 801::801 order 2049 key ht 801 bkt 0 flowid 1: match / at 0 filter parent 1: protocol ip pref 1 u32 fh 801:1:800 order 2048 key ht 801 bkt 1 flowid 1: match / at 0
Re: [2.6 patch] drivers/net/wireless/ipw2200.c: fix an array overun
On Sat, 2006-03-11 at 04:42 +0100, Adrian Bunk wrote: This patch fixes a big array overun found by the Coverity checker. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- linux-2.6.16-rc5-mm3-full/drivers/net/wireless/ipw2200.c.old 2006-03-11 02:41:23.0 +0100 +++ linux-2.6.16-rc5-mm3-full/drivers/net/wireless/ipw2200.c 2006-03-11 02:42:04.0 +0100 @@ -9956,9 +9956,8 @@ static int ipw_ethtool_set_eeprom(struct return -EINVAL; mutex_lock(p-mutex); memcpy(p-eeprom[eeprom-offset], bytes, eeprom-len); - for (i = IPW_EEPROM_DATA; - i IPW_EEPROM_DATA + IPW_EEPROM_IMAGE_SIZE; i++) - ipw_write8(p, i, p-eeprom[i]); + for (i = 0; i IPW_EEPROM_IMAGE_SIZE; i++) + ipw_write8(p, i + IPW_EEPROM_DATA, p-eeprom[i]); mutex_unlock(p-mutex); return 0; } Acked-by: Zhu Yi [EMAIL PROTECTED] Thanks, -yi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Mon, 2006-03-13 at 21:39 -0500, jamal wrote: On Tue, 2006-14-03 at 12:29 +1000, Russell Stuart wrote: On Sat, 2006-03-11 at 08:11 -0500, jamal wrote: On Fri, 2006-10-02 at 12:33 +1000, Russell Stuart wrote: This patch adds a divisor option to tc's sample clause: While this looks right - can we have more test data with tc filter ls both before and after your fix? Specify divisor of 256 and 16 for example. Show that for the 256 it is the same as before and for 16 it does the right thing. With patch, divisor 256: Thanks. Looks like it is doing the right thing - Although sometimes the only way to really see things is to send some traffic. Take my word for it that I have sent mega bytes of traffic through it. I also used the patch below to view how the u32 filter was classifying things. Stephen - it looks like jamal is happy with this patch. (Jamal - I hope I am not putting words into your mouth.) Could you detail why you didn't add it to CVS, so I can attempt to fix any problems you have with it? Just to be clear, the patch below produces trace. It is an excellent way of debugging a complex u32 filter - but should kept away from production systems. diff -Nur kernel-source-2.6.15.orig/net/sched/cls_u32.c kernel-source-2.6.15/net/sched/cls_u32.c --- kernel-source-2.6.15.orig/net/sched/cls_u32.c 2006-01-03 13:21:10.0 +1000 +++ kernel-source-2.6.15/net/sched/cls_u32.c2006-02-23 17:39:27.0 +1000 @@ -58,6 +58,8 @@ #include net/act_api.h #include net/pkt_cls.h +#defineU32_TRACE(x, args...) printk(x, ## args) + struct tc_u_knode { struct tc_u_knode *next; @@ -131,6 +133,11 @@ #endif int i, r; + U32_TRACE( + u32_classify dev=%s len=%04x proto=%02x src=%08x dst=%08x ports=%08x skblen=%04x , + skb-dev-name, ((u16*)ptr)[1], ptr[9], + ((u32*)ptr)[3], ((u32*)ptr)[4], ((u32*)ptr)[5], skb-len); + next_ht: n = ht-ht[sel]; @@ -138,6 +145,9 @@ if (n) { struct tc_u32_key *key = n-sel.keys; + + U32_TRACE( handle=%08x, n-handle); + #ifdef CONFIG_CLS_U32_PERF n-pf-rcnt +=1; j = 0; @@ -146,6 +156,7 @@ #ifdef CONFIG_CLS_U32_MARK if ((skb-nfmark n-mark.mask) != n-mark.val) { n = n-next; + U32_TRACE(~mark); goto next_knode; } else { n-mark.success++; @@ -156,8 +167,12 @@ if ((*(u32*)(ptr+key-off+(off2key-offmask))^key-val)key-mask) { n = n-next; + U32_TRACE(~%08x=%08x%08x, + key-val, *(u32*)(ptr+key-off+(off2key-offmask)), key-mask); goto next_knode; } + U32_TRACE(,%08x=%08x%08x, + key-val, *(u32*)(ptr+key-off+(off2key-offmask)), key-mask); #ifdef CONFIG_CLS_U32_PERF n-pf-kcnts[j] +=1; j++; @@ -171,6 +186,7 @@ #ifdef CONFIG_NET_CLS_IND if (!tcf_match_indev(skb, n-indev)) { n = n-next; + U32_TRACE(~in-dev); goto next_knode; } #endif @@ -180,11 +196,15 @@ r = tcf_exts_exec(skb, n-exts, res); if (r 0) { n = n-next; + U32_TRACE(~exts); goto next_knode; } + U32_TRACE(~OK classid=%08x\n, n-res.classid); + return r; } + U32_TRACE(~class); n = n-next; goto next_knode; } @@ -198,11 +218,20 @@ ht = n-ht_down; sel = 0; - if (ht-divisor) + + U32_TRACE(,link-%08x, ht-handle); + if (ht-divisor) { + U32_TRACE((%08x%08x%d), + *(u32*)(ptr+n-sel.hoff), + n-sel.hmask, n-fshift); sel = ht-divisoru32_hash_fold(*(u32*)(ptr+n-sel.hoff), n-sel,n-fshift); + } + U32_TRACE([%02x], sel); - if (!(n-sel.flags(TC_U32_VAROFFSET|TC_U32_OFFSET|TC_U32_EAT))) + if (!(n-sel.flags(TC_U32_VAROFFSET|TC_U32_OFFSET|TC_U32_EAT))) { + U32_TRACE(:+%d+%d, ptr-skb-nh.raw, off2); goto next_ht; + } if (n-sel.flags(TC_U32_OFFSET|TC_U32_VAROFFSET)) {
Re: [PATCH] TC: bug fixes to the sample clause
On Tue, 2006-14-03 at 13:16 +1000, Russell Stuart wrote: On Mon, 2006-03-13 at 21:24 -0500, jamal wrote: [..] And besides if you really insist, look at using hashkey - it will work a lot better since the dependency is only at the kernel and none at user space. Hashkey and sample do different things. Briefly, hashkey is used to direct a link operation to the correct bucket at runtime, whereas sample is used put a u32 filter line in the correct bucket at compile time. If you want more detail read the doco I wrote. Trust me - I understand this stuff;- (I have described these two to many people). And i will take a look at your doc at some point. What i mean is: that you could achieve the result of stitching hashtables together to direct a packet to eventually reach a terminal node with either scheme. Sample is supposed to be more intuitive but you dont need to use it. As it happens, I agree with removing any dependency on the hashing algorithm at user level - if not at user space. tc lives in user space, and is the exception. I tend to view it like iptables - if you want it to work with the latest kernel, get the latest version of iptables. Ditto for tc. tc hides the details of the kernel. This hiding the details in tc gives the kernel developers the freedom to change the hashing algorithm, and indeed other details, any time they like. [..] So my take on this is: either forget about making any changes at all And leave the sample clause of u32 broken? I don't view that as a reasonable option. I am surprised you do. Given that the change is controversial and that other than you or i (and of course Alexey) - I dont know of anybody else who is even aware of this feature; i consider it as an option, yes. You said to do two things: a. Modify tc to use the 2.6 algorithm only. b. Modify the 2.4 kernel to use the 2.6 hashing algorithm. I viewed these two things are part of the same package. They have to be, otherwise we don't have a tc in CVS that works with all maintained kernels. Agreed. But you didnt seem to agree with this earlier. The issue I have with it is that if I were the 2.4 kernel maintainer, there is no way I would allow a change that breaks 2.4 binary compatibility. So to me, the package is a non-flyer. It breaks it for a small number of users - ones who are actually pretty clueful and will be able to do an upgrade. Actually, I am having trouble understanding why you dislike the utsname patch. Binary compatibility was broken between 2.4 and 2.6. Fine. So the assumption is the user space apps have to deal with it, I presume. If you don't like dealing with it using utsname, then what other method do you suggest we use? Ignoring the problem and saying it will only effect a few users isn't dealing with it. To me it looks more like pretending it didn't happen, in hopes that no-one will notice. If you can find 2 other people who use sample in scripts in the manner in which you and i use them (ok maybe i should lower it to 1 other person), then please send me a shipping address and I will send you a gift certificate for a drink of your choice. For the common use of 1 byte (for the cutnpasters out there) i think we agree it is a non-issue. So lets assume a worst case where there is no 2.4.x fix, it is still fine. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Tue, 2006-14-03 at 13:27 +1000, Russell Stuart wrote: On Mon, 2006-03-13 at 21:39 -0500, jamal wrote: Looks like it is doing the right thing - Although sometimes the only way to really see things is to send some traffic. Take my word for it that I have sent mega bytes of traffic through it. I also used the patch below to view how the u32 filter was classifying things. Stephen - it looks like jamal is happy with this patch. (Jamal - I hope I am not putting words into your mouth.) Yes, I am ACKing that patch. Could you detail why you didn't add it to CVS, so I can attempt to fix any problems you have with it? Just to be clear, the patch below produces trace. It is an excellent way of debugging a complex u32 filter - but should kept away from production systems. BTW, another way to track if a rule is being hit or not is to add a simple action to accept using gact action. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] TC: bug fixes to the sample clause
On Mon, 2006-03-13 at 22:10 -0500, jamal wrote: I will try to find the scripts. Just the data sets will do. I can then pump it through my python script. But before launching into this, I don't see the choice of algorithm as very important. Yes, I believe the 2.4 one was better. But not by so much that the greatly alters the effectiveness of u32. The end goal is to reduce the total lookups at the expense of more RAM. I dont see using 20M as a big deal for example. Sigh. No its not .. now. Imagine a few thousand hosts... and you want to distribute these across many hash tables (i recall the test i had was for 64K unique hosts). One of the techniques you could use is to have several layers of hash tables before eventually hitting the terminal; at each then you could spread the bits (of lets say the 3rd octet of src IP in 3 bit hash masks) in the same hash level at the diagonal level. The result is many hash tables created (a lot of memory consumed) but faster lookups. In such a case a selection on a hash table could be made with bit pattern 101 or 111 etc as the mask with a hash table with only 8 buckets. I am not sure if that made sense. If you use a tree of hash tables, where each hash table is indexed off one byte - as I think you are suggesting, then the two algorithms are very similar. Identical, in fact, unless the byte you are indexing off is split across a natural 8 bit boundary. If that is the case the two algorithms will put things in different buckets, but there still be only one filter element per bucket. So the only case work comparing is where you are hashing on more than 8 bits. If the code space is dense (by that I mean if you have N bits then the number of items (E say) you have to distinguish between 2^(N-1)), then you are better off using a tree like you said. Effectively you aren't hashing - you are indexing into an array (if that makes any sense). So it is only the non-dense cases, where E 2^N where the hashing algorithm becomes important. In my case, I was indexing off the tcp/udp port. E==400 odd, and N==16. This is a very sparse use of the code space, and so the hashing mattered. I was so concerned that I would end up with 50 or 60 items in one bucket, I actually went to the trouble of plotting it using a python program. I was amazed at how well the original 2.4 hash worked. I then stunned that the resulting u32 filter didn't work - because of the bugs in sample. (This is how I ended up here - posting patches to netdev.) When I check the results using 2.6 hashing function the results were different, and worse, but no 50 item bucket disasters. This is all navel gazing of course - based on one example. Yours may reveal a different story. The equation you provided for evaluating the different hashes does look clever but i have never seen it used anywhere. What i did was to compute the std deviation from the best value. Pulled it out of my arse :(. However, it is a standard deviation of sorts. Plot the buckets on a graph. The bucket number is the X axis. The number of items in the bucket is the Y axis. Since every bucket should have the same number of items, the resulting graph for a perfect hash should be a straight line, parallel to the X axis. Perhaps not exact, because you can't have 1/2 an item in the bucket, so you may get a 1 item difference between some buckets. The figure I calculated is the standard deviation of the actual graph from this perfect one. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 9/9] Generic netlink interface for delay accounting
jamal wrote: On Mon, 2006-13-03 at 18:33 -0800, Matt Helsley wrote: On Mon, 2006-03-13 at 19:56 -0500, Shailabh Nagar wrote: Jamal, was your Mon, 13 Mar 2006 21:29:09 -0500 reply: Note, you are still not following the standard scheme of doing things. Example: using command = GET and the message carrying the TGID to note which TGID is of interest. Instead you have command = TGID. meant to suggest that TASKSTATS_CMD_(P|TG)ID should be renamed to TASKSTATS_CMD_GET_(P|TG)ID ? Is that sufficient? Or am I misunderstanding? I had a long description in an earlier email feedback; but the summary of it is the GET command is generic like TASKSTATS_CMD_GET; the message itself carries TLVs of what needs to be gotten which are either PID and/or TGID etc. Anyways, theres a long spill of what i am saying in that earlier email. Perhaps the current patch is a transition towards that? Yes, the comments you'd made in the previous mail have not been incorporated and this is still the older version of the patch. We'd been avoiding TLV usage so far :-) cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 9/9] Generic netlink interface for delay accounting
Matt Helsley wrote: On Mon, 2006-03-13 at 19:56 -0500, Shailabh Nagar wrote: snip Comments addressed (all in response to Jamal) - Eliminated TASKSTATS_CMD_LISTEN and TASKSTATS_CMD_IGNORE The enums for these are still in the patch. See below. snip +/* + * Commands sent from userspace + * Not versioned. New commands should only be inserted at the enum's end + */ + +enum { + TASKSTATS_CMD_UNSPEC, /* Reserved */ + TASKSTATS_CMD_NONE, /* Not a valid cmd to send +* Marks data sent on task/tgid exit */ + TASKSTATS_CMD_LISTEN, /* Start listening */ + TASKSTATS_CMD_IGNORE, /* Stop listening */ From the description I thought you had eliminated these. Yup, typo. the commands aren't registered but the enums hang on. Will fix. --Shailabh - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[Fwd: Re: [PATCH 15/16] ipw2200: switch to the new ipw2200-fw-3.0 image format]
---BeginMessage--- I don't see how to verify the image being loaded is appropriate for the operating mode. The old fw header had a mode field (0 bss, 1 ibss, 2 monitor) but the new one does not--unless it's encoded in the version field? Sam ---End Message---
Re: [PATCH] TC: bug fixes to the sample clause
On Mon, 2006-03-13 at 22:39 -0500, jamal wrote: Trust me - I understand this stuff;- (I have described these two to many people). And i will take a look at your doc at some point. What i mean is: that you could achieve the result of stitching hashtables together to direct a packet to eventually reach a terminal node with either scheme. Sample is supposed to be more intuitive but you dont need to use it. OK. But what I can replace sample with is directly specifying the hash using the ht option. This requires me to manually calculate the hash. Since sample hasn't worked since 2.6, I guess I must be the only person on the planet who isn't using ht. I will also that this to mean that I am almost certainly the only person on the planet hashing more than one byte. If you were you hashing more than one byte, then who wants to manually calculate a hash? Not me anyway - that is what computers are for. Unless, of course, you happen to be the author of the said hashing algorithm. Then you know that manually calculating it is trivial in 2.6 - it can be done by inspection. Making the user manually calculate the hash puts the said author in a bit of a bind, of course. It now means every shell script out there effectively knows, and depends on knowing, the hash algorithm used by the kernel. Ergo, it can't be changed without causing huge amounts of pain. You might get away with it between major releases, 2.4 to 2.6 say. But now it appears the kernel has done away with major releases, so changing the hash function becomes a near impossibility. Personally, I would never of allowed the user to specify the bucket using the ht option, and reserved the right to changing the hash algorithm in the kernel and user space tools whenever I wanted to. It is still possible to move in that direction, btw. You could assume the bucket passed to the ht option is a single byte value (as calculated by sample u3 VALUE 0xff say), which is to be hashed. It would work because the current hash functions on single byte values are identities. Given that the change is controversial and that other than you or i (and of course Alexey) - I dont know of anybody else who is even aware of this feature; i consider it as an option, yes. It is controversial with you, and possibly Stephen (I haven't really understood why he rejected the patch, he hasn't replied yet to my last email), and me of course. As you point out, we seem to be the only people in the Universe who care. You didn't really answer my question below about why you dislike the patch. I can tell you why I want it. I run production boxes, some with 2.4 and some with 2.6. I am currently carrying patches for tc. I would rather not do that - I don't get security updates for example. Purely selfish reasons I know - but isn't that how these things are meant to work - if you have an itch to scratch...? I viewed these two things are part of the same package. They have to be, otherwise we don't have a tc in CVS that works with all maintained kernels. Agreed. But you didnt seem to agree with this earlier. I didn't mean to come across like that. Sorry. If you can find 2 other people who use sample in scripts in the manner in which you and i use them (ok maybe i should lower it to 1 other person), then please send me a shipping address and I will send you a gift certificate for a drink of your choice. You may of well asked me to find the holy grail, I think. :(. For the common use of 1 byte (for the cutnpasters out there) i think we agree it is a non-issue. So lets assume a worst case where there is no 2.4.x fix, it is still fine. Assuming you are not swayed by the arguments above, I think I have lost this one. I presume you have no objections to changing tc to use the 2.6 hash and leave it at that. Yes? Before I can submit a patch to do that, you have to decide whether reverting to the 2.4 hash function is a good idea. If so it is the kernel that has to change, not tc. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [NET] : get rid of 'struct sec_path *' in skbuff if ! CONFIG_XFRM
Herbert Xu a écrit : Eric Dumazet [EMAIL PROTECTED] wrote: - C(sp); -#ifdef CONFIG_INET - secpath_get(skb-sp); +#ifdef CONFIG_XFRM + n-sp = secpath_get(skb-sp); #endif Please move the ifdef into xfrm.h while you're at it. That is, make secpath_get a no-op if CONFIG_XFRM is not defined. Thanks, Hum, but then we need a new macro or prototype, because n-sp is not valid n-sp = secpath_get(skb-sp); would still miscompile, even if secpath_get() is a no-op Maybe a new secpath_copy() declared in include/net/xfrm.h ? static inline void secpath_copy(struct sk_buff *dst, struct_sk_buff *src) { #ifdef CONFIG_XFRM dst-sp = secpath_get(src); #endif } Eric - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html