Re: [RFC PATCH]: Dynamically sized routing cache hash table.
David Miller a écrit : From: Eric Dumazet <[EMAIL PROTECTED]> Date: Tue, 06 Mar 2007 08:14:46 +0100 I wonder... are you sure this has no relation with the size of rt_hash_locks / RT_HASH_LOCK_SZ ? One entry must have the same lock in the two tables when resizing is in flight. #define MIN_RTHASH_SHIFT LOG2(RT_HASH_LOCK_SZ) Good point. +static struct rt_hash_bucket *rthash_alloc(unsigned int sz) +{ + struct rt_hash_bucket *n; + + if (sz <= PAGE_SIZE) + n = kmalloc(sz, GFP_KERNEL); + else if (hashdist) + n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); + else + n = (struct rt_hash_bucket *) + __get_free_pages(GFP_KERNEL, get_order(sz)); I dont feel well with this. Maybe we could try a __get_free_pages(), and in case of failure, fallback to vmalloc(). Then keep a flag to be able to free memory correctly. Anyway, if (get_order(sz)>=MAX_ORDER) we know __get_free_pages() will fail. We have to use vmalloc() for the hashdist case so that the pages are spread out properly on NUMA systems. That's exactly what the large system hash allocator is going to do on bootup anyways. Yes, but on bootup you have an appropriate NUMA active policy. (Well... we hope so, but it broke several time in the past) I am not sure what kind of mm policy is active for scheduled works. Anyway I have some XX GB machines, non NUMA, and I would love to be able to have a 2^20 slots hash table, without having to increase MAX_ORDER. Look, either both are right or both are wrong. I'm just following protocol above and you'll note the PRECISE same logic exists in other dynamically growing hash table implementations such as net/xfrm/xfrm_hash.c Yes, they are both wrong/dumb :) Can we be smarter, or do we have to stay dumb ? :) struct rt_hash_bucket *n = NULL; if (sz <= PAGE_SIZE) { n = kmalloc(sz, GFP_KERNEL); *kind = allocated_by_kmalloc; } else if (!hashdist) { n = (struct rt_hash_bucket *) __get_free_pages(GFP_KERNEL, get_order(sz)); *kind = allocated_by_get_free_pages; } if (!n) { n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); *kind = allocated_by_vmalloc; } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH]: Dynamically sized routing cache hash table.
From: Eric Dumazet <[EMAIL PROTECTED]> Date: Tue, 06 Mar 2007 08:14:46 +0100 > I wonder... are you sure this has no relation with the size of rt_hash_locks > / > RT_HASH_LOCK_SZ ? > One entry must have the same lock in the two tables when resizing is in > flight. > #define MIN_RTHASH_SHIFT LOG2(RT_HASH_LOCK_SZ) Good point. > > +static struct rt_hash_bucket *rthash_alloc(unsigned int sz) > > +{ > > + struct rt_hash_bucket *n; > > + > > + if (sz <= PAGE_SIZE) > > + n = kmalloc(sz, GFP_KERNEL); > > + else if (hashdist) > > + n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); > > + else > > + n = (struct rt_hash_bucket *) > > + __get_free_pages(GFP_KERNEL, get_order(sz)); > > I dont feel well with this. > Maybe we could try a __get_free_pages(), and in case of failure, fallback to > vmalloc(). Then keep a flag to be able to free memory correctly. Anyway, if > (get_order(sz)>=MAX_ORDER) we know __get_free_pages() will fail. We have to use vmalloc() for the hashdist case so that the pages are spread out properly on NUMA systems. That's exactly what the large system hash allocator is going to do on bootup anyways. Look, either both are right or both are wrong. I'm just following protocol above and you'll note the PRECISE same logic exists in other dynamically growing hash table implementations such as net/xfrm/xfrm_hash.c > Could you add const qualifiers to 'struct rt_hash *' in prototypes where > appropriate ? Sure, no problem. > Maybe that for small tables (less than PAGE_SIZE/2), we could embed them in > 'struct rt_hash' Not worth the pain nor the in-kernel-image-space it would chew up, in my opinion. After you visit a handful of web sites you'll get beyond that threshold. > Could we group all static vars at the begining of this file, so that we > clearly see where we should place them, to avoid false sharing. Sure. > > + > > +static void rt_hash_resize(unsigned int new_shift) Damn, please don't quote such huge portions of a patch without any comments, this has to go out to several thousand recipients you know :-/ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH]: Dynamically sized routing cache hash table.
David Miller a écrit : This is essentially a "port" of Nick Piggin's dcache hash table patches to the routing cache. It solves the locking issues during table grow/shrink that I couldn't handle properly last time I tried to code up a patch like this. But one of the core issues of this kind of change still remains. There is a conflict between the desire of routing cache garbage collection to reach a state of equilibrium and the hash table grow code's desire to match the table size to the current state of affairs. Actually, more accurately, the conflict exists in how this GC logic is implemented. The core issue is that hash table size guides the GC processing, and hash table growth therefore modifies those GC goals. So with the patch below we'll just keep growing the hash table instead of giving GC some time to try to keep the working set in equilibrium before doing the hash grow. One idea is to put the hash grow check in the garbage collector, and put the hash shrink check in rt_del(). In fact, it would be a good time to perhaps hack up some entirely new passive GC logic for the routing cache. BTW, another thing that plays into this is that Robert's TRASH work could make this patch not necessary :-) Well, maybe... but after looking robert's trash, I discovered its model is essentially a big (2^18 slots) root node (our hash table), and very few order:1,2,3 nodes. Almost all leaves... work in progress anyway. Please find my comments in your patch Finally, I know that (due to some of Nick's helpful comments the other day) that I'm missing some rcu_assign_pointer()'s in here. Fixes in this area are most welcome. This patch passes basic testing on UP sparc64, but please handle with care :) Signed-off-by: David S. Miller <[EMAIL PROTECTED]> diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 0b3d7bf..57e004a 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -92,6 +92,9 @@ #include #include #include +#include +#include +#include #include #include #include @@ -242,28 +245,195 @@ static spinlock_t*rt_hash_locks; # define rt_hash_lock_init() #endif -static struct rt_hash_bucket *rt_hash_table; -static unsignedrt_hash_mask; -static int rt_hash_log; -static unsigned intrt_hash_rnd; +#define MIN_RTHASH_SHIFT 4 I wonder... are you sure this has no relation with the size of rt_hash_locks / RT_HASH_LOCK_SZ ? One entry must have the same lock in the two tables when resizing is in flight. #define MIN_RTHASH_SHIFT LOG2(RT_HASH_LOCK_SZ) +#if BITS_PER_LONG == 32 +#define MAX_RTHASH_SHIFT 24 +#else +#define MAX_RTHASH_SHIFT 30 +#endif + +struct rt_hash { + struct rt_hash_bucket *table; + unsigned intmask; + unsigned intlog; +}; + +struct rt_hash *rt_hash __read_mostly; +struct rt_hash *old_rt_hash __read_mostly; +static unsigned int rt_hash_rnd __read_mostly; +static DEFINE_SEQLOCK(resize_transfer_lock); +static DEFINE_MUTEX(resize_mutex); I think a better model would be a structure, with a part containing 'read mostly' data, and part of 'higly modified' data with appropriate align_to_cache For example, resize_transfer_lock should be in the first part, like rt_hash and old_rt_hash, dont you think ? All static data of this file should be placed on this single structure so that we can easily avoid false sharing and have optimal placement. static DEFINE_PER_CPU(struct rt_cache_stat, rt_cache_stat); #define RT_CACHE_STAT_INC(field) \ (__raw_get_cpu_var(rt_cache_stat).field++) -static int rt_intern_hash(unsigned hash, struct rtable *rth, - struct rtable **res); +static void rt_hash_resize(unsigned int new_shift); +static void check_nr_rthash(void) +{ + unsigned int sz = rt_hash->mask + 1; + unsigned int nr = atomic_read(&ipv4_dst_ops.entries); + + if (unlikely(nr > (sz + (sz >> 1 + rt_hash_resize(rt_hash->log + 1); + else if (unlikely(nr < (sz >> 1))) + rt_hash_resize(rt_hash->log - 1); +} -static unsigned int rt_hash_code(u32 daddr, u32 saddr) +static struct rt_hash_bucket *rthash_alloc(unsigned int sz) +{ + struct rt_hash_bucket *n; + + if (sz <= PAGE_SIZE) + n = kmalloc(sz, GFP_KERNEL); + else if (hashdist) + n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); + else + n = (struct rt_hash_bucket *) + __get_free_pages(GFP_KERNEL, get_order(sz)); I dont feel well with this. Maybe we could try a __get_free_pages(), and in case of failure, fallback to vmalloc(). Then keep a flag to be able to free memory correctly. Anyway, if (get_order(sz)>=MAX_ORDER) we know __get_free_pages() will fail. + + if (n) + memset(n, 0, sz); + + return n; +} + +static void rthash_free(struct rt_hash_bucket *r, unsigned int sz) +{ + if (sz <= PAGE_SIZE) +
Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED
On Mon, Mar 05, 2007 at 08:03:50PM -0800, Greg KH wrote: > On Mon, Mar 05, 2007 at 09:39:47PM -0600, Matt Mackall wrote: > > On Mon, Mar 05, 2007 at 06:48:50PM -0800, Greg KH wrote: > > > If so, can you disable the option and strace it to see what program is > > > trying to access what? That will put the > > > HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty > > > quickly :) > > > > Ok, I've got straces of both good and bad (>5M each). Filtered out > > random pointer values and the like, diffed, and filtered for /sys/, > > and the result's still 1.5M. What should I be looking for? > > Failures when trying to read from /sys/class/net/ > > Or opening the directory and iterating over the subdirs in there. Or > something like that. > > But the /sys/class/net/ stuff should hopefully help narrow it down. Works: 6857 open("/sys/class/net", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 13 6857 fstat64(13, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 6857 fcntl64(13, F_SETFD, FD_CLOEXEC) = 0 6857 getdents64(13, /* 5 entries */, 4096) = 120 6857 readlink("/sys/class/net/eth1", 0x80a2450, 256) = -1 EINVAL (Invalid argument) 6857 readlink("/sys/class/net/eth1/device", "../../../devices/pci:00/:00:1e.0/:02:02.0", 256) = 53 6857 readlink("/sys/class/net/lo", 0x80a2450, 256) = -1 EINVAL (Invalid argument) 6857 readlink("/sys/class/net/lo/device", 0x80a2450, 256) = -1 ENOENT (No such file or directory) 6857 readlink("/sys/class/net/eth0", 0x80a2450, 256) = -1 EINVAL (Invalid argument) 6857 readlink("/sys/class/net/eth0/device", "../../../devices/pci:00/:00:1e.0/:02:01.0", 256) = 53 6857 getdents64(13, /* 0 entries */, 4096) = 0 6857 close(13) = 0 Breaks: 3620 open("/sys/class/net", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 13 3620 fstat64(13, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 3620 fcntl64(13, F_SETFD, FD_CLOEXEC) = 0 3620 getdents64(13, /* 5 entries */, 4096) = 120 3620 readlink("/sys/class/net/eth1", "../../devices/pci:00/:00:1e.0/00\00:02:02.0/eth1", 256) = 55 3620 readlink("/sys/devices/pci:00/:00:1e.0/:02:02.0/eth1/device", 0x809e910, 256) = -1 ENOENT (No such file or directory) 3620 readlink("/sys/class/net/lo", "../../devices/virtual/net/lo", 256) = 28 3620 readlink("/sys/devices/virtual/net/lo/device", 0x809e960, 256) = -1 ENOEN\T (No such file or directory) 3620 readlink("/sys/class/net/eth0", "../../devices/pci:00/:00:1e.0/00\00:02:01.0/eth0", 256) = 55 3620 readlink("/sys/devices/pci:00/:00:1e.0/:02:01.0/eth0/device", 0x809e960, 256) = -1 ENOENT (No such file or directory) 3620 getdents64(13, /* 0 entries */, 4096) = 0 3620 close(13) = 0 -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] LVS: Send ICMP unreachable responses to end-users when real-servers are removed
From: Horms <[EMAIL PROTECTED]> Date: Sun, 11 Feb 2007 12:04:43 +0900 > this is a small patch by Janusz Krzysztofik to ip_route_output_slow() > that allows VIP-less LVS linux director to generate packets originating > >From VIP if sysctl_ip_nonlocal_bind is set. > > In a nutshell, the intention is for an LVS linux director to be able > to send ICMP unreachable responses to end-users when real-servers are > removed. > > http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00106.html > > I'm not really sure about the correctness of this approach, > so I am sending it here to netdev for review > > Cc: Janusz Krzysztofik <[EMAIL PROTECTED]> > Signed-off-by: Simon Horman <[EMAIL PROTECTED]> I'm not against this patch or the idea, I just want to think about it some more to make sure there are not bad unintended side effects to allowing this. If someone else could provide some feedback or comments, I'd very much appreciate that as well. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Arp announce (for Xen)
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Thu, 1 Mar 2007 17:30:30 -0800 > What about implementing the unused arp_announce flag on the inetdevice? > Something like the following. Totally untested... > > Looks like it either was there (and got removed) or was planned but > never implemented. This idea is fine. But: > + case NETDEV_CHANGEADDR: > + /* Send gratuitous ARP in case of address change or new device > */ > + if (IN_DEV_ARP_ANNOUNCE(in_dev)) > + arp_send(ARPOP_REQUEST, ETH_P_ARP, > + in_dev->ifa_list->ifa_address, dev, > + in_dev->ifa_list->ifa_address, NULL, > + dev->dev_addr, NULL); We'll need to make sure the appropriate 'arp_anounce' address selection is employed here. One idea is to change arp_solicit() such that it can be invoked in this context, or provide a new helper function which will do the source address selection rules of 'arp_announce' and then invoke arp_send() as appropriate for us. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH]: Dynamically sized routing cache hash table.
This is essentially a "port" of Nick Piggin's dcache hash table patches to the routing cache. It solves the locking issues during table grow/shrink that I couldn't handle properly last time I tried to code up a patch like this. But one of the core issues of this kind of change still remains. There is a conflict between the desire of routing cache garbage collection to reach a state of equilibrium and the hash table grow code's desire to match the table size to the current state of affairs. Actually, more accurately, the conflict exists in how this GC logic is implemented. The core issue is that hash table size guides the GC processing, and hash table growth therefore modifies those GC goals. So with the patch below we'll just keep growing the hash table instead of giving GC some time to try to keep the working set in equilibrium before doing the hash grow. One idea is to put the hash grow check in the garbage collector, and put the hash shrink check in rt_del(). In fact, it would be a good time to perhaps hack up some entirely new passive GC logic for the routing cache. BTW, another thing that plays into this is that Robert's TRASH work could make this patch not necessary :-) Finally, I know that (due to some of Nick's helpful comments the other day) that I'm missing some rcu_assign_pointer()'s in here. Fixes in this area are most welcome. This patch passes basic testing on UP sparc64, but please handle with care :) Signed-off-by: David S. Miller <[EMAIL PROTECTED]> diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 0b3d7bf..57e004a 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -92,6 +92,9 @@ #include #include #include +#include +#include +#include #include #include #include @@ -242,28 +245,195 @@ static spinlock_t*rt_hash_locks; # define rt_hash_lock_init() #endif -static struct rt_hash_bucket *rt_hash_table; -static unsignedrt_hash_mask; -static int rt_hash_log; -static unsigned intrt_hash_rnd; +#define MIN_RTHASH_SHIFT 4 +#if BITS_PER_LONG == 32 +#define MAX_RTHASH_SHIFT 24 +#else +#define MAX_RTHASH_SHIFT 30 +#endif + +struct rt_hash { + struct rt_hash_bucket *table; + unsigned intmask; + unsigned intlog; +}; + +struct rt_hash *rt_hash __read_mostly; +struct rt_hash *old_rt_hash __read_mostly; +static unsigned int rt_hash_rnd __read_mostly; +static DEFINE_SEQLOCK(resize_transfer_lock); +static DEFINE_MUTEX(resize_mutex); static DEFINE_PER_CPU(struct rt_cache_stat, rt_cache_stat); #define RT_CACHE_STAT_INC(field) \ (__raw_get_cpu_var(rt_cache_stat).field++) -static int rt_intern_hash(unsigned hash, struct rtable *rth, - struct rtable **res); +static void rt_hash_resize(unsigned int new_shift); +static void check_nr_rthash(void) +{ + unsigned int sz = rt_hash->mask + 1; + unsigned int nr = atomic_read(&ipv4_dst_ops.entries); + + if (unlikely(nr > (sz + (sz >> 1 + rt_hash_resize(rt_hash->log + 1); + else if (unlikely(nr < (sz >> 1))) + rt_hash_resize(rt_hash->log - 1); +} -static unsigned int rt_hash_code(u32 daddr, u32 saddr) +static struct rt_hash_bucket *rthash_alloc(unsigned int sz) +{ + struct rt_hash_bucket *n; + + if (sz <= PAGE_SIZE) + n = kmalloc(sz, GFP_KERNEL); + else if (hashdist) + n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); + else + n = (struct rt_hash_bucket *) + __get_free_pages(GFP_KERNEL, get_order(sz)); + + if (n) + memset(n, 0, sz); + + return n; +} + +static void rthash_free(struct rt_hash_bucket *r, unsigned int sz) +{ + if (sz <= PAGE_SIZE) + kfree(r); + else if (hashdist) + vfree(r); + else + free_pages((unsigned long)r, get_order(sz)); +} + +static unsigned int rt_hash_code(struct rt_hash *hashtable, +u32 daddr, u32 saddr) { return (jhash_2words(daddr, saddr, rt_hash_rnd) - & rt_hash_mask); + & hashtable->mask); } -#define rt_hash(daddr, saddr, idx) \ - rt_hash_code((__force u32)(__be32)(daddr),\ +#define rt_hashfn(htab, daddr, saddr, idx) \ + rt_hash_code(htab, (__force u32)(__be32)(daddr),\ (__force u32)(__be32)(saddr) ^ ((idx) << 5)) +static unsigned int resize_new_shift; + +static void rt_hash_resize_work(struct work_struct *work) +{ + struct rt_hash *new_hash, *old_hash; + unsigned int new_size, old_size, transferred; + int i; + + if (!mutex_trylock(&resize_mutex)) + goto out; + + new_hash = kmalloc(sizeof(struct rt_hash), GFP_KERNEL); + if (!new_hash) + goto out_unlock; + + new_hash->log = resize_new_shift; + new_size = 1 << new_hash->log; + new_hash->mask = new_siz
Re: [PATCH] natsemi: netpoll fixes
Mark Brown wrote: [Once more with CCs] On Tue, Mar 06, 2007 at 12:10:08AM +0400, Sergei Shtylyov wrote: #ifdef CONFIG_NET_POLL_CONTROLLER static void natsemi_poll_controller(struct net_device *dev) { + struct netdev_private *np = netdev_priv(dev); + disable_irq(dev->irq); - intr_handler(dev->irq, dev); + + /* + * A real interrupt might have already reached us at this point + * but NAPI might still haven't called us back. As the interrupt + * status register is cleared by reading, we should prevent an + * interrupt loss in this case... + */ + if (!np->intr_status) + intr_handler(dev->irq, dev); + enable_irq(dev->irq); Is it possible for this to run at the same time as the NAPI poll? If so then it is possible for the netpoll poll to run between np->intr_status being cleared and netif_rx_complete() being called. If the hardware asserts an interrupt at the wrong moment then this could cause the Well, there is a whole task of analyzing the netpoll conditions under smp. There appears to me to be a race with netpoll and NAPI on another processor, given that netpoll can be called with virtually any system condition on a debug breakpoint or crash dump initiation. I'm spending some time looking into it, but don't have a smoking gun immediately. Regardless, if such a condition does exist, it is shared across many or all of the potential netpolled devices. Since that is exactly the condition the suggested patch purports to solve, it is pointless if the whole NAPI/netpoll race exists. Such a race would lead to various and imaginative failures in the system. So don't fix that problem in a particular driver. If it exists, fix it generally in the netpoll/NAPI infrastructure. In any case, this is a problem independently of netpoll if the chip shares an interrupt with anything so the interrupt handler should be fixed to cope with this situation instead. Yes, that would appear so. If an interrupt line is shared with this device, then the interrupt handler can be called again, even though the device's interrupts are disabled on the interface. So, in the actual interrupt handler, check the dev->state __LINK_STATE_SCHED flag - if it's set, leave immediately, it can't be our interrupt. If it's clear, read the irq enable hardware register. If enabled, do the rest of the interrupt handler. Since the isr is disabled only by the interrupt handler, and enabled only by the poll routine, the race on the interrupt cause register is prevented. And, as a byproduct, the netpoll race is also prevented. You could just always read the isr enable hardware register, but that means you always do an operation to the chip, which can be painfully slow. I guess the tradeoff depends on the probability of getting the isr called when NAPI is active for the device. If this results in netpoll not getting a packet right away, that's okay, since the netpoll users should try again. Mark Huth - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED
On Mon, Mar 05, 2007 at 09:39:47PM -0600, Matt Mackall wrote: > On Mon, Mar 05, 2007 at 06:48:50PM -0800, Greg KH wrote: > > If so, can you disable the option and strace it to see what program is > > trying to access what? That will put the > > HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty > > quickly :) > > Ok, I've got straces of both good and bad (>5M each). Filtered out > random pointer values and the like, diffed, and filtered for /sys/, > and the result's still 1.5M. What should I be looking for? Failures when trying to read from /sys/class/net/ Or opening the directory and iterating over the subdirs in there. Or something like that. But the /sys/class/net/ stuff should hopefully help narrow it down. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED
On Mon, Mar 05, 2007 at 06:48:50PM -0800, Greg KH wrote: > If so, can you disable the option and strace it to see what program is > trying to access what? That will put the > HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty > quickly :) Ok, I've got straces of both good and bad (>5M each). Filtered out random pointer values and the like, diffed, and filtered for /sys/, and the result's still 1.5M. What should I be looking for? -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: when having to acquire an SA, ipsec drops the packet
On Mon, 5 Mar 2007, Joy Latten wrote: > 5. Around the time the set of SAs for OUT direction are to be >inserted into SAD, I see another ACQUIRE happening. > >I have not yet figured out where this second ACQUIRE comes from >and why it happens. As long as the minimal SA or set of valid outgoing >SAs exist in SAD, an ACQUIRE should not happen. I saw something similar to this some time ago when testing various failure modes, and discused it with Herbert. IIRC, there's a larval SA which is not torn down properly by Racoon once the full SA is established, and the larval SA keeps resending until it times out. - James -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED
On Mon, Mar 05, 2007 at 06:48:50PM -0800, Greg KH wrote: > Wait, have confirmed that if you enable this config option, > NetworkManager starts back up again and works properly? Yep, probably should have mentioned that. > If so, can you disable the option and strace it to see what program is > trying to access what? That will put the > HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty > quickly :) Did that a few hours ago, got a very large dump from both programs. No smoking guns to my eye, but I'll send you the logs later. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 02:39:00PM -0800, Greg KH wrote: > Ok, I only named HAL as that is what people have told me the problem is. > I have been running this change on my boxs, without > CONFIG_SYSFS_DEPRECATED since last July or so. > > But I don't use NetworkManager here for the most part, but I have tried > this in the OpenSuse10.3 alpha releases and it seems to work just fine > with whatever version of NetworkManager it uses. At a guess, you're carrying either a git snapshot or have backports from git. Several distributions do this, but until there's actually been a released version that works, it's a bit early to set a timescale. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED
On Mon, Mar 05, 2007 at 07:30:21PM -0600, Matt Mackall wrote: > On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote: > > On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote: > > > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote: > > > > > > > > Ok, how about the following patch. Is it acceptable to everyone? > > > > > > > > thanks, > > > > > > > > greg k-h > > > > > > > > --- > > > > init/Kconfig | 13 +++-- > > > > 1 file changed, 11 insertions(+), 2 deletions(-) > > > > > > > > --- gregkh-2.6.orig/init/Kconfig > > > > +++ gregkh-2.6/init/Kconfig > > > > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED > > > > that belong to a class, back into the /sys/class heirachy, in > > > > order to support older versions of udev. > > > > > > > > - If you are using a distro that was released in 2006 or later, > > > > - it should be safe to say N here. > > > > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora > > > > + release from 2007 or later, it should be safe to say N here. > > > > + > > > > + If you are using Debian or other distros that are slow to > > > > + update HAL, please say Y here. > > > >... > > > > > > The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally > > > for all users, and schedule it's removal for mid-2008 (or later). > > > > > > 12 months after the first _release_ of a HAL that can live without seems > > > to be the first time when we can consider getting rid of it, since all > > > distributions with at least one release a year should ship it by then. > > > > > > Currently, SYSFS_DEPRECATED is only a trap for users. > > > > Huh? > > > > No, again, I've been using this just fine for about 6 months now. > > > > And what about all of the servers not using HAL/NetworkManager? > > And what about all of the embedded systems not using either? > > > > So to not allow this to be turned off by people who might want to (we > > want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will > > other distros released this year), is pretty heavy-handed. > > > > It also will work in OpenSuSE 10.2 which is already released, and I > > think Fedora 6, but I've only limited experience with these. > > > > Oh, and Gentoo works just fine, and has been for the past 6 months. > > > > I would just prefer to come up with an acceptable set of wording that > > will work to properly warn people. > > > > I proposed one such wording which some people took as a slam against > > Debian, which it really was not at all. > > > > Does someone else want to propose some other wording instead? > > Back up a bit. Let's review: > > Problem: NetworkManager stopped working with my ipw2200 on Debian/unstable Wait, have confirmed that if you enable this config option, NetworkManager starts back up again and works properly? If so, can you disable the option and strace it to see what program is trying to access what? That will put the HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty quickly :) thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] pcnet32: only allocate init_block dma consistent
The patch below moves the init_block out of the private struct and only allocates init block with pci_alloc_consistent. This has two effects: 1. Performance increase for non cache coherent machines, because the CPU only data in the private struct are now cached 2. locks are working now for platforms, which need to have locks in cached memory Also use netdev_priv() instead of dev->priv Signed-off-by: Thomas Bogendoerfer <[EMAIL PROTECTED]> Acked-by: Don Fry <[EMAIL PROTECTED]> --- diff --git a/drivers/net/pcnet32.c b/drivers/net/pcnet32.c index 36f9d98..8498c3b 100644 --- a/drivers/net/pcnet32.c +++ b/drivers/net/pcnet32.c @@ -253,12 +253,12 @@ struct pcnet32_access { * so the structure should be allocated using pci_alloc_consistent(). */ struct pcnet32_private { - struct pcnet32_init_block init_block; + struct pcnet32_init_block *init_block; /* The Tx and Rx ring entries must be aligned on 16-byte boundaries in 32bit mode. */ struct pcnet32_rx_head *rx_ring; struct pcnet32_tx_head *tx_ring; - dma_addr_t dma_addr;/* DMA address of beginning of this - object, returned by pci_alloc_consistent */ + dma_addr_t init_dma_addr;/* DMA address of beginning of the init block, + returned by pci_alloc_consistent */ struct pci_dev *pci_dev; const char *name; /* The saved address of a sent-in-place packet/buffer, for skfree(). */ @@ -653,7 +653,7 @@ static void pcnet32_realloc_rx_ring(struct net_device *dev, static void pcnet32_purge_rx_ring(struct net_device *dev) { - struct pcnet32_private *lp = dev->priv; + struct pcnet32_private *lp = netdev_priv(dev); int i; /* free all allocated skbuffs */ @@ -681,7 +681,7 @@ static void pcnet32_poll_controller(struct net_device *dev) static int pcnet32_get_settings(struct net_device *dev, struct ethtool_cmd *cmd) { - struct pcnet32_private *lp = dev->priv; + struct pcnet32_private *lp = netdev_priv(dev); unsigned long flags; int r = -EOPNOTSUPP; @@ -696,7 +696,7 @@ static int pcnet32_get_settings(struct net_device *dev, struct ethtool_cmd *cmd) static int pcnet32_set_settings(struct net_device *dev, struct ethtool_cmd *cmd) { - struct pcnet32_private *lp = dev->priv; + struct pcnet32_private *lp = netdev_priv(dev); unsigned long flags; int r = -EOPNOTSUPP; @@ -711,7 +711,7 @@ static int pcnet32_set_settings(struct net_device *dev, struct ethtool_cmd *cmd) static void pcnet32_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info) { - struct pcnet32_private *lp = dev->priv; + struct pcnet32_private *lp = netdev_priv(dev); strcpy(info->driver, DRV_NAME); strcpy(info->version, DRV_VERSION); @@ -723,7 +723,7 @@ static void pcnet32_get_drvinfo(struct net_device *dev, static u32 pcnet32_get_link(struct net_device *dev) { - struct pcnet32_private *lp = dev->priv; + struct pcnet32_private *lp = netdev_priv(dev); unsigned long flags; int r; @@ -743,19 +743,19 @@ static u32 pcnet32_get_link(struct net_device *dev) static u32 pcnet32_get_msglevel(struct net_device *dev) { - struct pcnet32_private *lp = dev->priv; + struct pcnet32_private *lp = netdev_priv(dev); return lp->msg_enable; } static void pcnet32_set_msglevel(struct net_device *dev, u32 value) { - struct pcnet32_private *lp = dev->priv; + struct pcnet32_private *lp = netdev_priv(dev); lp->msg_enable = value; } static int pcnet32_nway_reset(struct net_device *dev) { - struct pcnet32_private *lp = dev->priv; + struct pcnet32_private *lp = netdev_priv(dev); unsigned long flags; int r = -EOPNOTSUPP; @@ -770,7 +770,7 @@ static int pcnet32_nway_reset(struct net_device *dev) static void pcnet32_get_ringparam(struct net_device *dev, struct ethtool_ringparam *ering) { - struct pcnet32_private *lp = dev->priv; + struct pcnet32_private *lp = netdev_priv(dev); ering->tx_max_pending = TX_MAX_RING_SIZE; ering->tx_pending = lp->tx_ring_size; @@ -781,7 +781,7 @@ static void pcnet32_get_ringparam(struct net_device *dev, static int pcnet32_set_ringparam(struct net_device *dev, struct ethtool_ringparam *ering) { - struct pcnet32_private *lp = dev->priv; + struct pcnet32_private *lp = netdev_priv(dev); unsigned long flags; unsigned int size; ulong ioaddr = dev->base_addr; @@ -847,7 +847,7 @@ static int pcnet32_self_test_count(struct net_device *dev) static void pcnet32_ethtool_test(struct net_device *dev, struct ethtool_test *test, u64 * data) { - struct pcnet32_private *lp = dev->p
[PATCH ] pcnet32: Fix PCnet32 performance bug on non-coherent architecutres
The PCnet32 driver always passed the the size of the largest possible packet to the pci_dma_sync_single_for_cpu and pci_dma_sync_single_for_device. This results in a fairly large "colateral damage" in the caches and makes the flush operation itself much slower. On a system with a 40MHz CPU this patch increases network bandwidth by about 12%. Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]> Acked-by: Don Fry <[EMAIL PROTECTED]> diff --git a/drivers/net/pcnet32.c b/drivers/net/pcnet32.c index 36f9d98..4d94ba7 100644 --- a/drivers/net/pcnet32.c +++ b/drivers/net/pcnet32.c @@ -1234,14 +1234,14 @@ static void pcnet32_rx_entry(struct net_device *dev, skb_put(skb, pkt_len); /* Make room */ pci_dma_sync_single_for_cpu(lp->pci_dev, lp->rx_dma_addr[entry], - PKT_BUF_SZ - 2, + pkt_len, PCI_DMA_FROMDEVICE); eth_copy_and_sum(skb, (unsigned char *)(lp->rx_skbuff[entry]->data), pkt_len, 0); pci_dma_sync_single_for_device(lp->pci_dev, lp->rx_dma_addr[entry], - PKT_BUF_SZ - 2, + pkt_len, PCI_DMA_FROMDEVICE); } lp->stats.rx_bytes += skb->len; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: when having to acquire an SA, ipsec drops the packet
>From: Joy Latten <[EMAIL PROTECTED]> >Date: Mon, 05 Feb 2007 14:53:39 -0600 > >> I can run some tests with this patch and report any results... > >Please check out the two most recent patches I posted: > >1) Updated core patch with ipv6 side added. >2) Fix for thinko noticed by Venkat. I have been testing this a lot in the lspp kernel. Plan to test also in upstream kernel. I am seeing a second ACQUIRE occur while establishing the SAs. My scenario: My policy states to use both the ESP and AH protocols (may not make much sense but this was for testing purposes). I get double SAs with only difference being SPI. Here is what I see happening... 1. Trigger first ACQUIRE via ping or netperf. 2. xfrm_lookup() calls xfrm_tmpl_resolv() who calls xfrm_state_find(). First time around, we need to establish SA, so a minimal SA get allocated and put in SAD, timer is set for the minimal SA to be ACQUIRED and km_query() gets called. 3. xfrm_tmpl_resolv() returns -EAGAIN causing add_wait_queue(&km_waitq, &wait) and proceeding code to get called waiting for SA to be established. As long as the minimal SA with XFRM_STATE_ACQUIRE is in SAD, we keep waiting... 4. First set of SAs (one for AH and ESP) for IN direction get inserted in SAD. 5. Around the time the set of SAs for OUT direction are to be inserted into SAD, I see another ACQUIRE happening. I have not yet figured out where this second ACQUIRE comes from and why it happens. As long as the minimal SA or set of valid outgoing SAs exist in SAD, an ACQUIRE should not happen. The minimal SA does not get removed from the SAD until the set of SAs for OUT get added and the xfrm_state_lock released. And the lock pretty much guarantees no one else can step through the SAD until after new SAs are being added... and if someone gets the lock to step though SAD before OUT SAs are added, minimal SA is still there... 6. Since this second ACQUIRE was able to happen, result is identical sets of SAs for the traffic stream. SPIs are only difference. 7. Noticed something while pasting log info below. Perhaps when outgoing AH SA is added, wake_up(&km_waitq) gets called, lock released, and minimal SA deleted (xfrm_state_add()), xfrm_tmpl_resolv() is called and it looks first for the outgoing ESP SA. Since it is not there yet and no minimal SA, then km_query() results in an ACQUIRE just before the outgoing ESP SA gets added. It would explain why I only see it when both ESP and AH are specified... that is if I am thinking correctly... Regards, Joy Latten >From my log file: Mar 5 19:10:02 racoon: INFO: initiate new phase 2 negotiation: 9.3.192.210[500]<=>9.3.189.55[500] Mar 5 19:10:03 racoon: INFO: IPsec-SA established: AH/Transport 9.3.189.55[0]->9.3.192.210[0] spi=137942922(0x838d78a) Mar 5 19:10:03 racoon: INFO: IPsec-SA established: ESP/Transport 9.3.189.55[0]->9.3.192.210[0] spi=244321490(0xe900cd2) Mar 5 19:10:03 racoon: INFO: IPsec-SA established: AH/Transport 9.3.192.210[0]->9.3.189.55[0] spi=38721750(0x24ed8d6) Mar 5 19:10:03 racoon: INFO: initiate new phase 2 negotiation: 9.3.192.210[500]<=>9.3.189.55[500] Mar 5 19:10:03 racoon: INFO: IPsec-SA established: ESP/Transport 9.3.192.210[0]->9.3.189.55[0] spi=265079770(0xfcccbda) Mar 5 19:10:05 racoon: INFO: IPsec-SA established: AH/Transport 9.3.189.55[0]->9.3.192.210[0] spi=108627618(0x67986a2) Mar 5 19:10:05 racoon: INFO: IPsec-SA established: ESP/Transport 9.3.189.55[0]->9.3.192.210[0] spi=182973856(0xae7f5a0) Mar 5 19:10:05 racoon: INFO: IPsec-SA established: AH/Transport 9.3.192.210[0]->9.3.189.55[0] spi=58486297(0x37c6e19) Mar 5 19:10:05 racoon: INFO: IPsec-SA established: ESP/Transport 9.3.192.210[0]->9.3.189.55[0] spi=268295215(0xffddc2f) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED
On Mon, Mar 05, 2007 at 07:30:21PM -0600, Matt Mackall wrote: > On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote: > > On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote: > > > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote: > > > > > > > > Ok, how about the following patch. Is it acceptable to everyone? > > > > > > > > thanks, > > > > > > > > greg k-h > > > > > > > > --- > > > > init/Kconfig | 13 +++-- > > > > 1 file changed, 11 insertions(+), 2 deletions(-) > > > > > > > > --- gregkh-2.6.orig/init/Kconfig > > > > +++ gregkh-2.6/init/Kconfig > > > > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED > > > > that belong to a class, back into the /sys/class heirachy, in > > > > order to support older versions of udev. > > > > > > > > - If you are using a distro that was released in 2006 or later, > > > > - it should be safe to say N here. > > > > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora > > > > + release from 2007 or later, it should be safe to say N here. > > > > + > > > > + If you are using Debian or other distros that are slow to > > > > + update HAL, please say Y here. > > > >... > > > > > > The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally > > > for all users, and schedule it's removal for mid-2008 (or later). > > > > > > 12 months after the first _release_ of a HAL that can live without seems > > > to be the first time when we can consider getting rid of it, since all > > > distributions with at least one release a year should ship it by then. > > > > > > Currently, SYSFS_DEPRECATED is only a trap for users. > > > > Huh? > > > > No, again, I've been using this just fine for about 6 months now. > > > > And what about all of the servers not using HAL/NetworkManager? > > And what about all of the embedded systems not using either? > > > > So to not allow this to be turned off by people who might want to (we > > want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will > > other distros released this year), is pretty heavy-handed. > > > > It also will work in OpenSuSE 10.2 which is already released, and I > > think Fedora 6, but I've only limited experience with these. > > > > Oh, and Gentoo works just fine, and has been for the past 6 months. > > > > I would just prefer to come up with an acceptable set of wording that > > will work to properly warn people. > > > > I proposed one such wording which some people took as a slam against > > Debian, which it really was not at all. > > > > Does someone else want to propose some other wording instead? > > Back up a bit. Let's review: > > Problem: NetworkManager stopped working with my ipw2200 on Debian/unstable > > Theory A: It broke because I'm not running an as-yet-unreleased HAL. > > Then we should revert the patch pronto because it's an unqualified > regression. > > Theory B: It broke because I'm not running relatively recent HAL. > > By all accounts I'm running the latest and greatest HAL and Network > Manager, more than recent enough to work. > > Theory C: It broke because I've got some goofy config. > > My setup passes no arguments to either. The HAL config file is > completely bare-bones and there's no sign of any configuration files > for Network Manager. > > Theory D: It broke for some nebulous Debian-related reason. > > That's a bunch of unhelpful crap. > > Can we come up with an actual theory for what's wrong with my setup, please? > Like, perhaps: > > Theory E: There's some undiagnosed new breakage that this introduces > that no else hit until it went into mainline. Theory F: It broke because you are using NetworkManager for your network devices and the patches that fix this have not made it into a real release? I'm just guessing, but does anyone who is having this problem, NOT using NetworkManager? I'm running an old version of HAL just fine, but I'm not using NetworkManager here. I am using NetworkManager on a OpenSuSE 10.3 release, but suse's version of NetworkManager is well known to not be anywhere near what is released as a tarball :( thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED
On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote: > On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote: > > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote: > > > > > > Ok, how about the following patch. Is it acceptable to everyone? > > > > > > thanks, > > > > > > greg k-h > > > > > > --- > > > init/Kconfig | 13 +++-- > > > 1 file changed, 11 insertions(+), 2 deletions(-) > > > > > > --- gregkh-2.6.orig/init/Kconfig > > > +++ gregkh-2.6/init/Kconfig > > > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED > > > that belong to a class, back into the /sys/class heirachy, in > > > order to support older versions of udev. > > > > > > - If you are using a distro that was released in 2006 or later, > > > - it should be safe to say N here. > > > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora > > > + release from 2007 or later, it should be safe to say N here. > > > + > > > + If you are using Debian or other distros that are slow to > > > + update HAL, please say Y here. > > >... > > > > The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally > > for all users, and schedule it's removal for mid-2008 (or later). > > > > 12 months after the first _release_ of a HAL that can live without seems > > to be the first time when we can consider getting rid of it, since all > > distributions with at least one release a year should ship it by then. > > > > Currently, SYSFS_DEPRECATED is only a trap for users. > > Huh? > > No, again, I've been using this just fine for about 6 months now. > > And what about all of the servers not using HAL/NetworkManager? > And what about all of the embedded systems not using either? > > So to not allow this to be turned off by people who might want to (we > want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will > other distros released this year), is pretty heavy-handed. > > It also will work in OpenSuSE 10.2 which is already released, and I > think Fedora 6, but I've only limited experience with these. > > Oh, and Gentoo works just fine, and has been for the past 6 months. > > I would just prefer to come up with an acceptable set of wording that > will work to properly warn people. > > I proposed one such wording which some people took as a slam against > Debian, which it really was not at all. > > Does someone else want to propose some other wording instead? Back up a bit. Let's review: Problem: NetworkManager stopped working with my ipw2200 on Debian/unstable Theory A: It broke because I'm not running an as-yet-unreleased HAL. Then we should revert the patch pronto because it's an unqualified regression. Theory B: It broke because I'm not running relatively recent HAL. By all accounts I'm running the latest and greatest HAL and Network Manager, more than recent enough to work. Theory C: It broke because I've got some goofy config. My setup passes no arguments to either. The HAL config file is completely bare-bones and there's no sign of any configuration files for Network Manager. Theory D: It broke for some nebulous Debian-related reason. That's a bunch of unhelpful crap. Can we come up with an actual theory for what's wrong with my setup, please? Like, perhaps: Theory E: There's some undiagnosed new breakage that this introduces that no else hit until it went into mainline. Hmmm, this one sounds more promising. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[UDP]: Clean up UDP-Lite receive checksum
Hi Dave: [UDP]: Clean up UDP-Lite receive checksum This patch eliminates some duplicate code for the verification of receive checksums between UDP-Lite and UDP. It does this by introducing __skb_checksum_complete_head which is identical to __skb_checksum_complete_head apart from the fact that it takes a length parameter rather than computing the first skb->len bytes. As a result UDP-Lite will be able to use hardware checksum offload for packets which do not use partial coverage checksums. It also means that UDP-Lite loopback no longer does unnecessary checksum verification. If any NICs start support UDP-Lite this would also start working automatically. This patch removes the assumption that msg_flags has MSG_TRUNC clear upon entry in recvmsg. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- b830b85a68b42ce10139a7a9e405622e809b8de7 diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 4ff3940..658dfad 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1381,6 +1381,7 @@ static inline void skb_set_timestamp(struct sk_buff *skb, const struct timeval * extern void __net_timestamp(struct sk_buff *skb); +extern __sum16 __skb_checksum_complete_head(struct sk_buff *skb, int len); extern __sum16 __skb_checksum_complete(struct sk_buff *skb); /** diff --git a/include/net/udp.h b/include/net/udp.h index 1b921fa..4a9699f 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -72,10 +72,7 @@ struct sk_buff; */ static inline __sum16 __udp_lib_checksum_complete(struct sk_buff *skb) { - if (! UDP_SKB_CB(skb)->partial_cov) - return __skb_checksum_complete(skb); - return csum_fold(skb_checksum(skb, 0, UDP_SKB_CB(skb)->cscov, - skb->csum)); + return __skb_checksum_complete_head(skb, UDP_SKB_CB(skb)->cscov); } static inline int udp_lib_checksum_complete(struct sk_buff *skb) diff --git a/include/net/udplite.h b/include/net/udplite.h index 67ac514..89aa2bd 100644 --- a/include/net/udplite.h +++ b/include/net/udplite.h @@ -47,11 +47,10 @@ static inline int udplite_checksum_init(struct sk_buff *skb, struct udphdr *uh) return 1; } -UDP_SKB_CB(skb)->partial_cov = 0; cscov = ntohs(uh->len); if (cscov == 0) /* Indicates that full coverage is required. */ - cscov = skb->len; + ; else if (cscov < 8 || cscov > skb->len) { /* * Coverage length violates RFC 3828: log and discard silently. @@ -60,42 +59,16 @@ static inline int udplite_checksum_init(struct sk_buff *skb, struct udphdr *uh) cscov, skb->len); return 1; - } else if (cscov < skb->len) + } else if (cscov < skb->len) { UDP_SKB_CB(skb)->partial_cov = 1; - -UDP_SKB_CB(skb)->cscov = cscov; - - /* -* There is no known NIC manufacturer supporting UDP-Lite yet, -* hence ip_summed is always (re-)set to CHECKSUM_NONE. -*/ - skb->ip_summed = CHECKSUM_NONE; + UDP_SKB_CB(skb)->cscov = cscov; + if (skb->ip_summed == CHECKSUM_COMPLETE) + skb->ip_summed = CHECKSUM_NONE; +} return 0; } -static __inline__ int udplite4_csum_init(struct sk_buff *skb, struct udphdr *uh) -{ - int rc = udplite_checksum_init(skb, uh); - - if (!rc) - skb->csum = csum_tcpudp_nofold(skb->nh.iph->saddr, - skb->nh.iph->daddr, - skb->len, IPPROTO_UDPLITE, 0); - return rc; -} - -static __inline__ int udplite6_csum_init(struct sk_buff *skb, struct udphdr *uh) -{ - int rc = udplite_checksum_init(skb, uh); - - if (!rc) - skb->csum = ~csum_unfold(csum_ipv6_magic(&skb->nh.ipv6h->saddr, -&skb->nh.ipv6h->daddr, -skb->len, IPPROTO_UDPLITE, 0)); - return rc; -} - static inline int udplite_sender_cscov(struct udp_sock *up, struct udphdr *uh) { int cscov = up->len; diff --git a/net/core/datagram.c b/net/core/datagram.c index 186212b..cb056f4 100644 --- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -411,11 +411,11 @@ fault: return -EFAULT; } -__sum16 __skb_checksum_complete(struct sk_buff *skb) +__sum16 __skb_checksum_complete_head(struct sk_buff *skb, int len) { __sum16 sum; - sum = csum_fold(skb_checksum(skb, 0, skb->len, skb->csum)); + sum = csum_fold(skb_checksum(skb, 0, len, skb->csum)); if (likely(!sum)) { if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE))
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, 5 Mar 2007 17:17:09 -0800 Greg KH <[EMAIL PROTECTED]> wrote: > On Mon, Mar 05, 2007 at 05:08:49PM -0800, Andrew Morton wrote: > > On Mon, 5 Mar 2007 19:56:25 -0500 > > Theodore Tso <[EMAIL PROTECTED]> wrote: > > > > > So the question really is are we really done making changes to sysfs, > > > or maybe what we should do is talk about major version numbers to > > > sysfs. > > > > Perhaps using a config option wasn't the right way to do this - a kernel > > boot parameter might be better. > > Ok, I have no problem with that if people really want it. But give me > the option to also make it a config option so I don't have to change our > bootloaders too. Sometimes we provide a config option which provides the default version of the boot option. So: CONFIG_SYSFS_VERSION=1.2 and if (user_provided_sysfs_version == NULL) user_provided_sysfs_version = CONFIG_SYSFS_VERSION; > Does that sound acceptable? If we make CONFIG_SYSFS_DEPRECATED just a boolean boot option then that fixes this problem (we hope) but won't help us next time we want to change something. It all depends on whether sysfs is finished yet ;) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[UDP6]: Restore sk_filter optimisation
Hi Dave: [UDP6]: Restore sk_filter optimisation This reverts the changeset [IPV6]: UDPv6 checksum. We always need to check UDPv6 checksum because it is mandatory. The sk_filter optimisation has nothing to do whether we verify the checksum. It simply postpones it to the point when the user calls recv or poll. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 0ad4719..4474480 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -279,8 +279,10 @@ int udpv6_queue_rcv_skb(struct sock * sk, struct sk_buff *skb) } } - if (udp_lib_checksum_complete(skb)) - goto drop; + if (sk->sk_filter) { + if (udp_lib_checksum_complete(skb)) + goto drop; + } if ((rc = sock_queue_rcv_skb(sk,skb)) < 0) { /* Note that an ENOMEM error is charged twice */ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 05:08:49PM -0800, Andrew Morton wrote: > On Mon, 5 Mar 2007 19:56:25 -0500 > Theodore Tso <[EMAIL PROTECTED]> wrote: > > > So the question really is are we really done making changes to sysfs, > > or maybe what we should do is talk about major version numbers to > > sysfs. > > Perhaps using a config option wasn't the right way to do this - a kernel > boot parameter might be better. Ok, I have no problem with that if people really want it. But give me the option to also make it a config option so I don't have to change our bootloaders too. Does that sound acceptable? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 07:56:25PM -0500, Theodore Tso wrote: > On Mon, Mar 05, 2007 at 04:37:15PM -0800, Greg KH wrote: > > But I AM TRYING TO MAKE IT COMPATIBLE!!! > > > > That's what that config option is there for. If you happen to be > > running a newer userspace, a different distro than what is in Debian > > right now, or don't use HAL and Networkmanager, then disable that > > option. Then all of sysfs looks just like it used to, no user visble > > changes at all. It doesn't get any more compatible than that. > > This is great, but I think the real problem isn't the config option, > but what is changing if the config option isn't enabled. The claim > which some, including Matt and Bron, seem to be making is that if you > turn *off* CONFIG_SYSFS_DEPRECATED, you must be using at least hal > 0.5.9-rc1, released ***yesterday***, or suffer breakages for at least > some system configurations. Ok, well that has been proven incorrect. I originally thought it was HAL that had the problem, but I think that is not true, as I am using the older version of hal here (0.5.7.1) just fine. > So the problem with putting a date in Kconfig.txt help file, or in > Documentation/feature-removal-schedule.txt, is that if there are other > incompatible changes which are added to sysfs in say, December 2007 or > January 2008, but which are papered over with CONFIG_SYSFS_DEPRECATED, > and then come June 2008, CONFIG_SYSFS_DEPRECATED is unceremoniously > ripped out, then users will get screwed. > > So the question really is are we really done making changes to sysfs, > or maybe what we should do is talk about major version numbers to > sysfs. Call what we have currently not CONFIG_SYSFS_DEPRECATED, but > rather CONFIG_SYSFS_LAYOUT_1. At the moment, CONFIG_SYSFS_LAYOUT_2 is > undergoing changes, but at some point we need to lock down and state > that Layout version 2 is never going to change, and then people who > want changes can go work on CONFIG_SYSFS_LAYOUT_3. > > The problem with calling CONFIG_SYSFS_DEPRECATED is that people think > that since it's deprecated, it should be turned off, but if we have > staged major version numbers, with guarantees of absolute stability > once a particular major version number is locked down, then it may > make it a lot easier to talk about what version of hal and udev and > Network Manager is really needed for different versions. This is what Documentation/ABI/ has tried to nail down, unfortunatly it has turned out to be very hard to track down all of the odd userspace programs that use sysfs and see what they are relying on. We are slowly fixing things, as is proof in the OpenSuSE and Gentoo releases. And I'll be the first to admit that the ABI/ directory needs some flushing out... And it isn't really a whole different layout, the only problem here is that a directory has turned into a symlink, so programs that were not written that well (and I'll be the first to admit that I made the same mistake in udev many years ago) and can't handle the change. So numerous programs "just work" fine, but for a limited few, they have problems, hence the config option so that nothing will break. And if you look in the ABI/ directory, it describes this usage of the class devices in sysfs. But again, no one is flushing out the users of these features, or even reading the stuff that is there... So, again, a better wording for the CONFIG help text anyone? Or a better name for the CONFIG value itself? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, 5 Mar 2007 19:56:25 -0500 Theodore Tso <[EMAIL PROTECTED]> wrote: > So the question really is are we really done making changes to sysfs, > or maybe what we should do is talk about major version numbers to > sysfs. Perhaps using a config option wasn't the right way to do this - a kernel boot parameter might be better. In fact, one could envisage a kernel boot parameter "sysfs_version=N" which will allow distro people to select the sysfs-of-the-day which works with their userspace. Because it does appear that we need _something_ which will get us away from this ongoing problem of needing to keep the kernel and userspace synchronised across sysfs changes. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [UDP]: Reread uh pointer after pskb_trim
From: Herbert Xu <[EMAIL PROTECTED]> Date: Tue, 6 Mar 2007 12:00:20 +1100 > Hi Dave: > > [UDP]: Reread uh pointer after pskb_trim > > The header may have moved when trimming. > > Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Good catch, I'll apply this and push to -stable, thanks Herbert. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[UDP]: Reread uh pointer after pskb_trim
Hi Dave: [UDP]: Reread uh pointer after pskb_trim The header may have moved when trimming. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index ce6c460..fc620a7 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1215,6 +1215,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct hlist_head udptable[], if (ulen < sizeof(*uh) || pskb_trim_rcsum(skb, ulen)) goto short_packet; + uh = skb->h.uh; udp4_csum_init(skb, uh); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 04:37:15PM -0800, Greg KH wrote: > But I AM TRYING TO MAKE IT COMPATIBLE!!! > > That's what that config option is there for. If you happen to be > running a newer userspace, a different distro than what is in Debian > right now, or don't use HAL and Networkmanager, then disable that > option. Then all of sysfs looks just like it used to, no user visble > changes at all. It doesn't get any more compatible than that. This is great, but I think the real problem isn't the config option, but what is changing if the config option isn't enabled. The claim which some, including Matt and Bron, seem to be making is that if you turn *off* CONFIG_SYSFS_DEPRECATED, you must be using at least hal 0.5.9-rc1, released ***yesterday***, or suffer breakages for at least some system configurations. So the problem with putting a date in Kconfig.txt help file, or in Documentation/feature-removal-schedule.txt, is that if there are other incompatible changes which are added to sysfs in say, December 2007 or January 2008, but which are papered over with CONFIG_SYSFS_DEPRECATED, and then come June 2008, CONFIG_SYSFS_DEPRECATED is unceremoniously ripped out, then users will get screwed. So the question really is are we really done making changes to sysfs, or maybe what we should do is talk about major version numbers to sysfs. Call what we have currently not CONFIG_SYSFS_DEPRECATED, but rather CONFIG_SYSFS_LAYOUT_1. At the moment, CONFIG_SYSFS_LAYOUT_2 is undergoing changes, but at some point we need to lock down and state that Layout version 2 is never going to change, and then people who want changes can go work on CONFIG_SYSFS_LAYOUT_3. The problem with calling CONFIG_SYSFS_DEPRECATED is that people think that since it's deprecated, it should be turned off, but if we have staged major version numbers, with guarantees of absolute stability once a particular major version number is locked down, then it may make it a lot easier to talk about what version of hal and udev and Network Manager is really needed for different versions. - Ted - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 03:14:25PM -0600, Matt Mackall wrote: > On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote: > > > That's not the point. The point is that Debian/unstable as of _this > > > morning_ doesn't work. For reference, I'm running both the latest > > > releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And > > > there are people telling me I need a copy of HAL out of git that > > > hasn't even been released for Debian to package. Debian isn't the > > > problem here. > > > > hal 0.5.9-rc1 (released, not from git) should work. It will be > > problably released soon and picked by sane distributions. Debian is very > > irritating corner case. > > Presumably the -rc1 stands for "release candidate". Which means "not > yet released". And when did it show up? 04-Mar-2007 at 18:31. That's > right, YESTERDAY. Almost a full month after Greg's commit. > > For the last time, DEBIAN IS NOT THE PROBLEM. Can I please second this (having been burned by hell that was udev of the 0.5ish era) - Greg, please try to make changes in a cross-compatible way so that versions of userspace and kernel are not so closely dependant on tracking each other. The whole 2.6.8 -> 2.6.12 series of kernels and associated udevs are fraught with race conditions where upgrading one but not the other will leave your machine unbootable. I read the "manifesto" for udev showing how crap devfs was, it was broken, it could never be fixed etc - yet my experience was that devfs systems "just worked"[tm] and udev was very dangerous. My thinking is going to be tarnished by that for a while and my mental image of udev is "unreliable POS". I'm hoping enough good experiences with udev might make me feel less scared whenever I have to deal with it. Similarly, I'm hoping I don't have to think "oh shit, will this break boot" every time I upgrade either a kernel or hal version for the next year, because it would really suck to do that all over again. It contributes to the meme that linux is unreliable and perpetually unstable. Regards, Bron. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED
On Tue, Mar 06, 2007 at 01:35:41AM +0100, Adrian Bunk wrote: > On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote: > > On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote: > > > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote: > > > > > > > > Ok, how about the following patch. Is it acceptable to everyone? > > > > > > > > thanks, > > > > > > > > greg k-h > > > > > > > > --- > > > > init/Kconfig | 13 +++-- > > > > 1 file changed, 11 insertions(+), 2 deletions(-) > > > > > > > > --- gregkh-2.6.orig/init/Kconfig > > > > +++ gregkh-2.6/init/Kconfig > > > > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED > > > > that belong to a class, back into the /sys/class heirachy, in > > > > order to support older versions of udev. > > > > > > > > - If you are using a distro that was released in 2006 or later, > > > > - it should be safe to say N here. > > > > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora > > > > + release from 2007 or later, it should be safe to say N here. > > > > + > > > > + If you are using Debian or other distros that are slow to > > > > + update HAL, please say Y here. > > > >... > > > > > > The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally > > > for all users, and schedule it's removal for mid-2008 (or later). > > > > > > 12 months after the first _release_ of a HAL that can live without seems > > > to be the first time when we can consider getting rid of it, since all > > > distributions with at least one release a year should ship it by then. > > > > > > Currently, SYSFS_DEPRECATED is only a trap for users. > > > > Huh? > > > > No, again, I've been using this just fine for about 6 months now. > > > > And what about all of the servers not using HAL/NetworkManager? > > On a server, it shouldn't harm. But if they wanted that option enabled? > > And what about all of the embedded systems not using either? > > If it was much code, I would have sent a patch that allowed disabling it > if EMBEDDED=y. It's not a code size issue. In fact, if the option is enabled, like you have done, it builds more code into the kernel than before. > > So to not allow this to be turned off by people who might want to (we > > want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will > > other distros released this year), is pretty heavy-handed. > > > > It also will work in OpenSuSE 10.2 which is already released, and I > > think Fedora 6, but I've only limited experience with these. > > > > Oh, and Gentoo works just fine, and has been for the past 6 months. > > For most people, it simply doesn't matter whether SYSFS_DEPRECATED is > on or off. Exactly. > But accidentally disabling SYSFS_DEPRECATED has proven to be a trap > people sometimes fall into - and tracking them down to > SYSFS_DEPRECATED=n sometimes takes some time. So how do I put up the warning flag any larger than I have? I do not want this always enabled, that option is not acceptable to me, or to the zillions of people who are running a distro that this option works just fine on (see above list...) thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Tue, Mar 06, 2007 at 11:24:57AM +1100, Bron Gondwana wrote: > On Mon, Mar 05, 2007 at 03:14:25PM -0600, Matt Mackall wrote: > > On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote: > > > > That's not the point. The point is that Debian/unstable as of _this > > > > morning_ doesn't work. For reference, I'm running both the latest > > > > releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And > > > > there are people telling me I need a copy of HAL out of git that > > > > hasn't even been released for Debian to package. Debian isn't the > > > > problem here. > > > > > > hal 0.5.9-rc1 (released, not from git) should work. It will be > > > problably released soon and picked by sane distributions. Debian is very > > > irritating corner case. > > > > Presumably the -rc1 stands for "release candidate". Which means "not > > yet released". And when did it show up? 04-Mar-2007 at 18:31. That's > > right, YESTERDAY. Almost a full month after Greg's commit. > > > > For the last time, DEBIAN IS NOT THE PROBLEM. > > Can I please second this (having been burned by hell that was udev of > the 0.5ish era) - Greg, please try to make changes in a cross-compatible > way so that versions of userspace and kernel are not so closely > dependant on tracking each other. The whole 2.6.8 -> 2.6.12 series of > kernels and associated udevs are fraught with race conditions where > upgrading one but not the other will leave your machine unbootable. But I AM TRYING TO MAKE IT COMPATIBLE!!! That's what that config option is there for. If you happen to be running a newer userspace, a different distro than what is in Debian right now, or don't use HAL and Networkmanager, then disable that option. Then all of sysfs looks just like it used to, no user visble changes at all. It doesn't get any more compatible than that. Again, I've pointed out distros that work just fine many times in this thread... It's been there since 2.6.20 I think, no one seemed to have noticed it then for an odd reason... And the default is enabled, you have to manually turn it off in order to break your machine. Again, how can I word this in a manner that would be sufficient to keep this misunderstanding from happening again? greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfrm audit hook misplaced in pfkey_delete and xfrm_del_sa
On Fri, 2 Mar 2007, Eric Paris wrote: > Inside pfkey_delete and xfrm_del_sa the audit hooks were not called if > there was any permission/security failures in attempting to do the del > operation (such as permission denied from security_xfrm_state_delete). > This patch moves the audit hook to the exit path such that all failures > (and successes) will actually get audited. > > Signed-off-by: Eric Paris <[EMAIL PROTECTED]> Acked-by: James Morris <[EMAIL PROTECTED]> -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Add xfrm policy change auditing to pfkey_spdget
On Fri, 2 Mar 2007, Eric Paris wrote: > pfkey_spdget neither had an LSM security hook nor auditing for the > removal of xfrm_policy structs. The security hook was added when it was > moved into xfrm_policy_byid instead of the callers to that function by > my earlier patch and this patch adds the auditing hooks as well. > > Signed-off-by: Eric Paris <[EMAIL PROTECTED]> Acked-by: James Morris <[EMAIL PROTECTED]> -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfrm_policy delete security check misplaced
On Fri, 2 Mar 2007, Eric Paris wrote: > Signed-off-by: Eric Paris <[EMAIL PROTECTED]> Acked-by: James Morris <[EMAIL PROTECTED]> -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED
On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote: > On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote: > > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote: > > > > > > Ok, how about the following patch. Is it acceptable to everyone? > > > > > > thanks, > > > > > > greg k-h > > > > > > --- > > > init/Kconfig | 13 +++-- > > > 1 file changed, 11 insertions(+), 2 deletions(-) > > > > > > --- gregkh-2.6.orig/init/Kconfig > > > +++ gregkh-2.6/init/Kconfig > > > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED > > > that belong to a class, back into the /sys/class heirachy, in > > > order to support older versions of udev. > > > > > > - If you are using a distro that was released in 2006 or later, > > > - it should be safe to say N here. > > > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora > > > + release from 2007 or later, it should be safe to say N here. > > > + > > > + If you are using Debian or other distros that are slow to > > > + update HAL, please say Y here. > > >... > > > > The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally > > for all users, and schedule it's removal for mid-2008 (or later). > > > > 12 months after the first _release_ of a HAL that can live without seems > > to be the first time when we can consider getting rid of it, since all > > distributions with at least one release a year should ship it by then. > > > > Currently, SYSFS_DEPRECATED is only a trap for users. > > Huh? > > No, again, I've been using this just fine for about 6 months now. > > And what about all of the servers not using HAL/NetworkManager? On a server, it shouldn't harm. > And what about all of the embedded systems not using either? If it was much code, I would have sent a patch that allowed disabling it if EMBEDDED=y. > So to not allow this to be turned off by people who might want to (we > want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will > other distros released this year), is pretty heavy-handed. > > It also will work in OpenSuSE 10.2 which is already released, and I > think Fedora 6, but I've only limited experience with these. > > Oh, and Gentoo works just fine, and has been for the past 6 months. For most people, it simply doesn't matter whether SYSFS_DEPRECATED is on or off. But accidentally disabling SYSFS_DEPRECATED has proven to be a trap people sometimes fall into - and tracking them down to SYSFS_DEPRECATED=n sometimes takes some time. > I would just prefer to come up with an acceptable set of wording that > will work to properly warn people. > > I proposed one such wording which some people took as a slam against > Debian, which it really was not at all. > > Does someone else want to propose some other wording instead? > > thanks, > > greg k-h cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().
On Tue, 6 Mar 2007, Herbert Xu wrote: > It's just too error-prone to rely on it to not have MSG_TRUNC set. Agreed. > I'm going to clean this up for UDP and improve the UDP-lite checksum > handling while I'm at it. Great. It'll be good to get this years-old UDP bug fixed. Thanks, Jim - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] div64_64 support
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Mon, 5 Mar 2007 15:57:14 -0800 > I tried the code from Hacker's Delight. > It is cool, but performance is CPU (and data) dependent: > > Average # of usecs per operation: Interesting results. The problem with these algorithms that tradoff one or more multiplies in order to avoid a divide is that they don't give anything and often lose when both multiplies and divides are emulated in software. This is particularly true in this cube-root case from Hacker's Delight, because it's using 3 multiplies per iteration in place of one divide per iteration. Actually, sorry, there is only one real multiply in there since the other two can be computed using addition and shifts. Another thing is that the non-Hacker's Delight version iterates differently for different input values, so the input value space is very important to consider when comparing these two pieces of code. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED
On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote: > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote: > > > > Ok, how about the following patch. Is it acceptable to everyone? > > > > thanks, > > > > greg k-h > > > > --- > > init/Kconfig | 13 +++-- > > 1 file changed, 11 insertions(+), 2 deletions(-) > > > > --- gregkh-2.6.orig/init/Kconfig > > +++ gregkh-2.6/init/Kconfig > > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED > > that belong to a class, back into the /sys/class heirachy, in > > order to support older versions of udev. > > > > - If you are using a distro that was released in 2006 or later, > > - it should be safe to say N here. > > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora > > + release from 2007 or later, it should be safe to say N here. > > + > > + If you are using Debian or other distros that are slow to > > + update HAL, please say Y here. > >... > > The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally > for all users, and schedule it's removal for mid-2008 (or later). > > 12 months after the first _release_ of a HAL that can live without seems > to be the first time when we can consider getting rid of it, since all > distributions with at least one release a year should ship it by then. > > Currently, SYSFS_DEPRECATED is only a trap for users. Huh? No, again, I've been using this just fine for about 6 months now. And what about all of the servers not using HAL/NetworkManager? And what about all of the embedded systems not using either? So to not allow this to be turned off by people who might want to (we want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will other distros released this year), is pretty heavy-handed. It also will work in OpenSuSE 10.2 which is already released, and I think Fedora 6, but I've only limited experience with these. Oh, and Gentoo works just fine, and has been for the past 6 months. I would just prefer to come up with an acceptable set of wording that will work to properly warn people. I proposed one such wording which some people took as a slam against Debian, which it really was not at all. Does someone else want to propose some other wording instead? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().
On Tue, Mar 06, 2007 at 10:34:49AM +1100, Herbert Xu wrote: > > > > That's not true. Please see my post. > > > > Specifically, at linux-2.6.20.1/net/ipv4/udp.c:843, it seems that > > udp_recvmsg() can randomly ignore whether the HW has computed a checksum > > and compute it in SW redundantly. > > Sorry, you're right. This bug has been there for years. Actually I think we should fix UDP regardless of whether we initialise msg_flags to zero here. It's just too error-prone to rely on it to not have MSG_TRUNC set. I'm going to clean this up for UDP and improve the UDP-lite checksum handling while I'm at it. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] div64_64 support
On 03 Mar 2007 03:31:52 +0100 Andi Kleen <[EMAIL PROTECTED]> wrote: > Stephen Hemminger <[EMAIL PROTECTED]> writes: > > > Here is another way to handle the 64 bit divide case. > > It allows full 64 bit divide by adding the support routine > > GCC needs. > > Not supplying that was intentional by Linus so that people > think twice (or more often) before they using such expensive > operations. A plain / looks too innocent. > > Is it really needed by CUBIC anyways? It uses it for getting > the cubic root, but the algorithm recommended by Hacker's Delight > (great book) doesn't use any divisions at all. Probably better > to use a better algorithm without divisions. > I tried the code from Hacker's Delight. It is cool, but performance is CPU (and data) dependent: Average # of usecs per operation: Hacker Newton Pentium 3 68.6< 90.4 T2050 98.6> 92.0 U1400 450 > 415 Xeon70 < 90 Xeon (newer)71 < 78 EM64T 21.8< 24.6 AMD64 23.4< 32.0 It might be worth the change for code size reduction though. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying
On Monday March 5, [EMAIL PROTECTED] wrote: > On Friday 02 March 2007 05:28, NeilBrown wrote: > > The sunrpc server code needs to know the source and destination address > > for UDP packets so it can reply properly. > > It currently copies code out of the network stack to pick the pieces out > > of the skb. > > This is ugly and causes compile problems with the IPv6 stuff. > > ... and this IPv6 code could never have worked anyway: :-( It's hard to test the IPv6 server until we have an IPv6 client I guess, so thanks for the code review, even though we aren't going to end up using that code... > > But I find using recvmsg just for getting at the addresses > a little awkward too. Do you? It's surely a lot better than code duplication, and it is exactly how you would get the information from user-space. > And I think to be on the safe side, you > should check that you're really looking at a PKTINFO cmsg > rather than something else. Maybe. But is there really a chance that it might not be PKTINFO? And what do you do if it isn't? Log an error and drop the packet I guess. I'll see what I can do. NeilBrown - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2.6.21 patch] unconditionally enable SYSFS_DEPRECATED
On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote: > > Ok, how about the following patch. Is it acceptable to everyone? > > thanks, > > greg k-h > > --- > init/Kconfig | 13 +++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > --- gregkh-2.6.orig/init/Kconfig > +++ gregkh-2.6/init/Kconfig > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED > that belong to a class, back into the /sys/class heirachy, in > order to support older versions of udev. > > - If you are using a distro that was released in 2006 or later, > - it should be safe to say N here. > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora > + release from 2007 or later, it should be safe to say N here. > + > + If you are using Debian or other distros that are slow to > + update HAL, please say Y here. >... The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally for all users, and schedule it's removal for mid-2008 (or later). 12 months after the first _release_ of a HAL that can live without seems to be the first time when we can consider getting rid of it, since all distributions with at least one release a year should ship it by then. Currently, SYSFS_DEPRECATED is only a trap for users. Suggested patch below. cu Adrian <-- snip --> unconditionally enable SYSFS_DEPRECATED This patch unconditionally enables SYSFS_DEPRECATED and schedules it's removal for July 2008. Currently, SYSFS_DEPRECATED is only a trap for users accidentally disabling it. In July 2008, all distributions with at least one release a year should be able to run without SYSFS_DEPRECATED. Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index c3b1430..b0bce93 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -316,3 +316,13 @@ Why: The option/code is Who: Johannes Berg <[EMAIL PROTECTED]> --- + +What: deprecated sysfs files (CONFIG_SYSFS_DEPRECATED) +When: July 2008 +Why: None of these features or values should be used any longer, + as they export driver core implementation details to userspace + or export properties which can't be kept stable across kernel + releases. +Who: Greg KH <[EMAIL PROTECTED]> + +--- diff --git a/init/Kconfig b/init/Kconfig index f977086..f652b6f 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -274,24 +274,9 @@ config CPUSETS Say N if unsure. config SYSFS_DEPRECATED - bool "Create deprecated sysfs files" + bool default y help - This option creates deprecated symlinks such as the - "device"-link, the :-link, and the - "bus"-link. It may also add deprecated key in the - uevent environment. - None of these features or values should be used today, as - they export driver core implementation details to userspace - or export properties which can't be kept stable across kernel - releases. - - If enabled, this option will also move any device structures - that belong to a class, back into the /sys/class heirachy, in - order to support older versions of udev. - - If you are using a distro that was released in 2006 or later, - it should be safe to say N here. config RELAY bool "Kernel->user space relay support (formerly relayfs)" - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
Greg KH wrote: On Mon, Mar 05, 2007 at 07:59:50AM -0500, Theodore Tso wrote: Ok, how about the following patch. Is it acceptable to everyone? thanks, greg k-h --- init/Kconfig | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) --- gregkh-2.6.orig/init/Kconfig +++ gregkh-2.6/init/Kconfig @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED that belong to a class, back into the /sys/class heirachy, in order to support older versions of udev. - If you are using a distro that was released in 2006 or later, - it should be safe to say N here. + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora + release from 2007 or later, it should be safe to say N here. + + If you are using Debian or other distros that are slow to + update HAL, please say Y here. + + If you have any problems with devices not being found properly + from userspace programs, and this option is disabled, say Y + here. + + If you are unsure about this at all, say Y. config RELAY bool "Kernel->user space relay support (formerly relayfs)" Since it appears you're trying to offend people with this patch, it would seem appropriate to call someone's mother a "bad" name. This may be in the style guide; perhaps I should submit a patch. -- Jeffrey Hundstad PS: Humor (really!) relax. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().
On Mon, Mar 05, 2007 at 01:01:16PM -0800, Jim Chow wrote: > On Tue, 6 Mar 2007, Herbert Xu wrote: > > msg_flags [...] its initial value is not used. > > That's not true. Please see my post. > > Specifically, at linux-2.6.20.1/net/ipv4/udp.c:843, it seems that > udp_recvmsg() can randomly ignore whether the HW has computed a checksum > and compute it in SW redundantly. Sorry, you're right. This bug has been there for years. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 8132] New: pptp server lockup in ppp_asynctty_receive()
On Mon, 5 Mar 2007 14:26:30 -0800 [EMAIL PROTECTED] wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8132 > >Summary: pptp server lockup in ppp_asynctty_receive() > Kernel Version: 2.6.20 > Status: NEW > Severity: high > Owner: [EMAIL PROTECTED] > Submitter: [EMAIL PROTECTED] > CC: [EMAIL PROTECTED] > > > Already several kernel releases i've expirienced different lockups of vpn > (pptp) server. > There is more then 200 ppp connections sometimes. > With kernel debug i was able to retrive next information: > > First: > Showing all locks held in the system: > 1 lock held by agetty/4486: > #0: (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b > 1 lock held by agetty/4487: > #0: (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b > 1 lock held by agetty/4488: > #0: (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b > 2 locks held by pptpctrl/4500: > #0: (&tty->atomic_write_lock){--..}, at: [] tty_write+0x83/0x1d0 > #1: (&ap->recv_lock){}, at: [] > ppp_asynctty_receive+0x2e/0x710 > > = > BUG: spinlock lockup on CPU#1, pppd/4504, df5048c4 > [] _raw_spin_lock+0x100/0x134 > [] ppp_async_ioctl+0xa7/0x1d0 > [] ppp_ioctl+0xa5/0xbff > [] down_read+0x29/0x3a > [] ppp_async_ioctl+0x0/0x1d0 > [] ppp_ioctl+0xce/0xbff > [] _spin_unlock+0x14/0x1c > [] do_wp_page+0x256/0x4ba > [] __handle_mm_fault+0x74e/0xa22 > [] do_ioctl+0x64/0x6d > [] vfs_ioctl+0x50/0x273 > [] sys_ioctl+0x34/0x50 > [] sysenter_past_esp+0x5f/0x99 > === > BUG: soft lockup detected on CPU#0! > [] softlockup_tick+0x8d/0xbc > [] update_process_times+0x28/0x5e > [] smp_apic_timer_interrupt+0x80/0x9c > [] apic_timer_interrupt+0x33/0x38 > [] delay_tsc+0x9/0x13 > [] __delay+0x6/0x7 > [] _raw_spin_lock+0xa9/0x134 > [] tty_write+0x83/0x1d0 > [] tty_ldisc_try+0x2f/0x33 > [] lock_kernel+0x19/0x24 > [] tty_write+0x10b/0x1d0 > [] write_chan+0x0/0x320 > [] vfs_write+0x87/0xf0 > [] tty_write+0x0/0x1d0 > [] sys_write+0x41/0x6a > [] sysenter_past_esp+0x5f/0x99 > === > > > Second) > <0>BUG: spinlock lockup on CPU#0, pppd/5209, de3e2884 > [] _raw_spin_lock+0x100/0x134 > BUG: spinlock lockup on CPU#1, ip-down/7524, c0353300 > [] _raw_spin_lock+0x100/0x134 > [] lock_kernel+0x19/0x24 > [] chrdev_open+0x8a/0x16e > [] chrdev_open+0x0/0x16e > [] __dentry_open+0xaf/0x1a0 > [] nameidata_to_filp+0x31/0x3a > [] do_filp_open+0x39/0x40 > [] _spin_unlock+0x14/0x1c > [] get_unused_fd+0xaa/0xbb > [] do_sys_open+0x3a/0x6d > [] sys_open+0x1c/0x20 > [] sysenter_past_esp+0x5f/0x99 > === > [] ppp_async_ioctl+0xa7/0x1d0 > [] ppp_ioctl+0xa5/0xbff > [] down_read+0x29/0x3a > [] ppp_async_ioctl+0x0/0x1d0 > [] ppp_ioctl+0xce/0xbff > [] _spin_unlock+0x14/0x1c > [] do_wp_page+0x256/0x4ba > [] __handle_mm_fault+0x74e/0xa22 > [] do_ioctl+0x64/0x6d > [] vfs_ioctl+0x50/0x273 > [] sys_ioctl+0x34/0x50 > [] sysenter_past_esp+0x5f/0x99 > === > > Third) > BUG: soft lockup detected on CPU#0! > [] softlockup_tick+0x8d/0xbc > [] update_process_times+0x28/0x5e > [] smp_apic_timer_interrupt+0x80/0x9c > [] apic_timer_interrupt+0x33/0x38 > [] delay_tsc+0x9/0x13 > [] __delay+0x6/0x7 > [] _raw_spin_lock+0xa9/0x134 > [] tty_ldisc_try+0x2f/0x33 > [] lock_kernel+0x19/0x24 > [] tty_read+0x5a/0xbe > [] vfs_read+0x85/0xee > [] tty_read+0x0/0xbe > [] sys_read+0x41/0x6a > [] sysenter_past_esp+0x5f/0x99 > === > BUG: soft lockup detected on CPU#0! > [] softlockup_tick+0x8d/0xbc > [] update_process_times+0x28/0x5e > [] smp_apic_timer_interrupt+0x80/0x9c > [] apic_timer_interrupt+0x33/0x38 > [] prio_tree_insert+0xe8/0x23b > [] _raw_spin_lock+0xaf/0x134 > [] tty_ldisc_try+0x2f/0x33 > [] lock_kernel+0x19/0x24 > [] tty_read+0x5a/0xbe > [] vfs_read+0x85/0xee > [] tty_read+0x0/0xbe > [] sys_read+0x41/0x6a > [] sysenter_past_esp+0x5f/0x99 > > > Next via SysRq: > > Showing all locks held in the system: > 1 lock held by agetty/5057: > #0: (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b > 1 lock held by agetty/5058: > #0: (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b > 1 lock held by agetty/5059: > #0: (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b > 2 locks held by pptpctrl/5071: > #0: (&tty->atomic_write_lock){--..}, at: [] tty_write+0x83/0x1d0 > #1: (&ap->recv_lock){}, at: [] > ppp_asynctty_receive+0x2e/0x710 > > > ~#SysRq : Show Blocked State > > freesibling > task PCstack pid father child younger older > pptpctrl D C02A18E0 0 5071 4646 50745094 5064 (L-TLB) >df3a3bd0 0082 0029b837 c02a18e0 0246 dd4f131c > dd563cac >def86030 c140864c 0009 def86030 2ccaa8e5 > 017d >
ignore; Re: "skge 0000:01:0a.0: unsupported phy type 0x0"
Ignore this. I rebooted into the wrong kernel and was testing with 2.6.16 instead of 2.6.20. It works fine with 2.6.20. -Chris On Mon, 5 Mar 2007, Chris Stromsoe wrote: I have a bunch of dual-port SK 98xx cards that work with sk98lin but not with skge. After loading skge, I get ACPI: PCI Interrupt :01:0a.0[A] -> Link [LNKC] -> GSI 10 (level, low) -> IRQ 10 skge :01:0a.0: unsupported phy type 0x0 ACPI: PCI interrupt for device :01:0a.0 disabled skge: probe of :01:0a.0 failed with error -95 lspci -vv output for the card: :01:0a.0 Ethernet controller: Syskonnect (Schneider & Koch) SK-98xx Gigabit Ethernet Server Adapter (rev 12) Subsystem: Syskonnect (Schneider & Koch) SK-9844 Gigabit Ethernet Server Adapter (SK-NET GE-SX dual link) Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- Interrupt: pin A routed to IRQ 10 Region 0: Memory at ff8fc000 (32-bit, non-prefetchable) [size=16K] Region 1: I/O ports at d800 [size=256] Expansion ROM at ff40 [disabled] [size=128K] Capabilities: [48] Power Management version 1 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=1 PME- Capabilities: [50] Vital Product Data -Chris - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] natsemi: netpoll fixes
[Once more with CCs] On Tue, Mar 06, 2007 at 12:10:08AM +0400, Sergei Shtylyov wrote: > #ifdef CONFIG_NET_POLL_CONTROLLER > static void natsemi_poll_controller(struct net_device *dev) > { > + struct netdev_private *np = netdev_priv(dev); > + > disable_irq(dev->irq); > - intr_handler(dev->irq, dev); > + > + /* > + * A real interrupt might have already reached us at this point > + * but NAPI might still haven't called us back. As the > interrupt > + * status register is cleared by reading, we should prevent an > + * interrupt loss in this case... > + */ > + if (!np->intr_status) > + intr_handler(dev->irq, dev); > + > enable_irq(dev->irq); Is it possible for this to run at the same time as the NAPI poll? If so then it is possible for the netpoll poll to run between np->intr_status being cleared and netif_rx_complete() being called. If the hardware asserts an interrupt at the wrong moment then this could cause the In any case, this is a problem independently of netpoll if the chip shares an interrupt with anything so the interrupt handler should be fixed to cope with this situation instead. -- "You grabbed my hand and we fell into it, like a daydream - or a fever." signature.asc Description: Digital signature
RE: [PATCH] s2io: add PCI error recovery support
Comments on this patch - 1. device_close_flag is unused and is not required. > +static pci_ers_result_t s2io_io_error_detected(struct pci_dev *pdev, > + pci_channel_state_t state) > +{ ... > + do_s2io_card_down(sp, 0); > + sp->device_close_flag = TRUE; /* Device is shut down. */ 2. s2io_reset can fail to reset the device. Ideally s2io_reset should return a failure in this case (return is void now) and in this case could s2io_io_slot_reset() be called again, maybe try thrice, in total, before failing to reset the slot? Ram > -Original Message- > From: Linas Vepstas [mailto:[EMAIL PROTECTED] > Sent: Thursday, February 15, 2007 3:09 PM > To: Ramkrishna Vepa; Raghavendra Koushik; Ananda Raju > Cc: Wen Xiong; linux-kernel@vger.kernel.org; linux- > [EMAIL PROTECTED]; netdev@vger.kernel.org; Jeff Garzik; Andrew > Morton > Subject: [PATCH] s2io: add PCI error recovery support > > > Koushik, Raju, > > Please review, comment, and if you find this acceptable, > please forward upstream. This patch incorporates all of > fixes resulting from the last set of discussions, circa > November 2006. > > --linas > > This patch adds PCI error recovery support to the > s2io 10-Gigabit ethernet device driver. Fourth revision, > blocks interrupts and the watchdog. Adds a flag to > s2io_down(), to avoid doing I/O when PCI bus is offline. > > Tested, seems to work well. > > Signed-off-by: Linas Vepstas <[EMAIL PROTECTED]> > Acked-by: Ramkrishna Vepa <[EMAIL PROTECTED]> > Cc: Raghavendra Koushik <[EMAIL PROTECTED]> > Cc: Ananda Raju <[EMAIL PROTECTED]> > Cc: Wen Xiong <[EMAIL PROTECTED]> > > > drivers/net/s2io.c | 116 > ++--- > drivers/net/s2io.h |5 ++ > 2 files changed, 116 insertions(+), 5 deletions(-) > > Index: linux-2.6.20-git4/drivers/net/s2io.c > === > --- linux-2.6.20-git4.orig/drivers/net/s2io.c 2007-02-15 > 15:39:35.0 -0600 > +++ linux-2.6.20-git4/drivers/net/s2io.c 2007-02-15 16:15:10.0 - > 0600 > @@ -435,11 +435,18 @@ static struct pci_device_id s2io_tbl[] _ > > MODULE_DEVICE_TABLE(pci, s2io_tbl); > > +static struct pci_error_handlers s2io_err_handler = { > + .error_detected = s2io_io_error_detected, > + .slot_reset = s2io_io_slot_reset, > + .resume = s2io_io_resume, > +}; > + > static struct pci_driver s2io_driver = { >.name = "S2IO", >.id_table = s2io_tbl, >.probe = s2io_init_nic, >.remove = __devexit_p(s2io_rem_nic), > + .err_handler = &s2io_err_handler, > }; > > /* A simplifier macro used both by init and free shared_mem Fns(). */ > @@ -2577,6 +2584,9 @@ static void s2io_netpoll(struct net_devi > u64 val64 = 0xULL; > int i; > > + if (pci_channel_offline(nic->pdev)) > + return; > + > disable_irq(dev->irq); > > atomic_inc(&nic->isr_cnt); > @@ -3079,6 +3089,8 @@ static void alarm_intr_handler(struct s2 > int i; > if (atomic_read(&nic->card_state) == CARD_DOWN) > return; > + if (pci_channel_offline(nic->pdev)) > + return; > nic->mac_control.stats_info->sw_stat.ring_full_cnt = 0; > /* Handling the XPAK counters update */ > if(nic->mac_control.stats_info->xpak_stat.xpak_timer_count < 72000) > { > @@ -4117,6 +4129,10 @@ static irqreturn_t s2io_isr(int irq, voi > struct mac_info *mac_control; > struct config_param *config; > > + /* Pretend we handled any irq's from a disconnected card */ > + if (pci_channel_offline(sp->pdev)) > + return IRQ_NONE; > + > atomic_inc(&sp->isr_cnt); > mac_control = &sp->mac_control; > config = &sp->config; > @@ -6188,7 +6204,7 @@ static void s2io_rem_isr(struct s2io_nic > } while(cnt < 5); > } > > -static void s2io_card_down(struct s2io_nic * sp) > +static void do_s2io_card_down(struct s2io_nic * sp, int do_io) > { > int cnt = 0; > struct XENA_dev_config __iomem *bar0 = sp->bar0; > @@ -6203,7 +6219,8 @@ static void s2io_card_down(struct s2io_n > atomic_set(&sp->card_state, CARD_DOWN); > > /* disable Tx and Rx traffic on the NIC */ > - stop_nic(sp); > + if (do_io) > + stop_nic(sp); > > s2io_rem_isr(sp); > > @@ -6211,7 +6228,7 @@ static void s2io_card_down(struct s2io_n > tasklet_kill(&sp->task); > > /* Check if the device is Quiescent and then Reset the NIC */ > - do { > + while(do_io) { > /* As per the HW requirement we need to replenish the >* receive buffer to avoid the ring bump. Since there is >* no intention of processing the Rx frame at this pointwe are > @@ -6236,8 +6253,9 @@ static void s2io_card_down(struct s2io_n > (unsigned long long) val64); >
Re: [PATCH] natsemi: netpoll fixes
On Tue, Mar 06, 2007 at 12:10:08AM +0400, Sergei Shtylyov wrote: > #ifdef CONFIG_NET_POLL_CONTROLLER > static void natsemi_poll_controller(struct net_device *dev) > { > + struct netdev_private *np = netdev_priv(dev); > + > disable_irq(dev->irq); > - intr_handler(dev->irq, dev); > + > + /* > + * A real interrupt might have already reached us at this point > + * but NAPI might still haven't called us back. As the interrupt > + * status register is cleared by reading, we should prevent an > + * interrupt loss in this case... > + */ > + if (!np->intr_status) > + intr_handler(dev->irq, dev); > + > enable_irq(dev->irq); Is it possible for this to run at the same time as the NAPI poll? If so then it is possible for the netpoll poll to run between np->intr_status being cleared and netif_rx_complete() being called. If the hardware asserts an interrupt at the wrong moment then this could cause the In any case, this is a problem independently of netpoll if the chip shares an interrupt with anything so the interrupt handler should be fixed to cope with this situation instead. -- "You grabbed my hand and we fell into it, like a daydream - or a fever." signature.asc Description: Digital signature
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 01:55:30PM -0600, Matt Mackall wrote: > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote: > > Ok, how about the following patch. Is it acceptable to everyone? > > > > - If you are using a distro that was released in 2006 or later, > > - it should be safe to say N here. > > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora > > + release from 2007 or later, it should be safe to say N here. > > + > > + If you are using Debian or other distros that are slow to > > + update HAL, please say Y here. > > What HAL version do you think Debian ought to have, pray tell? And > what the hell version do those other distros have? > > The last HAL release was 0.5.8 on 11-Sep-2006. It showed up in > Debian/unstable on 2-Oct. There have been six Debian bugfix releases, > the most recent on 12-Feb. > > http://people.freedesktop.org/~david/dist/ > http://packages.debian.org/changelogs/pool/main/h/hal/hal_0.5.8.1-6.1/changelog Ok, I only named HAL as that is what people have told me the problem is. I have been running this change on my boxs, without CONFIG_SYSFS_DEPRECATED since last July or so. But I don't use NetworkManager here for the most part, but I have tried this in the OpenSuse10.3 alpha releases and it seems to work just fine with whatever version of NetworkManager it uses. So perhaps it's some wrapper scripts somewhere? I think SuSE had some odd things hard coded somewhere that prevented 10.1 from working properly with this change. Ok, so I'll drop the HAL wording above, what should I say instead? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: "skge 0000:01:0a.0: unsupported phy type 0x0"
On Mon, 5 Mar 2007, Stephen Hemminger wrote: What kernel version. Type 0 is XMAC support, and that was added to a fairly recent kernel (2.6.19?) It was an old kernel. I booted into 2.6.16 instead of 2.6.20. See my follow-up (and ignore the report). -Chris - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: "skge 0000:01:0a.0: unsupported phy type 0x0"
On Mon, 5 Mar 2007 13:48:29 -0800 (PST) Chris Stromsoe <[EMAIL PROTECTED]> wrote: > I have a bunch of dual-port SK 98xx cards that work with sk98lin but not > with skge. After loading skge, I get > > ACPI: PCI Interrupt :01:0a.0[A] -> Link [LNKC] -> GSI 10 (level, low) -> > IRQ 10 > skge :01:0a.0: unsupported phy type 0x0 > ACPI: PCI interrupt for device :01:0a.0 disabled > skge: probe of :01:0a.0 failed with error -95 > > > lspci -vv output for the card: > > :01:0a.0 Ethernet controller: Syskonnect (Schneider & Koch) SK-98xx > Gigabit Ethernet Server Adapter (rev 12) > Subsystem: Syskonnect (Schneider & Koch) SK-9844 Gigabit Ethernet > Server Adapter (SK-NET GE-SX dual link) > Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- > Stepping- SERR+ FastB2B- > Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- > SERR- Interrupt: pin A routed to IRQ 10 > Region 0: Memory at ff8fc000 (32-bit, non-prefetchable) [size=16K] > Region 1: I/O ports at d800 [size=256] > Expansion ROM at ff40 [disabled] [size=128K] > Capabilities: [48] Power Management version 1 > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA > PME(D0-,D1-,D2-,D3hot-,D3cold-) > Status: D0 PME-Enable- DSel=0 DScale=1 PME- > Capabilities: [50] Vital Product Data > What kernel version. Type 0 is XMAC support, and that was added to a fairly recent kernel (2.6.19?) -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().
On Tue, 6 Mar 2007, Herbert Xu wrote: > msg_flags [...] its initial value is not used. That's not true. Please see my post. Specifically, at linux-2.6.20.1/net/ipv4/udp.c:843, it seems that udp_recvmsg() can randomly ignore whether the HW has computed a checksum and compute it in SW redundantly. Jim - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote: > On Mon, Mar 05, 2007 at 01:13:26AM -0600, Matt Mackall wrote: > > That's not the point. The point is that Debian/unstable as of _this > > morning_ doesn't work. For reference, I'm running both the latest > > releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And > > there are people telling me I need a copy of HAL out of git that > > hasn't even been released for Debian to package. Debian isn't the > > problem here. > > hal 0.5.9-rc1 (released, not from git) should work. It will be > problably released soon and picked by sane distributions. Debian is very > irritating corner case. As of right now, Fedora Core 6 has hal-0.5.8.1-6.fc6. This is also too old. Please, stop claiming that Debian unstable is some corner case. No one is talking about Debian stable here. No one is talking about the Enterprise versions of Red Hat or SuSE (you'd find them just as irritating with modern kernels). Debian unstable tracks released code as fast or faster than Fedora and OpenSuSE. They all keep up with releases. But the last release of hal is 0.5.8.1. _Release_, not "release candidate". You can't break that. You can't break it for a while, if you want a sane deprecation schedule. These are userspace interfaces. Matt is absolutely correct that you should't deprecate a userspace<->kernel interface before you've even provided a release of the tool that detects the change! Joel -- "When ideas fail, words come in very handy." - Goethe Joel Becker Principal Software Developer Oracle E-mail: [EMAIL PROTECTED] Phone: (650) 506-8127 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().
Jim Chow <[EMAIL PROTECTED]> wrote: > After inspection of some networking code, it seems there is a use of > uninitialized data in udp_recvmsg(), > linux-2.6.20.1/net/ipv4/udp.c:843, while testing msg->msg_flags (see > the backtrace below). It looks like sys_recvfrom() is not msg_flags is set on return and its initial value is not used. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
"skge 0000:01:0a.0: unsupported phy type 0x0"
I have a bunch of dual-port SK 98xx cards that work with sk98lin but not with skge. After loading skge, I get ACPI: PCI Interrupt :01:0a.0[A] -> Link [LNKC] -> GSI 10 (level, low) -> IRQ 10 skge :01:0a.0: unsupported phy type 0x0 ACPI: PCI interrupt for device :01:0a.0 disabled skge: probe of :01:0a.0 failed with error -95 lspci -vv output for the card: :01:0a.0 Ethernet controller: Syskonnect (Schneider & Koch) SK-98xx Gigabit Ethernet Server Adapter (rev 12) Subsystem: Syskonnect (Schneider & Koch) SK-9844 Gigabit Ethernet Server Adapter (SK-NET GE-SX dual link) Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- http://vger.kernel.org/majordomo-info.html
Re: [PATCH] twcal_jiffie should be unsigned long, not int
From: Eric Dumazet <[EMAIL PROTECTED]> Date: Mon, 5 Mar 2007 16:09:21 +0100 > While browsing include/net/inet_timewait_sock.h, I found this buggy > definition > of twcal_jiffie. > > int twcal_jiffie; > > I wonder how inet_twdr_twcal_tick() can really works on x86_64 > > This seems quite an old bug, it was there before introduction of > inet_timewait_death_row made by Arnaldo Carvalho de Melo. > > [PATCH] twcal_jiffie should be unsigned long, not int > > Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]> Grrr, good catch Eric. I'll push this fix to -stable too. Thanks a lot. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote: > > That's not the point. The point is that Debian/unstable as of _this > > morning_ doesn't work. For reference, I'm running both the latest > > releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And > > there are people telling me I need a copy of HAL out of git that > > hasn't even been released for Debian to package. Debian isn't the > > problem here. > > hal 0.5.9-rc1 (released, not from git) should work. It will be > problably released soon and picked by sane distributions. Debian is very > irritating corner case. Presumably the -rc1 stands for "release candidate". Which means "not yet released". And when did it show up? 04-Mar-2007 at 18:31. That's right, YESTERDAY. Almost a full month after Greg's commit. For the last time, DEBIAN IS NOT THE PROBLEM. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC] Use of uninitialized data in udp_recvmsg().
After inspection of some networking code, it seems there is a use of uninitialized data in udp_recvmsg(), linux-2.6.20.1/net/ipv4/udp.c:843, while testing msg->msg_flags (see the backtrace below). It looks like sys_recvfrom() is not initializing msg.msg_flags and, along the path given below, msg_flags is tested (at #0) without (necessarily) being written to. A simple fix for this particular problem is given below. Alternatively, udp_recvmsg() could be changed to initialize msg_flags for its caller, since udp_recvmsg() (always? [*]) uses msg_flags as an output argument. In any case, I wanted to verify the bug with the networking gurus to see if they agree. #0 udp_recvmsg (linux-2.6.20.1/net/ipv4/udp.c:843) #1 sock_common_recvmsg (linux-2.6.20.1/net/core/sock.c:1617) #2 sock_recvmsg (linux-2.6.20.1/net/socket.c:630) #3 sys_recvfrom (linux-2.6.20.1/net/socket.c:1608) #4 sys_socketcall (linux-2.6.20.1/net/socket.c:2007) #5 syscall_call (linux-2.6.20.1/arch/i386/kernel/entry.S:0) Index: linux-2.6.20.1/net/socket.c === --- linux-2.6.20.1.orig/net/socket.c +++ linux-2.6.20.1/net/socket.c @@ -1601,6 +1601,7 @@ iov.iov_base = ubuf; msg.msg_name = address; msg.msg_namelen = MAX_SOCK_ADDR; + msg.msg_flags = 0; if (sock->file->f_flags & O_NONBLOCK) flags |= MSG_DONTWAIT; err = sock_recvmsg(sock, &msg, size, flags); -- [*] Although do_sock_read() linux-2.6.20.1/net/socket.c:704, for one, seems to want to initialize msg_flags nonzero, so maybe not. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] xfrm_policy delete security check misplaced
On Mon, 2007-03-05 at 11:39 -0500, James Morris wrote: > On Mon, 5 Mar 2007, Venkat Yekkirala wrote: > > > > > > > Signed-off-by: Eric Paris <[EMAIL PROTECTED]> > > Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]> > > What about your previous comment: > > "I guess you meant to do this here? > else if (err) > return err; " That also gets taken care of in the pfkey_spdget cleanup in a later patch. The return isn't in that same place venkat suggested it instead happens inside the new if (delete) block. (err is only non-zero on delete operations so there is no need to check it otherwise) -Eric - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying
On Monday March 5, [EMAIL PROTECTED] wrote: > > Hi Neil, > > here's another minor comment: > > On Friday 02 March 2007 05:28, NeilBrown wrote: > > +static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp, > > + struct cmsghdr *cmh) > > { > > switch (rqstp->rq_sock->sk_sk->sk_family) { > > case AF_INET: { > > + struct in_pktinfo *pki = CMSG_DATA(cmh); > > + rqstp->rq_daddr.addr.s_addr = pki->ipi_spec_dst.s_addr; > > break; > > + } > ... > > The daddr that is extracted here will only ever be used to build > another PKTINFO cmsg when sending the reply. So it would be > much easier to just store the raw control message in the svc_rqst, > without looking at its contents, and send it out along with the reply, > unchanged. Yes, sounds tempting, doesn't it? Unfortunately it isn't that simple as I found out when the sunrpc code in glibc did exactly that. You see sendmsg will use the interface-number as well as the source address from the PKTINFO structure. Suppose my server has two interfaces (A and B) on two subnets that both are connected to some router which is connected to a third subnet that my client is on. Further, suppose my server has only one default route, out interface A. The client chooses the IP address of interface B and sends a request. It arrives on interface B and is processed. If the PKTINFO received is passed unchanged to sendmsg, the pack will be sent out interface B. But interfacve B doesn't have a route to that client, so the packet is dropped. This exactly what was happening for me with mountd a few years ago. So yes, we could just zero the interface field, but I think it is clearer to extract that wanted data, then re-insert it. They really are different structures with different meanings (send verse receive) which happen to have the same layout. Thanks, NeilBrown - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] natsemi: netpoll fixes
Fix two issues in this driver's netpoll path: one usual, with spin_unlock_irq() enabling interrupts which nobody asks it to do (that has been fixed recently in a number of drivers) and one unusual, with poll_controller() method possibly causing loss of interrupts due to the interrupt status register being cleared by a simple read and the interrpupt handler simply storing it, not accumulating. Signed-off-by: Sergei Shtylyov <[EMAIL PROTECTED]> --- drivers/net/natsemi.c | 24 +++- 1 files changed, 19 insertions(+), 5 deletions(-) Index: linux-2.6/drivers/net/natsemi.c === --- linux-2.6.orig/drivers/net/natsemi.c +++ linux-2.6/drivers/net/natsemi.c @@ -2024,6 +2024,7 @@ static int start_tx(struct sk_buff *skb, struct netdev_private *np = netdev_priv(dev); void __iomem * ioaddr = ns_ioaddr(dev); unsigned entry; + unsigned long flags; /* Note: Ordering is important here, set the field with the "ownership" bit last, and only then increment cur_tx. */ @@ -2037,7 +2038,7 @@ static int start_tx(struct sk_buff *skb, np->tx_ring[entry].addr = cpu_to_le32(np->tx_dma[entry]); - spin_lock_irq(&np->lock); + spin_lock_irqsave(&np->lock, flags); if (!np->hands_off) { np->tx_ring[entry].cmd_status = cpu_to_le32(DescOwn | skb->len); @@ -2056,7 +2057,7 @@ static int start_tx(struct sk_buff *skb, dev_kfree_skb_irq(skb); np->stats.tx_dropped++; } - spin_unlock_irq(&np->lock); + spin_unlock_irqrestore(&np->lock, flags); dev->trans_start = jiffies; @@ -,6 +2223,8 @@ static void netdev_rx(struct net_device pkt_len = (desc_status & DescSizeMask) - 4; if ((desc_status&(DescMore|DescPktOK|DescRxLong)) != DescPktOK){ if (desc_status & DescMore) { + unsigned long flags; + if (netif_msg_rx_err(np)) printk(KERN_WARNING "%s: Oversized(?) Ethernet " @@ -2236,12 +2239,12 @@ static void netdev_rx(struct net_device * reset procedure documented in * AN-1287. */ - spin_lock_irq(&np->lock); + spin_lock_irqsave(&np->lock, flags); reset_rx(dev); reinit_rx(dev); writel(np->ring_dma, ioaddr + RxRingPtr); check_link(dev); - spin_unlock_irq(&np->lock); + spin_unlock_irqrestore(&np->lock, flags); /* We'll enable RX on exit from this * function. */ @@ -2396,8 +2399,19 @@ static struct net_device_stats *get_stat #ifdef CONFIG_NET_POLL_CONTROLLER static void natsemi_poll_controller(struct net_device *dev) { + struct netdev_private *np = netdev_priv(dev); + disable_irq(dev->irq); - intr_handler(dev->irq, dev); + + /* +* A real interrupt might have already reached us at this point +* but NAPI might still haven't called us back. As the interrupt +* status register is cleared by reading, we should prevent an +* interrupt loss in this case... +*/ + if (!np->intr_status) + intr_handler(dev->irq, dev); + enable_irq(dev->irq); } #endif - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote: > Ok, how about the following patch. Is it acceptable to everyone? > > - If you are using a distro that was released in 2006 or later, > - it should be safe to say N here. > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora > + release from 2007 or later, it should be safe to say N here. > + > + If you are using Debian or other distros that are slow to > + update HAL, please say Y here. What HAL version do you think Debian ought to have, pray tell? And what the hell version do those other distros have? The last HAL release was 0.5.8 on 11-Sep-2006. It showed up in Debian/unstable on 2-Oct. There have been six Debian bugfix releases, the most recent on 12-Feb. http://people.freedesktop.org/~david/dist/ http://packages.debian.org/changelogs/pool/main/h/hal/hal_0.5.8.1-6.1/changelog The last NetworkManager is 0.6.4 released 13-Jul-2006. It showed up in Debian/unstable on 8-Aug. There have been five bugfix releases, the most recent on 30-Nov. http://ftp.gnome.org/pub/GNOME/sources/NetworkManager/0.6/ http://packages.debian.org/changelogs/pool/main/n/network-manager/network-manager_0.6.4-6/changelog Debian is NOT the problem. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC NET_SCHED 00/03]: ktime + nano-second clock resolution for packet schedulers
Stephen Hemminger wrote: > Don't bother changing netem. I have a version that uses hrtimer's > and doesn't use PSCHED() clock source anymore. Me too :) I'm going to send it with my other patches soon, if you don't like it we can still drop it. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC NET_SCHED 00/03]: ktime + nano-second clock resolution for packet schedulers
On Mon, 05 Mar 2007 18:42:26 +0100 Patrick McHardy <[EMAIL PROTECTED]> wrote: > David Miller wrote: > > Frankly, I think now that we have ktime and all of the proper generic > > infrastructure to do this stuff properly, I think we should just use > > ktime for the packet scheduler across the board and just delete all of > > that old by-hand timekeeping selection crap from pkt_sched.h > > Sounds good, I'm going to remove all other clock sources. > Will resend in a couple of days after fixing a few more > problems I noticed. > Don't bother changing netem. I have a version that uses hrtimer's and doesn't use PSCHED() clock source anymore. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying
Hi Neil, here's another minor comment: On Friday 02 March 2007 05:28, NeilBrown wrote: > +static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp, > + struct cmsghdr *cmh) > { > switch (rqstp->rq_sock->sk_sk->sk_family) { > case AF_INET: { > + struct in_pktinfo *pki = CMSG_DATA(cmh); > + rqstp->rq_daddr.addr.s_addr = pki->ipi_spec_dst.s_addr; > break; > + } ... The daddr that is extracted here will only ever be used to build another PKTINFO cmsg when sending the reply. So it would be much easier to just store the raw control message in the svc_rqst, without looking at its contents, and send it out along with the reply, unchanged. Olaf -- Olaf Kirch | --- o --- Nous sommes du soleil we love when we play [EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 07:59:50AM -0500, Theodore Tso wrote: > On Sun, Mar 04, 2007 at 05:17:29PM -0800, Greg KH wrote: > > I should not have broken any userspace if CONFIG_SYSFS_DEPRECATED is > > enabled with that patch. If that is enabled, and that patch still > > causes problems, please let me know. > > But we still need to update the help text for CONFIG_SYS_DEPRECATED to > make it clear that its deprecation schedule still needs to be 2009 to > 2011 (depending on whether we want to accomodate Debian's glacial > release schedule). Certainly the 2006 date which is currently there > simply isn't accurate. Ok, how about the following patch. Is it acceptable to everyone? thanks, greg k-h --- init/Kconfig | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) --- gregkh-2.6.orig/init/Kconfig +++ gregkh-2.6/init/Kconfig @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED that belong to a class, back into the /sys/class heirachy, in order to support older versions of udev. - If you are using a distro that was released in 2006 or later, - it should be safe to say N here. + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora + release from 2007 or later, it should be safe to say N here. + + If you are using Debian or other distros that are slow to + update HAL, please say Y here. + + If you have any problems with devices not being found properly + from userspace programs, and this option is disabled, say Y + here. + + If you are unsure about this at all, say Y. config RELAY bool "Kernel->user space relay support (formerly relayfs)" - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying
On Friday 02 March 2007 05:28, NeilBrown wrote: > The sunrpc server code needs to know the source and destination address > for UDP packets so it can reply properly. > It currently copies code out of the network stack to pick the pieces out > of the skb. > This is ugly and causes compile problems with the IPv6 stuff. ... and this IPv6 code could never have worked anyway: > case AF_INET6: { ... > - rqstp->rq_addrlen = sizeof(struct sockaddr_in); ... this should have been sizeof(sockaddr_in6)... > - /* Remember which interface received this request */ > - ipv6_addr_copy(&rqstp->rq_daddr.addr6, > - &skb->nh.ipv6h->saddr); and this should have copied from daddr, not saddr. But I find using recvmsg just for getting at the addresses a little awkward too. And I think to be on the safe side, you should check that you're really looking at a PKTINFO cmsg rather than something else. Olaf -- Olaf Kirch | --- o --- Nous sommes du soleil we love when we play [EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC NET_SCHED 00/03]: ktime + nano-second clock resolution for packet schedulers
David Miller wrote: > Frankly, I think now that we have ktime and all of the proper generic > infrastructure to do this stuff properly, I think we should just use > ktime for the packet scheduler across the board and just delete all of > that old by-hand timekeeping selection crap from pkt_sched.h Sounds good, I'm going to remove all other clock sources. Will resend in a couple of days after fixing a few more problems I noticed. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] xfrm_policy delete security check misplaced
> > > > > > Signed-off-by: Eric Paris <[EMAIL PROTECTED]> > > Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]> > > What about your previous comment: > > "I guess you meant to do this here? > else if (err) > return err; " I saw that this was taken care of in patch-2 for the delete case, but while err isn't currently applicable to the non-delete case, it would be proper/complete for err to still be handled for the non-delete case. Thanks for asking. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SWS for rcvbuf < MTU
On March 3, 2007 06:40:12 pm John Heffner wrote: > David Miller wrote: > > From: John Heffner <[EMAIL PROTECTED]> > > Date: Fri, 02 Mar 2007 16:16:39 -0500 > > > >> Please don't apply the patch I sent. I've been thinking about this a > >> bit harder, and it may not fix this particular problem. (Hard to say > >> without knowing exactly what it is.) As the comment above > >> __tcp_select_window() states, we do not do full receive-side SWS > >> avoidance because of header prediction. > >> > >> Alex, you're right I missed that special zero-window case. I'm still > >> not quite sure I'm completely happy with this patch. I'd like to think > >> about this a little bit harder... > > > > Ok > > Alright, I've thought about it a bit more, and I think the patch I sent > should work. Alex, any opinion? Any way you can test this out? Here are the values from live kernel (obtained with 'crash') when the host was in SWS state: full_space=708 full_space/2=354 free_space=393 window=76 In this case the test from my original fix, (window < full_space/2), succeeds. But John's test free_space > window + full_space/2 393 430 does not. So I suspect that the new fix will not always work. From tcpdump traces we can see that both hosts exchange with 76-byte packets for a long time. From customer's application log we see that it continues to read 76-byte chunks per each read() call - even though more than that is available in the receive buffer. Technically it's OK for read() to return even after reading one byte, so if sk->receive_queue contains multiple 76-byte skbuffs we may return after processing just one skbuff (but we we don't understand the details of why this happens on customer's system). Are there any particular reasons why you want to postpone window update until free_space becomes > window + full_space/2 and not as soon as free_space > full_space/2? As the only real-life occurance of SWS shows free_space oscillating slightly above full_space/2, I created the fix specifically to match this phenomena as seen on customer's host. We reach the modified section only when (free_space > full_space/2) so it should be OK to update the window at this point if mss==full_space. So yes, we can test John's fix on customer's host but I doubt it will work for the reasons mentioned above, in brief: 'window = free_space' instead of 'window=full_space/2' is OK, but the test 'free_space > window + full_space/2' is not for the specific pattern customer sees on his hosts. Thanks, Alex -- -- Alexandre Sidorenko email: [EMAIL PROTECTED] Global Solutions Engineering: Unix Networking Hewlett-Packard (Canada) -- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] xfrm_policy delete security check misplaced
On Mon, 5 Mar 2007, Venkat Yekkirala wrote: > > > > Signed-off-by: Eric Paris <[EMAIL PROTECTED]> > Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]> What about your previous comment: "I guess you meant to do this here? else if (err) return err; " -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Why should we teach students Linux??
Hello listers, I'm tutor on the Faculty ICT, department NID. This is a bachelor degree and we are preparing our students to become something more then just System Administrators (such as manager, consulting, etc). Since this department is part of the Microsoft camp, the students are educated mostly in this direction, which I think is not a bad thing. A better thing would be if we could give our students the opportunity to meat both the systems on the same level, at least, that is my opinion. To change a curriculum of a study, I need a solid case. So if somebody knows a link/document about why we should educate our students in the Linux OS, please send it. Or article about the usage of Linux in company's. I hope you will all take some time to send me your best links/documents. with best regards Roel Bindels - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] xfrm audit hook misplaced in pfkey_delete and xfrm_del_sa
> Inside pfkey_delete and xfrm_del_sa the audit hooks were not called if > there was any permission/security failures in attempting to do the del > operation (such as permission denied from security_xfrm_state_delete). > This patch moves the audit hook to the exit path such that > all failures > (and successes) will actually get audited. Not sure ALL failures are being audited this way elsewhere, but I guess they would catchup in course of time. > > Signed-off-by: Eric Paris <[EMAIL PROTECTED]> Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] Add xfrm policy change auditing to pfkey_spdget
> pfkey_spdget neither had an LSM security hook nor auditing for the > removal of xfrm_policy structs. The security hook was added > when it was > moved into xfrm_policy_byid instead of the callers to that function by > my earlier patch and this patch adds the auditing hooks as well. > > Signed-off-by: Eric Paris <[EMAIL PROTECTED]> Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] xfrm_policy delete security check misplaced
> > Signed-off-by: Eric Paris <[EMAIL PROTECTED]> Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 01:13:26AM -0600, Matt Mackall wrote: > On Sun, Mar 04, 2007 at 11:02:48PM -0800, Greg KH wrote: > > On Mon, Mar 05, 2007 at 12:42:29AM -0600, Matt Mackall wrote: > > > On Sun, Mar 04, 2007 at 05:16:25PM -0800, Greg KH wrote: > > > > On Sun, Mar 04, 2007 at 04:08:57PM -0600, Matt Mackall wrote: > > > > > Recent kernels are having troubles with wireless for me. Two seemingly > > > > > related problems: > > > > > > > > > > a) NetworkManager seems oblivious to the existence of my IPW2200 > > > > > b) Manual iwconfig waits for 60s and then reports: > > > > > > > > > > Error for wireless request "Set Encode" (8B2A) : > > > > > SET failed on device eth1 ; Operation not supported. > > > > > > > > Do you have CONFIG_SYSFS_DEPRECATED enabled? If not, please do as that > > > > will keep you from having to change any userspace code. > > > > > > No, it's disabled. Will test once I'm done tracking down the iwconfig > > > problem. From the help text for SYSFS_DEPRECATED: > > > > > > If you are using a distro that was released in 2006 or > > > later, it should be safe to say N here. > > > > > > If we need an as-yet-unreleased HAL without it, I would say the above > > > should be changed to 2008 or so. If Debian actually cuts a release in > > > the next few months, you might make that 2010. > > > > Well, just because Debian has such a slow release cycle, should the rest > > of the world be forced to follow suit? :) > > > > When I originally wrote that, I thought Debian would have already done > > their release, my mistake... > > That's not the point. The point is that Debian/unstable as of _this > morning_ doesn't work. For reference, I'm running both the latest > releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And > there are people telling me I need a copy of HAL out of git that > hasn't even been released for Debian to package. Debian isn't the > problem here. hal 0.5.9-rc1 (released, not from git) should work. It will be problably released soon and picked by sane distributions. Debian is very irritating corner case. -- Tomasz TorczOnly gods can safely risk perfection, [EMAIL PROTECTED] it's a dangerous thing for a man. -- Alia - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 13/13] iptables tproxy match
Implements an iptables module which matches packets which have the tproxy flag set, that is, packets diverted in the tproxy table. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- net/netfilter/Kconfig |9 + net/netfilter/Makefile|1 + net/netfilter/xt_tproxy.c | 77 + 3 files changed, 87 insertions(+), 0 deletions(-) diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig index 253fce3..b22346e 100644 --- a/net/netfilter/Kconfig +++ b/net/netfilter/Kconfig @@ -603,6 +603,15 @@ config NETFILTER_XT_MATCH_QUOTA If you want to compile it as a module, say M here and read . If unsure, say `N'. +config NETFILTER_XT_MATCH_TPROXY + tristate '"tproxy" match support' + depends on NETFILTER_XTABLES + help + This option adds a `tproxy' match, which allows you to match + packets which have been diverted to local sockets by TProxy. + + To compile it as a module, choose M here. If unsure, say N. + config NETFILTER_XT_MATCH_REALM tristate '"realm" match support' depends on NETFILTER_XTABLES diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile index b2b5c75..83b2fd9 100644 --- a/net/netfilter/Makefile +++ b/net/netfilter/Makefile @@ -64,6 +64,7 @@ obj-$(CONFIG_NETFILTER_XT_MATCH_MARK) += xt_mark.o obj-$(CONFIG_NETFILTER_XT_MATCH_MULTIPORT) += xt_multiport.o obj-$(CONFIG_NETFILTER_XT_MATCH_POLICY) += xt_policy.o obj-$(CONFIG_NETFILTER_XT_MATCH_PKTTYPE) += xt_pkttype.o +obj-$(CONFIG_NETFILTER_XT_MATCH_TPROXY) += xt_tproxy.o obj-$(CONFIG_NETFILTER_XT_MATCH_QUOTA) += xt_quota.o obj-$(CONFIG_NETFILTER_XT_MATCH_REALM) += xt_realm.o obj-$(CONFIG_NETFILTER_XT_MATCH_SCTP) += xt_sctp.o diff --git a/net/netfilter/xt_tproxy.c b/net/netfilter/xt_tproxy.c new file mode 100644 index 000..53f8bee --- /dev/null +++ b/net/netfilter/xt_tproxy.c @@ -0,0 +1,77 @@ +/* + * Transparent proxy support for Linux/iptables + * + * Copyright (c) 2007 BalaBit IT Ltd. + * Author: Krisztian Kovacs + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + */ + +#include +#include + +#include + +static int +match(const struct sk_buff *skb, + const struct net_device *in, + const struct net_device *out, + const struct xt_match *match, + const void *matchinfo, + int offset, + unsigned int protoff, + int *hotdrop) +{ + return skb->ip_tproxy; +} + +static int +check(const char *tablename, + const void *entry, + const struct xt_match *match, + void *matchinfo, + unsigned int hook_mask) +{ + return 1; +} + +static struct xt_match tproxy_matches[] = { + { + .name = "tproxy", + .match = match, + .matchsize = 0, + .checkentry = check, + .family = AF_INET, + .me = THIS_MODULE, + }, + { + .name = "tproxy", + .match = match, + .matchsize = 0, + .checkentry = check, + .family = AF_INET6, + .me = THIS_MODULE, + }, +}; + +static int __init xt_tproxy_init(void) +{ + return xt_register_matches(tproxy_matches, ARRAY_SIZE(tproxy_matches)); +} + +static void __exit xt_tproxy_fini(void) +{ + xt_unregister_matches(tproxy_matches, ARRAY_SIZE(tproxy_matches)); +} + +module_init(xt_tproxy_init); +module_exit(xt_tproxy_fini); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Krisztian Kovacs <[EMAIL PROTECTED]>"); +MODULE_DESCRIPTION("iptables tproxy match module"); +MODULE_ALIAS("ipt_tproxy"); +MODULE_ALIAS("ip6t_tproxy"); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 12/13] iptables TPROXY target
The TPROXY target implements redirection of non-local TCP/UDP traffic to local sockets. It is simply a wrapper around functionality exported from iptable_tproxy. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- include/linux/netfilter_ipv4/ipt_TPROXY.h |9 +++ net/ipv4/netfilter/Kconfig| 11 +++ net/ipv4/netfilter/Makefile |1 net/ipv4/netfilter/ipt_TPROXY.c | 92 + 4 files changed, 113 insertions(+), 0 deletions(-) diff --git a/include/linux/netfilter_ipv4/ipt_TPROXY.h b/include/linux/netfilter_ipv4/ipt_TPROXY.h new file mode 100644 index 000..d05c956 --- /dev/null +++ b/include/linux/netfilter_ipv4/ipt_TPROXY.h @@ -0,0 +1,9 @@ +#ifndef _IPT_TPROXY_H_target +#define _IPT_TPROXY_H_target + +struct ipt_tproxy_target_info { + u_int16_t lport; + u_int32_t laddr; +}; + +#endif diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig index 17c3ec8..ecd8da5 100644 --- a/net/ipv4/netfilter/Kconfig +++ b/net/ipv4/netfilter/Kconfig @@ -638,6 +638,17 @@ config IP_NF_TPROXY To compile it as a module, choose M here. If unsure, say N. +config IP_NF_TARGET_TPROXY + tristate "TPROXY target support" + depends on IP_NF_TPROXY + help + This option adds a `TPROXY' target, which is somewhat similar to + REDIRECT. It can only be used in the tproxy table and is useful + to redirect traffic to a transparent proxy. It does _not_ depend + on Netfilter connection tracking. + + To compile it as a module, choose M here. If unsure, say N. + # ARP tables config IP_NF_ARPTABLES tristate "ARP tables support" diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile index 21a29f4..a50a64e 100644 --- a/net/ipv4/netfilter/Makefile +++ b/net/ipv4/netfilter/Makefile @@ -106,6 +106,7 @@ obj-$(CONFIG_IP_NF_TARGET_LOG) += ipt_LOG.o obj-$(CONFIG_IP_NF_TARGET_ULOG) += ipt_ULOG.o obj-$(CONFIG_IP_NF_TARGET_CLUSTERIP) += ipt_CLUSTERIP.o obj-$(CONFIG_IP_NF_TARGET_TTL) += ipt_TTL.o +obj-$(CONFIG_IP_NF_TARGET_TPROXY) += ipt_TPROXY.o # generic ARP tables obj-$(CONFIG_IP_NF_ARPTABLES) += arp_tables.o diff --git a/net/ipv4/netfilter/ipt_TPROXY.c b/net/ipv4/netfilter/ipt_TPROXY.c new file mode 100644 index 000..89a08b1 --- /dev/null +++ b/net/ipv4/netfilter/ipt_TPROXY.c @@ -0,0 +1,92 @@ +/* + * Transparent proxy support for Linux/iptables + * + * Copyright (c) 2006-2007 BalaBit IT Ltd. + * Author: Balazs Scheidler, Krisztian Kovacs + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +static unsigned int +target(struct sk_buff **pskb, + const struct net_device *in, + const struct net_device *out, + unsigned int hooknum, + const struct xt_target *target, + const void *targinfo) +{ + const struct iphdr *iph = (*pskb)->nh.iph; + const struct ipt_tproxy_target_info *tgi = + (const struct ipt_tproxy_target_info *) targinfo; + unsigned int verdict = NF_ACCEPT; + struct sk_buff *skb = *pskb; + struct udphdr _hdr, *hp; + struct sock *sk; + __be32 daddr; + __be16 dport; + + /* TCP/UDP only */ + if ((iph->protocol != IPPROTO_TCP) && + (iph->protocol != IPPROTO_UDP)) + return NF_ACCEPT; + + hp = skb_header_pointer(*pskb, iph->ihl * 4, sizeof(_hdr), &_hdr); + if (hp == NULL) + return NF_DROP; + + daddr = tgi->laddr ? : iph->daddr; + dport = tgi->lport ? : hp->dest; + sk = ip_tproxy_get_sock(iph->protocol, + iph->saddr, daddr, + hp->source, dport, in); + if (sk != NULL) { + if (ip_tproxy_do_divert(skb, sk, 0, in) < 0) + verdict = NF_DROP; + + if ((iph->protocol == IPPROTO_TCP) && (sk->sk_state == TCP_TIME_WAIT)) + inet_twsk_put(inet_twsk(sk)); + else + sock_put(sk); + } + + return verdict; +} + +static struct xt_target ipt_tproxy_reg = { + .name = "TPROXY", + .family = AF_INET, + .target = target, + .targetsize = sizeof(struct ipt_tproxy_target_info), + .table = "tproxy", + .me = THIS_MODULE, +}; + +static int __init init(void) +{ + return xt_register_target(&ipt_tproxy_reg); +} + +static void __exit fini(void) +{ + xt_unregister_target(&ipt_tproxy_reg); +} + +module_init(init); +module_exit(fini); +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Krisztian Kovacs <[EMAIL PROTECTED]>"); +MODULE_DESCRIPTION("Netfilter transparent proxy TPROXY target modu
[PATCH/RFC 11/13] iptables tproxy table
The iptables tproxy table registers a new hook on PRE_ROUTING and for each incoming TCP/UDP packet performs as follows: 1. Does IPv4 fragment reassembly. We need this to be able to do TCP/UDP header processing. 2. Does a TCP/UDP socket hash lookup to decide whether or not the packet is sent to a non-local bound socket. If a matching socket is found and the socket has the IP_TRANSPARENT socket option enabled the skb is diverted locally and the socket reference is stored in the skb. 3. If no matching socket was found, the PREROUTING chain of the iptables tproxy table is consulted. Matching rules with the TPROXY target can do transparent redirection here. (In this case it is not necessary to have the IP_TRANSPARENT socket option enabled for the target socket, redirection takes place even for "regular" sockets. This way no modification of the application is necessary.) Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- include/linux/netfilter_ipv4.h |1 include/linux/netfilter_ipv4/ip_tproxy.h | 20 ++ include/net/ip.h |3 net/ipv4/netfilter/Kconfig | 10 + net/ipv4/netfilter/Makefile |1 net/ipv4/netfilter/iptable_tproxy.c | 267 ++ 6 files changed, 301 insertions(+), 1 deletions(-) diff --git a/include/linux/netfilter_ipv4.h b/include/linux/netfilter_ipv4.h index ceae87a..cc4d83b 100644 --- a/include/linux/netfilter_ipv4.h +++ b/include/linux/netfilter_ipv4.h @@ -58,6 +58,7 @@ enum nf_ip_hook_priorities { NF_IP_PRI_SELINUX_FIRST = -225, NF_IP_PRI_CONNTRACK = -200, NF_IP_PRI_MANGLE = -150, + NF_IP_PRI_TPROXY = -125, NF_IP_PRI_NAT_DST = -100, NF_IP_PRI_FILTER = 0, NF_IP_PRI_NAT_SRC = 100, diff --git a/include/linux/netfilter_ipv4/ip_tproxy.h b/include/linux/netfilter_ipv4/ip_tproxy.h new file mode 100644 index 000..ae890e3 --- /dev/null +++ b/include/linux/netfilter_ipv4/ip_tproxy.h @@ -0,0 +1,20 @@ +#ifndef _IP_TPROXY_H +#define _IP_TPROXY_H + +#include + +/* look up and get a reference to a matching socket */ +extern struct sock * +ip_tproxy_get_sock(const u8 protocol, + const __be32 saddr, const __be32 daddr, + const __be16 sport, const __be16 dport, + const struct net_device *in); + +/* divert skb to a given socket */ +extern int +ip_tproxy_do_divert(struct sk_buff *skb, + const struct sock *sk, + const int require_freebind, + const struct net_device *in); + +#endif diff --git a/include/net/ip.h b/include/net/ip.h index 8b71991..a589e6e 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -321,7 +321,8 @@ enum ip_defrag_users IP_DEFRAG_CONNTRACK_OUT, IP_DEFRAG_VS_IN, IP_DEFRAG_VS_OUT, - IP_DEFRAG_VS_FWD + IP_DEFRAG_VS_FWD, + IP_DEFRAG_TP_IN, }; struct sk_buff *ip_defrag(struct sk_buff *skb, u32 user); diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig index 601808c..17c3ec8 100644 --- a/net/ipv4/netfilter/Kconfig +++ b/net/ipv4/netfilter/Kconfig @@ -628,6 +628,16 @@ config IP_NF_RAW If you want to compile it as a module, say M here and read . If unsure, say `N'. +# tproxy table +config IP_NF_TPROXY + tristate "Transparent proxying" + depends on IP_NF_IPTABLES + help + Transparent proxying. For more information see + http://www.balabit.com/downloads/tproxy. + + To compile it as a module, choose M here. If unsure, say N. + # ARP tables config IP_NF_ARPTABLES tristate "ARP tables support" diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile index 6625ec6..21a29f4 100644 --- a/net/ipv4/netfilter/Makefile +++ b/net/ipv4/netfilter/Makefile @@ -81,6 +81,7 @@ obj-$(CONFIG_IP_NF_MANGLE) += iptable_mangle.o obj-$(CONFIG_IP_NF_NAT) += iptable_nat.o obj-$(CONFIG_NF_NAT) += iptable_nat.o obj-$(CONFIG_IP_NF_RAW) += iptable_raw.o +obj-$(CONFIG_IP_NF_TPROXY) += iptable_tproxy.o # matches obj-$(CONFIG_IP_NF_MATCH_IPRANGE) += ipt_iprange.o diff --git a/net/ipv4/netfilter/iptable_tproxy.c b/net/ipv4/netfilter/iptable_tproxy.c new file mode 100644 index 000..a241f11 --- /dev/null +++ b/net/ipv4/netfilter/iptable_tproxy.c @@ -0,0 +1,267 @@ +/* + * Transparent proxy support for Linux/iptables + * + * Copyright (c) 2006-2007 BalaBit IT Ltd. + * Author: Balazs Scheidler, Krisztian Kovacs + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + */ + +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#define TPROXY_VALID_HOOKS (1 << NF_IP_PRE_ROUTING) + +#if 1 +#define DEBUGP printk +#else +#define DE
[PATCH/RFC 09/13] Create a tproxy flag in struct sk_buff
We would like to be able to match on whether or not a given packet has been diverted by tproxy. To make this possible we need a flag in sk_buff. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- include/linux/skbuff.h |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 4ff3940..6d7f5c7 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -284,7 +284,8 @@ struct sk_buff { nfctinfo:3; __u8pkt_type:3, fclone:2, - ipvs_property:1; + ipvs_property:1, + ip_tproxy:1; __be16 protocol; void(*destructor)(struct sk_buff *skb); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 10/13] Export UDP socket lookup function
The iptables tproxy code has to be able to do UDP socket hash lookups, so we have to provide an exported lookup function for this purpose. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- include/net/udp.h |4 net/ipv4/udp.c|8 2 files changed, 12 insertions(+), 0 deletions(-) diff --git a/include/net/udp.h b/include/net/udp.h index 1b921fa..ea5aa31 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -141,6 +141,10 @@ extern int udp_lib_setsockopt(struct sock *sk, int level, int optname, char __user *optval, int optlen, int (*push_pending_frames)(struct sock *)); +extern struct sock *udp4_lib_lookup(__be32 saddr, __be16 sport, + __be32 daddr, __be16 dport, + int dif); + DECLARE_SNMP_STAT(struct udp_mib, udp_statistics); /* * SNMP statistics for UDP and UDP-Lite diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 1d15edc..52695a6 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -285,6 +285,14 @@ static struct sock *__udp4_lib_lookup(__be32 saddr, __be16 sport, return result; } +struct sock *udp4_lib_lookup(__be32 saddr, __be16 sport, +__be32 daddr, __be16 dport, +int dif) +{ + return __udp4_lib_lookup(saddr, sport, daddr, dport, dif, udp_hash); +} +EXPORT_SYMBOL_GPL(udp4_lib_lookup); + static inline struct sock *udp_v4_mcast_next(struct sock *sk, __be16 loc_port, __be32 loc_addr, __be16 rmt_port, __be32 rmt_addr, - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 08/13] Handle TCP SYN+ACK/ACK/RST transparency
The TCP stack sends out SYN+ACK/ACK/RST reply packets in response to incoming packets. The non-local source address check on output bites us again, as replies for transparently redirected traffic won't have a chance to leave the node. This patch selectively sets the FLOWI_FLAG_TRANSPARENT flag when doing the route lookup for those replies. Transparent replies are enabled if the listening socket has the transparent socket flag set. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- include/net/ip.h|3 +++ include/net/request_sock.h |3 ++- net/ipv4/inet_connection_sock.c |2 ++ net/ipv4/ip_output.c|6 +- net/ipv4/syncookies.c |2 ++ net/ipv4/tcp_ipv4.c | 16 ++-- net/ipv4/tcp_minisocks.c|3 ++- 7 files changed, 26 insertions(+), 9 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index e79c3e3..8b71991 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -133,8 +133,11 @@ static inline void ip_tr_mc_map(__be32 addr, char *buf) buf[5]=0x00; } +#define IP_REPLY_ARG_NOSRCCHECK 1 + struct ip_reply_arg { struct kvec iov[1]; + int flags; __wsum csum; int csumoffset; /* u16 offset of csum in iov[0].iov_base */ /* -1 if not needed */ diff --git a/include/net/request_sock.h b/include/net/request_sock.h index 7aed02c..b9c8974 100644 --- a/include/net/request_sock.h +++ b/include/net/request_sock.h @@ -34,7 +34,8 @@ struct request_sock_ops { struct request_sock *req, struct dst_entry *dst); void(*send_ack)(struct sk_buff *skb, - struct request_sock *req); + struct request_sock *req, + int reply_flags); void(*send_reset)(struct sock *sk, struct sk_buff *skb); void(*destructor)(struct request_sock *req); diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 83ad972..90459a1 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -323,6 +323,8 @@ struct dst_entry* inet_csk_route_req(struct sock *sk, .saddr = ireq->loc_addr, .tos = RT_CONN_FLAGS(sk) } }, .proto = sk->sk_protocol, + .flags = inet_sk(sk)->transparent ? + FLOWI_FLAG_TRANSPARENT : 0, .uli_u = { .ports = { .sport = inet_sk(sk)->sport, .dport = ireq->rmt_port } } }; diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index d096332..7af25d4 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -312,6 +312,8 @@ int ip_queue_xmit(struct sk_buff *skb, int ipfragok) .saddr = inet->saddr, .tos = RT_CONN_FLAGS(sk) } }, .proto = sk->sk_protocol, + .flags = inet->transparent ? +FLOWI_FLAG_TRANSPARENT : 0, .uli_u = { .ports = { .sport = inet->sport, .dport = inet->dport } } }; @@ -1357,7 +1359,9 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, struct ip_reply_arg *ar .uli_u = { .ports = { .sport = skb->h.th->dest, .dport = skb->h.th->source } }, - .proto = sk->sk_protocol }; + .proto = sk->sk_protocol, + .flags = (arg->flags & IP_REPLY_ARG_NOSRCCHECK) ? + FLOWI_FLAG_TRANSPARENT : 0 }; security_skb_classify_flow(skb, &fl); if (ip_route_output_key(&rt, &fl)) return; diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 431c81d..08d8920 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -261,6 +261,8 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, .saddr = ireq->loc_addr, .tos = RT_CONN_FLAGS(sk) } }, .proto = IPPROTO_TCP, + .flags = inet_sk(sk)->transparent ? +
[PATCH/RFC 07/13] Conditionally enable transparent flow flag when connecting
Set FLOWI_FLAG_TRANSPARENT in flowi->flags if the socket has the transparent socket option set. This way we selectively enable certain connections with non-local source addresses to be routed. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- include/net/route.h |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/include/net/route.h b/include/net/route.h index 13da592..4dff368 100644 --- a/include/net/route.h +++ b/include/net/route.h @@ -161,6 +161,10 @@ static inline int ip_route_connect(struct rtable **rp, __be32 dst, .dport = dport } } }; int err; + + if (inet_sk(sk)->transparent) + fl.flags |= FLOWI_FLAG_TRANSPARENT; + if (!dst || !src) { err = __ip_route_output_key(rp, &fl); if (err) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 06/13] Implement IP_TRANSPARENT socket option
This patch introduces the IP_TRANSPARENT socket option: enabling that will make the IPv4 routing omit the non-local source address check on output. Setting IP_TRANSPARENT requires NET_ADMIN capability. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- include/linux/in.h |1 + include/net/inet_sock.h |3 ++- include/net/inet_timewait_sock.h |3 ++- include/net/route.h |1 + net/ipv4/inet_timewait_sock.c|1 + net/ipv4/ip_sockglue.c | 12 +++- 6 files changed, 18 insertions(+), 3 deletions(-) diff --git a/include/linux/in.h b/include/linux/in.h index 1912e7c..66be615 100644 --- a/include/linux/in.h +++ b/include/linux/in.h @@ -75,6 +75,7 @@ struct in_addr { #define IP_IPSEC_POLICY16 #define IP_XFRM_POLICY 17 #define IP_PASSSEC 18 +#define IP_TRANSPARENT 19 /* BSD compatibility */ #define IP_RECVRETOPTS IP_RETOPTS diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index 0bd167b..14b597d 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -128,7 +128,8 @@ struct inet_sock { is_icsk:1, freebind:1, hdrincl:1, - mc_loop:1; + mc_loop:1, + transparent:1; int mc_index; __be32 mc_addr; struct ip_mc_socklist *mc_list; diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h index f7be1ac..e30dd61 100644 --- a/include/net/inet_timewait_sock.h +++ b/include/net/inet_timewait_sock.h @@ -126,7 +126,8 @@ struct inet_timewait_sock { __be16 tw_dport; __u16 tw_num; /* And these are ours. */ - __u8tw_ipv6only:1; + __u8tw_ipv6only:1, + tw_transparent:1; /* 15 bits hole, try to pack */ __u16 tw_ipv6_offset; int tw_timeout; diff --git a/include/net/route.h b/include/net/route.h index efaa6b2..13da592 100644 --- a/include/net/route.h +++ b/include/net/route.h @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c index a73cf93..f57f81a 100644 --- a/net/ipv4/inet_timewait_sock.c +++ b/net/ipv4/inet_timewait_sock.c @@ -108,6 +108,7 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct sock *sk, const int stat tw->tw_reuse= sk->sk_reuse; tw->tw_hash = sk->sk_hash; tw->tw_ipv6only = 0; + tw->tw_transparent = inet->transparent; tw->tw_prot = sk->sk_prot_creator; atomic_set(&tw->tw_refcnt, 1); inet_twsk_dead_node_init(tw); diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index 23048d9..02e8d9f 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -414,7 +414,7 @@ static int do_ip_setsockopt(struct sock *sk, int level, (1<= sizeof(int)) { @@ -875,6 +875,16 @@ mc_msf_out: err = xfrm_user_policy(sk, optname, optval, optlen); break; + case IP_TRANSPARENT: + if (!capable(CAP_NET_ADMIN)) { + err = -EPERM; + break; + } + if (optlen < 1) + goto e_inval; + inet->transparent = !!val; + break; + default: err = -ENOPROTOOPT; break; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 04/13] Don't do the UDP socket lookup if we already have one attached
UDP input code path looks up the UDP socket hash tables to find a socket matching the incoming packet. However, as iptable_tproxy does socket lookups early the skb may already have the appropriate reference attached, in that case we steal that reference instead of doing the lookup. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- net/ipv4/udp.c | 11 +-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index ce6c460..1d15edc 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1226,8 +1226,15 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct hlist_head udptable[], if(rt->rt_flags & (RTCF_BROADCAST|RTCF_MULTICAST)) return __udp4_lib_mcast_deliver(skb, uh, saddr, daddr, udptable); - sk = __udp4_lib_lookup(saddr, uh->source, daddr, uh->dest, - skb->dev->ifindex, udptable); + if (skb->sk) { + /* steal reference */ + sk = skb->sk; + skb->destructor = NULL; + skb->sk = NULL; + } else { + sk = __udp4_lib_lookup(saddr, uh->source, daddr, uh->dest, + skb->dev->ifindex, udptable); + } if (sk != NULL) { int ret = udp_queue_rcv_skb(sk, skb); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 05/13] Loosen source address check on IPv4 output
ip_route_output() contains a check to make sure that no flows with non-local source IP addresses are routed. This obviously makes using such addresses impossible. This patch introduces a flowi flag which makes omitting this check possible. The new flag provides a way of handling transparent and non-transparent connections differently. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- include/net/flow.h |1 + net/ipv4/route.c |8 ++-- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/include/net/flow.h b/include/net/flow.h index ce4b10d..9eb91f2 100644 --- a/include/net/flow.h +++ b/include/net/flow.h @@ -49,6 +49,7 @@ struct flowi { __u8proto; __u8flags; #define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01 +#define FLOWI_FLAG_TRANSPARENT 0x02 union { struct { __be16 sport; diff --git a/net/ipv4/route.c b/net/ipv4/route.c index c526fb2..8091a96 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -572,7 +572,8 @@ static inline int compare_keys(struct flowi *fl1, struct flowi *fl2) (*(u16 *)&fl1->nl_u.ip4_u.tos ^ *(u16 *)&fl2->nl_u.ip4_u.tos) | (fl1->oif ^ fl2->oif) | - (fl1->iif ^ fl2->iif)) == 0; + (fl1->iif ^ fl2->iif) | + ((fl1->flags ^ fl2->flags) & FLOWI_FLAG_TRANSPARENT)) == 0; } #ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED @@ -2338,6 +2339,7 @@ static inline int __mkroute_output(struct rtable **result, rth->fl.fl4_src = oldflp->fl4_src; rth->fl.oif = oldflp->oif; rth->fl.mark= oldflp->mark; + rth->fl.flags = oldflp->flags; rth->rt_dst = fl->fl4_dst; rth->rt_src = fl->fl4_src; rth->rt_iif = oldflp->oif ? : dev_out->ifindex; @@ -2482,6 +2484,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) RT_SCOPE_LINK : RT_SCOPE_UNIVERSE), } }, + .flags = oldflp->flags, .mark = oldflp->mark, .iif = loopback_dev.ifindex, .oif = oldflp->oif }; @@ -2506,7 +2509,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ dev_out = ip_dev_find(oldflp->fl4_src); - if (dev_out == NULL) + if (dev_out == NULL && !(oldflp->flags & FLOWI_FLAG_TRANSPARENT)) goto out; /* I removed check for oif == dev_out->oif here. @@ -2678,6 +2681,7 @@ int __ip_route_output_key(struct rtable **rp, const struct flowi *flp) rth->fl.iif == 0 && rth->fl.oif == flp->oif && rth->fl.mark == flp->mark && + !((rth->fl.flags ^ flp->flags) & FLOWI_FLAG_TRANSPARENT) && !((rth->fl.fl4_tos ^ flp->fl4_tos) & (IPTOS_RT_MASK | RTO_ONLINK))) { - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 03/13] Don't do the TCP socket lookup if we already have one attached
TCP input code path looks up the TCP socket hash tables to find a socket matching the incoming packet. However, as iptable_tproxy does socket lookups early the skb may already have the appropriate reference attached, in that case we steal that reference instead of doing the lookup. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- net/ipv4/tcp_ipv4.c | 13 ++--- 1 files changed, 10 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 0ba74bb..536db7b 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1647,9 +1647,16 @@ int tcp_v4_rcv(struct sk_buff *skb) TCP_SKB_CB(skb)->flags = skb->nh.iph->tos; TCP_SKB_CB(skb)->sacked = 0; - sk = __inet_lookup(&tcp_hashinfo, skb->nh.iph->saddr, th->source, - skb->nh.iph->daddr, th->dest, - inet_iif(skb)); + if (unlikely(skb->sk)) { + /* steal reference */ + sk = skb->sk; + skb->destructor = NULL; + skb->sk = NULL; + } else { + sk = __inet_lookup(&tcp_hashinfo, skb->nh.iph->saddr, th->source, + skb->nh.iph->daddr, th->dest, + inet_iif(skb)); + } if (!sk) goto no_tcp_socket; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 02/13] Port redirection support for TCP
Current TCP code relies on the local port of the listening socket being the same as the destination address of the incoming connection. Port redirection used by many transparent proxying techniques obviously breaks this, so we have to store the original destination port address. This patch extends struct inet_request_sock and stores the incoming destination port value there. It also modifies the handshake code to use that value as the source port when sending reply packets. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- include/net/inet_sock.h |1 + include/net/tcp.h |1 + net/ipv4/inet_connection_sock.c |2 ++ net/ipv4/syncookies.c |1 + net/ipv4/tcp_output.c |2 +- 5 files changed, 6 insertions(+), 1 deletions(-) diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index ce6da97..0bd167b 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -64,6 +64,7 @@ struct inet_request_sock { #endif __be32 loc_addr; __be32 rmt_addr; + __be16 loc_port; __be16 rmt_port; u16 snd_wscale : 4, rcv_wscale : 4, diff --git a/include/net/tcp.h b/include/net/tcp.h index 5c472f2..e1cb3d0 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -982,6 +982,7 @@ static inline void tcp_openreq_init(struct request_sock *req, ireq->acked = 0; ireq->ecn_ok = 0; ireq->rmt_port = skb->h.th->source; + ireq->loc_port = skb->h.th->dest; } extern void tcp_enter_memory_pressure(void); diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 43fb160..83ad972 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -502,6 +502,8 @@ struct sock *inet_csk_clone(struct sock *sk, const struct request_sock *req, newicsk->icsk_bind_hash = NULL; inet_sk(newsk)->dport = inet_rsk(req)->rmt_port; + inet_sk(newsk)->num = ntohs(inet_rsk(req)->loc_port); + inet_sk(newsk)->sport = inet_rsk(req)->loc_port; newsk->sk_write_space = sk_stream_write_space; newicsk->icsk_retransmits = 0; diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 33016cc..431c81d 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -223,6 +223,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, treq->rcv_isn = ntohl(skb->h.th->seq) - 1; treq->snt_isn = cookie; req->mss= mss; + ireq->loc_port = skb->h.th->dest; ireq->rmt_port = skb->h.th->source; ireq->loc_addr = skb->nh.iph->daddr; ireq->rmt_addr = skb->nh.iph->saddr; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index dc15113..a3ea7a1 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2135,7 +2135,7 @@ struct sk_buff * tcp_make_synack(struct sock *sk, struct dst_entry *dst, th->syn = 1; th->ack = 1; TCP_ECN_make_synack(req, th); - th->source = inet_sk(sk)->sport; + th->source = ireq->loc_port; th->dest = ireq->rmt_port; TCP_SKB_CB(skb)->seq = tcp_rsk(req)->snt_isn; TCP_SKB_CB(skb)->end_seq = TCP_SKB_CB(skb)->seq + 1; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 01/13] Implement local diversion of IPv4 skbs
The input path for non-local bound sockets requires diverting certain packets locally, even if their destination IP address is not considered local. We achieve this by assigning a specially crafted dst entry to these skbs, and optionally also attaching a socket to the skb so that the upper layer code does not need to redo the socket lookup. We also have to be able to differentiate between these fake entries and "real" entries in the cache: it is perfectly legal that the diversion is done only for certain TCP or UDP packets and not for all packets of the flow. Since these special dst entries are used only by the iptables tproxy code, and that code uses exclusively these entries, simply flagging these entries as DST_DIVERTED is OK. All other cache lookup paths skip diverted entries, while our new ip_divert_local() function uses exclusively diverted dst entries. Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]> --- include/net/dst.h |1 include/net/route.h |2 + net/ipv4/route.c| 113 +++ 3 files changed, 115 insertions(+), 1 deletions(-) diff --git a/include/net/dst.h b/include/net/dst.h index e12a8ce..4cd0745 100644 --- a/include/net/dst.h +++ b/include/net/dst.h @@ -48,6 +48,7 @@ struct dst_entry #define DST_NOPOLICY 4 #define DST_NOHASH 8 #define DST_BALANCED0x10 +#define DST_DIVERTED 0x20 unsigned long expires; unsigned short header_len; /* more space at head required */ diff --git a/include/net/route.h b/include/net/route.h index 749e4df..efaa6b2 100644 --- a/include/net/route.h +++ b/include/net/route.h @@ -125,6 +125,8 @@ extern int ip_rt_ioctl(unsigned int cmd, void __user *arg); extern voidip_rt_get_source(u8 *src, struct rtable *rt); extern int ip_rt_dump(struct sk_buff *skb, struct netlink_callback *cb); +extern int ip_divert_local(struct sk_buff *skb, const struct in_device *in, struct sock *sk); + struct in_ifaddr; extern void fib_add_ifaddr(struct in_ifaddr *); diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 37e0d4d..c526fb2 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -100,6 +100,7 @@ #include #include #include +#include #include #include #include @@ -941,9 +942,11 @@ restart: while ((rth = *rthp) != NULL) { #ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED if (!(rth->u.dst.flags & DST_BALANCED) && + !((rt->u.dst.flags ^ rth->u.dst.flags) & DST_DIVERTED) && compare_keys(&rth->fl, &rt->fl)) { #else - if (compare_keys(&rth->fl, &rt->fl)) { + if (!((rt->u.dst.flags ^ rth->u.dst.flags) & DST_DIVERTED) && + compare_keys(&rth->fl, &rt->fl)) { #endif /* Put it first */ *rthp = rth->u.dst.rt_next; @@ -1165,6 +1168,7 @@ void ip_rt_redirect(__be32 old_gw, __be32 daddr, __be32 new_gw, if (rth->fl.fl4_dst != daddr || rth->fl.fl4_src != skeys[i] || rth->fl.oif != ikeys[k] || + (rth->u.dst.flags & DST_DIVERTED) || rth->fl.iif != 0) { rthp = &rth->u.dst.rt_next; continue; @@ -1525,6 +1529,111 @@ static int ip_rt_bug(struct sk_buff *skb) return 0; } +static void ip_divert_free_sock(struct sk_buff *skb) +{ + struct sock *sk = skb->sk; + + skb->sk = NULL; + skb->destructor = NULL; + + if (sk) { + /* TIME_WAIT inet sockets have to be handled differently */ + if (((sk->sk_protocol == IPPROTO_TCP) && (sk->sk_state == TCP_TIME_WAIT)) || + ((sk->sk_protocol == IPPROTO_DCCP) && (sk->sk_state == DCCP_TIME_WAIT))) + inet_twsk_put(inet_twsk(sk)); + else + sock_put(sk); + } +} + +int ip_divert_local(struct sk_buff *skb, const struct in_device *in, struct sock *sk) +{ + struct iphdr *iph = skb->nh.iph; + struct rtable *rth, *rtres; + unsigned hash; + const int iif = in->dev->ifindex; + u_int8_t tos; + int err; + + /* look up hash first */ + tos = iph->tos & IPTOS_RT_MASK; + hash = rt_hash_code(iph->daddr, iph->saddr ^ (iif << 5)); + + rcu_read_lock(); + for (rth = rcu_dereference(rt_hash_table[hash].chain); rth; +rth = rcu_dereference(rth->u.dst.rt_next)) { + if (rth->fl.fl4_dst == iph->daddr && + rth->fl.fl4_src == iph->saddr && + rth->fl.iif == iif && + rth->fl.oif == 0 && + (rth->u.dst.flags & DST_DIVERTED)) { + rth->u.dst.lastuse = jiffies; +
[PATCH/RFC 00/13] Transparent proxying patches, take two
Hi, These patches are my second try at providing Linux 2.2-like transparent proxying support for Linux 2.6. Major changes since the first version: - iptable_tproxy now does IPv4 fragment reassembly (necessary for processing TCP/UDP header) - The removal of the source address check in ip_route_output() was incorrect. Instead, I've implemented a separate setsockopt-settable per-socket flag (setting it requires CAP_NET_ADMIN) to selectively loosen that check in ip_route_output(). Besides these, I've tried to fix all the problems raised on netdev@ in January. Unfortunately the newly introduced IP_TRANSPARENT socket option leads to a quite intrusive set of patches touching core IPv4 routing and TCP code, however this was necessary as DaveM rejected our idea of using IP_FREEBIND instead (and he's right, of course, as it would have caused ABI breakage.) The current approach works by adding a new bit to the flag field in "struct flowi". Furthermore, I haven't removed the IPv4 routing local diversion code (caching socket lookups in the skb) yet. Patrick recommended throwing it out altogether and use mark-based policy routing instead, but I still think that would be harming usability as the user would need to harmonize the configuration in order to have two completely independent subsystems interoperate. -- Regards, Krisztian Kovacs - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] xfrm_policy delete security check misplaced
> Also, [Joy cc'd] deletions here needn't be audited? OK, I see the next patch addressed this :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] xfrm_policy delete security check misplaced
> @@ -2552,7 +2550,7 @@ static int pfkey_spdget(struct sock > *sk, struct sk_buff *skb, struct sadb_msg *h > return -EINVAL; > > xp = xfrm_policy_byid(XFRM_POLICY_TYPE_MAIN, dir, > pol->sadb_x_policy_id, > - hdr->sadb_msg_type == SADB_X_SPDDELETE2); > + hdr->sadb_msg_type == > SADB_X_SPDDELETE2, &err); > if (xp == NULL) > return -ENOENT; I guess you meant to do this here? else if (err) return err; Also, [Joy cc'd] deletions here needn't be audited? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 17/31] net: Factor out __dev_alloc_name from dev_alloc_name
Hello Eric, See comments about __dev_alloc_name() below. Regards, Benjamin Eric W. Biederman wrote: From: Eric W. Biederman <[EMAIL PROTECTED]> - unquoted When forcibly changing the network namespace of a device I need something that can generate a name for the device in the new namespace without overwriting the old name. __dev_alloc_name provides me that functionality. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- net/core/dev.c | 44 +--- 1 files changed, 33 insertions(+), 11 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 32fe905..fc0d2af 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -655,9 +655,10 @@ int dev_valid_name(const char *name) } /** - * dev_alloc_name - allocate a name for a device - * @dev: device + * __dev_alloc_name - allocate a name for a device + * @net: network namespace to allocate the device name in * @name: name format string + * @buf: scratch buffer and result name string * * Passed a format string - eg "lt%d" it will try and find a suitable * id. It scans list of devices to build up a free map, then chooses @@ -668,18 +669,13 @@ int dev_valid_name(const char *name) * Returns the number of the unit assigned or a negative errno code. */ -int dev_alloc_name(struct net_device *dev, const char *name) +static int __dev_alloc_name(net_t net, const char *name, char buf[IFNAMSIZ]) IMHO the third parameter should be: char *buf Indeed using "char buf[IFNAMSIZ]" is misleading because later in the routine sizeof(buf) is used (with an expected result of IFNAMSIZ). Unfortunately this is no longer the case: sizeof(buf) value is only 4 now (buf is pointer parameter). This corrupts the registration of network devices (now I understand why only one of my e1000 showed up after each reboot :). Also sizeof(buf) should be replaced by IFNAMSIZ in this new routine. (See below) { int i = 0; - char buf[IFNAMSIZ]; const char *p; const int max_netdevices = 8*PAGE_SIZE; long *inuse; struct net_device *d; - net_t net; - - BUG_ON(null_net(dev->nd_net)); - net = dev->nd_net; p = strnchr(name, IFNAMSIZ-1, '%'); if (p) { @@ -713,10 +709,8 @@ int dev_alloc_name(struct net_device *dev, const char *name) } snprintf(buf, sizeof(buf), name, i); Replace "snprintf(buf, IFNAMSIZ, name, i);" or i will never be appended to name and all your ethernet devices will all try to register the name "eth". There is another occurence of "snprintf(buf, sizeof(buf), ...)" to replace in the for loop above. - if (!__dev_get_by_name(net, buf)) { - strlcpy(dev->name, buf, IFNAMSIZ); + if (!__dev_get_by_name(net, buf)) return i; - } /* It is possible to run out of possible slots * when the name is long and there isn't enough space left @@ -725,6 +719,34 @@ int dev_alloc_name(struct net_device *dev, const char *name) return -ENFILE; } +/** + * dev_alloc_name - allocate a name for a device + * @dev: device + * @name: name format string + * + * Passed a format string - eg "lt%d" it will try and find a suitable + * id. It scans list of devices to build up a free map, then chooses + * the first empty slot. The caller must hold the dev_base or rtnl lock + * while allocating the name and adding the device in order to avoid + * duplicates. + * Limited to bits_per_byte * page size devices (ie 32K on most platforms). + * Returns the number of the unit assigned or a negative errno code. + */ + +int dev_alloc_name(struct net_device *dev, const char *name) +{ + char buf[IFNAMSIZ]; + net_t net; + int ret; + + BUG_ON(null_net(dev->nd_net)); + net = dev->nd_net; + ret = __dev_alloc_name(net, name, buf); + if (ret >= 0) + strlcpy(dev->name, buf, IFNAMSIZ); + return ret; +} + /** * dev_change_name - change name of a device -- B e n j a m i n T h e r y - BULL/DT/Open Software R&D http://www.bull.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] twcal_jiffie should be unsigned long, not int
Hi David While browsing include/net/inet_timewait_sock.h, I found this buggy definition of twcal_jiffie. int twcal_jiffie; I wonder how inet_twdr_twcal_tick() can really works on x86_64 This seems quite an old bug, it was there before introduction of inet_timewait_death_row made by Arnaldo Carvalho de Melo. [PATCH] twcal_jiffie should be unsigned long, not int Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]> diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h index f7be1ac..09a2532 100644 --- a/include/net/inet_timewait_sock.h +++ b/include/net/inet_timewait_sock.h @@ -66,7 +66,7 @@ #define INET_TWDR_TWKILL_QUOTA 100 struct inet_timewait_death_row { /* Short-time timewait calendar */ int twcal_hand; - int twcal_jiffie; + unsigned long twcal_jiffie; struct timer_list twcal_timer; struct hlist_head twcal_row[INET_TWDR_RECYCLE_SLOTS];
Re: TCP 2MSL on loopback
On Monday 05 March 2007 12:20, Howard Chu wrote: > Why is the Maximum Segment Lifetime a global parameter? Surely the > maximum possible lifetime of a particular TCP segment depends on the > actual connection. At the very least, it would be useful to be able to > set it on a per-interface basis. E.g., in the case of the loopback > interface, it would be useful to be able to set it to a very small > duration. Hi Howard I think you should address these questions on netdev instead of linux-kernel. > > As I note in this draft > http://www.ietf.org/internet-drafts/draft-chu-ldap-ldapi-00.txt > when doing a connection soak test of OpenLDAP using clients connected > through localhost, the entire port range is exhausted in well under a > second, at which point the test stalls until a port comes out of > TIME_WAIT state so the next connection can be opened. > > These days it's not uncommon for an OpenLDAP slapd server to handle tens > of thousands of connections per second in real use (e.g., at Google, or > at various telcos). While the LDAP server is fast enough to saturate > even 10gbit ethernet using contemporary CPUs, we have to resort to > multiple virtual interfaces just to make sure we have enough port > numbers available. > I dont uderstand... doesnt slapd server listen for connections on a given port, like http ? Or is it doing connections like a ftp server ? Of course, if you want to open more than 60.000 concurrent connections, using 127.0.0.1 address, you might have a problem... > Ideally the 2MSL parameter would be dynamically adjusted based on the > route to the destination and the weights associated with those routes. > In the simplest case, connections between machines on the same subnet > (i.e., no router hops involved) should have a much smaller default value > than connections that traverse any routers. I'd settle for a two-level > setting - with no router hops, use the small value; with any router hops > use the large value. Well, is it really a MSL problem ? I did a small test (linux-2.6.21-rc1) and was able to get 1.000.000 connections on localhost on my dual proc machine in one minute, without an error. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT] sky2 auto negotiation PHY errata
On Tue, Feb 20, 2007 at 11:00:53AM -0800, Stephen Hemminger wrote: > You need the flow control fix and the tx_timeout fix posted for 2.6.20 > (stable) > and current git tree. sky2 1.13 has been far better than 1.10; there have been no system hangs or permanent sky2 failures. However, the following two incidents were in syslog: Feb 27 07:08:21 btd kernel: Linux version 2.6.20.sky2.1.13-btd3 ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP PREEMPT Tue Feb 27 00:07:34 MST 2007 Feb 27 07:08:21 btd kernel: sky2 :04:00.0: v1.13 addr 0xfa9fc000 irq 17 Yukon-EC (0xb6) rev 2 Feb 27 07:08:21 btd kernel: sky2 eth0: addr 00:1a:92:23:52:4d Feb 27 07:08:21 btd kernel: sky2 :03:00.0: v1.13 addr 0xfa8fc000 irq 16 Yukon-EC (0xb6) rev 2 Feb 27 07:08:21 btd kernel: sky2 eth1: addr 00:1a:92:23:4b:a6 Feb 27 07:08:21 btd kernel: sky2 eth0: enabling interface Feb 27 07:08:21 btd kernel: sky2 eth0: ram buffer 48K Feb 27 07:08:21 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both Feb 27 19:48:34 btd kernel: sky2 :04:00.0: v1.13 addr 0xfa9fc000 irq 17 Yukon-EC (0xb6) rev 2 Feb 27 19:48:34 btd kernel: sky2 eth0: addr 00:1a:92:23:52:4d Feb 27 19:48:34 btd kernel: sky2 :03:00.0: v1.13 addr 0xfa8fc000 irq 16 Yukon-EC (0xb6) rev 2 Feb 27 19:48:34 btd kernel: sky2 eth1: addr 00:1a:92:23:4b:a6 Feb 27 19:48:34 btd kernel: sky2 eth0: enabling interface Feb 27 19:48:34 btd kernel: sky2 eth0: ram buffer 48K Feb 27 19:48:34 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both Feb 28 19:06:57 btd kernel: NETDEV WATCHDOG: eth0: transmit timed out Feb 28 19:06:57 btd kernel: sky2 eth0: tx timeout Feb 28 19:06:57 btd kernel: sky2 eth0: transmit ring 133 .. 110 report=133 done=133 Feb 28 19:06:57 btd kernel: sky2 eth0: disabling interface Feb 28 19:06:57 btd kernel: sky2 eth0: enabling interface Feb 28 19:06:57 btd kernel: sky2 eth0: ram buffer 48K Feb 28 19:07:00 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both Mar 4 13:58:31 btd kernel: NETDEV WATCHDOG: eth0: transmit timed out Mar 4 13:58:31 btd kernel: sky2 eth0: tx timeout Mar 4 13:58:31 btd kernel: sky2 eth0: transmit ring 353 .. 330 report=353 done=353 Mar 4 13:58:31 btd kernel: sky2 eth0: disabling interface Mar 4 13:58:31 btd kernel: sky2 eth0: enabling interface Mar 4 13:58:31 btd kernel: sky2 eth0: ram buffer 48K Mar 4 13:58:34 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both I only noticed the second of the two. -- Rob signature.asc Description: Digital signature
Re: [PATCH 3/3] NetXen: Make driver use multi PCI functions
On Saturday 03 March 2007 06:35, Jeff Garzik wrote: > Linsys Contractor Mithlesh Thukral wrote: > > NetXen: Make driver use multi PCI functions. > > > > Signed-off by: Mithlesh Thukral <[EMAIL PROTECTED]> > > > > --- > > > > netxen_nic.h | 126 +--- > > netxen_nic_ethtool.c | 80 +++ > > netxen_nic_hdr.h |8 > > netxen_nic_hw.c | 213 +++- > > netxen_nic_hw.h | 18 - > > netxen_nic_init.c | 115 +++--- > > netxen_nic_isr.c | 80 +++ > > netxen_nic_main.c | 523 > > +- netxen_nic_niu.c > > | 27 +- > > netxen_nic_phan_reg.h | 125 --- > > 10 files changed, 631 insertions(+), 684 deletions(-) > > all three patches in this patchset contained nothing but one-line > summaries of the changes included in them, and are overall very poorly > and vaguely described. > > This patch is far too big, with far too little description and > justification to go along with it. > > If you are not going to make the effort to write a paragraph or two > describing such huge changes, then I'm not going to make the effort to > review and apply it. NAK. My apologies for insufficient explanation of the patch. I have resend this patch some time ago. Regards, Mithlesh Thukral - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Sun, Mar 04, 2007 at 05:17:29PM -0800, Greg KH wrote: > I should not have broken any userspace if CONFIG_SYSFS_DEPRECATED is > enabled with that patch. If that is enabled, and that patch still > causes problems, please let me know. But we still need to update the help text for CONFIG_SYS_DEPRECATED to make it clear that its deprecation schedule still needs to be 2009 to 2011 (depending on whether we want to accomodate Debian's glacial release schedule). Certainly the 2006 date which is currently there simply isn't accurate. - Ted - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] NetXen: Fix ping failure of Jumbo frames on MEZ cards.
NetXen: Fix ping failure of Jumbo frames on MEZ cards. Signed-off by: Mithlesh Thukral <[EMAIL PROTECTED]> --- drivers/net/netxen/netxen_nic_hw.c |7 ++- 1 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/net/netxen/netxen_nic_hw.c b/drivers/net/netxen/netxen_nic_hw.c index 693d01a..81ebc81 100644 --- a/drivers/net/netxen/netxen_nic_hw.c +++ b/drivers/net/netxen/netxen_nic_hw.c @@ -962,7 +962,12 @@ int netxen_nic_set_mtu_gb(struct netxen_ int netxen_nic_set_mtu_xgb(struct netxen_adapter *adapter, int new_mtu) { new_mtu += NETXEN_NIU_HDRSIZE + NETXEN_NIU_TLRSIZE; - netxen_nic_write_w0(adapter, NETXEN_NIU_XGE_MAX_FRAME_SIZE, new_mtu); + if (adapter->portnum == 0) + netxen_nic_write_w0(adapter, NETXEN_NIU_XGE_MAX_FRAME_SIZE, + new_mtu); + else if (adapter->portnum == 1) + netxen_nic_write_w0(adapter, NETXEN_NIU_XG1_MAX_FRAME_SIZE, + new_mtu); return 0; } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html