Re: [RFC PATCH]: Dynamically sized routing cache hash table.

2007-03-05 Thread Eric Dumazet

David Miller a écrit :

From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Tue, 06 Mar 2007 08:14:46 +0100

I wonder... are you sure this has no relation with the size of rt_hash_locks / 
RT_HASH_LOCK_SZ ?

One entry must have the same lock in the two tables when resizing is in flight.
#define MIN_RTHASH_SHIFT LOG2(RT_HASH_LOCK_SZ)


Good point.


+static struct rt_hash_bucket *rthash_alloc(unsigned int sz)
+{
+   struct rt_hash_bucket *n;
+
+   if (sz <= PAGE_SIZE)
+   n = kmalloc(sz, GFP_KERNEL);
+   else if (hashdist)
+   n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL);
+   else
+   n = (struct rt_hash_bucket *)
+   __get_free_pages(GFP_KERNEL, get_order(sz));

I dont feel well with this.
Maybe we could try a __get_free_pages(), and in case of failure, fallback to 
vmalloc(). Then keep a flag to be able to free memory correctly. Anyway, if 
(get_order(sz)>=MAX_ORDER) we know __get_free_pages() will fail.


We have to use vmalloc() for the hashdist case so that the pages
are spread out properly on NUMA systems.  That's exactly what the
large system hash allocator is going to do on bootup anyways.


Yes, but on bootup you have an appropriate NUMA active policy. (Well... we 
hope so, but it broke several time in the past)

I am not sure what kind of mm policy is active for scheduled works.

Anyway I have some XX GB machines, non NUMA, and I would love to be able to 
have a 2^20 slots hash table, without having to increase MAX_ORDER.




Look, either both are right or both are wrong.  I'm just following
protocol above and you'll note the PRECISE same logic exists in other
dynamically growing hash table implementations such as
net/xfrm/xfrm_hash.c




Yes, they are both wrong/dumb :)

Can we be smarter, or do we have to stay dumb ? :)

struct rt_hash_bucket *n = NULL;

if (sz <= PAGE_SIZE) {
n = kmalloc(sz, GFP_KERNEL);
*kind = allocated_by_kmalloc;
}
else if (!hashdist) {
n = (struct rt_hash_bucket *)
__get_free_pages(GFP_KERNEL, get_order(sz));
*kind = allocated_by_get_free_pages;
}
if (!n) {
n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL);
*kind = allocated_by_vmalloc;
}

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH]: Dynamically sized routing cache hash table.

2007-03-05 Thread David Miller
From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Tue, 06 Mar 2007 08:14:46 +0100

> I wonder... are you sure this has no relation with the size of rt_hash_locks 
> / 
> RT_HASH_LOCK_SZ ?
> One entry must have the same lock in the two tables when resizing is in 
> flight.
> #define MIN_RTHASH_SHIFT LOG2(RT_HASH_LOCK_SZ)

Good point.

> > +static struct rt_hash_bucket *rthash_alloc(unsigned int sz)
> > +{
> > +   struct rt_hash_bucket *n;
> > +
> > +   if (sz <= PAGE_SIZE)
> > +   n = kmalloc(sz, GFP_KERNEL);
> > +   else if (hashdist)
> > +   n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL);
> > +   else
> > +   n = (struct rt_hash_bucket *)
> > +   __get_free_pages(GFP_KERNEL, get_order(sz));
> 
> I dont feel well with this.
> Maybe we could try a __get_free_pages(), and in case of failure, fallback to 
> vmalloc(). Then keep a flag to be able to free memory correctly. Anyway, if 
> (get_order(sz)>=MAX_ORDER) we know __get_free_pages() will fail.

We have to use vmalloc() for the hashdist case so that the pages
are spread out properly on NUMA systems.  That's exactly what the
large system hash allocator is going to do on bootup anyways.

Look, either both are right or both are wrong.  I'm just following
protocol above and you'll note the PRECISE same logic exists in other
dynamically growing hash table implementations such as
net/xfrm/xfrm_hash.c

> Could you add const qualifiers to 'struct rt_hash *' in prototypes where 
> appropriate ?

Sure, no problem.

> Maybe that for small tables (less than PAGE_SIZE/2), we could embed them in 
> 'struct rt_hash'

Not worth the pain nor the in-kernel-image-space it would chew up,
in my opinion.

After you visit a handful of web sites you'll get beyond that
threshold.

> Could we group all static vars at the begining of this file, so that we 
> clearly see where we should place them, to avoid false sharing.

Sure.

> > +
> > +static void rt_hash_resize(unsigned int new_shift)

Damn, please don't quote such huge portions of a patch without any
comments, this has to go out to several thousand recipients you know
:-/
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH]: Dynamically sized routing cache hash table.

2007-03-05 Thread Eric Dumazet

David Miller a écrit :

This is essentially a "port" of Nick Piggin's dcache hash table
patches to the routing cache.  It solves the locking issues
during table grow/shrink that I couldn't handle properly last
time I tried to code up a patch like this.

But one of the core issues of this kind of change still remains.
There is a conflict between the desire of routing cache garbage
collection to reach a state of equilibrium and the hash table
grow code's desire to match the table size to the current state
of affairs.

Actually, more accurately, the conflict exists in how this GC
logic is implemented.  The core issue is that hash table size
guides the GC processing, and hash table growth therefore
modifies those GC goals.  So with the patch below we'll just
keep growing the hash table instead of giving GC some time to
try to keep the working set in equilibrium before doing the
hash grow.

One idea is to put the hash grow check in the garbage collector,
and put the hash shrink check in rt_del().

In fact, it would be a good time to perhaps hack up some entirely
new passive GC logic for the routing cache.

BTW, another thing that plays into this is that Robert's TRASH work
could make this patch not necessary :-)


Well, maybe... but after looking robert's trash, I discovered its model is 
essentially a big (2^18 slots) root node (our hash table), and very few 
order:1,2,3 nodes.


Almost all leaves... work in progress anyway.

Please find my comments in your patch


Finally, I know that (due to some of Nick's helpful comments the
other day) that I'm missing some rcu_assign_pointer()'s in here.
Fixes in this area are most welcome.

This patch passes basic testing on UP sparc64, but please handle
with care :)

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 0b3d7bf..57e004a 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -92,6 +92,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -242,28 +245,195 @@ static spinlock_t*rt_hash_locks;
 # define rt_hash_lock_init()
 #endif
 
-static struct rt_hash_bucket 	*rt_hash_table;

-static unsignedrt_hash_mask;
-static int rt_hash_log;
-static unsigned intrt_hash_rnd;
+#define MIN_RTHASH_SHIFT 4


I wonder... are you sure this has no relation with the size of rt_hash_locks / 
RT_HASH_LOCK_SZ ?

One entry must have the same lock in the two tables when resizing is in flight.
#define MIN_RTHASH_SHIFT LOG2(RT_HASH_LOCK_SZ)


+#if BITS_PER_LONG == 32
+#define MAX_RTHASH_SHIFT 24
+#else
+#define MAX_RTHASH_SHIFT 30
+#endif
+
+struct rt_hash {
+   struct rt_hash_bucket   *table;
+   unsigned intmask;
+   unsigned intlog;
+};
+
+struct rt_hash *rt_hash __read_mostly;
+struct rt_hash *old_rt_hash __read_mostly;
+static unsigned int rt_hash_rnd __read_mostly;
+static DEFINE_SEQLOCK(resize_transfer_lock);
+static DEFINE_MUTEX(resize_mutex);


I think a better model would be a structure, with a part containing 'read 
mostly' data, and part of 'higly modified' data with appropriate align_to_cache


For example, resize_transfer_lock should be in the first part, like rt_hash 
and old_rt_hash, dont you think ?


All static data of this file should be placed on this single structure so that 
we can easily avoid false sharing and have optimal placement.


 
 static DEFINE_PER_CPU(struct rt_cache_stat, rt_cache_stat);

 #define RT_CACHE_STAT_INC(field) \
(__raw_get_cpu_var(rt_cache_stat).field++)
 
-static int rt_intern_hash(unsigned hash, struct rtable *rth,

-   struct rtable **res);
+static void rt_hash_resize(unsigned int new_shift);
+static void check_nr_rthash(void)
+{
+   unsigned int sz = rt_hash->mask + 1;
+   unsigned int nr = atomic_read(&ipv4_dst_ops.entries);
+
+   if (unlikely(nr > (sz + (sz >> 1
+   rt_hash_resize(rt_hash->log + 1);
+   else if (unlikely(nr < (sz >> 1)))
+   rt_hash_resize(rt_hash->log - 1);
+}
 
-static unsigned int rt_hash_code(u32 daddr, u32 saddr)

+static struct rt_hash_bucket *rthash_alloc(unsigned int sz)
+{
+   struct rt_hash_bucket *n;
+
+   if (sz <= PAGE_SIZE)
+   n = kmalloc(sz, GFP_KERNEL);
+   else if (hashdist)
+   n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL);
+   else
+   n = (struct rt_hash_bucket *)
+   __get_free_pages(GFP_KERNEL, get_order(sz));


I dont feel well with this.
Maybe we could try a __get_free_pages(), and in case of failure, fallback to 
vmalloc(). Then keep a flag to be able to free memory correctly. Anyway, if 
(get_order(sz)>=MAX_ORDER) we know __get_free_pages() will fail.





+
+   if (n)
+   memset(n, 0, sz);
+
+   return n;
+}
+
+static void rthash_free(struct rt_hash_bucket *r, unsigned int sz)
+{
+   if (sz <= PAGE_SIZE)
+   

Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Matt Mackall
On Mon, Mar 05, 2007 at 08:03:50PM -0800, Greg KH wrote:
> On Mon, Mar 05, 2007 at 09:39:47PM -0600, Matt Mackall wrote:
> > On Mon, Mar 05, 2007 at 06:48:50PM -0800, Greg KH wrote:
> > > If so, can you disable the option and strace it to see what program is
> > > trying to access what?  That will put the
> > > HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty
> > > quickly :)
> > 
> > Ok, I've got straces of both good and bad (>5M each). Filtered out
> > random pointer values and the like, diffed, and filtered for /sys/,
> > and the result's still 1.5M. What should I be looking for?
> 
> Failures when trying to read from /sys/class/net/
> 
> Or opening the directory and iterating over the subdirs in there.  Or
> something like that.
> 
> But the /sys/class/net/ stuff should hopefully help narrow it down.

Works:

6857  open("/sys/class/net",
O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 13
6857  fstat64(13, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
6857  fcntl64(13, F_SETFD, FD_CLOEXEC)  = 0
6857  getdents64(13, /* 5 entries */, 4096) = 120
6857  readlink("/sys/class/net/eth1", 0x80a2450, 256) = -1 EINVAL
(Invalid argument)
6857  readlink("/sys/class/net/eth1/device",
"../../../devices/pci:00/:00:1e.0/:02:02.0", 256) = 53
6857  readlink("/sys/class/net/lo", 0x80a2450, 256) = -1 EINVAL
(Invalid argument)
6857  readlink("/sys/class/net/lo/device", 0x80a2450, 256) = -1 ENOENT
(No such
file or directory)
6857  readlink("/sys/class/net/eth0", 0x80a2450, 256) = -1 EINVAL
(Invalid argument)
6857  readlink("/sys/class/net/eth0/device",
"../../../devices/pci:00/:00:1e.0/:02:01.0", 256) = 53
6857  getdents64(13, /* 0 entries */, 4096) = 0
6857  close(13) = 0

Breaks:

3620  open("/sys/class/net",
O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 13
3620  fstat64(13, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
3620  fcntl64(13, F_SETFD, FD_CLOEXEC)  = 0
3620  getdents64(13, /* 5 entries */, 4096) = 120
3620  readlink("/sys/class/net/eth1",
"../../devices/pci:00/:00:1e.0/00\00:02:02.0/eth1", 256) = 55
3620
readlink("/sys/devices/pci:00/:00:1e.0/:02:02.0/eth1/device",
0x809e910, 256) = -1 ENOENT (No such file or directory)
3620  readlink("/sys/class/net/lo", "../../devices/virtual/net/lo",
256) = 28
3620  readlink("/sys/devices/virtual/net/lo/device", 0x809e960, 256) =
-1 ENOEN\T (No such file or directory)
3620  readlink("/sys/class/net/eth0",
"../../devices/pci:00/:00:1e.0/00\00:02:01.0/eth0", 256) = 55
3620
readlink("/sys/devices/pci:00/:00:1e.0/:02:01.0/eth0/device",
0x809e960, 256) = -1 ENOENT (No such file or directory)
3620  getdents64(13, /* 0 entries */, 4096) = 0
3620  close(13) = 0

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] LVS: Send ICMP unreachable responses to end-users when real-servers are removed

2007-03-05 Thread David Miller
From: Horms <[EMAIL PROTECTED]>
Date: Sun, 11 Feb 2007 12:04:43 +0900

> this is a small patch by  Janusz Krzysztofik to ip_route_output_slow()
> that allows VIP-less LVS linux director to generate packets originating
> >From VIP if sysctl_ip_nonlocal_bind is set.
> 
> In a nutshell, the intention is for an LVS linux director to be able
> to send ICMP unreachable responses to end-users when real-servers are
> removed.
> 
> http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00106.html
> 
> I'm not really sure about the correctness of this approach,
> so I am sending it here to netdev for review
> 
> Cc: Janusz Krzysztofik <[EMAIL PROTECTED]>
> Signed-off-by: Simon Horman <[EMAIL PROTECTED]>

I'm not against this patch or the idea, I just want to
think about it some more to make sure there are not bad
unintended side effects to allowing this.

If someone else could provide some feedback or comments,
I'd very much appreciate that as well.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Arp announce (for Xen)

2007-03-05 Thread David Miller
From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Thu, 1 Mar 2007 17:30:30 -0800

> What about implementing the unused arp_announce flag on the inetdevice?
> Something like the following.  Totally untested...
> 
> Looks like it either was there (and got removed) or was planned but
> never implemented.

This idea is fine.  But:

> + case NETDEV_CHANGEADDR:
> + /* Send gratuitous ARP in case of address change or new device 
> */
> + if (IN_DEV_ARP_ANNOUNCE(in_dev))
> + arp_send(ARPOP_REQUEST, ETH_P_ARP,
> +  in_dev->ifa_list->ifa_address, dev,
> +  in_dev->ifa_list->ifa_address, NULL, 
> +  dev->dev_addr, NULL);

We'll need to make sure the appropriate 'arp_anounce' address
selection is employed here.

One idea is to change arp_solicit() such that it can be invoked in
this context, or provide a new helper function which will do the
source address selection rules of 'arp_announce' and then invoke
arp_send() as appropriate for us.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH]: Dynamically sized routing cache hash table.

2007-03-05 Thread David Miller

This is essentially a "port" of Nick Piggin's dcache hash table
patches to the routing cache.  It solves the locking issues
during table grow/shrink that I couldn't handle properly last
time I tried to code up a patch like this.

But one of the core issues of this kind of change still remains.
There is a conflict between the desire of routing cache garbage
collection to reach a state of equilibrium and the hash table
grow code's desire to match the table size to the current state
of affairs.

Actually, more accurately, the conflict exists in how this GC
logic is implemented.  The core issue is that hash table size
guides the GC processing, and hash table growth therefore
modifies those GC goals.  So with the patch below we'll just
keep growing the hash table instead of giving GC some time to
try to keep the working set in equilibrium before doing the
hash grow.

One idea is to put the hash grow check in the garbage collector,
and put the hash shrink check in rt_del().

In fact, it would be a good time to perhaps hack up some entirely
new passive GC logic for the routing cache.

BTW, another thing that plays into this is that Robert's TRASH work
could make this patch not necessary :-)

Finally, I know that (due to some of Nick's helpful comments the
other day) that I'm missing some rcu_assign_pointer()'s in here.
Fixes in this area are most welcome.

This patch passes basic testing on UP sparc64, but please handle
with care :)

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 0b3d7bf..57e004a 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -92,6 +92,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -242,28 +245,195 @@ static spinlock_t*rt_hash_locks;
 # define rt_hash_lock_init()
 #endif
 
-static struct rt_hash_bucket   *rt_hash_table;
-static unsignedrt_hash_mask;
-static int rt_hash_log;
-static unsigned intrt_hash_rnd;
+#define MIN_RTHASH_SHIFT 4
+#if BITS_PER_LONG == 32
+#define MAX_RTHASH_SHIFT 24
+#else
+#define MAX_RTHASH_SHIFT 30
+#endif
+
+struct rt_hash {
+   struct rt_hash_bucket   *table;
+   unsigned intmask;
+   unsigned intlog;
+};
+
+struct rt_hash *rt_hash __read_mostly;
+struct rt_hash *old_rt_hash __read_mostly;
+static unsigned int rt_hash_rnd __read_mostly;
+static DEFINE_SEQLOCK(resize_transfer_lock);
+static DEFINE_MUTEX(resize_mutex);
 
 static DEFINE_PER_CPU(struct rt_cache_stat, rt_cache_stat);
 #define RT_CACHE_STAT_INC(field) \
(__raw_get_cpu_var(rt_cache_stat).field++)
 
-static int rt_intern_hash(unsigned hash, struct rtable *rth,
-   struct rtable **res);
+static void rt_hash_resize(unsigned int new_shift);
+static void check_nr_rthash(void)
+{
+   unsigned int sz = rt_hash->mask + 1;
+   unsigned int nr = atomic_read(&ipv4_dst_ops.entries);
+
+   if (unlikely(nr > (sz + (sz >> 1
+   rt_hash_resize(rt_hash->log + 1);
+   else if (unlikely(nr < (sz >> 1)))
+   rt_hash_resize(rt_hash->log - 1);
+}
 
-static unsigned int rt_hash_code(u32 daddr, u32 saddr)
+static struct rt_hash_bucket *rthash_alloc(unsigned int sz)
+{
+   struct rt_hash_bucket *n;
+
+   if (sz <= PAGE_SIZE)
+   n = kmalloc(sz, GFP_KERNEL);
+   else if (hashdist)
+   n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL);
+   else
+   n = (struct rt_hash_bucket *)
+   __get_free_pages(GFP_KERNEL, get_order(sz));
+
+   if (n)
+   memset(n, 0, sz);
+
+   return n;
+}
+
+static void rthash_free(struct rt_hash_bucket *r, unsigned int sz)
+{
+   if (sz <= PAGE_SIZE)
+   kfree(r);
+   else if (hashdist)
+   vfree(r);
+   else
+   free_pages((unsigned long)r, get_order(sz));
+}
+
+static unsigned int rt_hash_code(struct rt_hash *hashtable,
+u32 daddr, u32 saddr)
 {
return (jhash_2words(daddr, saddr, rt_hash_rnd)
-   & rt_hash_mask);
+   & hashtable->mask);
 }
 
-#define rt_hash(daddr, saddr, idx) \
-   rt_hash_code((__force u32)(__be32)(daddr),\
+#define rt_hashfn(htab, daddr, saddr, idx) \
+   rt_hash_code(htab, (__force u32)(__be32)(daddr),\
 (__force u32)(__be32)(saddr) ^ ((idx) << 5))
 
+static unsigned int resize_new_shift;
+
+static void rt_hash_resize_work(struct work_struct *work)
+{
+   struct rt_hash *new_hash, *old_hash;
+   unsigned int new_size, old_size, transferred;
+   int i;
+
+   if (!mutex_trylock(&resize_mutex))
+   goto out;
+
+   new_hash = kmalloc(sizeof(struct rt_hash), GFP_KERNEL);
+   if (!new_hash)
+   goto out_unlock;
+
+   new_hash->log = resize_new_shift;
+   new_size = 1 << new_hash->log;
+   new_hash->mask = new_siz

Re: [PATCH] natsemi: netpoll fixes

2007-03-05 Thread Mark Huth

Mark Brown wrote:

[Once more with CCs]

On Tue, Mar 06, 2007 at 12:10:08AM +0400, Sergei Shtylyov wrote:

  

 #ifdef CONFIG_NET_POLL_CONTROLLER
 static void natsemi_poll_controller(struct net_device *dev)
 {
+ struct netdev_private *np = netdev_priv(dev);
+
  disable_irq(dev->irq);
- intr_handler(dev->irq, dev);
+
+ /*
+  * A real interrupt might have already reached us at this point
+  * but NAPI might still haven't called us back.  As the
interrupt
+  * status register is cleared by reading, we should prevent an
+  * interrupt loss in this case...
+  */
+ if (!np->intr_status)
+ intr_handler(dev->irq, dev);
+
  enable_irq(dev->irq);



Is it possible for this to run at the same time as the NAPI poll?  If so
then it is possible for the netpoll poll to run between np->intr_status
being cleared and netif_rx_complete() being called.  If the hardware
asserts an interrupt at the wrong moment then this could cause the
  
Well, there is a whole task of analyzing the netpoll conditions under 
smp.  There appears to me to be a race with netpoll and NAPI on another 
processor, given that netpoll can be called with virtually any system 
condition on a debug breakpoint or crash dump initiation.  I'm spending 
some time looking into it, but don't have a smoking gun immediately.  
Regardless, if such a condition does exist, it is shared across many or 
all of the potential netpolled devices.  Since that is exactly the 
condition  the suggested patch purports to solve, it is pointless if the 
whole NAPI/netpoll race exists.  Such a race would lead to various and 
imaginative failures in the system.  So don't fix that problem in a 
particular driver.  If it exists, fix it generally in the netpoll/NAPI 
infrastructure.

In any case, this is a problem independently of netpoll if the chip
shares an interrupt with anything so the interrupt handler should be
fixed to cope with this situation instead.
  
Yes, that would appear so.  If an interrupt line is shared with this 
device, then the interrupt handler can be called again, even though the 
device's interrupts are disabled on the interface.  So, in the actual 
interrupt handler, check the dev->state __LINK_STATE_SCHED flag - if 
it's set, leave immediately, it can't be our interrupt. If it's clear, 
read the irq enable hardware register.  If enabled, do the rest of the 
interrupt handler. Since the isr is disabled only by the interrupt 
handler, and enabled only by the poll routine, the race on the interrupt 
cause register is prevented.  And, as a byproduct, the netpoll race is 
also prevented.  You could just always read the isr enable hardware 
register, but that means you always do an operation to the chip, which 
can be painfully slow.  I guess the tradeoff depends on the probability 
of getting the isr called when NAPI is active for the device.


If this results in netpoll not getting a packet right away, that's okay, 
since the netpoll users should try again.


Mark Huth

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 09:39:47PM -0600, Matt Mackall wrote:
> On Mon, Mar 05, 2007 at 06:48:50PM -0800, Greg KH wrote:
> > If so, can you disable the option and strace it to see what program is
> > trying to access what?  That will put the
> > HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty
> > quickly :)
> 
> Ok, I've got straces of both good and bad (>5M each). Filtered out
> random pointer values and the like, diffed, and filtered for /sys/,
> and the result's still 1.5M. What should I be looking for?

Failures when trying to read from /sys/class/net/

Or opening the directory and iterating over the subdirs in there.  Or
something like that.

But the /sys/class/net/ stuff should hopefully help narrow it down.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Matt Mackall
On Mon, Mar 05, 2007 at 06:48:50PM -0800, Greg KH wrote:
> If so, can you disable the option and strace it to see what program is
> trying to access what?  That will put the
> HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty
> quickly :)

Ok, I've got straces of both good and bad (>5M each). Filtered out
random pointer values and the like, diffed, and filtered for /sys/,
and the result's still 1.5M. What should I be looking for?

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: when having to acquire an SA, ipsec drops the packet

2007-03-05 Thread James Morris
On Mon, 5 Mar 2007, Joy Latten wrote:

> 5. Around the time the set of SAs for OUT direction are to be
>inserted into SAD, I see another ACQUIRE happening.
>
>I have not yet figured out where this second ACQUIRE comes from
>and why it happens. As long as the minimal SA or set of valid outgoing
>SAs exist in SAD, an ACQUIRE should not happen.

I saw something similar to this some time ago when testing various 
failure modes, and discused it with Herbert.

IIRC, there's a larval SA which is not torn down properly by Racoon once 
the full SA is established, and the larval SA keeps resending until it 
times out.



- James
-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Matt Mackall
On Mon, Mar 05, 2007 at 06:48:50PM -0800, Greg KH wrote:
 
> Wait, have confirmed that if you enable this config option,
> NetworkManager starts back up again and works properly?

Yep, probably should have mentioned that.

> If so, can you disable the option and strace it to see what program is
> trying to access what?  That will put the
> HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty
> quickly :)

Did that a few hours ago, got a very large dump from both programs. No
smoking guns to my eye, but I'll send you the logs later.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Matthew Garrett
On Mon, Mar 05, 2007 at 02:39:00PM -0800, Greg KH wrote:

> Ok, I only named HAL as that is what people have told me the problem is.
> I have been running this change on my boxs, without
> CONFIG_SYSFS_DEPRECATED since last July or so.
> 
> But I don't use NetworkManager here for the most part, but I have tried
> this in the OpenSuse10.3 alpha releases and it seems to work just fine
> with whatever version of NetworkManager it uses.

At a guess, you're carrying either a git snapshot or have backports from 
git. Several distributions do this, but until there's actually been a 
released version that works, it's a bit early to set a timescale.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 07:30:21PM -0600, Matt Mackall wrote:
> On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote:
> > On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
> > > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
> > > > 
> > > > Ok, how about the following patch.  Is it acceptable to everyone?
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > > 
> > > > ---
> > > >  init/Kconfig |   13 +++--
> > > >  1 file changed, 11 insertions(+), 2 deletions(-)
> > > > 
> > > > --- gregkh-2.6.orig/init/Kconfig
> > > > +++ gregkh-2.6/init/Kconfig
> > > > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
> > > >   that belong to a class, back into the /sys/class heirachy, in
> > > >   order to support older versions of udev.
> > > >  
> > > > - If you are using a distro that was released in 2006 or later,
> > > > - it should be safe to say N here.
> > > > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
> > > > + release from 2007 or later, it should be safe to say N here.
> > > > +
> > > > + If you are using Debian or other distros that are slow to
> > > > + update HAL, please say Y here.
> > > >...
> > > 
> > > The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
> > > for all users, and schedule it's removal for mid-2008 (or later).
> > > 
> > > 12 months after the first _release_ of a HAL that can live without seems 
> > > to be the first time when we can consider getting rid of it, since all 
> > > distributions with at least one release a year should ship it by then.
> > > 
> > > Currently, SYSFS_DEPRECATED is only a trap for users.
> > 
> > Huh?
> > 
> > No, again, I've been using this just fine for about 6 months now.
> > 
> > And what about all of the servers not using HAL/NetworkManager?
> > And what about all of the embedded systems not using either?
> > 
> > So to not allow this to be turned off by people who might want to (we
> > want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
> > other distros released this year), is pretty heavy-handed.
> > 
> > It also will work in OpenSuSE 10.2 which is already released, and I
> > think Fedora 6, but I've only limited experience with these.
> > 
> > Oh, and Gentoo works just fine, and has been for the past 6 months.
> >
> > I would just prefer to come up with an acceptable set of wording that
> > will work to properly warn people.
> > 
> > I proposed one such wording which some people took as a slam against
> > Debian, which it really was not at all.
> > 
> > Does someone else want to propose some other wording instead?
> 
> Back up a bit. Let's review:
> 
> Problem: NetworkManager stopped working with my ipw2200 on Debian/unstable

Wait, have confirmed that if you enable this config option,
NetworkManager starts back up again and works properly?

If so, can you disable the option and strace it to see what program is
trying to access what?  That will put the
HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty
quickly :)

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] pcnet32: only allocate init_block dma consistent

2007-03-05 Thread Don Fry
The patch below moves the init_block out of the private struct and
only allocates init block with pci_alloc_consistent.

This has two effects:

1. Performance increase for non cache coherent machines, because the
   CPU only data in the private struct are now cached

2. locks are working now for platforms, which need to have locks
   in cached memory

Also use netdev_priv() instead of dev->priv

Signed-off-by: Thomas Bogendoerfer <[EMAIL PROTECTED]>
Acked-by: Don Fry <[EMAIL PROTECTED]>
---

diff --git a/drivers/net/pcnet32.c b/drivers/net/pcnet32.c
index 36f9d98..8498c3b 100644
--- a/drivers/net/pcnet32.c
+++ b/drivers/net/pcnet32.c
@@ -253,12 +253,12 @@ struct pcnet32_access {
  * so the structure should be allocated using pci_alloc_consistent().
  */
 struct pcnet32_private {
-   struct pcnet32_init_block init_block;
+   struct pcnet32_init_block *init_block;
/* The Tx and Rx ring entries must be aligned on 16-byte boundaries in 
32bit mode. */
struct pcnet32_rx_head  *rx_ring;
struct pcnet32_tx_head  *tx_ring;
-   dma_addr_t  dma_addr;/* DMA address of beginning of this
-  object, returned by pci_alloc_consistent */
+   dma_addr_t  init_dma_addr;/* DMA address of beginning of 
the init block,
+  returned by pci_alloc_consistent */
struct pci_dev  *pci_dev;
const char  *name;
/* The saved address of a sent-in-place packet/buffer, for skfree(). */
@@ -653,7 +653,7 @@ static void pcnet32_realloc_rx_ring(struct net_device *dev,
 
 static void pcnet32_purge_rx_ring(struct net_device *dev)
 {
-   struct pcnet32_private *lp = dev->priv;
+   struct pcnet32_private *lp = netdev_priv(dev);
int i;
 
/* free all allocated skbuffs */
@@ -681,7 +681,7 @@ static void pcnet32_poll_controller(struct net_device *dev)
 
 static int pcnet32_get_settings(struct net_device *dev, struct ethtool_cmd 
*cmd)
 {
-   struct pcnet32_private *lp = dev->priv;
+   struct pcnet32_private *lp = netdev_priv(dev);
unsigned long flags;
int r = -EOPNOTSUPP;
 
@@ -696,7 +696,7 @@ static int pcnet32_get_settings(struct net_device *dev, 
struct ethtool_cmd *cmd)
 
 static int pcnet32_set_settings(struct net_device *dev, struct ethtool_cmd 
*cmd)
 {
-   struct pcnet32_private *lp = dev->priv;
+   struct pcnet32_private *lp = netdev_priv(dev);
unsigned long flags;
int r = -EOPNOTSUPP;
 
@@ -711,7 +711,7 @@ static int pcnet32_set_settings(struct net_device *dev, 
struct ethtool_cmd *cmd)
 static void pcnet32_get_drvinfo(struct net_device *dev,
struct ethtool_drvinfo *info)
 {
-   struct pcnet32_private *lp = dev->priv;
+   struct pcnet32_private *lp = netdev_priv(dev);
 
strcpy(info->driver, DRV_NAME);
strcpy(info->version, DRV_VERSION);
@@ -723,7 +723,7 @@ static void pcnet32_get_drvinfo(struct net_device *dev,
 
 static u32 pcnet32_get_link(struct net_device *dev)
 {
-   struct pcnet32_private *lp = dev->priv;
+   struct pcnet32_private *lp = netdev_priv(dev);
unsigned long flags;
int r;
 
@@ -743,19 +743,19 @@ static u32 pcnet32_get_link(struct net_device *dev)
 
 static u32 pcnet32_get_msglevel(struct net_device *dev)
 {
-   struct pcnet32_private *lp = dev->priv;
+   struct pcnet32_private *lp = netdev_priv(dev);
return lp->msg_enable;
 }
 
 static void pcnet32_set_msglevel(struct net_device *dev, u32 value)
 {
-   struct pcnet32_private *lp = dev->priv;
+   struct pcnet32_private *lp = netdev_priv(dev);
lp->msg_enable = value;
 }
 
 static int pcnet32_nway_reset(struct net_device *dev)
 {
-   struct pcnet32_private *lp = dev->priv;
+   struct pcnet32_private *lp = netdev_priv(dev);
unsigned long flags;
int r = -EOPNOTSUPP;
 
@@ -770,7 +770,7 @@ static int pcnet32_nway_reset(struct net_device *dev)
 static void pcnet32_get_ringparam(struct net_device *dev,
  struct ethtool_ringparam *ering)
 {
-   struct pcnet32_private *lp = dev->priv;
+   struct pcnet32_private *lp = netdev_priv(dev);
 
ering->tx_max_pending = TX_MAX_RING_SIZE;
ering->tx_pending = lp->tx_ring_size;
@@ -781,7 +781,7 @@ static void pcnet32_get_ringparam(struct net_device *dev,
 static int pcnet32_set_ringparam(struct net_device *dev,
 struct ethtool_ringparam *ering)
 {
-   struct pcnet32_private *lp = dev->priv;
+   struct pcnet32_private *lp = netdev_priv(dev);
unsigned long flags;
unsigned int size;
ulong ioaddr = dev->base_addr;
@@ -847,7 +847,7 @@ static int pcnet32_self_test_count(struct net_device *dev)
 static void pcnet32_ethtool_test(struct net_device *dev,
 struct ethtool_test *test, u64 * data)
 {
-   struct pcnet32_private *lp = dev->p

[PATCH ] pcnet32: Fix PCnet32 performance bug on non-coherent architecutres

2007-03-05 Thread Don Fry
The PCnet32 driver always passed the the size of the largest possible packet
to the pci_dma_sync_single_for_cpu and pci_dma_sync_single_for_device.
This results in a fairly large "colateral damage" in the caches and makes
the flush operation itself much slower.  On a system with a 40MHz CPU this
patch increases network bandwidth by about 12%.

Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]>
Acked-by: Don Fry <[EMAIL PROTECTED]>

diff --git a/drivers/net/pcnet32.c b/drivers/net/pcnet32.c
index 36f9d98..4d94ba7 100644
--- a/drivers/net/pcnet32.c
+++ b/drivers/net/pcnet32.c
@@ -1234,14 +1234,14 @@ static void pcnet32_rx_entry(struct net_device *dev,
skb_put(skb, pkt_len);  /* Make room */
pci_dma_sync_single_for_cpu(lp->pci_dev,
lp->rx_dma_addr[entry],
-   PKT_BUF_SZ - 2,
+   pkt_len,
PCI_DMA_FROMDEVICE);
eth_copy_and_sum(skb,
 (unsigned char *)(lp->rx_skbuff[entry]->data),
 pkt_len, 0);
pci_dma_sync_single_for_device(lp->pci_dev,
   lp->rx_dma_addr[entry],
-  PKT_BUF_SZ - 2,
+  pkt_len,
   PCI_DMA_FROMDEVICE);
}
lp->stats.rx_bytes += skb->len;

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: when having to acquire an SA, ipsec drops the packet

2007-03-05 Thread Joy Latten
>From: Joy Latten <[EMAIL PROTECTED]>
>Date: Mon, 05 Feb 2007 14:53:39 -0600
>
>> I can run some tests with this patch and report any results... 
>
>Please check out the two most recent patches I posted:
>
>1) Updated core patch with ipv6 side added.
>2) Fix for thinko noticed by Venkat.

I have been testing this a lot in the lspp kernel.
Plan to test also in upstream kernel.
I am seeing a second ACQUIRE occur while establishing the SAs.

My scenario:
My policy states to use both the ESP and AH protocols (may not
make much sense but this was for testing purposes).  I get double 
SAs with only difference being SPI.

Here is what I see happening... 

1. Trigger first ACQUIRE via ping or netperf.

2. xfrm_lookup() calls xfrm_tmpl_resolv() who calls xfrm_state_find().
   First time around, we need to establish SA, so a minimal SA
   get allocated and put in SAD, timer is set for the minimal SA
   to be ACQUIRED and km_query() gets called.
   
3. xfrm_tmpl_resolv() returns -EAGAIN causing add_wait_queue(&km_waitq, &wait)
   and proceeding code to get called waiting for SA to be established.
   As long as the minimal SA with XFRM_STATE_ACQUIRE is in SAD,
   we keep waiting...
   
4. First set of SAs (one for AH and ESP) for IN direction get inserted in SAD.
 
5. Around the time the set of SAs for OUT direction are to be
   inserted into SAD, I see another ACQUIRE happening.
   
   I have not yet figured out where this second ACQUIRE comes from
   and why it happens. As long as the minimal SA or set of valid outgoing
   SAs exist in SAD, an ACQUIRE should not happen.
   The minimal SA does not get removed from the SAD until the set 
   of SAs for OUT get added and the xfrm_state_lock
   released. And the lock pretty much guarantees no one else can step
   through the SAD until after new SAs are being added...
   and if someone gets the lock to step though SAD before OUT SAs
   are added, minimal SA is still there... 

 6. Since this second ACQUIRE was able to happen, result is identical
sets of SAs for the traffic stream. SPIs are only difference.
 
 7. Noticed something while pasting log info below.
Perhaps when outgoing AH SA is added, wake_up(&km_waitq) gets called, 
lock released, and minimal SA deleted (xfrm_state_add()), 
xfrm_tmpl_resolv() is called and it looks first for the outgoing
ESP SA. Since it is not there yet and no minimal SA, then km_query()
results in an ACQUIRE just before the outgoing ESP SA gets added.

It would explain why I only see it when both ESP and AH are specified...
that is if I am thinking correctly... 

Regards,
Joy Latten

>From my log file:

Mar  5 19:10:02 racoon: INFO: initiate new phase 2 negotiation: 
9.3.192.210[500]<=>9.3.189.55[500]
Mar  5 19:10:03 racoon: INFO: IPsec-SA established: AH/Transport 
9.3.189.55[0]->9.3.192.210[0] spi=137942922(0x838d78a)
Mar  5 19:10:03 racoon: INFO: IPsec-SA established: ESP/Transport 
9.3.189.55[0]->9.3.192.210[0] spi=244321490(0xe900cd2)
Mar  5 19:10:03 racoon: INFO: IPsec-SA established: AH/Transport 
9.3.192.210[0]->9.3.189.55[0] spi=38721750(0x24ed8d6)
Mar  5 19:10:03 racoon: INFO: initiate new phase 2 negotiation: 
9.3.192.210[500]<=>9.3.189.55[500]
Mar  5 19:10:03 racoon: INFO: IPsec-SA established: ESP/Transport 
9.3.192.210[0]->9.3.189.55[0] spi=265079770(0xfcccbda)
Mar  5 19:10:05 racoon: INFO: IPsec-SA established: AH/Transport 
9.3.189.55[0]->9.3.192.210[0] spi=108627618(0x67986a2)
Mar  5 19:10:05 racoon: INFO: IPsec-SA established: ESP/Transport 
9.3.189.55[0]->9.3.192.210[0] spi=182973856(0xae7f5a0)
Mar  5 19:10:05 racoon: INFO: IPsec-SA established: AH/Transport 
9.3.192.210[0]->9.3.189.55[0] spi=58486297(0x37c6e19)
Mar  5 19:10:05 racoon: INFO: IPsec-SA established: ESP/Transport 
9.3.192.210[0]->9.3.189.55[0] spi=268295215(0xffddc2f)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 07:30:21PM -0600, Matt Mackall wrote:
> On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote:
> > On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
> > > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
> > > > 
> > > > Ok, how about the following patch.  Is it acceptable to everyone?
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > > 
> > > > ---
> > > >  init/Kconfig |   13 +++--
> > > >  1 file changed, 11 insertions(+), 2 deletions(-)
> > > > 
> > > > --- gregkh-2.6.orig/init/Kconfig
> > > > +++ gregkh-2.6/init/Kconfig
> > > > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
> > > >   that belong to a class, back into the /sys/class heirachy, in
> > > >   order to support older versions of udev.
> > > >  
> > > > - If you are using a distro that was released in 2006 or later,
> > > > - it should be safe to say N here.
> > > > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
> > > > + release from 2007 or later, it should be safe to say N here.
> > > > +
> > > > + If you are using Debian or other distros that are slow to
> > > > + update HAL, please say Y here.
> > > >...
> > > 
> > > The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
> > > for all users, and schedule it's removal for mid-2008 (or later).
> > > 
> > > 12 months after the first _release_ of a HAL that can live without seems 
> > > to be the first time when we can consider getting rid of it, since all 
> > > distributions with at least one release a year should ship it by then.
> > > 
> > > Currently, SYSFS_DEPRECATED is only a trap for users.
> > 
> > Huh?
> > 
> > No, again, I've been using this just fine for about 6 months now.
> > 
> > And what about all of the servers not using HAL/NetworkManager?
> > And what about all of the embedded systems not using either?
> > 
> > So to not allow this to be turned off by people who might want to (we
> > want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
> > other distros released this year), is pretty heavy-handed.
> > 
> > It also will work in OpenSuSE 10.2 which is already released, and I
> > think Fedora 6, but I've only limited experience with these.
> > 
> > Oh, and Gentoo works just fine, and has been for the past 6 months.
> >
> > I would just prefer to come up with an acceptable set of wording that
> > will work to properly warn people.
> > 
> > I proposed one such wording which some people took as a slam against
> > Debian, which it really was not at all.
> > 
> > Does someone else want to propose some other wording instead?
> 
> Back up a bit. Let's review:
> 
> Problem: NetworkManager stopped working with my ipw2200 on Debian/unstable
> 
> Theory A: It broke because I'm not running an as-yet-unreleased HAL.
> 
>  Then we should revert the patch pronto because it's an unqualified
>  regression.
> 
> Theory B: It broke because I'm not running relatively recent HAL.
> 
>  By all accounts I'm running the latest and greatest HAL and Network
>  Manager, more than recent enough to work.
> 
> Theory C: It broke because I've got some goofy config.
> 
>  My setup passes no arguments to either. The HAL config file is
>  completely bare-bones and there's no sign of any configuration files
>  for Network Manager.
> 
> Theory D: It broke for some nebulous Debian-related reason.
> 
>  That's a bunch of unhelpful crap.
> 

> Can we come up with an actual theory for what's wrong with my setup, please?
> Like, perhaps:
> 
> Theory E: There's some undiagnosed new breakage that this introduces
> that no else hit until it went into mainline.

Theory F:  It broke because you are using NetworkManager for your
network devices and the patches that fix this have not made it into a
real release?

I'm just guessing, but does anyone who is having this problem, NOT using
NetworkManager?

I'm running an old version of HAL just fine, but I'm not using
NetworkManager here.

I am using NetworkManager on a OpenSuSE 10.3 release, but suse's version
of NetworkManager is well known to not be anywhere near what is released
as a tarball :(

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Matt Mackall
On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote:
> On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
> > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
> > > 
> > > Ok, how about the following patch.  Is it acceptable to everyone?
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > > 
> > > ---
> > >  init/Kconfig |   13 +++--
> > >  1 file changed, 11 insertions(+), 2 deletions(-)
> > > 
> > > --- gregkh-2.6.orig/init/Kconfig
> > > +++ gregkh-2.6/init/Kconfig
> > > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
> > > that belong to a class, back into the /sys/class heirachy, in
> > > order to support older versions of udev.
> > >  
> > > -   If you are using a distro that was released in 2006 or later,
> > > -   it should be safe to say N here.
> > > +   If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
> > > +   release from 2007 or later, it should be safe to say N here.
> > > +
> > > +   If you are using Debian or other distros that are slow to
> > > +   update HAL, please say Y here.
> > >...
> > 
> > The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
> > for all users, and schedule it's removal for mid-2008 (or later).
> > 
> > 12 months after the first _release_ of a HAL that can live without seems 
> > to be the first time when we can consider getting rid of it, since all 
> > distributions with at least one release a year should ship it by then.
> > 
> > Currently, SYSFS_DEPRECATED is only a trap for users.
> 
> Huh?
> 
> No, again, I've been using this just fine for about 6 months now.
> 
> And what about all of the servers not using HAL/NetworkManager?
> And what about all of the embedded systems not using either?
> 
> So to not allow this to be turned off by people who might want to (we
> want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
> other distros released this year), is pretty heavy-handed.
> 
> It also will work in OpenSuSE 10.2 which is already released, and I
> think Fedora 6, but I've only limited experience with these.
> 
> Oh, and Gentoo works just fine, and has been for the past 6 months.
>
> I would just prefer to come up with an acceptable set of wording that
> will work to properly warn people.
> 
> I proposed one such wording which some people took as a slam against
> Debian, which it really was not at all.
> 
> Does someone else want to propose some other wording instead?

Back up a bit. Let's review:

Problem: NetworkManager stopped working with my ipw2200 on Debian/unstable

Theory A: It broke because I'm not running an as-yet-unreleased HAL.

 Then we should revert the patch pronto because it's an unqualified
 regression.

Theory B: It broke because I'm not running relatively recent HAL.

 By all accounts I'm running the latest and greatest HAL and Network
 Manager, more than recent enough to work.

Theory C: It broke because I've got some goofy config.

 My setup passes no arguments to either. The HAL config file is
 completely bare-bones and there's no sign of any configuration files
 for Network Manager.

Theory D: It broke for some nebulous Debian-related reason.

 That's a bunch of unhelpful crap.

Can we come up with an actual theory for what's wrong with my setup, please?
Like, perhaps:

Theory E: There's some undiagnosed new breakage that this introduces
that no else hit until it went into mainline.

 Hmmm, this one sounds more promising.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[UDP]: Clean up UDP-Lite receive checksum

2007-03-05 Thread Herbert Xu
Hi Dave:

[UDP]: Clean up UDP-Lite receive checksum

This patch eliminates some duplicate code for the verification of
receive checksums between UDP-Lite and UDP.  It does this by
introducing __skb_checksum_complete_head which is identical to
__skb_checksum_complete_head apart from the fact that it takes
a length parameter rather than computing the first skb->len bytes.

As a result UDP-Lite will be able to use hardware checksum offload
for packets which do not use partial coverage checksums.  It also
means that UDP-Lite loopback no longer does unnecessary checksum
verification.

If any NICs start support UDP-Lite this would also start working
automatically.

This patch removes the assumption that msg_flags has MSG_TRUNC clear
upon entry in recvmsg.

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
b830b85a68b42ce10139a7a9e405622e809b8de7
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4ff3940..658dfad 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1381,6 +1381,7 @@ static inline void skb_set_timestamp(struct sk_buff *skb, 
const struct timeval *
 
 extern void __net_timestamp(struct sk_buff *skb);
 
+extern __sum16 __skb_checksum_complete_head(struct sk_buff *skb, int len);
 extern __sum16 __skb_checksum_complete(struct sk_buff *skb);
 
 /**
diff --git a/include/net/udp.h b/include/net/udp.h
index 1b921fa..4a9699f 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -72,10 +72,7 @@ struct sk_buff;
  */
 static inline __sum16 __udp_lib_checksum_complete(struct sk_buff *skb)
 {
-   if (! UDP_SKB_CB(skb)->partial_cov)
-   return __skb_checksum_complete(skb);
-   return csum_fold(skb_checksum(skb, 0, UDP_SKB_CB(skb)->cscov,
- skb->csum));
+   return __skb_checksum_complete_head(skb, UDP_SKB_CB(skb)->cscov);
 }
 
 static inline int udp_lib_checksum_complete(struct sk_buff *skb)
diff --git a/include/net/udplite.h b/include/net/udplite.h
index 67ac514..89aa2bd 100644
--- a/include/net/udplite.h
+++ b/include/net/udplite.h
@@ -47,11 +47,10 @@ static inline int udplite_checksum_init(struct sk_buff 
*skb, struct udphdr *uh)
return 1;
}
 
-UDP_SKB_CB(skb)->partial_cov = 0;
cscov = ntohs(uh->len);
 
if (cscov == 0)  /* Indicates that full coverage is required. */
-   cscov = skb->len;
+   ;
else if (cscov < 8  || cscov > skb->len) {
/*
 * Coverage length violates RFC 3828: log and discard silently.
@@ -60,42 +59,16 @@ static inline int udplite_checksum_init(struct sk_buff 
*skb, struct udphdr *uh)
   cscov, skb->len);
return 1;
 
-   } else if (cscov < skb->len)
+   } else if (cscov < skb->len) {
UDP_SKB_CB(skb)->partial_cov = 1;
-
-UDP_SKB_CB(skb)->cscov = cscov;
-
-   /*
-* There is no known NIC manufacturer supporting UDP-Lite yet,
-* hence ip_summed is always (re-)set to CHECKSUM_NONE.
-*/
-   skb->ip_summed = CHECKSUM_NONE;
+   UDP_SKB_CB(skb)->cscov = cscov;
+   if (skb->ip_summed == CHECKSUM_COMPLETE)
+   skb->ip_summed = CHECKSUM_NONE;
+}
 
return 0;
 }
 
-static __inline__ int udplite4_csum_init(struct sk_buff *skb, struct udphdr 
*uh)
-{
-   int rc = udplite_checksum_init(skb, uh);
-
-   if (!rc)
-   skb->csum = csum_tcpudp_nofold(skb->nh.iph->saddr,
-  skb->nh.iph->daddr,
-  skb->len, IPPROTO_UDPLITE, 0);
-   return rc;
-}
-
-static __inline__ int udplite6_csum_init(struct sk_buff *skb, struct udphdr 
*uh)
-{
-   int rc = udplite_checksum_init(skb, uh);
-
-   if (!rc)
-   skb->csum = ~csum_unfold(csum_ipv6_magic(&skb->nh.ipv6h->saddr,
-&skb->nh.ipv6h->daddr,
-skb->len, IPPROTO_UDPLITE, 0));
-   return rc;
-}
-
 static inline int udplite_sender_cscov(struct udp_sock *up, struct udphdr *uh)
 {
int cscov = up->len;
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 186212b..cb056f4 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -411,11 +411,11 @@ fault:
return -EFAULT;
 }
 
-__sum16 __skb_checksum_complete(struct sk_buff *skb)
+__sum16 __skb_checksum_complete_head(struct sk_buff *skb, int len)
 {
__sum16 sum;
 
-   sum = csum_fold(skb_checksum(skb, 0, skb->len, skb->csum));
+   sum = csum_fold(skb_checksum(skb, 0, len, skb->csum));
if (likely(!sum)) {
if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE))
 

Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Andrew Morton
On Mon, 5 Mar 2007 17:17:09 -0800
Greg KH <[EMAIL PROTECTED]> wrote:

> On Mon, Mar 05, 2007 at 05:08:49PM -0800, Andrew Morton wrote:
> > On Mon, 5 Mar 2007 19:56:25 -0500
> > Theodore Tso <[EMAIL PROTECTED]> wrote:
> > 
> > > So the question really is are we really done making changes to sysfs,
> > > or maybe what we should do is talk about major version numbers to
> > > sysfs.
> > 
> > Perhaps using a config option wasn't the right way to do this - a kernel
> > boot parameter might be better.
> 
> Ok, I have no problem with that if people really want it.  But give me
> the option to also make it a config option so I don't have to change our
> bootloaders too.

Sometimes we provide a config option which provides the default version of
the boot option.  So:

CONFIG_SYSFS_VERSION=1.2

and

if (user_provided_sysfs_version == NULL)
user_provided_sysfs_version = CONFIG_SYSFS_VERSION;


> Does that sound acceptable?

If we make CONFIG_SYSFS_DEPRECATED just a boolean boot option then that
fixes this problem (we hope) but won't help us next time we want to change
something.

It all depends on whether sysfs is finished yet ;)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[UDP6]: Restore sk_filter optimisation

2007-03-05 Thread Herbert Xu
Hi Dave:

[UDP6]: Restore sk_filter optimisation

This reverts the changeset

[IPV6]: UDPv6 checksum.

We always need to check UDPv6 checksum because it is mandatory.

The sk_filter optimisation has nothing to do whether we verify the
checksum.  It simply postpones it to the point when the user calls
recv or poll.

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 0ad4719..4474480 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -279,8 +279,10 @@ int udpv6_queue_rcv_skb(struct sock * sk, struct sk_buff 
*skb)
}
}
 
-   if (udp_lib_checksum_complete(skb))
-   goto drop;
+   if (sk->sk_filter) {
+   if (udp_lib_checksum_complete(skb))
+   goto drop;
+   }
 
if ((rc = sock_queue_rcv_skb(sk,skb)) < 0) {
/* Note that an ENOMEM error is charged twice */
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 05:08:49PM -0800, Andrew Morton wrote:
> On Mon, 5 Mar 2007 19:56:25 -0500
> Theodore Tso <[EMAIL PROTECTED]> wrote:
> 
> > So the question really is are we really done making changes to sysfs,
> > or maybe what we should do is talk about major version numbers to
> > sysfs.
> 
> Perhaps using a config option wasn't the right way to do this - a kernel
> boot parameter might be better.

Ok, I have no problem with that if people really want it.  But give me
the option to also make it a config option so I don't have to change our
bootloaders too.

Does that sound acceptable?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 07:56:25PM -0500, Theodore Tso wrote:
> On Mon, Mar 05, 2007 at 04:37:15PM -0800, Greg KH wrote:
> > But I AM TRYING TO MAKE IT COMPATIBLE!!!
> > 
> > That's what that config option is there for.  If you happen to be
> > running a newer userspace, a different distro than what is in Debian
> > right now, or don't use HAL and Networkmanager, then disable that
> > option.  Then all of sysfs looks just like it used to, no user visble
> > changes at all.  It doesn't get any more compatible than that.
> 
> This is great, but I think the real problem isn't the config option,
> but what is changing if the config option isn't enabled.  The claim
> which some, including Matt and Bron, seem to be making is that if you
> turn *off* CONFIG_SYSFS_DEPRECATED, you must be using at least hal
> 0.5.9-rc1, released ***yesterday***, or suffer breakages for at least
> some system configurations.

Ok, well that has been proven incorrect.  I originally thought it was
HAL that had the problem, but I think that is not true, as I am using
the older version of hal here (0.5.7.1) just fine.

> So the problem with putting a date in Kconfig.txt help file, or in
> Documentation/feature-removal-schedule.txt, is that if there are other
> incompatible changes which are added to sysfs in say, December 2007 or
> January 2008, but which are papered over with CONFIG_SYSFS_DEPRECATED,
> and then come June 2008, CONFIG_SYSFS_DEPRECATED is unceremoniously
> ripped out, then users will get screwed.  
> 
> So the question really is are we really done making changes to sysfs,
> or maybe what we should do is talk about major version numbers to
> sysfs.  Call what we have currently not CONFIG_SYSFS_DEPRECATED, but
> rather CONFIG_SYSFS_LAYOUT_1.  At the moment, CONFIG_SYSFS_LAYOUT_2 is
> undergoing changes, but at some point we need to lock down and state
> that Layout version 2 is never going to change, and then people who
> want changes can go work on CONFIG_SYSFS_LAYOUT_3.  
> 
> The problem with calling CONFIG_SYSFS_DEPRECATED is that people think
> that since it's deprecated, it should be turned off, but if we have
> staged major version numbers, with guarantees of absolute stability
> once a particular major version number is locked down, then it may
> make it a lot easier to talk about what version of hal and udev and
> Network Manager is really needed for different versions.  

This is what Documentation/ABI/ has tried to nail down, unfortunatly it
has turned out to be very hard to track down all of the odd userspace
programs that use sysfs and see what they are relying on.  We are slowly
fixing things, as is proof in the OpenSuSE and Gentoo releases.

And I'll be the first to admit that the ABI/ directory needs some
flushing out...

And it isn't really a whole different layout, the only problem here is
that a directory has turned into a symlink, so programs that were not
written that well (and I'll be the first to admit that I made the same
mistake in udev many years ago) and can't handle the change.

So numerous programs "just work" fine, but for a limited few, they have
problems, hence the config option so that nothing will break.

And if you look in the ABI/ directory, it describes this usage of the
class devices in sysfs.  But again, no one is flushing out the users of
these features, or even reading the stuff that is there...

So, again, a better wording for the CONFIG help text anyone?  Or a
better name for the CONFIG value itself?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Andrew Morton
On Mon, 5 Mar 2007 19:56:25 -0500
Theodore Tso <[EMAIL PROTECTED]> wrote:

> So the question really is are we really done making changes to sysfs,
> or maybe what we should do is talk about major version numbers to
> sysfs.

Perhaps using a config option wasn't the right way to do this - a kernel
boot parameter might be better.

In fact, one could envisage a kernel boot parameter "sysfs_version=N" which 
will allow distro people to select the sysfs-of-the-day which works with their
userspace.

Because it does appear that we need _something_ which will get us away from this
ongoing problem of needing to keep the kernel and userspace synchronised across
sysfs changes.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [UDP]: Reread uh pointer after pskb_trim

2007-03-05 Thread David Miller
From: Herbert Xu <[EMAIL PROTECTED]>
Date: Tue, 6 Mar 2007 12:00:20 +1100

> Hi Dave:
> 
> [UDP]: Reread uh pointer after pskb_trim
> 
> The header may have moved when trimming.
> 
> Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Good catch, I'll apply this and push to -stable, thanks
Herbert.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[UDP]: Reread uh pointer after pskb_trim

2007-03-05 Thread Herbert Xu
Hi Dave:

[UDP]: Reread uh pointer after pskb_trim

The header may have moved when trimming.

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index ce6c460..fc620a7 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1215,6 +1215,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct hlist_head 
udptable[],
 
if (ulen < sizeof(*uh) || pskb_trim_rcsum(skb, ulen))
goto short_packet;
+   uh = skb->h.uh;
 
udp4_csum_init(skb, uh);
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Theodore Tso
On Mon, Mar 05, 2007 at 04:37:15PM -0800, Greg KH wrote:
> But I AM TRYING TO MAKE IT COMPATIBLE!!!
> 
> That's what that config option is there for.  If you happen to be
> running a newer userspace, a different distro than what is in Debian
> right now, or don't use HAL and Networkmanager, then disable that
> option.  Then all of sysfs looks just like it used to, no user visble
> changes at all.  It doesn't get any more compatible than that.

This is great, but I think the real problem isn't the config option,
but what is changing if the config option isn't enabled.  The claim
which some, including Matt and Bron, seem to be making is that if you
turn *off* CONFIG_SYSFS_DEPRECATED, you must be using at least hal
0.5.9-rc1, released ***yesterday***, or suffer breakages for at least
some system configurations.

So the problem with putting a date in Kconfig.txt help file, or in
Documentation/feature-removal-schedule.txt, is that if there are other
incompatible changes which are added to sysfs in say, December 2007 or
January 2008, but which are papered over with CONFIG_SYSFS_DEPRECATED,
and then come June 2008, CONFIG_SYSFS_DEPRECATED is unceremoniously
ripped out, then users will get screwed.  

So the question really is are we really done making changes to sysfs,
or maybe what we should do is talk about major version numbers to
sysfs.  Call what we have currently not CONFIG_SYSFS_DEPRECATED, but
rather CONFIG_SYSFS_LAYOUT_1.  At the moment, CONFIG_SYSFS_LAYOUT_2 is
undergoing changes, but at some point we need to lock down and state
that Layout version 2 is never going to change, and then people who
want changes can go work on CONFIG_SYSFS_LAYOUT_3.  

The problem with calling CONFIG_SYSFS_DEPRECATED is that people think
that since it's deprecated, it should be turned off, but if we have
staged major version numbers, with guarantees of absolute stability
once a particular major version number is locked down, then it may
make it a lot easier to talk about what version of hal and udev and
Network Manager is really needed for different versions.  

- Ted
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Bron Gondwana
On Mon, Mar 05, 2007 at 03:14:25PM -0600, Matt Mackall wrote:
> On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote:
> > > That's not the point. The point is that Debian/unstable as of _this
> > > morning_ doesn't work. For reference, I'm running both the latest
> > > releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And
> > > there are people telling me I need a copy of HAL out of git that
> > > hasn't even been released for Debian to package. Debian isn't the
> > > problem here.
> > 
> >   hal 0.5.9-rc1 (released, not from git) should work. It will be
> > problably released soon and picked by sane distributions. Debian is very
> > irritating corner case.
> 
> Presumably the -rc1 stands for "release candidate". Which means "not
> yet released". And when did it show up? 04-Mar-2007 at 18:31. That's
> right, YESTERDAY. Almost a full month after Greg's commit.
> 
> For the last time, DEBIAN IS NOT THE PROBLEM.

Can I please second this (having been burned by hell that was udev of
the 0.5ish era) - Greg, please try to make changes in a cross-compatible
way so that versions of userspace and kernel are not so closely
dependant on tracking each other.  The whole 2.6.8 -> 2.6.12 series of
kernels and associated udevs are fraught with race conditions where
upgrading one but not the other will leave your machine unbootable.

I read the "manifesto" for udev showing how crap devfs was, it was
broken, it could never be fixed etc - yet my experience was that devfs
systems "just worked"[tm] and udev was very dangerous.  My thinking is
going to be tarnished by that for a while and my mental image of udev
is "unreliable POS".  I'm hoping enough good experiences with udev might
make me feel less scared whenever I have to deal with it.

Similarly, I'm hoping I don't have to think "oh shit, will this break
boot" every time I upgrade either a kernel or hal version for the next
year, because it would really suck to do that all over again.  It
contributes to the meme that linux is unreliable and perpetually
unstable.

Regards,

Bron.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Greg KH
On Tue, Mar 06, 2007 at 01:35:41AM +0100, Adrian Bunk wrote:
> On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote:
> > On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
> > > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
> > > > 
> > > > Ok, how about the following patch.  Is it acceptable to everyone?
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > > 
> > > > ---
> > > >  init/Kconfig |   13 +++--
> > > >  1 file changed, 11 insertions(+), 2 deletions(-)
> > > > 
> > > > --- gregkh-2.6.orig/init/Kconfig
> > > > +++ gregkh-2.6/init/Kconfig
> > > > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
> > > >   that belong to a class, back into the /sys/class heirachy, in
> > > >   order to support older versions of udev.
> > > >  
> > > > - If you are using a distro that was released in 2006 or later,
> > > > - it should be safe to say N here.
> > > > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
> > > > + release from 2007 or later, it should be safe to say N here.
> > > > +
> > > > + If you are using Debian or other distros that are slow to
> > > > + update HAL, please say Y here.
> > > >...
> > > 
> > > The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
> > > for all users, and schedule it's removal for mid-2008 (or later).
> > > 
> > > 12 months after the first _release_ of a HAL that can live without seems 
> > > to be the first time when we can consider getting rid of it, since all 
> > > distributions with at least one release a year should ship it by then.
> > > 
> > > Currently, SYSFS_DEPRECATED is only a trap for users.
> > 
> > Huh?
> > 
> > No, again, I've been using this just fine for about 6 months now.
> > 
> > And what about all of the servers not using HAL/NetworkManager?
> 
> On a server, it shouldn't harm.

But if they wanted that option enabled?

> > And what about all of the embedded systems not using either?
> 
> If it was much code, I would have sent a patch that allowed disabling it 
> if EMBEDDED=y.

It's not a code size issue.  In fact, if the option is enabled, like you
have done, it builds more code into the kernel than before.

> > So to not allow this to be turned off by people who might want to (we
> > want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
> > other distros released this year), is pretty heavy-handed.
> > 
> > It also will work in OpenSuSE 10.2 which is already released, and I
> > think Fedora 6, but I've only limited experience with these.
> > 
> > Oh, and Gentoo works just fine, and has been for the past 6 months.
> 
> For most people, it simply doesn't matter whether SYSFS_DEPRECATED is 
> on or off.

Exactly.

> But accidentally disabling SYSFS_DEPRECATED has proven to be a trap 
> people sometimes fall into - and tracking them down to 
> SYSFS_DEPRECATED=n sometimes takes some time.

So how do I put up the warning flag any larger than I have?

I do not want this always enabled, that option is not acceptable to me,
or to the zillions of people who are running a distro that this option
works just fine on (see above list...)

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Greg KH
On Tue, Mar 06, 2007 at 11:24:57AM +1100, Bron Gondwana wrote:
> On Mon, Mar 05, 2007 at 03:14:25PM -0600, Matt Mackall wrote:
> > On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote:
> > > > That's not the point. The point is that Debian/unstable as of _this
> > > > morning_ doesn't work. For reference, I'm running both the latest
> > > > releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And
> > > > there are people telling me I need a copy of HAL out of git that
> > > > hasn't even been released for Debian to package. Debian isn't the
> > > > problem here.
> > > 
> > >   hal 0.5.9-rc1 (released, not from git) should work. It will be
> > > problably released soon and picked by sane distributions. Debian is very
> > > irritating corner case.
> > 
> > Presumably the -rc1 stands for "release candidate". Which means "not
> > yet released". And when did it show up? 04-Mar-2007 at 18:31. That's
> > right, YESTERDAY. Almost a full month after Greg's commit.
> > 
> > For the last time, DEBIAN IS NOT THE PROBLEM.
> 
> Can I please second this (having been burned by hell that was udev of
> the 0.5ish era) - Greg, please try to make changes in a cross-compatible
> way so that versions of userspace and kernel are not so closely
> dependant on tracking each other.  The whole 2.6.8 -> 2.6.12 series of
> kernels and associated udevs are fraught with race conditions where
> upgrading one but not the other will leave your machine unbootable.

But I AM TRYING TO MAKE IT COMPATIBLE!!!

That's what that config option is there for.  If you happen to be
running a newer userspace, a different distro than what is in Debian
right now, or don't use HAL and Networkmanager, then disable that
option.  Then all of sysfs looks just like it used to, no user visble
changes at all.  It doesn't get any more compatible than that.

Again, I've pointed out distros that work just fine many times in this
thread...

It's been there since 2.6.20 I think, no one seemed to have noticed it
then for an odd reason...

And the default is enabled, you have to manually turn it off in order to
break your machine.

Again, how can I word this in a manner that would be sufficient to keep
this misunderstanding from happening again?

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfrm audit hook misplaced in pfkey_delete and xfrm_del_sa

2007-03-05 Thread James Morris
On Fri, 2 Mar 2007, Eric Paris wrote:

> Inside pfkey_delete and xfrm_del_sa the audit hooks were not called if
> there was any permission/security failures in attempting to do the del
> operation (such as permission denied from security_xfrm_state_delete).
> This patch moves the audit hook to the exit path such that all failures
> (and successes) will actually get audited.
> 
> Signed-off-by: Eric Paris <[EMAIL PROTECTED]>

Acked-by: James Morris <[EMAIL PROTECTED]>


-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Add xfrm policy change auditing to pfkey_spdget

2007-03-05 Thread James Morris
On Fri, 2 Mar 2007, Eric Paris wrote:

> pfkey_spdget neither had an LSM security hook nor auditing for the
> removal of xfrm_policy structs.  The security hook was added when it was
> moved into xfrm_policy_byid instead of the callers to that function by
> my earlier patch and this patch adds the auditing hooks as well.
> 
> Signed-off-by: Eric Paris <[EMAIL PROTECTED]>

Acked-by: James Morris <[EMAIL PROTECTED]>


-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread James Morris
On Fri, 2 Mar 2007, Eric Paris wrote:

> Signed-off-by: Eric Paris <[EMAIL PROTECTED]>

Acked-by: James Morris <[EMAIL PROTECTED]>



-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Adrian Bunk
On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote:
> On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
> > On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
> > > 
> > > Ok, how about the following patch.  Is it acceptable to everyone?
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > > 
> > > ---
> > >  init/Kconfig |   13 +++--
> > >  1 file changed, 11 insertions(+), 2 deletions(-)
> > > 
> > > --- gregkh-2.6.orig/init/Kconfig
> > > +++ gregkh-2.6/init/Kconfig
> > > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
> > > that belong to a class, back into the /sys/class heirachy, in
> > > order to support older versions of udev.
> > >  
> > > -   If you are using a distro that was released in 2006 or later,
> > > -   it should be safe to say N here.
> > > +   If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
> > > +   release from 2007 or later, it should be safe to say N here.
> > > +
> > > +   If you are using Debian or other distros that are slow to
> > > +   update HAL, please say Y here.
> > >...
> > 
> > The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
> > for all users, and schedule it's removal for mid-2008 (or later).
> > 
> > 12 months after the first _release_ of a HAL that can live without seems 
> > to be the first time when we can consider getting rid of it, since all 
> > distributions with at least one release a year should ship it by then.
> > 
> > Currently, SYSFS_DEPRECATED is only a trap for users.
> 
> Huh?
> 
> No, again, I've been using this just fine for about 6 months now.
> 
> And what about all of the servers not using HAL/NetworkManager?

On a server, it shouldn't harm.

> And what about all of the embedded systems not using either?

If it was much code, I would have sent a patch that allowed disabling it 
if EMBEDDED=y.

> So to not allow this to be turned off by people who might want to (we
> want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
> other distros released this year), is pretty heavy-handed.
> 
> It also will work in OpenSuSE 10.2 which is already released, and I
> think Fedora 6, but I've only limited experience with these.
> 
> Oh, and Gentoo works just fine, and has been for the past 6 months.

For most people, it simply doesn't matter whether SYSFS_DEPRECATED is 
on or off.

But accidentally disabling SYSFS_DEPRECATED has proven to be a trap 
people sometimes fall into - and tracking them down to 
SYSFS_DEPRECATED=n sometimes takes some time.

> I would just prefer to come up with an acceptable set of wording that
> will work to properly warn people.
> 
> I proposed one such wording which some people took as a slam against
> Debian, which it really was not at all.
> 
> Does someone else want to propose some other wording instead?
> 
> thanks,
> 
> greg k-h

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Jim Chow
On Tue, 6 Mar 2007, Herbert Xu wrote:
> It's just too error-prone to rely on it to not have MSG_TRUNC set.

Agreed.

> I'm going to clean this up for UDP and improve the UDP-lite checksum
> handling while I'm at it.

Great.  It'll be good to get this years-old UDP bug fixed.

Thanks,
Jim
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] div64_64 support

2007-03-05 Thread David Miller
From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Mon, 5 Mar 2007 15:57:14 -0800

> I tried the code from Hacker's Delight.
> It is cool, but performance is CPU (and data) dependent:
> 
> Average # of usecs per operation:

Interesting results.

The problem with these algorithms that tradoff one or more
multiplies in order to avoid a divide is that they don't
give anything and often lose when both multiplies and
divides are emulated in software.

This is particularly true in this cube-root case from Hacker's
Delight, because it's using 3 multiplies per iteration in place of one
divide per iteration.

Actually, sorry, there is only one real multiply in there since the
other two can be computed using addition and shifts.

Another thing is that the non-Hacker's Delight version iterates
differently for different input values, so the input value space is
very important to consider when comparing these two pieces of code.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Greg KH
On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
> On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
> > 
> > Ok, how about the following patch.  Is it acceptable to everyone?
> > 
> > thanks,
> > 
> > greg k-h
> > 
> > ---
> >  init/Kconfig |   13 +++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> > 
> > --- gregkh-2.6.orig/init/Kconfig
> > +++ gregkh-2.6/init/Kconfig
> > @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
> >   that belong to a class, back into the /sys/class heirachy, in
> >   order to support older versions of udev.
> >  
> > - If you are using a distro that was released in 2006 or later,
> > - it should be safe to say N here.
> > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
> > + release from 2007 or later, it should be safe to say N here.
> > +
> > + If you are using Debian or other distros that are slow to
> > + update HAL, please say Y here.
> >...
> 
> The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
> for all users, and schedule it's removal for mid-2008 (or later).
> 
> 12 months after the first _release_ of a HAL that can live without seems 
> to be the first time when we can consider getting rid of it, since all 
> distributions with at least one release a year should ship it by then.
> 
> Currently, SYSFS_DEPRECATED is only a trap for users.

Huh?

No, again, I've been using this just fine for about 6 months now.

And what about all of the servers not using HAL/NetworkManager?
And what about all of the embedded systems not using either?

So to not allow this to be turned off by people who might want to (we
want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
other distros released this year), is pretty heavy-handed.

It also will work in OpenSuSE 10.2 which is already released, and I
think Fedora 6, but I've only limited experience with these.

Oh, and Gentoo works just fine, and has been for the past 6 months.

I would just prefer to come up with an acceptable set of wording that
will work to properly warn people.

I proposed one such wording which some people took as a slam against
Debian, which it really was not at all.

Does someone else want to propose some other wording instead?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Herbert Xu
On Tue, Mar 06, 2007 at 10:34:49AM +1100, Herbert Xu wrote:
> > 
> > That's not true.  Please see my post.
> > 
> > Specifically, at linux-2.6.20.1/net/ipv4/udp.c:843, it seems that 
> > udp_recvmsg() can randomly ignore whether the HW has computed a checksum 
> > and compute it in SW redundantly.
> 
> Sorry, you're right.  This bug has been there for years.

Actually I think we should fix UDP regardless of whether we initialise
msg_flags to zero here.  It's just too error-prone to rely on it to not
have MSG_TRUNC set.

I'm going to clean this up for UDP and improve the UDP-lite checksum
handling while I'm at it.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] div64_64 support

2007-03-05 Thread Stephen Hemminger
On 03 Mar 2007 03:31:52 +0100
Andi Kleen <[EMAIL PROTECTED]> wrote:

> Stephen Hemminger <[EMAIL PROTECTED]> writes:
> 
> > Here is another way to handle the 64 bit divide case.
> > It allows full 64 bit divide by adding the support routine
> > GCC needs.
> 
> Not supplying that was intentional by Linus so that people
> think twice (or more often) before they using such expensive
> operations. A plain / looks too innocent.
> 
> Is it really needed by CUBIC anyways?  It uses it for getting
> the cubic root, but the algorithm recommended by Hacker's Delight
> (great book) doesn't use any divisions at all. Probably better 
> to use a better algorithm without divisions.
> 

I tried the code from Hacker's Delight.
It is cool, but performance is CPU (and data) dependent:

Average # of usecs per operation:

Hacker  Newton
Pentium 3   68.6<   90.4
T2050   98.6>   92.0
U1400   450 >   415
Xeon70  <   90
Xeon (newer)71  <   78

EM64T   21.8<   24.6
AMD64   23.4<   32.0

It might be worth the change for code size reduction though.


-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying

2007-03-05 Thread Neil Brown
On Monday March 5, [EMAIL PROTECTED] wrote:
> On Friday 02 March 2007 05:28, NeilBrown wrote:
> > The sunrpc server code needs to know the source and destination address
> > for UDP packets so it can reply properly.
> > It currently copies code out of the network stack to pick the pieces out
> > of the skb.
> > This is ugly and causes compile problems with the IPv6 stuff.
> 
> ... and this IPv6 code could never have worked anyway:

:-(
It's hard to test the IPv6 server until we have an IPv6 client I
guess, so thanks for the code review, even though we aren't going to
end up using that code...

> 
> But I find using recvmsg just for getting at the addresses
> a little awkward too.

Do you?  It's surely a lot better than code duplication, and it is
exactly how you would get the information from user-space.

>   And I think to be on the safe side, you
> should check that you're really looking at a PKTINFO cmsg
> rather than something else.

Maybe.
But is there really a chance that it might not be PKTINFO?
And what do you do if it isn't?
Log an error and drop the packet I guess.

I'll see what I can do.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Adrian Bunk
On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
> 
> Ok, how about the following patch.  Is it acceptable to everyone?
> 
> thanks,
> 
> greg k-h
> 
> ---
>  init/Kconfig |   13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> --- gregkh-2.6.orig/init/Kconfig
> +++ gregkh-2.6/init/Kconfig
> @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
> that belong to a class, back into the /sys/class heirachy, in
> order to support older versions of udev.
>  
> -   If you are using a distro that was released in 2006 or later,
> -   it should be safe to say N here.
> +   If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
> +   release from 2007 or later, it should be safe to say N here.
> +
> +   If you are using Debian or other distros that are slow to
> +   update HAL, please say Y here.
>...

The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
for all users, and schedule it's removal for mid-2008 (or later).

12 months after the first _release_ of a HAL that can live without seems 
to be the first time when we can consider getting rid of it, since all 
distributions with at least one release a year should ship it by then.

Currently, SYSFS_DEPRECATED is only a trap for users.

Suggested patch below.

cu
Adrian


<--  snip  -->


unconditionally enable SYSFS_DEPRECATED

This patch unconditionally enables SYSFS_DEPRECATED and schedules it's
removal for July 2008.

Currently, SYSFS_DEPRECATED is only a trap for users accidentally
disabling it.

In July 2008, all distributions with at least one release a year should
be able to run without SYSFS_DEPRECATED.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

diff --git a/Documentation/feature-removal-schedule.txt 
b/Documentation/feature-removal-schedule.txt
index c3b1430..b0bce93 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -316,3 +316,13 @@ Why:   The option/code is
 Who:   Johannes Berg <[EMAIL PROTECTED]>
 
 ---
+
+What:  deprecated sysfs files (CONFIG_SYSFS_DEPRECATED)
+When:  July 2008
+Why:   None of these features or values should be used any longer,
+   as they export driver core implementation details to userspace
+   or export properties which can't be kept stable across kernel
+   releases.
+Who:   Greg KH <[EMAIL PROTECTED]>
+
+---
diff --git a/init/Kconfig b/init/Kconfig
index f977086..f652b6f 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -274,24 +274,9 @@ config CPUSETS
  Say N if unsure.
 
 config SYSFS_DEPRECATED
-   bool "Create deprecated sysfs files"
+   bool
default y
help
- This option creates deprecated symlinks such as the
- "device"-link, the :-link, and the
- "bus"-link. It may also add deprecated key in the
- uevent environment.
- None of these features or values should be used today, as
- they export driver core implementation details to userspace
- or export properties which can't be kept stable across kernel
- releases.
-
- If enabled, this option will also move any device structures
- that belong to a class, back into the /sys/class heirachy, in
- order to support older versions of udev.
-
- If you are using a distro that was released in 2006 or later,
- it should be safe to say N here.
 
 config RELAY
bool "Kernel->user space relay support (formerly relayfs)"
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Jeffrey Hundstad

Greg KH wrote:

On Mon, Mar 05, 2007 at 07:59:50AM -0500, Theodore Tso wrote:
  
Ok, how about the following patch.  Is it acceptable to everyone?


thanks,

greg k-h

---
 init/Kconfig |   13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- gregkh-2.6.orig/init/Kconfig
+++ gregkh-2.6/init/Kconfig
@@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
  that belong to a class, back into the /sys/class heirachy, in
  order to support older versions of udev.
 
-	  If you are using a distro that was released in 2006 or later,

- it should be safe to say N here.
+ If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
+ release from 2007 or later, it should be safe to say N here.
+
+ If you are using Debian or other distros that are slow to
+ update HAL, please say Y here.
+
+ If you have any problems with devices not being found properly
+ from userspace programs, and this option is disabled, say Y
+ here.
+
+ If you are unsure about this at all, say Y.
 
 config RELAY

bool "Kernel->user space relay support (formerly relayfs)"


Since it appears you're trying to offend people with this patch, it 
would seem appropriate to call someone's mother a "bad" name.  This may 
be in the style guide; perhaps I should submit a patch.


--
Jeffrey Hundstad
PS: Humor (really!) relax.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Herbert Xu
On Mon, Mar 05, 2007 at 01:01:16PM -0800, Jim Chow wrote:
> On Tue, 6 Mar 2007, Herbert Xu wrote:
> > msg_flags [...] its initial value is not used.
> 
> That's not true.  Please see my post.
> 
> Specifically, at linux-2.6.20.1/net/ipv4/udp.c:843, it seems that 
> udp_recvmsg() can randomly ignore whether the HW has computed a checksum 
> and compute it in SW redundantly.

Sorry, you're right.  This bug has been there for years.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8132] New: pptp server lockup in ppp_asynctty_receive()

2007-03-05 Thread Andrew Morton
On Mon, 5 Mar 2007 14:26:30 -0800
[EMAIL PROTECTED] wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=8132
> 
>Summary: pptp server lockup in ppp_asynctty_receive()
> Kernel Version:  2.6.20
> Status: NEW
>   Severity: high
>  Owner: [EMAIL PROTECTED]
>  Submitter: [EMAIL PROTECTED]
> CC: [EMAIL PROTECTED]
> 
> 
> Already several kernel releases i've expirienced different lockups of  vpn 
> (pptp) server.
> There is more then 200  ppp connections sometimes.
> With kernel debug i was able to retrive next information:
> 
> First:
> Showing all locks held in the system:
> 1 lock held by agetty/4486:
>  #0:  (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b
> 1 lock held by agetty/4487:
>  #0:  (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b
> 1 lock held by agetty/4488:
>  #0:  (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b
> 2 locks held by pptpctrl/4500:
>  #0:  (&tty->atomic_write_lock){--..}, at: [] tty_write+0x83/0x1d0
>  #1:  (&ap->recv_lock){}, at: [] 
> ppp_asynctty_receive+0x2e/0x710
> 
> =
> BUG: spinlock lockup on CPU#1, pppd/4504, df5048c4
>  [] _raw_spin_lock+0x100/0x134
>  [] ppp_async_ioctl+0xa7/0x1d0
>  [] ppp_ioctl+0xa5/0xbff
>  [] down_read+0x29/0x3a
>  [] ppp_async_ioctl+0x0/0x1d0
>  [] ppp_ioctl+0xce/0xbff
>  [] _spin_unlock+0x14/0x1c
>  [] do_wp_page+0x256/0x4ba
>  [] __handle_mm_fault+0x74e/0xa22
>  [] do_ioctl+0x64/0x6d
>  [] vfs_ioctl+0x50/0x273
>  [] sys_ioctl+0x34/0x50
>  [] sysenter_past_esp+0x5f/0x99
>  ===
> BUG: soft lockup detected on CPU#0!
>  [] softlockup_tick+0x8d/0xbc
>  [] update_process_times+0x28/0x5e
>  [] smp_apic_timer_interrupt+0x80/0x9c
>  [] apic_timer_interrupt+0x33/0x38
>  [] delay_tsc+0x9/0x13
>  [] __delay+0x6/0x7
>  [] _raw_spin_lock+0xa9/0x134
>  [] tty_write+0x83/0x1d0
>  [] tty_ldisc_try+0x2f/0x33
>  [] lock_kernel+0x19/0x24
>  [] tty_write+0x10b/0x1d0
>  [] write_chan+0x0/0x320
>  [] vfs_write+0x87/0xf0
>  [] tty_write+0x0/0x1d0
>  [] sys_write+0x41/0x6a
>  [] sysenter_past_esp+0x5f/0x99
>  ===
> 
> 
> Second)
> <0>BUG: spinlock lockup on CPU#0, pppd/5209, de3e2884
>  [] _raw_spin_lock+0x100/0x134
> BUG: spinlock lockup on CPU#1, ip-down/7524, c0353300
>  [] _raw_spin_lock+0x100/0x134
>  [] lock_kernel+0x19/0x24
>  [] chrdev_open+0x8a/0x16e
>  [] chrdev_open+0x0/0x16e
>  [] __dentry_open+0xaf/0x1a0
>  [] nameidata_to_filp+0x31/0x3a
>  [] do_filp_open+0x39/0x40
>  [] _spin_unlock+0x14/0x1c
>  [] get_unused_fd+0xaa/0xbb
>  [] do_sys_open+0x3a/0x6d
>  [] sys_open+0x1c/0x20
>  [] sysenter_past_esp+0x5f/0x99
>  ===
>  [] ppp_async_ioctl+0xa7/0x1d0
>  [] ppp_ioctl+0xa5/0xbff
>  [] down_read+0x29/0x3a
>  [] ppp_async_ioctl+0x0/0x1d0
>  [] ppp_ioctl+0xce/0xbff
>  [] _spin_unlock+0x14/0x1c
>  [] do_wp_page+0x256/0x4ba
>  [] __handle_mm_fault+0x74e/0xa22
>  [] do_ioctl+0x64/0x6d
>  [] vfs_ioctl+0x50/0x273
>  [] sys_ioctl+0x34/0x50
>  [] sysenter_past_esp+0x5f/0x99
>  ===
> 
> Third)
> BUG: soft lockup detected on CPU#0!
>  [] softlockup_tick+0x8d/0xbc
>  [] update_process_times+0x28/0x5e
>  [] smp_apic_timer_interrupt+0x80/0x9c
>  [] apic_timer_interrupt+0x33/0x38
>  [] delay_tsc+0x9/0x13
>  [] __delay+0x6/0x7
>  [] _raw_spin_lock+0xa9/0x134
>  [] tty_ldisc_try+0x2f/0x33
>  [] lock_kernel+0x19/0x24
>  [] tty_read+0x5a/0xbe
>  [] vfs_read+0x85/0xee
>  [] tty_read+0x0/0xbe
>  [] sys_read+0x41/0x6a
>  [] sysenter_past_esp+0x5f/0x99
>  ===
> BUG: soft lockup detected on CPU#0!
>  [] softlockup_tick+0x8d/0xbc
>  [] update_process_times+0x28/0x5e
>  [] smp_apic_timer_interrupt+0x80/0x9c
>  [] apic_timer_interrupt+0x33/0x38
>  [] prio_tree_insert+0xe8/0x23b
>  [] _raw_spin_lock+0xaf/0x134
>  [] tty_ldisc_try+0x2f/0x33
>  [] lock_kernel+0x19/0x24
>  [] tty_read+0x5a/0xbe
>  [] vfs_read+0x85/0xee
>  [] tty_read+0x0/0xbe
>  [] sys_read+0x41/0x6a
>  [] sysenter_past_esp+0x5f/0x99
> 
> 
> Next via SysRq:
> 
> Showing all locks held in the system:
> 1 lock held by agetty/5057:
>  #0:  (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b
> 1 lock held by agetty/5058:
>  #0:  (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b
> 1 lock held by agetty/5059:
>  #0:  (&tty->atomic_read_lock){--..}, at: [] read_chan+0x41a/0x60b
> 2 locks held by pptpctrl/5071:
>  #0:  (&tty->atomic_write_lock){--..}, at: [] tty_write+0x83/0x1d0
>  #1:  (&ap->recv_lock){}, at: [] 
> ppp_asynctty_receive+0x2e/0x710
> 
> 
> ~#SysRq : Show Blocked State
> 
>  freesibling
>   task PCstack   pid father child younger older
> pptpctrl  D C02A18E0 0  5071   4646  50745094  5064 (L-TLB)
>df3a3bd0 0082 0029b837 c02a18e0 0246  dd4f131c 
> dd563cac
>def86030 c140864c   0009 def86030 2ccaa8e5 
> 017d
>   

ignore; Re: "skge 0000:01:0a.0: unsupported phy type 0x0"

2007-03-05 Thread Chris Stromsoe
Ignore this.  I rebooted into the wrong kernel and was testing with 2.6.16 
instead of 2.6.20.  It works fine with 2.6.20.


-Chris

On Mon, 5 Mar 2007, Chris Stromsoe wrote:

I have a bunch of dual-port SK 98xx cards that work with sk98lin but not 
with skge.  After loading skge, I get


ACPI: PCI Interrupt :01:0a.0[A] -> Link [LNKC] -> GSI 10 (level, low) -> 
IRQ 10

skge :01:0a.0: unsupported phy type 0x0
ACPI: PCI interrupt for device :01:0a.0 disabled
skge: probe of :01:0a.0 failed with error -95


lspci -vv output for the card:

:01:0a.0 Ethernet controller: Syskonnect (Schneider & Koch) SK-98xx 
Gigabit Ethernet Server Adapter (rev 12)
   Subsystem: Syskonnect (Schneider & Koch) SK-9844 Gigabit Ethernet 
Server Adapter (SK-NET GE-SX dual link)
   Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
   Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- 
   Interrupt: pin A routed to IRQ 10
   Region 0: Memory at ff8fc000 (32-bit, non-prefetchable) [size=16K]
   Region 1: I/O ports at d800 [size=256]
   Expansion ROM at ff40 [disabled] [size=128K]
   Capabilities: [48] Power Management version 1
   Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)

   Status: D0 PME-Enable- DSel=0 DScale=1 PME-
   Capabilities: [50] Vital Product Data




-Chris
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] natsemi: netpoll fixes

2007-03-05 Thread Mark Brown
[Once more with CCs]

On Tue, Mar 06, 2007 at 12:10:08AM +0400, Sergei Shtylyov wrote:

>  #ifdef CONFIG_NET_POLL_CONTROLLER
>  static void natsemi_poll_controller(struct net_device *dev)
>  {
> + struct netdev_private *np = netdev_priv(dev);
> +
>   disable_irq(dev->irq);
> - intr_handler(dev->irq, dev);
> +
> + /*
> +  * A real interrupt might have already reached us at this point
> +  * but NAPI might still haven't called us back.  As the
> interrupt
> +  * status register is cleared by reading, we should prevent an
> +  * interrupt loss in this case...
> +  */
> + if (!np->intr_status)
> + intr_handler(dev->irq, dev);
> +
>   enable_irq(dev->irq);

Is it possible for this to run at the same time as the NAPI poll?  If so
then it is possible for the netpoll poll to run between np->intr_status
being cleared and netif_rx_complete() being called.  If the hardware
asserts an interrupt at the wrong moment then this could cause the

In any case, this is a problem independently of netpoll if the chip
shares an interrupt with anything so the interrupt handler should be
fixed to cope with this situation instead.

--
"You grabbed my hand and we fell into it, like a daydream - or a fever."


signature.asc
Description: Digital signature


RE: [PATCH] s2io: add PCI error recovery support

2007-03-05 Thread Ramkrishna Vepa
Comments on this patch -

1. device_close_flag is unused and is not required.
> +static pci_ers_result_t s2io_io_error_detected(struct pci_dev *pdev,
> +   pci_channel_state_t
state)
> +{
...
> + do_s2io_card_down(sp, 0);
> + sp->device_close_flag = TRUE;   /* Device is shut down.
*/

2. s2io_reset can fail to reset the device. Ideally s2io_reset should
return a failure in this case (return is void now) and in this case
could s2io_io_slot_reset() be called again, maybe try thrice, in total,
before failing to reset the slot?

Ram
> -Original Message-
> From: Linas Vepstas [mailto:[EMAIL PROTECTED]
> Sent: Thursday, February 15, 2007 3:09 PM
> To: Ramkrishna Vepa; Raghavendra Koushik; Ananda Raju
> Cc: Wen Xiong; linux-kernel@vger.kernel.org; linux-
> [EMAIL PROTECTED]; netdev@vger.kernel.org; Jeff Garzik;
Andrew
> Morton
> Subject: [PATCH] s2io: add PCI error recovery support
> 
> 
> Koushik, Raju,
> 
> Please review, comment, and if you find this acceptable,
> please forward upstream. This patch incorporates all of
> fixes resulting from the last set of discussions, circa
> November 2006.
> 
> --linas
> 
> This patch adds PCI error recovery support to the
> s2io 10-Gigabit ethernet device driver. Fourth revision,
> blocks interrupts and the watchdog. Adds a flag to
> s2io_down(), to avoid doing I/O when PCI bus is offline.
> 
> Tested, seems to work well.
> 
> Signed-off-by: Linas Vepstas <[EMAIL PROTECTED]>
> Acked-by: Ramkrishna Vepa <[EMAIL PROTECTED]>
> Cc: Raghavendra Koushik <[EMAIL PROTECTED]>
> Cc: Ananda Raju <[EMAIL PROTECTED]>
> Cc: Wen Xiong <[EMAIL PROTECTED]>
> 
> 
>  drivers/net/s2io.c |  116
> ++---
>  drivers/net/s2io.h |5 ++
>  2 files changed, 116 insertions(+), 5 deletions(-)
> 
> Index: linux-2.6.20-git4/drivers/net/s2io.c
> ===
> --- linux-2.6.20-git4.orig/drivers/net/s2io.c 2007-02-15
> 15:39:35.0 -0600
> +++ linux-2.6.20-git4/drivers/net/s2io.c  2007-02-15
16:15:10.0 -
> 0600
> @@ -435,11 +435,18 @@ static struct pci_device_id s2io_tbl[] _
> 
>  MODULE_DEVICE_TABLE(pci, s2io_tbl);
> 
> +static struct pci_error_handlers s2io_err_handler = {
> + .error_detected = s2io_io_error_detected,
> + .slot_reset = s2io_io_slot_reset,
> + .resume = s2io_io_resume,
> +};
> +
>  static struct pci_driver s2io_driver = {
>.name = "S2IO",
>.id_table = s2io_tbl,
>.probe = s2io_init_nic,
>.remove = __devexit_p(s2io_rem_nic),
> +  .err_handler = &s2io_err_handler,
>  };
> 
>  /* A simplifier macro used both by init and free shared_mem Fns(). */
> @@ -2577,6 +2584,9 @@ static void s2io_netpoll(struct net_devi
>   u64 val64 = 0xULL;
>   int i;
> 
> + if (pci_channel_offline(nic->pdev))
> + return;
> +
>   disable_irq(dev->irq);
> 
>   atomic_inc(&nic->isr_cnt);
> @@ -3079,6 +3089,8 @@ static void alarm_intr_handler(struct s2
>   int i;
>   if (atomic_read(&nic->card_state) == CARD_DOWN)
>   return;
> + if (pci_channel_offline(nic->pdev))
> + return;
>   nic->mac_control.stats_info->sw_stat.ring_full_cnt = 0;
>   /* Handling the XPAK counters update */
>   if(nic->mac_control.stats_info->xpak_stat.xpak_timer_count <
72000)
> {
> @@ -4117,6 +4129,10 @@ static irqreturn_t s2io_isr(int irq, voi
>   struct mac_info *mac_control;
>   struct config_param *config;
> 
> + /* Pretend we handled any irq's from a disconnected card */
> + if (pci_channel_offline(sp->pdev))
> + return IRQ_NONE;
> +
>   atomic_inc(&sp->isr_cnt);
>   mac_control = &sp->mac_control;
>   config = &sp->config;
> @@ -6188,7 +6204,7 @@ static void s2io_rem_isr(struct s2io_nic
>   } while(cnt < 5);
>  }
> 
> -static void s2io_card_down(struct s2io_nic * sp)
> +static void do_s2io_card_down(struct s2io_nic * sp, int do_io)
>  {
>   int cnt = 0;
>   struct XENA_dev_config __iomem *bar0 = sp->bar0;
> @@ -6203,7 +6219,8 @@ static void s2io_card_down(struct s2io_n
>   atomic_set(&sp->card_state, CARD_DOWN);
> 
>   /* disable Tx and Rx traffic on the NIC */
> - stop_nic(sp);
> + if (do_io)
> + stop_nic(sp);
> 
>   s2io_rem_isr(sp);
> 
> @@ -6211,7 +6228,7 @@ static void s2io_card_down(struct s2io_n
>   tasklet_kill(&sp->task);
> 
>   /* Check if the device is Quiescent and then Reset the NIC */
> - do {
> + while(do_io) {
>   /* As per the HW requirement we need to replenish the
>* receive buffer to avoid the ring bump. Since there is
>* no intention of processing the Rx frame at this
pointwe are
> @@ -6236,8 +6253,9 @@ static void s2io_card_down(struct s2io_n
> (unsigned long long) val64);
>

Re: [PATCH] natsemi: netpoll fixes

2007-03-05 Thread Mark Brown
On Tue, Mar 06, 2007 at 12:10:08AM +0400, Sergei Shtylyov wrote:

>  #ifdef CONFIG_NET_POLL_CONTROLLER
>  static void natsemi_poll_controller(struct net_device *dev)
>  {
> + struct netdev_private *np = netdev_priv(dev);
> +
>   disable_irq(dev->irq);
> - intr_handler(dev->irq, dev);
> +
> + /*
> +  * A real interrupt might have already reached us at this point
> +  * but NAPI might still haven't called us back.  As the interrupt
> +  * status register is cleared by reading, we should prevent an
> +  * interrupt loss in this case...
> +  */
> + if (!np->intr_status)
> + intr_handler(dev->irq, dev);
> +
>   enable_irq(dev->irq);

Is it possible for this to run at the same time as the NAPI poll?  If so
then it is possible for the netpoll poll to run between np->intr_status
being cleared and netif_rx_complete() being called.  If the hardware
asserts an interrupt at the wrong moment then this could cause the 

In any case, this is a problem independently of netpoll if the chip
shares an interrupt with anything so the interrupt handler should be
fixed to cope with this situation instead.

-- 
"You grabbed my hand and we fell into it, like a daydream - or a fever."


signature.asc
Description: Digital signature


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 01:55:30PM -0600, Matt Mackall wrote:
> On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
> > Ok, how about the following patch.  Is it acceptable to everyone?
> > 
> > - If you are using a distro that was released in 2006 or later,
> > - it should be safe to say N here.
> > + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
> > + release from 2007 or later, it should be safe to say N here.
> > +
> > + If you are using Debian or other distros that are slow to
> > + update HAL, please say Y here.
> 
> What HAL version do you think Debian ought to have, pray tell? And
> what the hell version do those other distros have?
> 
> The last HAL release was 0.5.8 on 11-Sep-2006. It showed up in
> Debian/unstable on 2-Oct. There have been six Debian bugfix releases,
> the most recent on 12-Feb.
> 
> http://people.freedesktop.org/~david/dist/
> http://packages.debian.org/changelogs/pool/main/h/hal/hal_0.5.8.1-6.1/changelog

Ok, I only named HAL as that is what people have told me the problem is.
I have been running this change on my boxs, without
CONFIG_SYSFS_DEPRECATED since last July or so.

But I don't use NetworkManager here for the most part, but I have tried
this in the OpenSuse10.3 alpha releases and it seems to work just fine
with whatever version of NetworkManager it uses.

So perhaps it's some wrapper scripts somewhere?  I think SuSE had some
odd things hard coded somewhere that prevented 10.1 from working
properly with this change.

Ok, so I'll drop the HAL wording above, what should I say instead?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "skge 0000:01:0a.0: unsupported phy type 0x0"

2007-03-05 Thread Chris Stromsoe

On Mon, 5 Mar 2007, Stephen Hemminger wrote:


What kernel version. Type 0 is XMAC support, and that was added to a 
fairly recent kernel (2.6.19?)


It was an old kernel.  I booted into 2.6.16 instead of 2.6.20.  See my 
follow-up (and ignore the report).


-Chris
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "skge 0000:01:0a.0: unsupported phy type 0x0"

2007-03-05 Thread Stephen Hemminger
On Mon, 5 Mar 2007 13:48:29 -0800 (PST)
Chris Stromsoe <[EMAIL PROTECTED]> wrote:

> I have a bunch of dual-port SK 98xx cards that work with sk98lin but not 
> with skge.  After loading skge, I get
> 
> ACPI: PCI Interrupt :01:0a.0[A] -> Link [LNKC] -> GSI 10 (level, low) -> 
> IRQ 10
> skge :01:0a.0: unsupported phy type 0x0
> ACPI: PCI interrupt for device :01:0a.0 disabled
> skge: probe of :01:0a.0 failed with error -95
> 
> 
> lspci -vv output for the card:
> 
> :01:0a.0 Ethernet controller: Syskonnect (Schneider & Koch) SK-98xx 
> Gigabit Ethernet Server Adapter (rev 12)
>  Subsystem: Syskonnect (Schneider & Koch) SK-9844 Gigabit Ethernet 
> Server Adapter (SK-NET GE-SX dual link)
>  Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- 
> Stepping- SERR+ FastB2B-
>  Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
> SERR-   Interrupt: pin A routed to IRQ 10
>  Region 0: Memory at ff8fc000 (32-bit, non-prefetchable) [size=16K]
>  Region 1: I/O ports at d800 [size=256]
>  Expansion ROM at ff40 [disabled] [size=128K]
>  Capabilities: [48] Power Management version 1
>  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
>  Status: D0 PME-Enable- DSel=0 DScale=1 PME-
>  Capabilities: [50] Vital Product Data
> 

What kernel version. Type 0 is XMAC support, and that was added to a fairly
recent kernel (2.6.19?)

-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Jim Chow
On Tue, 6 Mar 2007, Herbert Xu wrote:
> msg_flags [...] its initial value is not used.

That's not true.  Please see my post.

Specifically, at linux-2.6.20.1/net/ipv4/udp.c:843, it seems that 
udp_recvmsg() can randomly ignore whether the HW has computed a checksum 
and compute it in SW redundantly.

Jim
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Joel Becker
On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote:
> On Mon, Mar 05, 2007 at 01:13:26AM -0600, Matt Mackall wrote:
> > That's not the point. The point is that Debian/unstable as of _this
> > morning_ doesn't work. For reference, I'm running both the latest
> > releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And
> > there are people telling me I need a copy of HAL out of git that
> > hasn't even been released for Debian to package. Debian isn't the
> > problem here.
> 
>   hal 0.5.9-rc1 (released, not from git) should work. It will be
> problably released soon and picked by sane distributions. Debian is very
> irritating corner case.

As of right now, Fedora Core 6 has hal-0.5.8.1-6.fc6.  This is also
too old.  Please, stop claiming that Debian unstable is some corner
case.  No one is talking about Debian stable here.  No one is talking
about the Enterprise versions of Red Hat or SuSE (you'd find them just
as irritating with modern kernels).  Debian unstable tracks released
code as fast or faster than Fedora and OpenSuSE.  They all keep up with
releases.
But the last release of hal is 0.5.8.1.  _Release_, not "release
candidate".  You can't break that.  You can't break it for a while, if
you want a sane deprecation schedule.  These are userspace interfaces.
Matt is absolutely correct that you should't deprecate a
userspace<->kernel interface before you've even provided a release of
the tool that detects the change!

Joel

-- 

"When ideas fail, words come in very handy." 
 - Goethe

Joel Becker
Principal Software Developer
Oracle
E-mail: [EMAIL PROTECTED]
Phone: (650) 506-8127
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Herbert Xu
Jim Chow <[EMAIL PROTECTED]> wrote:
> After inspection of some networking code, it seems there is a use of
> uninitialized data in udp_recvmsg(),
> linux-2.6.20.1/net/ipv4/udp.c:843, while testing msg->msg_flags (see
> the backtrace below).  It looks like sys_recvfrom() is not

msg_flags is set on return and its initial value is not used.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


"skge 0000:01:0a.0: unsupported phy type 0x0"

2007-03-05 Thread Chris Stromsoe
I have a bunch of dual-port SK 98xx cards that work with sk98lin but not 
with skge.  After loading skge, I get


ACPI: PCI Interrupt :01:0a.0[A] -> Link [LNKC] -> GSI 10 (level, low) -> 
IRQ 10
skge :01:0a.0: unsupported phy type 0x0
ACPI: PCI interrupt for device :01:0a.0 disabled
skge: probe of :01:0a.0 failed with error -95


lspci -vv output for the card:

:01:0a.0 Ethernet controller: Syskonnect (Schneider & Koch) SK-98xx Gigabit 
Ethernet Server Adapter (rev 12)
Subsystem: Syskonnect (Schneider & Koch) SK-9844 Gigabit Ethernet 
Server Adapter (SK-NET GE-SX dual link)
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- http://vger.kernel.org/majordomo-info.html


Re: [PATCH] twcal_jiffie should be unsigned long, not int

2007-03-05 Thread David Miller
From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Mon, 5 Mar 2007 16:09:21 +0100

> While browsing include/net/inet_timewait_sock.h, I found this buggy 
> definition 
> of twcal_jiffie.
> 
> int twcal_jiffie;
> 
> I wonder how inet_twdr_twcal_tick() can really works on x86_64
> 
> This seems quite an old bug, it was there before introduction of 
> inet_timewait_death_row made by Arnaldo Carvalho de Melo.
> 
> [PATCH] twcal_jiffie should be unsigned long, not int
> 
> Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]>

Grrr, good catch Eric.  I'll push this fix to -stable too.

Thanks a lot.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Matt Mackall
On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote:
> > That's not the point. The point is that Debian/unstable as of _this
> > morning_ doesn't work. For reference, I'm running both the latest
> > releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And
> > there are people telling me I need a copy of HAL out of git that
> > hasn't even been released for Debian to package. Debian isn't the
> > problem here.
> 
>   hal 0.5.9-rc1 (released, not from git) should work. It will be
> problably released soon and picked by sane distributions. Debian is very
> irritating corner case.

Presumably the -rc1 stands for "release candidate". Which means "not
yet released". And when did it show up? 04-Mar-2007 at 18:31. That's
right, YESTERDAY. Almost a full month after Greg's commit.

For the last time, DEBIAN IS NOT THE PROBLEM.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Jim Chow
After inspection of some networking code, it seems there is a use of
uninitialized data in udp_recvmsg(),
linux-2.6.20.1/net/ipv4/udp.c:843, while testing msg->msg_flags (see
the backtrace below).  It looks like sys_recvfrom() is not
initializing msg.msg_flags and, along the path given below, msg_flags
is tested (at #0) without (necessarily) being written to.

A simple fix for this particular problem is given below.

Alternatively, udp_recvmsg() could be changed to initialize msg_flags
for its caller, since udp_recvmsg() (always? [*]) uses msg_flags as an
output argument.

In any case, I wanted to verify the bug with the networking gurus to see 
if they agree.

#0 udp_recvmsg (linux-2.6.20.1/net/ipv4/udp.c:843)
#1 sock_common_recvmsg (linux-2.6.20.1/net/core/sock.c:1617)
#2 sock_recvmsg (linux-2.6.20.1/net/socket.c:630)
#3 sys_recvfrom (linux-2.6.20.1/net/socket.c:1608)
#4 sys_socketcall (linux-2.6.20.1/net/socket.c:2007)
#5 syscall_call (linux-2.6.20.1/arch/i386/kernel/entry.S:0)

Index: linux-2.6.20.1/net/socket.c
===
--- linux-2.6.20.1.orig/net/socket.c
+++ linux-2.6.20.1/net/socket.c
@@ -1601,6 +1601,7 @@
iov.iov_base = ubuf;
msg.msg_name = address;
msg.msg_namelen = MAX_SOCK_ADDR;
+   msg.msg_flags = 0;
if (sock->file->f_flags & O_NONBLOCK)
flags |= MSG_DONTWAIT;
err = sock_recvmsg(sock, &msg, size, flags);


--

[*] Although do_sock_read() linux-2.6.20.1/net/socket.c:704, for one,
seems to want to initialize msg_flags nonzero, so maybe not.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread Eric Paris
On Mon, 2007-03-05 at 11:39 -0500, James Morris wrote:
> On Mon, 5 Mar 2007, Venkat Yekkirala wrote:
> 
> > > 
> > > Signed-off-by: Eric Paris <[EMAIL PROTECTED]>
> > Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]> 
> 
> What about your previous comment:
> 
>  "I guess you meant to do this here?
> else if (err)
> return err; "

That also gets taken care of in the pfkey_spdget cleanup in a later
patch.  The return isn't in that same place venkat suggested it instead
happens inside the new if (delete) block.  (err is only non-zero on
delete operations so there is no need to check it otherwise)

-Eric

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying

2007-03-05 Thread Neil Brown
On Monday March 5, [EMAIL PROTECTED] wrote:
> 
> Hi Neil,
> 
> here's another minor comment:
> 
> On Friday 02 March 2007 05:28, NeilBrown wrote:
> > +static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp,
> > +   struct cmsghdr *cmh)
> >  {
> > switch (rqstp->rq_sock->sk_sk->sk_family) {
> > case AF_INET: {
> > +   struct in_pktinfo *pki = CMSG_DATA(cmh);
> > +   rqstp->rq_daddr.addr.s_addr = pki->ipi_spec_dst.s_addr;
> > break;
> > +   }
> ...
> 
> The daddr that is extracted here will only ever be used to build
> another PKTINFO cmsg when sending the reply. So it would be
> much easier to just store the raw control message in the svc_rqst,
> without looking at its contents, and send it out along with the reply,
> unchanged.

Yes, sounds tempting, doesn't it?
Unfortunately it isn't that simple as I found out when the sunrpc code
in glibc did exactly that.

You see sendmsg will use the interface-number as well as the source
address from the PKTINFO structure.

Suppose my server has two interfaces (A and B) on two subnets that
both are connected to some router which is connected to a third subnet
that my client is on.  Further, suppose my server has only one default
route, out interface A.
The client chooses the IP address of interface B and sends a request.
It arrives on interface B and is processed.
If the PKTINFO received is passed unchanged to sendmsg, the pack will
be sent out interface B.  But interfacve B doesn't have a route to
that client, so the packet is dropped.

This exactly what was happening for me with mountd a few years ago.

So yes, we could just zero the interface field, but I think it is
clearer to extract that wanted data, then re-insert it.  They really
are different structures with different meanings (send verse receive)
which happen to have the same layout.

Thanks,
NeilBrown

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] natsemi: netpoll fixes

2007-03-05 Thread Sergei Shtylyov
Fix two issues in this driver's netpoll path: one usual, with spin_unlock_irq()
enabling interrupts which nobody asks it to do (that has been fixed recently in
a number of drivers) and one unusual, with poll_controller() method possibly
causing loss of interrupts due to the interrupt status register being cleared
by a simple read and the interrpupt handler simply storing it, not accumulating.

Signed-off-by: Sergei Shtylyov <[EMAIL PROTECTED]>

---
 drivers/net/natsemi.c |   24 +++-
 1 files changed, 19 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/net/natsemi.c
===
--- linux-2.6.orig/drivers/net/natsemi.c
+++ linux-2.6/drivers/net/natsemi.c
@@ -2024,6 +2024,7 @@ static int start_tx(struct sk_buff *skb,
struct netdev_private *np = netdev_priv(dev);
void __iomem * ioaddr = ns_ioaddr(dev);
unsigned entry;
+   unsigned long flags;
 
/* Note: Ordering is important here, set the field with the
   "ownership" bit last, and only then increment cur_tx. */
@@ -2037,7 +2038,7 @@ static int start_tx(struct sk_buff *skb,
 
np->tx_ring[entry].addr = cpu_to_le32(np->tx_dma[entry]);
 
-   spin_lock_irq(&np->lock);
+   spin_lock_irqsave(&np->lock, flags);
 
if (!np->hands_off) {
np->tx_ring[entry].cmd_status = cpu_to_le32(DescOwn | skb->len);
@@ -2056,7 +2057,7 @@ static int start_tx(struct sk_buff *skb,
dev_kfree_skb_irq(skb);
np->stats.tx_dropped++;
}
-   spin_unlock_irq(&np->lock);
+   spin_unlock_irqrestore(&np->lock, flags);
 
dev->trans_start = jiffies;
 
@@ -,6 +2223,8 @@ static void netdev_rx(struct net_device 
pkt_len = (desc_status & DescSizeMask) - 4;
if ((desc_status&(DescMore|DescPktOK|DescRxLong)) != DescPktOK){
if (desc_status & DescMore) {
+   unsigned long flags;
+
if (netif_msg_rx_err(np))
printk(KERN_WARNING
"%s: Oversized(?) Ethernet "
@@ -2236,12 +2239,12 @@ static void netdev_rx(struct net_device 
 * reset procedure documented in
 * AN-1287. */
 
-   spin_lock_irq(&np->lock);
+   spin_lock_irqsave(&np->lock, flags);
reset_rx(dev);
reinit_rx(dev);
writel(np->ring_dma, ioaddr + RxRingPtr);
check_link(dev);
-   spin_unlock_irq(&np->lock);
+   spin_unlock_irqrestore(&np->lock, flags);
 
/* We'll enable RX on exit from this
 * function. */
@@ -2396,8 +2399,19 @@ static struct net_device_stats *get_stat
 #ifdef CONFIG_NET_POLL_CONTROLLER
 static void natsemi_poll_controller(struct net_device *dev)
 {
+   struct netdev_private *np = netdev_priv(dev);
+
disable_irq(dev->irq);
-   intr_handler(dev->irq, dev);
+
+   /*
+* A real interrupt might have already reached us at this point
+* but NAPI might still haven't called us back.  As the interrupt
+* status register is cleared by reading, we should prevent an
+* interrupt loss in this case...
+*/
+   if (!np->intr_status)
+   intr_handler(dev->irq, dev);
+
enable_irq(dev->irq);
 }
 #endif

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Matt Mackall
On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
> Ok, how about the following patch.  Is it acceptable to everyone?
> 
> -   If you are using a distro that was released in 2006 or later,
> -   it should be safe to say N here.
> +   If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
> +   release from 2007 or later, it should be safe to say N here.
> +
> +   If you are using Debian or other distros that are slow to
> +   update HAL, please say Y here.

What HAL version do you think Debian ought to have, pray tell? And
what the hell version do those other distros have?

The last HAL release was 0.5.8 on 11-Sep-2006. It showed up in
Debian/unstable on 2-Oct. There have been six Debian bugfix releases,
the most recent on 12-Feb.

http://people.freedesktop.org/~david/dist/
http://packages.debian.org/changelogs/pool/main/h/hal/hal_0.5.8.1-6.1/changelog

The last NetworkManager is 0.6.4 released 13-Jul-2006. It showed up in
Debian/unstable on 8-Aug. There have been five bugfix releases, the
most recent on 30-Nov.

http://ftp.gnome.org/pub/GNOME/sources/NetworkManager/0.6/
http://packages.debian.org/changelogs/pool/main/n/network-manager/network-manager_0.6.4-6/changelog

Debian is NOT the problem.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET_SCHED 00/03]: ktime + nano-second clock resolution for packet schedulers

2007-03-05 Thread Patrick McHardy
Stephen Hemminger wrote:
> Don't bother changing netem. I have a version that uses hrtimer's
> and doesn't use PSCHED() clock source anymore.

Me too :) I'm going to send it with my other patches soon, if you
don't like it we can still drop it.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET_SCHED 00/03]: ktime + nano-second clock resolution for packet schedulers

2007-03-05 Thread Stephen Hemminger
On Mon, 05 Mar 2007 18:42:26 +0100
Patrick McHardy <[EMAIL PROTECTED]> wrote:

> David Miller wrote:
> > Frankly, I think now that we have ktime and all of the proper generic
> > infrastructure to do this stuff properly, I think we should just use
> > ktime for the packet scheduler across the board and just delete all of
> > that old by-hand timekeeping selection crap from pkt_sched.h
> 
> Sounds good, I'm going to remove all other clock sources.
> Will resend in a couple of days after fixing a few more
> problems I noticed.
> 

Don't bother changing netem. I have a version that uses hrtimer's
and doesn't use PSCHED() clock source anymore.

-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying

2007-03-05 Thread Olaf Kirch

Hi Neil,

here's another minor comment:

On Friday 02 March 2007 05:28, NeilBrown wrote:
> +static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp,
> + struct cmsghdr *cmh)
>  {
>   switch (rqstp->rq_sock->sk_sk->sk_family) {
>   case AF_INET: {
> + struct in_pktinfo *pki = CMSG_DATA(cmh);
> + rqstp->rq_daddr.addr.s_addr = pki->ipi_spec_dst.s_addr;
>   break;
> + }
...

The daddr that is extracted here will only ever be used to build
another PKTINFO cmsg when sending the reply. So it would be
much easier to just store the raw control message in the svc_rqst,
without looking at its contents, and send it out along with the reply,
unchanged.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 07:59:50AM -0500, Theodore Tso wrote:
> On Sun, Mar 04, 2007 at 05:17:29PM -0800, Greg KH wrote:
> > I should not have broken any userspace if CONFIG_SYSFS_DEPRECATED is
> > enabled with that patch.  If that is enabled, and that patch still
> > causes problems, please let me know.
> 
> But we still need to update the help text for CONFIG_SYS_DEPRECATED to
> make it clear that its deprecation schedule still needs to be 2009 to
> 2011 (depending on whether we want to accomodate Debian's glacial
> release schedule).  Certainly the 2006 date which is currently there
> simply isn't accurate.

Ok, how about the following patch.  Is it acceptable to everyone?

thanks,

greg k-h

---
 init/Kconfig |   13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- gregkh-2.6.orig/init/Kconfig
+++ gregkh-2.6/init/Kconfig
@@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
  that belong to a class, back into the /sys/class heirachy, in
  order to support older versions of udev.
 
- If you are using a distro that was released in 2006 or later,
- it should be safe to say N here.
+ If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
+ release from 2007 or later, it should be safe to say N here.
+
+ If you are using Debian or other distros that are slow to
+ update HAL, please say Y here.
+
+ If you have any problems with devices not being found properly
+ from userspace programs, and this option is disabled, say Y
+ here.
+
+ If you are unsure about this at all, say Y.
 
 config RELAY
bool "Kernel->user space relay support (formerly relayfs)"
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying

2007-03-05 Thread Olaf Kirch
On Friday 02 March 2007 05:28, NeilBrown wrote:
> The sunrpc server code needs to know the source and destination address
> for UDP packets so it can reply properly.
> It currently copies code out of the network stack to pick the pieces out
> of the skb.
> This is ugly and causes compile problems with the IPv6 stuff.

... and this IPv6 code could never have worked anyway:


>   case AF_INET6: {
...
> - rqstp->rq_addrlen = sizeof(struct sockaddr_in);
... this should have been sizeof(sockaddr_in6)...

> - /* Remember which interface received this request */
> - ipv6_addr_copy(&rqstp->rq_daddr.addr6,
> - &skb->nh.ipv6h->saddr);
 and this should have copied from daddr, not saddr.

But I find using recvmsg just for getting at the addresses
a little awkward too. And I think to be on the safe side, you
should check that you're really looking at a PKTINFO cmsg
rather than something else.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET_SCHED 00/03]: ktime + nano-second clock resolution for packet schedulers

2007-03-05 Thread Patrick McHardy
David Miller wrote:
> Frankly, I think now that we have ktime and all of the proper generic
> infrastructure to do this stuff properly, I think we should just use
> ktime for the packet scheduler across the board and just delete all of
> that old by-hand timekeeping selection crap from pkt_sched.h

Sounds good, I'm going to remove all other clock sources.
Will resend in a couple of days after fixing a few more
problems I noticed.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread Venkat Yekkirala
> > > 
> > > Signed-off-by: Eric Paris <[EMAIL PROTECTED]>
> > Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]> 
> 
> What about your previous comment:
> 
>  "I guess you meant to do this here?
> else if (err)
> return err; "

I saw that this was taken care of in patch-2 for the delete case, but
while err isn't currently applicable to the non-delete case, it would
be proper/complete for err to still be handled for the non-delete case.
Thanks for asking.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SWS for rcvbuf < MTU

2007-03-05 Thread Alex Sidorenko
On March 3, 2007 06:40:12 pm John Heffner wrote:
> David Miller wrote:
> > From: John Heffner <[EMAIL PROTECTED]>
> > Date: Fri, 02 Mar 2007 16:16:39 -0500
> >
> >> Please don't apply the patch I sent.  I've been thinking about this a
> >> bit harder, and it may not fix this particular problem.  (Hard to say
> >> without knowing exactly what it is.)  As the comment above
> >> __tcp_select_window() states, we do not do full receive-side SWS
> >> avoidance because of header prediction.
> >>
> >> Alex, you're right I missed that special zero-window case.  I'm still
> >> not quite sure I'm completely happy with this patch.  I'd like to think
> >> about this a little bit harder...
> >
> > Ok
>
> Alright, I've thought about it a bit more, and I think the patch I sent
> should work.  Alex, any opinion?  Any way you can test this out?

Here are the values from live kernel (obtained with 'crash') when the host was 
in SWS state:

full_space=708  full_space/2=354
free_space=393
window=76

In this case the test from my original fix, (window < full_space/2),  
succeeds. But John's test

free_space > window + full_space/2
393  430

does not. So I suspect that the new fix will not always work. From tcpdump 
traces we can see that both hosts exchange with 76-byte packets for a long 
time. From customer's application log we see that it continues to read 
76-byte chunks per each read() call - even though more than that is available 
in the receive buffer. Technically it's OK for read() to return even after 
reading one byte, so if sk->receive_queue contains multiple 76-byte skbuffs 
we may return after processing just one skbuff (but we we don't understand 
the details of why this happens on customer's system).

Are there any particular reasons why you want to postpone window update until 
free_space becomes > window + full_space/2 and not as soon as 
free_space > full_space/2? As the only real-life occurance of SWS shows 
free_space oscillating slightly above full_space/2, I created the fix 
specifically to match this phenomena as seen on customer's host. We reach the 
modified section only when (free_space > full_space/2) so it should be OK to 
update the window at this point if mss==full_space. 

So yes, we can test John's fix on customer's host but I doubt it will work for 
the reasons mentioned above, in brief:

'window = free_space' instead of 'window=full_space/2' is OK,
but the test 'free_space > window + full_space/2' is not for the specific 
pattern customer sees on his hosts.

Thanks,
Alex


-- 
--
Alexandre Sidorenko email: [EMAIL PROTECTED]
Global Solutions Engineering:   Unix Networking
Hewlett-Packard (Canada)
--
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread James Morris
On Mon, 5 Mar 2007, Venkat Yekkirala wrote:

> > 
> > Signed-off-by: Eric Paris <[EMAIL PROTECTED]>
> Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]> 

What about your previous comment:

 "I guess you meant to do this here?
else if (err)
return err; "




-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Why should we teach students Linux??

2007-03-05 Thread Roel Bindels
Hello listers,

I'm tutor on the Faculty ICT, department NID. This is a bachelor degree
and we are preparing our students to become something more then just
System Administrators (such as manager, consulting, etc). Since this
department is part of the Microsoft camp, the students are educated
mostly in this direction, which I think is not a bad thing. A better
thing would be if we could give our students the opportunity to meat
both the systems on the same level, at least, that is my opinion.

To change a curriculum of a study, I need a solid case. So if somebody
knows a link/document about why we should educate our students in the
Linux OS, please send it. Or article about the usage of Linux in company's.

I hope you will all take some time to send me your best links/documents.

with best regards

Roel Bindels

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm audit hook misplaced in pfkey_delete and xfrm_del_sa

2007-03-05 Thread Venkat Yekkirala
> Inside pfkey_delete and xfrm_del_sa the audit hooks were not called if
> there was any permission/security failures in attempting to do the del
> operation (such as permission denied from security_xfrm_state_delete).
> This patch moves the audit hook to the exit path such that 
> all failures
> (and successes) will actually get audited.

Not sure ALL failures are being audited this way elsewhere, but I guess
they would catchup in course of time.

> 
> Signed-off-by: Eric Paris <[EMAIL PROTECTED]>
Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]> 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] Add xfrm policy change auditing to pfkey_spdget

2007-03-05 Thread Venkat Yekkirala

> pfkey_spdget neither had an LSM security hook nor auditing for the
> removal of xfrm_policy structs.  The security hook was added 
> when it was
> moved into xfrm_policy_byid instead of the callers to that function by
> my earlier patch and this patch adds the auditing hooks as well.
> 
> Signed-off-by: Eric Paris <[EMAIL PROTECTED]>
Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]>  
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread Venkat Yekkirala
> 
> Signed-off-by: Eric Paris <[EMAIL PROTECTED]>
Acked-by: Venkat Yekkirala <[EMAIL PROTECTED]> 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Tomasz Torcz
On Mon, Mar 05, 2007 at 01:13:26AM -0600, Matt Mackall wrote:
> On Sun, Mar 04, 2007 at 11:02:48PM -0800, Greg KH wrote:
> > On Mon, Mar 05, 2007 at 12:42:29AM -0600, Matt Mackall wrote:
> > > On Sun, Mar 04, 2007 at 05:16:25PM -0800, Greg KH wrote:
> > > > On Sun, Mar 04, 2007 at 04:08:57PM -0600, Matt Mackall wrote:
> > > > > Recent kernels are having troubles with wireless for me. Two seemingly
> > > > > related problems:
> > > > > 
> > > > > a) NetworkManager seems oblivious to the existence of my IPW2200
> > > > > b) Manual iwconfig waits for 60s and then reports:
> > > > > 
> > > > > Error for wireless request "Set Encode" (8B2A) :
> > > > > SET failed on device eth1 ; Operation not supported.
> > > > 
> > > > Do you have CONFIG_SYSFS_DEPRECATED enabled?  If not, please do as that
> > > > will keep you from having to change any userspace code.
> > > 
> > > No, it's disabled. Will test once I'm done tracking down the iwconfig
> > > problem. From the help text for SYSFS_DEPRECATED:
> > > 
> > >   If you are using a distro that was released in 2006 or
> > > later, it should be safe to say N here.
> > > 
> > > If we need an as-yet-unreleased HAL without it, I would say the above
> > > should be changed to 2008 or so. If Debian actually cuts a release in
> > > the next few months, you might make that 2010.
> > 
> > Well, just because Debian has such a slow release cycle, should the rest
> > of the world be forced to follow suit?  :)
> > 
> > When I originally wrote that, I thought Debian would have already done
> > their release, my mistake...
> 
> That's not the point. The point is that Debian/unstable as of _this
> morning_ doesn't work. For reference, I'm running both the latest
> releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And
> there are people telling me I need a copy of HAL out of git that
> hasn't even been released for Debian to package. Debian isn't the
> problem here.

  hal 0.5.9-rc1 (released, not from git) should work. It will be
problably released soon and picked by sane distributions. Debian is very
irritating corner case.

-- 
Tomasz TorczOnly gods can safely risk perfection,
[EMAIL PROTECTED] it's a dangerous thing for a man.  -- Alia

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 13/13] iptables tproxy match

2007-03-05 Thread KOVACS Krisztian
Implements an iptables module which matches packets which have the
tproxy flag set, that is, packets diverted in the tproxy table.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 net/netfilter/Kconfig |9 +
 net/netfilter/Makefile|1 +
 net/netfilter/xt_tproxy.c |   77 +
 3 files changed, 87 insertions(+), 0 deletions(-)

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 253fce3..b22346e 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -603,6 +603,15 @@ config NETFILTER_XT_MATCH_QUOTA
  If you want to compile it as a module, say M here and read
  .  If unsure, say `N'.
 
+config NETFILTER_XT_MATCH_TPROXY
+   tristate '"tproxy" match support'
+   depends on NETFILTER_XTABLES
+   help
+ This option adds a `tproxy' match, which allows you to match
+ packets which have been diverted to local sockets by TProxy.
+
+ To compile it as a module, choose M here.  If unsure, say N.
+
 config NETFILTER_XT_MATCH_REALM
tristate  '"realm" match support'
depends on NETFILTER_XTABLES
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index b2b5c75..83b2fd9 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -64,6 +64,7 @@ obj-$(CONFIG_NETFILTER_XT_MATCH_MARK) += xt_mark.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_MULTIPORT) += xt_multiport.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_POLICY) += xt_policy.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_PKTTYPE) += xt_pkttype.o
+obj-$(CONFIG_NETFILTER_XT_MATCH_TPROXY) += xt_tproxy.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_QUOTA) += xt_quota.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_REALM) += xt_realm.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_SCTP) += xt_sctp.o
diff --git a/net/netfilter/xt_tproxy.c b/net/netfilter/xt_tproxy.c
new file mode 100644
index 000..53f8bee
--- /dev/null
+++ b/net/netfilter/xt_tproxy.c
@@ -0,0 +1,77 @@
+/*
+ * Transparent proxy support for Linux/iptables
+ *
+ * Copyright (c) 2007 BalaBit IT Ltd.
+ * Author: Krisztian Kovacs
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include 
+#include 
+
+#include 
+
+static int
+match(const struct sk_buff *skb,
+  const struct net_device *in,
+  const struct net_device *out,
+  const struct xt_match *match,
+  const void *matchinfo,
+  int offset,
+  unsigned int protoff,
+  int *hotdrop)
+{
+   return skb->ip_tproxy;
+}
+
+static int
+check(const char *tablename,
+  const void *entry,
+  const struct xt_match *match,
+  void *matchinfo,
+  unsigned int hook_mask)
+{
+   return 1;
+}
+
+static struct xt_match tproxy_matches[] = {
+   {
+   .name   = "tproxy",
+   .match  = match,
+   .matchsize  = 0,
+   .checkentry = check,
+   .family = AF_INET,
+   .me = THIS_MODULE,
+   },
+   {
+   .name   = "tproxy",
+   .match  = match,
+   .matchsize  = 0,
+   .checkentry = check,
+   .family = AF_INET6,
+   .me = THIS_MODULE,
+   },
+};
+
+static int __init xt_tproxy_init(void)
+{
+   return xt_register_matches(tproxy_matches, ARRAY_SIZE(tproxy_matches));
+}
+
+static void __exit xt_tproxy_fini(void)
+{
+   xt_unregister_matches(tproxy_matches, ARRAY_SIZE(tproxy_matches));
+}
+
+module_init(xt_tproxy_init);
+module_exit(xt_tproxy_fini);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Krisztian Kovacs <[EMAIL PROTECTED]>");
+MODULE_DESCRIPTION("iptables tproxy match module");
+MODULE_ALIAS("ipt_tproxy");
+MODULE_ALIAS("ip6t_tproxy");

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 12/13] iptables TPROXY target

2007-03-05 Thread KOVACS Krisztian
The TPROXY target implements redirection of non-local TCP/UDP traffic
to local sockets. It is simply a wrapper around functionality exported
from iptable_tproxy.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 include/linux/netfilter_ipv4/ipt_TPROXY.h |9 +++
 net/ipv4/netfilter/Kconfig|   11 +++
 net/ipv4/netfilter/Makefile   |1 
 net/ipv4/netfilter/ipt_TPROXY.c   |   92 +
 4 files changed, 113 insertions(+), 0 deletions(-)

diff --git a/include/linux/netfilter_ipv4/ipt_TPROXY.h 
b/include/linux/netfilter_ipv4/ipt_TPROXY.h
new file mode 100644
index 000..d05c956
--- /dev/null
+++ b/include/linux/netfilter_ipv4/ipt_TPROXY.h
@@ -0,0 +1,9 @@
+#ifndef _IPT_TPROXY_H_target
+#define _IPT_TPROXY_H_target
+
+struct ipt_tproxy_target_info {
+   u_int16_t lport;
+   u_int32_t laddr;
+};
+
+#endif
diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index 17c3ec8..ecd8da5 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -638,6 +638,17 @@ config IP_NF_TPROXY
 
  To compile it as a module, choose M here.  If unsure, say N.
 
+config IP_NF_TARGET_TPROXY
+   tristate "TPROXY target support"
+   depends on IP_NF_TPROXY
+   help
+ This option adds a `TPROXY' target, which is somewhat similar to
+ REDIRECT.  It can only be used in the tproxy table and is useful
+ to redirect traffic to a transparent proxy.  It does _not_ depend
+ on Netfilter connection tracking.
+
+ To compile it as a module, choose M here.  If unsure, say N.
+
 # ARP tables
 config IP_NF_ARPTABLES
tristate "ARP tables support"
diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile
index 21a29f4..a50a64e 100644
--- a/net/ipv4/netfilter/Makefile
+++ b/net/ipv4/netfilter/Makefile
@@ -106,6 +106,7 @@ obj-$(CONFIG_IP_NF_TARGET_LOG) += ipt_LOG.o
 obj-$(CONFIG_IP_NF_TARGET_ULOG) += ipt_ULOG.o
 obj-$(CONFIG_IP_NF_TARGET_CLUSTERIP) += ipt_CLUSTERIP.o
 obj-$(CONFIG_IP_NF_TARGET_TTL) += ipt_TTL.o
+obj-$(CONFIG_IP_NF_TARGET_TPROXY) += ipt_TPROXY.o
 
 # generic ARP tables
 obj-$(CONFIG_IP_NF_ARPTABLES) += arp_tables.o
diff --git a/net/ipv4/netfilter/ipt_TPROXY.c b/net/ipv4/netfilter/ipt_TPROXY.c
new file mode 100644
index 000..89a08b1
--- /dev/null
+++ b/net/ipv4/netfilter/ipt_TPROXY.c
@@ -0,0 +1,92 @@
+/*
+ * Transparent proxy support for Linux/iptables
+ *
+ * Copyright (c) 2006-2007 BalaBit IT Ltd.
+ * Author: Balazs Scheidler, Krisztian Kovacs
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+static unsigned int
+target(struct sk_buff **pskb,
+   const struct net_device *in,
+   const struct net_device *out,
+   unsigned int hooknum,
+   const struct xt_target *target,
+   const void *targinfo)
+{
+   const struct iphdr *iph = (*pskb)->nh.iph;
+   const struct ipt_tproxy_target_info *tgi =
+   (const struct ipt_tproxy_target_info *) targinfo;
+   unsigned int verdict = NF_ACCEPT;
+   struct sk_buff *skb = *pskb;
+   struct udphdr _hdr, *hp;
+   struct sock *sk;
+   __be32 daddr;
+   __be16 dport;
+
+   /* TCP/UDP only */
+   if ((iph->protocol != IPPROTO_TCP) &&
+   (iph->protocol != IPPROTO_UDP))
+   return NF_ACCEPT;
+
+   hp = skb_header_pointer(*pskb, iph->ihl * 4, sizeof(_hdr), &_hdr);
+   if (hp == NULL)
+   return NF_DROP;
+
+   daddr = tgi->laddr ? : iph->daddr;
+   dport = tgi->lport ? : hp->dest;
+   sk = ip_tproxy_get_sock(iph->protocol,
+   iph->saddr, daddr,
+   hp->source, dport, in);
+   if (sk != NULL) {
+   if (ip_tproxy_do_divert(skb, sk, 0, in) < 0)
+   verdict = NF_DROP;
+
+   if ((iph->protocol == IPPROTO_TCP) && (sk->sk_state == 
TCP_TIME_WAIT))
+   inet_twsk_put(inet_twsk(sk));
+   else
+   sock_put(sk);
+   }
+
+   return verdict;
+}
+
+static struct xt_target ipt_tproxy_reg = {
+   .name   = "TPROXY",
+   .family = AF_INET,
+   .target = target,
+   .targetsize = sizeof(struct ipt_tproxy_target_info),
+   .table  = "tproxy",
+   .me = THIS_MODULE,
+};
+
+static int __init init(void)
+{
+   return xt_register_target(&ipt_tproxy_reg);
+}
+
+static void __exit fini(void)
+{
+   xt_unregister_target(&ipt_tproxy_reg);
+}
+
+module_init(init);
+module_exit(fini);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Krisztian Kovacs <[EMAIL PROTECTED]>");
+MODULE_DESCRIPTION("Netfilter transparent proxy TPROXY target modu

[PATCH/RFC 11/13] iptables tproxy table

2007-03-05 Thread KOVACS Krisztian
The iptables tproxy table registers a new hook on PRE_ROUTING and for
each incoming TCP/UDP packet performs as follows:

1. Does IPv4 fragment reassembly. We need this to be able to do TCP/UDP
   header processing.

2. Does a TCP/UDP socket hash lookup to decide whether or not the packet
   is sent to a non-local bound socket. If a matching socket is found
   and the socket has the IP_TRANSPARENT socket option enabled the skb is
   diverted locally and the socket reference is stored in the skb.

3. If no matching socket was found, the PREROUTING chain of the
   iptables tproxy table is consulted. Matching rules with the TPROXY
   target can do transparent redirection here. (In this case it is not
   necessary to have the IP_TRANSPARENT socket option enabled for the
   target socket, redirection takes place even for "regular"
   sockets. This way no modification of the application is necessary.)

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 include/linux/netfilter_ipv4.h   |1 
 include/linux/netfilter_ipv4/ip_tproxy.h |   20 ++
 include/net/ip.h |3 
 net/ipv4/netfilter/Kconfig   |   10 +
 net/ipv4/netfilter/Makefile  |1 
 net/ipv4/netfilter/iptable_tproxy.c  |  267 ++
 6 files changed, 301 insertions(+), 1 deletions(-)

diff --git a/include/linux/netfilter_ipv4.h b/include/linux/netfilter_ipv4.h
index ceae87a..cc4d83b 100644
--- a/include/linux/netfilter_ipv4.h
+++ b/include/linux/netfilter_ipv4.h
@@ -58,6 +58,7 @@ enum nf_ip_hook_priorities {
NF_IP_PRI_SELINUX_FIRST = -225,
NF_IP_PRI_CONNTRACK = -200,
NF_IP_PRI_MANGLE = -150,
+   NF_IP_PRI_TPROXY = -125,
NF_IP_PRI_NAT_DST = -100,
NF_IP_PRI_FILTER = 0,
NF_IP_PRI_NAT_SRC = 100,
diff --git a/include/linux/netfilter_ipv4/ip_tproxy.h 
b/include/linux/netfilter_ipv4/ip_tproxy.h
new file mode 100644
index 000..ae890e3
--- /dev/null
+++ b/include/linux/netfilter_ipv4/ip_tproxy.h
@@ -0,0 +1,20 @@
+#ifndef _IP_TPROXY_H
+#define _IP_TPROXY_H
+
+#include 
+
+/* look up and get a reference to a matching socket */
+extern struct sock *
+ip_tproxy_get_sock(const u8 protocol,
+  const __be32 saddr, const __be32 daddr,
+  const __be16 sport, const __be16 dport,
+  const struct net_device *in);
+
+/* divert skb to a given socket */
+extern int
+ip_tproxy_do_divert(struct sk_buff *skb,
+   const struct sock *sk,
+   const int require_freebind,
+   const struct net_device *in);
+
+#endif
diff --git a/include/net/ip.h b/include/net/ip.h
index 8b71991..a589e6e 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -321,7 +321,8 @@ enum ip_defrag_users
IP_DEFRAG_CONNTRACK_OUT,
IP_DEFRAG_VS_IN,
IP_DEFRAG_VS_OUT,
-   IP_DEFRAG_VS_FWD
+   IP_DEFRAG_VS_FWD,
+   IP_DEFRAG_TP_IN,
 };
 
 struct sk_buff *ip_defrag(struct sk_buff *skb, u32 user);
diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index 601808c..17c3ec8 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -628,6 +628,16 @@ config IP_NF_RAW
  If you want to compile it as a module, say M here and read
  .  If unsure, say `N'.
 
+# tproxy table
+config IP_NF_TPROXY
+   tristate "Transparent proxying"
+   depends on IP_NF_IPTABLES
+   help
+ Transparent proxying. For more information see
+ http://www.balabit.com/downloads/tproxy.
+
+ To compile it as a module, choose M here.  If unsure, say N.
+
 # ARP tables
 config IP_NF_ARPTABLES
tristate "ARP tables support"
diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile
index 6625ec6..21a29f4 100644
--- a/net/ipv4/netfilter/Makefile
+++ b/net/ipv4/netfilter/Makefile
@@ -81,6 +81,7 @@ obj-$(CONFIG_IP_NF_MANGLE) += iptable_mangle.o
 obj-$(CONFIG_IP_NF_NAT) += iptable_nat.o
 obj-$(CONFIG_NF_NAT) += iptable_nat.o
 obj-$(CONFIG_IP_NF_RAW) += iptable_raw.o
+obj-$(CONFIG_IP_NF_TPROXY) += iptable_tproxy.o
 
 # matches
 obj-$(CONFIG_IP_NF_MATCH_IPRANGE) += ipt_iprange.o
diff --git a/net/ipv4/netfilter/iptable_tproxy.c 
b/net/ipv4/netfilter/iptable_tproxy.c
new file mode 100644
index 000..a241f11
--- /dev/null
+++ b/net/ipv4/netfilter/iptable_tproxy.c
@@ -0,0 +1,267 @@
+/*
+ * Transparent proxy support for Linux/iptables
+ *
+ * Copyright (c) 2006-2007 BalaBit IT Ltd.
+ * Author: Balazs Scheidler, Krisztian Kovacs
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#define TPROXY_VALID_HOOKS (1 << NF_IP_PRE_ROUTING)
+
+#if 1
+#define DEBUGP printk
+#else
+#define DE

[PATCH/RFC 09/13] Create a tproxy flag in struct sk_buff

2007-03-05 Thread KOVACS Krisztian
We would like to be able to match on whether or not a given packet has
been diverted by tproxy. To make this possible we need a flag in
sk_buff.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 include/linux/skbuff.h |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4ff3940..6d7f5c7 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -284,7 +284,8 @@ struct sk_buff {
nfctinfo:3;
__u8pkt_type:3,
fclone:2,
-   ipvs_property:1;
+   ipvs_property:1,
+   ip_tproxy:1;
__be16  protocol;
 
void(*destructor)(struct sk_buff *skb);

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 10/13] Export UDP socket lookup function

2007-03-05 Thread KOVACS Krisztian
The iptables tproxy code has to be able to do UDP socket hash lookups,
so we have to provide an exported lookup function for this purpose.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 include/net/udp.h |4 
 net/ipv4/udp.c|8 
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/include/net/udp.h b/include/net/udp.h
index 1b921fa..ea5aa31 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -141,6 +141,10 @@ extern int udp_lib_setsockopt(struct sock *sk, int 
level, int optname,
   char __user *optval, int optlen,
   int (*push_pending_frames)(struct sock *));
 
+extern struct sock *udp4_lib_lookup(__be32 saddr, __be16 sport,
+   __be32 daddr, __be16 dport,
+   int dif);
+
 DECLARE_SNMP_STAT(struct udp_mib, udp_statistics);
 /*
  * SNMP statistics for UDP and UDP-Lite
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 1d15edc..52695a6 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -285,6 +285,14 @@ static struct sock *__udp4_lib_lookup(__be32 saddr, __be16 
sport,
return result;
 }
 
+struct sock *udp4_lib_lookup(__be32 saddr, __be16 sport,
+__be32 daddr, __be16 dport,
+int dif)
+{
+   return __udp4_lib_lookup(saddr, sport, daddr, dport, dif, udp_hash);
+}
+EXPORT_SYMBOL_GPL(udp4_lib_lookup);
+
 static inline struct sock *udp_v4_mcast_next(struct sock *sk,
 __be16 loc_port, __be32 loc_addr,
 __be16 rmt_port, __be32 rmt_addr,

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 08/13] Handle TCP SYN+ACK/ACK/RST transparency

2007-03-05 Thread KOVACS Krisztian
The TCP stack sends out SYN+ACK/ACK/RST reply packets in response to
incoming packets. The non-local source address check on output bites
us again, as replies for transparently redirected traffic won't have a
chance to leave the node.

This patch selectively sets the FLOWI_FLAG_TRANSPARENT flag when doing
the route lookup for those replies. Transparent replies are enabled if
the listening socket has the transparent socket flag set.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 include/net/ip.h|3 +++
 include/net/request_sock.h  |3 ++-
 net/ipv4/inet_connection_sock.c |2 ++
 net/ipv4/ip_output.c|6 +-
 net/ipv4/syncookies.c   |2 ++
 net/ipv4/tcp_ipv4.c |   16 ++--
 net/ipv4/tcp_minisocks.c|3 ++-
 7 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index e79c3e3..8b71991 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -133,8 +133,11 @@ static inline void ip_tr_mc_map(__be32 addr, char *buf)
buf[5]=0x00;
 }
 
+#define IP_REPLY_ARG_NOSRCCHECK 1
+
 struct ip_reply_arg {
struct kvec iov[1];   
+   int flags;
__wsum  csum;
int csumoffset; /* u16 offset of csum in iov[0].iov_base */
/* -1 if not needed */ 
diff --git a/include/net/request_sock.h b/include/net/request_sock.h
index 7aed02c..b9c8974 100644
--- a/include/net/request_sock.h
+++ b/include/net/request_sock.h
@@ -34,7 +34,8 @@ struct request_sock_ops {
   struct request_sock *req,
   struct dst_entry *dst);
void(*send_ack)(struct sk_buff *skb,
-   struct request_sock *req);
+   struct request_sock *req,
+   int reply_flags);
void(*send_reset)(struct sock *sk,
  struct sk_buff *skb);
void(*destructor)(struct request_sock *req);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 83ad972..90459a1 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -323,6 +323,8 @@ struct dst_entry* inet_csk_route_req(struct sock *sk,
.saddr = ireq->loc_addr,
.tos = RT_CONN_FLAGS(sk) } },
.proto = sk->sk_protocol,
+   .flags = inet_sk(sk)->transparent ?
+   FLOWI_FLAG_TRANSPARENT : 0,
.uli_u = { .ports =
   { .sport = inet_sk(sk)->sport,
 .dport = ireq->rmt_port } } };
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index d096332..7af25d4 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -312,6 +312,8 @@ int ip_queue_xmit(struct sk_buff *skb, int ipfragok)
.saddr = inet->saddr,
.tos = 
RT_CONN_FLAGS(sk) } },
.proto = sk->sk_protocol,
+   .flags = inet->transparent ?
+FLOWI_FLAG_TRANSPARENT 
: 0,
.uli_u = { .ports =
   { .sport = inet->sport,
 .dport = inet->dport } 
} };
@@ -1357,7 +1359,9 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, 
struct ip_reply_arg *ar
.uli_u = { .ports =
   { .sport = skb->h.th->dest,
 .dport = skb->h.th->source } },
-   .proto = sk->sk_protocol };
+   .proto = sk->sk_protocol,
+   .flags = (arg->flags & 
IP_REPLY_ARG_NOSRCCHECK) ?
+   FLOWI_FLAG_TRANSPARENT : 0 };
security_skb_classify_flow(skb, &fl);
if (ip_route_output_key(&rt, &fl))
return;
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 431c81d..08d8920 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -261,6 +261,8 @@ struct sock *cookie_v4_check(struct sock *sk, struct 
sk_buff *skb,
.saddr = ireq->loc_addr,
.tos = RT_CONN_FLAGS(sk) } },
.proto = IPPROTO_TCP,
+   .flags = inet_sk(sk)->transparent ?
+   

[PATCH/RFC 07/13] Conditionally enable transparent flow flag when connecting

2007-03-05 Thread KOVACS Krisztian
Set FLOWI_FLAG_TRANSPARENT in flowi->flags if the socket has the
transparent socket option set. This way we selectively enable certain
connections with non-local source addresses to be routed.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 include/net/route.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/include/net/route.h b/include/net/route.h
index 13da592..4dff368 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -161,6 +161,10 @@ static inline int ip_route_connect(struct rtable **rp, 
__be32 dst,
 .dport = dport } } };
 
int err;
+
+   if (inet_sk(sk)->transparent)
+   fl.flags |= FLOWI_FLAG_TRANSPARENT;
+
if (!dst || !src) {
err = __ip_route_output_key(rp, &fl);
if (err)

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 06/13] Implement IP_TRANSPARENT socket option

2007-03-05 Thread KOVACS Krisztian
This patch introduces the IP_TRANSPARENT socket option: enabling that will make
the IPv4 routing omit the non-local source address check on output. Setting
IP_TRANSPARENT requires NET_ADMIN capability.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 include/linux/in.h   |1 +
 include/net/inet_sock.h  |3 ++-
 include/net/inet_timewait_sock.h |3 ++-
 include/net/route.h  |1 +
 net/ipv4/inet_timewait_sock.c|1 +
 net/ipv4/ip_sockglue.c   |   12 +++-
 6 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/include/linux/in.h b/include/linux/in.h
index 1912e7c..66be615 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -75,6 +75,7 @@ struct in_addr {
 #define IP_IPSEC_POLICY16
 #define IP_XFRM_POLICY 17
 #define IP_PASSSEC 18
+#define IP_TRANSPARENT 19
 
 /* BSD compatibility */
 #define IP_RECVRETOPTS IP_RETOPTS
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 0bd167b..14b597d 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -128,7 +128,8 @@ struct inet_sock {
is_icsk:1,
freebind:1,
hdrincl:1,
-   mc_loop:1;
+   mc_loop:1,
+   transparent:1;
int mc_index;
__be32  mc_addr;
struct ip_mc_socklist   *mc_list;
diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
index f7be1ac..e30dd61 100644
--- a/include/net/inet_timewait_sock.h
+++ b/include/net/inet_timewait_sock.h
@@ -126,7 +126,8 @@ struct inet_timewait_sock {
__be16  tw_dport;
__u16   tw_num;
/* And these are ours. */
-   __u8tw_ipv6only:1;
+   __u8tw_ipv6only:1,
+   tw_transparent:1;
/* 15 bits hole, try to pack */
__u16   tw_ipv6_offset;
int tw_timeout;
diff --git a/include/net/route.h b/include/net/route.h
index efaa6b2..13da592 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index a73cf93..f57f81a 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -108,6 +108,7 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct 
sock *sk, const int stat
tw->tw_reuse= sk->sk_reuse;
tw->tw_hash = sk->sk_hash;
tw->tw_ipv6only = 0;
+   tw->tw_transparent  = inet->transparent;
tw->tw_prot = sk->sk_prot_creator;
atomic_set(&tw->tw_refcnt, 1);
inet_twsk_dead_node_init(tw);
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 23048d9..02e8d9f 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -414,7 +414,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
(1<= sizeof(int)) {
@@ -875,6 +875,16 @@ mc_msf_out:
err = xfrm_user_policy(sk, optname, optval, optlen);
break;
 
+   case IP_TRANSPARENT:
+   if (!capable(CAP_NET_ADMIN)) {
+   err = -EPERM;
+   break;
+   }
+   if (optlen < 1)
+   goto e_inval;
+   inet->transparent = !!val;
+   break;
+
default:
err = -ENOPROTOOPT;
break;

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 04/13] Don't do the UDP socket lookup if we already have one attached

2007-03-05 Thread KOVACS Krisztian
UDP input code path looks up the UDP socket hash tables to find a
socket matching the incoming packet. However, as iptable_tproxy does
socket lookups early the skb may already have the appropriate
reference attached, in that case we steal that reference instead of
doing the lookup.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 net/ipv4/udp.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index ce6c460..1d15edc 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1226,8 +1226,15 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct 
hlist_head udptable[],
if(rt->rt_flags & (RTCF_BROADCAST|RTCF_MULTICAST))
return __udp4_lib_mcast_deliver(skb, uh, saddr, daddr, 
udptable);
 
-   sk = __udp4_lib_lookup(saddr, uh->source, daddr, uh->dest,
-  skb->dev->ifindex, udptable);
+   if (skb->sk) {
+   /* steal reference */
+   sk = skb->sk;
+   skb->destructor = NULL;
+   skb->sk = NULL;
+   } else {
+   sk = __udp4_lib_lookup(saddr, uh->source, daddr, uh->dest,
+  skb->dev->ifindex, udptable);
+   }
 
if (sk != NULL) {
int ret = udp_queue_rcv_skb(sk, skb);

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 05/13] Loosen source address check on IPv4 output

2007-03-05 Thread KOVACS Krisztian
ip_route_output() contains a check to make sure that no flows with
non-local source IP addresses are routed. This obviously makes using
such addresses impossible.

This patch introduces a flowi flag which makes omitting this check
possible. The new flag provides a way of handling transparent and
non-transparent connections differently.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 include/net/flow.h |1 +
 net/ipv4/route.c   |8 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/net/flow.h b/include/net/flow.h
index ce4b10d..9eb91f2 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -49,6 +49,7 @@ struct flowi {
__u8proto;
__u8flags;
 #define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01
+#define FLOWI_FLAG_TRANSPARENT 0x02
union {
struct {
__be16  sport;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index c526fb2..8091a96 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -572,7 +572,8 @@ static inline int compare_keys(struct flowi *fl1, struct 
flowi *fl2)
(*(u16 *)&fl1->nl_u.ip4_u.tos ^
 *(u16 *)&fl2->nl_u.ip4_u.tos) |
(fl1->oif ^ fl2->oif) |
-   (fl1->iif ^ fl2->iif)) == 0;
+   (fl1->iif ^ fl2->iif) |
+   ((fl1->flags ^ fl2->flags) & FLOWI_FLAG_TRANSPARENT)) == 0;
 }
 
 #ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED
@@ -2338,6 +2339,7 @@ static inline int __mkroute_output(struct rtable **result,
rth->fl.fl4_src = oldflp->fl4_src;
rth->fl.oif = oldflp->oif;
rth->fl.mark= oldflp->mark;
+   rth->fl.flags   = oldflp->flags;
rth->rt_dst = fl->fl4_dst;
rth->rt_src = fl->fl4_src;
rth->rt_iif = oldflp->oif ? : dev_out->ifindex;
@@ -2482,6 +2484,7 @@ static int ip_route_output_slow(struct rtable **rp, const 
struct flowi *oldflp)
  RT_SCOPE_LINK :
  RT_SCOPE_UNIVERSE),
  } },
+   .flags = oldflp->flags,
.mark = oldflp->mark,
.iif = loopback_dev.ifindex,
.oif = oldflp->oif };
@@ -2506,7 +2509,7 @@ static int ip_route_output_slow(struct rtable **rp, const 
struct flowi *oldflp)
 
/* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
dev_out = ip_dev_find(oldflp->fl4_src);
-   if (dev_out == NULL)
+   if (dev_out == NULL && !(oldflp->flags & 
FLOWI_FLAG_TRANSPARENT))
goto out;
 
/* I removed check for oif == dev_out->oif here.
@@ -2678,6 +2681,7 @@ int __ip_route_output_key(struct rtable **rp, const 
struct flowi *flp)
rth->fl.iif == 0 &&
rth->fl.oif == flp->oif &&
rth->fl.mark == flp->mark &&
+   !((rth->fl.flags ^ flp->flags) & FLOWI_FLAG_TRANSPARENT) &&
!((rth->fl.fl4_tos ^ flp->fl4_tos) &
(IPTOS_RT_MASK | RTO_ONLINK))) {
 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 03/13] Don't do the TCP socket lookup if we already have one attached

2007-03-05 Thread KOVACS Krisztian
TCP input code path looks up the TCP socket hash tables to find a
socket matching the incoming packet. However, as iptable_tproxy does
socket lookups early the skb may already have the appropriate
reference attached, in that case we steal that reference instead of
doing the lookup.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 net/ipv4/tcp_ipv4.c |   13 ++---
 1 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 0ba74bb..536db7b 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1647,9 +1647,16 @@ int tcp_v4_rcv(struct sk_buff *skb)
TCP_SKB_CB(skb)->flags   = skb->nh.iph->tos;
TCP_SKB_CB(skb)->sacked  = 0;
 
-   sk = __inet_lookup(&tcp_hashinfo, skb->nh.iph->saddr, th->source,
-  skb->nh.iph->daddr, th->dest,
-  inet_iif(skb));
+   if (unlikely(skb->sk)) {
+   /* steal reference */
+   sk = skb->sk;
+   skb->destructor = NULL;
+   skb->sk = NULL;
+   } else {
+   sk = __inet_lookup(&tcp_hashinfo, skb->nh.iph->saddr, 
th->source,
+  skb->nh.iph->daddr, th->dest,
+  inet_iif(skb));
+   }
 
if (!sk)
goto no_tcp_socket;

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 02/13] Port redirection support for TCP

2007-03-05 Thread KOVACS Krisztian
Current TCP code relies on the local port of the listening socket
being the same as the destination address of the incoming
connection. Port redirection used by many transparent proxying
techniques obviously breaks this, so we have to store the original
destination port address.

This patch extends struct inet_request_sock and stores the incoming
destination port value there. It also modifies the handshake code to
use that value as the source port when sending reply packets.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 include/net/inet_sock.h |1 +
 include/net/tcp.h   |1 +
 net/ipv4/inet_connection_sock.c |2 ++
 net/ipv4/syncookies.c   |1 +
 net/ipv4/tcp_output.c   |2 +-
 5 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index ce6da97..0bd167b 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -64,6 +64,7 @@ struct inet_request_sock {
 #endif
__be32  loc_addr;
__be32  rmt_addr;
+   __be16  loc_port;
__be16  rmt_port;
u16 snd_wscale : 4, 
rcv_wscale : 4, 
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 5c472f2..e1cb3d0 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -982,6 +982,7 @@ static inline void tcp_openreq_init(struct request_sock 
*req,
ireq->acked = 0;
ireq->ecn_ok = 0;
ireq->rmt_port = skb->h.th->source;
+   ireq->loc_port = skb->h.th->dest;
 }
 
 extern void tcp_enter_memory_pressure(void);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 43fb160..83ad972 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -502,6 +502,8 @@ struct sock *inet_csk_clone(struct sock *sk, const struct 
request_sock *req,
newicsk->icsk_bind_hash = NULL;
 
inet_sk(newsk)->dport = inet_rsk(req)->rmt_port;
+   inet_sk(newsk)->num = ntohs(inet_rsk(req)->loc_port);
+   inet_sk(newsk)->sport = inet_rsk(req)->loc_port;
newsk->sk_write_space = sk_stream_write_space;
 
newicsk->icsk_retransmits = 0;
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 33016cc..431c81d 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -223,6 +223,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct 
sk_buff *skb,
treq->rcv_isn   = ntohl(skb->h.th->seq) - 1;
treq->snt_isn   = cookie;
req->mss= mss;
+   ireq->loc_port  = skb->h.th->dest;
ireq->rmt_port  = skb->h.th->source;
ireq->loc_addr  = skb->nh.iph->daddr;
ireq->rmt_addr  = skb->nh.iph->saddr;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index dc15113..a3ea7a1 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2135,7 +2135,7 @@ struct sk_buff * tcp_make_synack(struct sock *sk, struct 
dst_entry *dst,
th->syn = 1;
th->ack = 1;
TCP_ECN_make_synack(req, th);
-   th->source = inet_sk(sk)->sport;
+   th->source = ireq->loc_port;
th->dest = ireq->rmt_port;
TCP_SKB_CB(skb)->seq = tcp_rsk(req)->snt_isn;
TCP_SKB_CB(skb)->end_seq = TCP_SKB_CB(skb)->seq + 1;

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 01/13] Implement local diversion of IPv4 skbs

2007-03-05 Thread KOVACS Krisztian
The input path for non-local bound sockets requires diverting certain
packets locally, even if their destination IP address is not
considered local. We achieve this by assigning a specially crafted dst
entry to these skbs, and optionally also attaching a socket to the skb
so that the upper layer code does not need to redo the socket lookup.

We also have to be able to differentiate between these fake entries
and "real" entries in the cache: it is perfectly legal that the
diversion is done only for certain TCP or UDP packets and not for all
packets of the flow. Since these special dst entries are used only by
the iptables tproxy code, and that code uses exclusively these
entries, simply flagging these entries as DST_DIVERTED is OK. All
other cache lookup paths skip diverted entries, while our new
ip_divert_local() function uses exclusively diverted dst entries.

Signed-off-by: KOVACS Krisztian <[EMAIL PROTECTED]>

---

 include/net/dst.h   |1 
 include/net/route.h |2 +
 net/ipv4/route.c|  113 +++
 3 files changed, 115 insertions(+), 1 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index e12a8ce..4cd0745 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -48,6 +48,7 @@ struct dst_entry
 #define DST_NOPOLICY   4
 #define DST_NOHASH 8
 #define DST_BALANCED0x10
+#define DST_DIVERTED   0x20
unsigned long   expires;
 
unsigned short  header_len; /* more space at head required 
*/
diff --git a/include/net/route.h b/include/net/route.h
index 749e4df..efaa6b2 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -125,6 +125,8 @@ extern int  ip_rt_ioctl(unsigned int cmd, void 
__user *arg);
 extern voidip_rt_get_source(u8 *src, struct rtable *rt);
 extern int ip_rt_dump(struct sk_buff *skb,  struct 
netlink_callback *cb);
 
+extern int ip_divert_local(struct sk_buff *skb, const struct 
in_device *in, struct sock *sk);
+
 struct in_ifaddr;
 extern void fib_add_ifaddr(struct in_ifaddr *);
 
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 37e0d4d..c526fb2 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -100,6 +100,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -941,9 +942,11 @@ restart:
while ((rth = *rthp) != NULL) {
 #ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED
if (!(rth->u.dst.flags & DST_BALANCED) &&
+   !((rt->u.dst.flags ^ rth->u.dst.flags) & DST_DIVERTED) &&
compare_keys(&rth->fl, &rt->fl)) {
 #else
-   if (compare_keys(&rth->fl, &rt->fl)) {
+   if (!((rt->u.dst.flags ^ rth->u.dst.flags) & DST_DIVERTED) &&
+   compare_keys(&rth->fl, &rt->fl)) {
 #endif
/* Put it first */
*rthp = rth->u.dst.rt_next;
@@ -1165,6 +1168,7 @@ void ip_rt_redirect(__be32 old_gw, __be32 daddr, __be32 
new_gw,
if (rth->fl.fl4_dst != daddr ||
rth->fl.fl4_src != skeys[i] ||
rth->fl.oif != ikeys[k] ||
+   (rth->u.dst.flags & DST_DIVERTED) ||
rth->fl.iif != 0) {
rthp = &rth->u.dst.rt_next;
continue;
@@ -1525,6 +1529,111 @@ static int ip_rt_bug(struct sk_buff *skb)
return 0;
 }
 
+static void ip_divert_free_sock(struct sk_buff *skb)
+{
+   struct sock *sk = skb->sk;
+
+   skb->sk = NULL;
+   skb->destructor = NULL;
+
+   if (sk) {
+   /* TIME_WAIT inet sockets have to be handled differently */
+   if (((sk->sk_protocol == IPPROTO_TCP) && (sk->sk_state == 
TCP_TIME_WAIT)) ||
+   ((sk->sk_protocol == IPPROTO_DCCP) && (sk->sk_state == 
DCCP_TIME_WAIT)))
+   inet_twsk_put(inet_twsk(sk));
+   else
+   sock_put(sk);
+   }
+}
+
+int ip_divert_local(struct sk_buff *skb, const struct in_device *in, struct 
sock *sk)
+{
+   struct iphdr *iph = skb->nh.iph;
+   struct rtable *rth, *rtres;
+   unsigned hash;
+   const int iif = in->dev->ifindex;
+   u_int8_t tos;
+   int err;
+
+   /* look up hash first */
+   tos = iph->tos & IPTOS_RT_MASK;
+   hash = rt_hash_code(iph->daddr, iph->saddr ^ (iif << 5));
+
+   rcu_read_lock();
+   for (rth = rcu_dereference(rt_hash_table[hash].chain); rth;
+rth = rcu_dereference(rth->u.dst.rt_next)) {
+   if (rth->fl.fl4_dst == iph->daddr &&
+   rth->fl.fl4_src == iph->saddr &&
+   rth->fl.iif == iif &&
+   rth->fl.oif == 0 &&
+   (rth->u.dst.flags & DST_DIVERTED)) {
+   rth->u.dst.lastuse = jiffies;
+

[PATCH/RFC 00/13] Transparent proxying patches, take two

2007-03-05 Thread KOVACS Krisztian
  Hi,

These patches are my second try at providing Linux 2.2-like transparent
proxying support for Linux 2.6.

Major changes since the first version:

- iptable_tproxy now does IPv4 fragment reassembly (necessary for
  processing TCP/UDP header)

- The removal of the source address check in ip_route_output() was
  incorrect.  Instead, I've implemented a separate setsockopt-settable
  per-socket flag (setting it requires CAP_NET_ADMIN) to selectively
  loosen that check in ip_route_output().

Besides these, I've tried to fix all the problems raised on netdev@ in
January.

Unfortunately the newly introduced IP_TRANSPARENT socket option leads to
a quite intrusive set of patches touching core IPv4 routing and TCP
code, however this was necessary as DaveM rejected our idea of using
IP_FREEBIND instead (and he's right, of course, as it would have caused
ABI breakage.) The current approach works by adding a new bit to the
flag field in "struct flowi".

Furthermore, I haven't removed the IPv4 routing local diversion code
(caching socket lookups in the skb) yet. Patrick recommended throwing it
out altogether and use mark-based policy routing instead, but I still
think that would be harming usability as the user would need to
harmonize the configuration in order to have two completely independent
subsystems interoperate.

-- 
 Regards,
  Krisztian Kovacs
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread Venkat Yekkirala

> Also, [Joy cc'd] deletions here needn't be audited?

OK, I see the next patch addressed this :)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread Venkat Yekkirala
> @@ -2552,7 +2550,7 @@ static int pfkey_spdget(struct sock 
> *sk, struct sk_buff *skb, struct sadb_msg *h
>   return -EINVAL;
>  
>   xp = xfrm_policy_byid(XFRM_POLICY_TYPE_MAIN, dir, 
> pol->sadb_x_policy_id,
> -   hdr->sadb_msg_type == SADB_X_SPDDELETE2);
> +   hdr->sadb_msg_type == 
> SADB_X_SPDDELETE2, &err);
>   if (xp == NULL)
>   return -ENOENT;
I guess you meant to do this here?
else if (err)
return err;

Also, [Joy cc'd] deletions here needn't be audited?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 17/31] net: Factor out __dev_alloc_name from dev_alloc_name

2007-03-05 Thread Benjamin Thery

Hello Eric,

See comments about __dev_alloc_name() below.

Regards,
Benjamin

Eric W. Biederman wrote:

From: Eric W. Biederman <[EMAIL PROTECTED]> - unquoted

When forcibly changing the network namespace of a device
I need something that can generate a name for the device
in the new namespace without overwriting the old name.

__dev_alloc_name provides me that functionality.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 net/core/dev.c |   44 +---
 1 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 32fe905..fc0d2af 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -655,9 +655,10 @@ int dev_valid_name(const char *name)
 }
 
 /**

- * dev_alloc_name - allocate a name for a device
- * @dev: device
+ * __dev_alloc_name - allocate a name for a device
+ * @net: network namespace to allocate the device name in
  * @name: name format string
+ * @buf:  scratch buffer and result name string
  *
  * Passed a format string - eg "lt%d" it will try and find a suitable
  * id. It scans list of devices to build up a free map, then chooses
@@ -668,18 +669,13 @@ int dev_valid_name(const char *name)
  * Returns the number of the unit assigned or a negative errno code.
  */
 
-int dev_alloc_name(struct net_device *dev, const char *name)

+static int __dev_alloc_name(net_t net, const char *name, char buf[IFNAMSIZ])


IMHO the third parameter should be: char *buf
Indeed using "char buf[IFNAMSIZ]" is misleading because later in the 
routine sizeof(buf) is used (with an expected result of IFNAMSIZ).
Unfortunately this is no longer the case: sizeof(buf) value is only 4 
now (buf is pointer parameter).


This corrupts the registration of network devices (now I understand 
why only one of my e1000 showed up after each reboot :).


Also sizeof(buf) should be replaced by IFNAMSIZ in this new routine.
(See below)


 {
int i = 0;
-   char buf[IFNAMSIZ];
const char *p;
const int max_netdevices = 8*PAGE_SIZE;
long *inuse;
struct net_device *d;
-   net_t net;
-
-   BUG_ON(null_net(dev->nd_net));
-   net = dev->nd_net;
 
 	p = strnchr(name, IFNAMSIZ-1, '%');

if (p) {
@@ -713,10 +709,8 @@ int dev_alloc_name(struct net_device *dev, const char 
*name)
}
 
 	snprintf(buf, sizeof(buf), name, i);


Replace "snprintf(buf, IFNAMSIZ, name, i);" or i will never be 
appended to name and all your ethernet devices will all try to 
register the name "eth".


There is another occurence of "snprintf(buf, sizeof(buf), ...)" to 
replace in the for loop above.



-   if (!__dev_get_by_name(net, buf)) {
-   strlcpy(dev->name, buf, IFNAMSIZ);
+   if (!__dev_get_by_name(net, buf))
return i;
-   }
 
 	/* It is possible to run out of possible slots

 * when the name is long and there isn't enough space left
@@ -725,6 +719,34 @@ int dev_alloc_name(struct net_device *dev, const char 
*name)
return -ENFILE;
 }
 
+/**

+ * dev_alloc_name - allocate a name for a device
+ * @dev: device
+ * @name: name format string
+ *
+ * Passed a format string - eg "lt%d" it will try and find a suitable
+ * id. It scans list of devices to build up a free map, then chooses
+ * the first empty slot. The caller must hold the dev_base or rtnl lock
+ * while allocating the name and adding the device in order to avoid
+ * duplicates.
+ * Limited to bits_per_byte * page size devices (ie 32K on most platforms).
+ * Returns the number of the unit assigned or a negative errno code.
+ */
+
+int dev_alloc_name(struct net_device *dev, const char *name)
+{
+   char buf[IFNAMSIZ];
+   net_t net;
+   int ret;
+
+   BUG_ON(null_net(dev->nd_net));
+   net = dev->nd_net;
+   ret = __dev_alloc_name(net, name, buf);
+   if (ret >= 0)
+   strlcpy(dev->name, buf, IFNAMSIZ);
+   return ret;
+}
+
 
 /**

  * dev_change_name - change name of a device



--
B e n j a m i n   T h e r y  - BULL/DT/Open Software R&D

   http://www.bull.com
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] twcal_jiffie should be unsigned long, not int

2007-03-05 Thread Eric Dumazet
Hi David

While browsing include/net/inet_timewait_sock.h, I found this buggy definition 
of twcal_jiffie.

int twcal_jiffie;

I wonder how inet_twdr_twcal_tick() can really works on x86_64

This seems quite an old bug, it was there before introduction of 
inet_timewait_death_row made by Arnaldo Carvalho de Melo.

[PATCH] twcal_jiffie should be unsigned long, not int

Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]>

diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
index f7be1ac..09a2532 100644
--- a/include/net/inet_timewait_sock.h
+++ b/include/net/inet_timewait_sock.h
@@ -66,7 +66,7 @@ #define INET_TWDR_TWKILL_QUOTA 100
 struct inet_timewait_death_row {
/* Short-time timewait calendar */
int twcal_hand;
-   int twcal_jiffie;
+   unsigned long   twcal_jiffie;
struct timer_list   twcal_timer;
struct hlist_head   twcal_row[INET_TWDR_RECYCLE_SLOTS];
 


Re: TCP 2MSL on loopback

2007-03-05 Thread Eric Dumazet
On Monday 05 March 2007 12:20, Howard Chu wrote:
> Why is the Maximum Segment Lifetime a global parameter? Surely the
> maximum possible lifetime of a particular TCP segment depends on the
> actual connection. At the very least, it would be useful to be able to
> set it on a per-interface basis. E.g., in the case of the loopback
> interface, it would be useful to be able to set it to a very small
> duration.

Hi Howard

I think you should address these questions on netdev instead of linux-kernel.

>
> As I note in this draft
> http://www.ietf.org/internet-drafts/draft-chu-ldap-ldapi-00.txt
> when doing a connection soak test of OpenLDAP using clients connected
> through localhost, the entire port range is exhausted in well under a
> second, at which point the test stalls until a port comes out of
> TIME_WAIT state so the next connection can be opened.
>
> These days it's not uncommon for an OpenLDAP slapd server to handle tens
> of thousands of connections per second in real use (e.g., at Google, or
> at various telcos). While the LDAP server is fast enough to saturate
> even 10gbit ethernet using contemporary CPUs, we have to resort to
> multiple virtual interfaces just to make sure we have enough port
> numbers available.
>

I dont uderstand... doesnt slapd server listen for connections on a given 
port, like http ? Or is it doing connections like a ftp server ?

Of course, if you want to open more than 60.000 concurrent connections, using 
127.0.0.1 address, you might have a problem...

> Ideally the 2MSL parameter would be dynamically adjusted based on the
> route to the destination and the weights associated with those routes.
> In the simplest case, connections between machines on the same subnet
> (i.e., no router hops involved) should have a much smaller default value
> than connections that traverse any routers. I'd settle for a two-level
> setting - with no router hops, use the small value; with any router hops
> use the large value.

Well, is it really a MSL problem ?

I did a small test (linux-2.6.21-rc1) and was able to get 1.000.000 
connections on localhost on my dual proc machine in one minute, without an 
error.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT] sky2 auto negotiation PHY errata

2007-03-05 Thread Rob Sims
On Tue, Feb 20, 2007 at 11:00:53AM -0800, Stephen Hemminger wrote:
> You need the flow control fix and the tx_timeout fix posted for 2.6.20 
> (stable)
> and current git tree. 

sky2 1.13 has been far better than 1.10; there have been no system hangs
or permanent sky2 failures.  However, the following two incidents were
in syslog:

Feb 27 07:08:21 btd kernel: Linux version 2.6.20.sky2.1.13-btd3 ([EMAIL 
PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP 
PREEMPT Tue Feb 27 00:07:34 MST 2007
Feb 27 07:08:21 btd kernel: sky2 :04:00.0: v1.13 addr 0xfa9fc000 irq 17 
Yukon-EC (0xb6) rev 2
Feb 27 07:08:21 btd kernel: sky2 eth0: addr 00:1a:92:23:52:4d
Feb 27 07:08:21 btd kernel: sky2 :03:00.0: v1.13 addr 0xfa8fc000 irq 16 
Yukon-EC (0xb6) rev 2
Feb 27 07:08:21 btd kernel: sky2 eth1: addr 00:1a:92:23:4b:a6
Feb 27 07:08:21 btd kernel: sky2 eth0: enabling interface
Feb 27 07:08:21 btd kernel: sky2 eth0: ram buffer 48K
Feb 27 07:08:21 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, 
flow control both
Feb 27 19:48:34 btd kernel: sky2 :04:00.0: v1.13 addr 0xfa9fc000 irq 17 
Yukon-EC (0xb6) rev 2
Feb 27 19:48:34 btd kernel: sky2 eth0: addr 00:1a:92:23:52:4d
Feb 27 19:48:34 btd kernel: sky2 :03:00.0: v1.13 addr 0xfa8fc000 irq 16 
Yukon-EC (0xb6) rev 2
Feb 27 19:48:34 btd kernel: sky2 eth1: addr 00:1a:92:23:4b:a6
Feb 27 19:48:34 btd kernel: sky2 eth0: enabling interface
Feb 27 19:48:34 btd kernel: sky2 eth0: ram buffer 48K
Feb 27 19:48:34 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, 
flow control both

Feb 28 19:06:57 btd kernel: NETDEV WATCHDOG: eth0: transmit timed out
Feb 28 19:06:57 btd kernel: sky2 eth0: tx timeout
Feb 28 19:06:57 btd kernel: sky2 eth0: transmit ring 133 ..  110 report=133 
done=133
Feb 28 19:06:57 btd kernel: sky2 eth0: disabling interface
Feb 28 19:06:57 btd kernel: sky2 eth0: enabling interface
Feb 28 19:06:57 btd kernel: sky2 eth0: ram buffer 48K
Feb 28 19:07:00 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, 
flow control both

Mar  4 13:58:31 btd kernel: NETDEV WATCHDOG: eth0: transmit timed out
Mar  4 13:58:31 btd kernel: sky2 eth0: tx timeout
Mar  4 13:58:31 btd kernel: sky2 eth0: transmit ring 353 .. 330 report=353 
done=353
Mar  4 13:58:31 btd kernel: sky2 eth0: disabling interface
Mar  4 13:58:31 btd kernel: sky2 eth0: enabling interface
Mar  4 13:58:31 btd kernel: sky2 eth0: ram buffer 48K
Mar  4 13:58:34 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, 
flow control both

I only noticed the second of the two.
-- 
Rob


signature.asc
Description: Digital signature


Re: [PATCH 3/3] NetXen: Make driver use multi PCI functions

2007-03-05 Thread Mithlesh Thukral
On Saturday 03 March 2007 06:35, Jeff Garzik wrote:
> Linsys Contractor Mithlesh Thukral wrote:
> > NetXen: Make driver use multi PCI functions.
> >
> > Signed-off by: Mithlesh Thukral <[EMAIL PROTECTED]>
> >
> > ---
> >
> >  netxen_nic.h  |  126 +---
> >  netxen_nic_ethtool.c  |   80 +++
> >  netxen_nic_hdr.h  |8
> >  netxen_nic_hw.c   |  213 +++-
> >  netxen_nic_hw.h   |   18 -
> >  netxen_nic_init.c |  115 +++---
> >  netxen_nic_isr.c  |   80 +++
> >  netxen_nic_main.c |  523
> > +- netxen_nic_niu.c 
> > |   27 +-
> >  netxen_nic_phan_reg.h |  125 ---
> >  10 files changed, 631 insertions(+), 684 deletions(-)
>
> all three patches in this patchset contained nothing but one-line
> summaries of the changes included in them, and are overall very poorly
> and vaguely described.
>
> This patch is far too big, with far too little description and
> justification to go along with it.
>
> If you are not going to make the effort to write a paragraph or two
> describing such huge changes, then I'm not going to make the effort to
> review and apply it.  NAK.
My apologies for insufficient explanation of the patch. I have resend this 
patch some time ago. 

Regards,
Mithlesh Thukral
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Theodore Tso
On Sun, Mar 04, 2007 at 05:17:29PM -0800, Greg KH wrote:
> I should not have broken any userspace if CONFIG_SYSFS_DEPRECATED is
> enabled with that patch.  If that is enabled, and that patch still
> causes problems, please let me know.

But we still need to update the help text for CONFIG_SYS_DEPRECATED to
make it clear that its deprecation schedule still needs to be 2009 to
2011 (depending on whether we want to accomodate Debian's glacial
release schedule).  Certainly the 2006 date which is currently there
simply isn't accurate.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] NetXen: Fix ping failure of Jumbo frames on MEZ cards.

2007-03-05 Thread Linsys Contractor Mithlesh Thukral
NetXen: Fix ping failure of Jumbo frames on MEZ cards.

Signed-off by: Mithlesh Thukral <[EMAIL PROTECTED]>

---

 drivers/net/netxen/netxen_nic_hw.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletion(-)
 
diff --git a/drivers/net/netxen/netxen_nic_hw.c 
b/drivers/net/netxen/netxen_nic_hw.c
index 693d01a..81ebc81 100644
--- a/drivers/net/netxen/netxen_nic_hw.c
+++ b/drivers/net/netxen/netxen_nic_hw.c
@@ -962,7 +962,12 @@ int netxen_nic_set_mtu_gb(struct netxen_
 int netxen_nic_set_mtu_xgb(struct netxen_adapter *adapter, int new_mtu)
 {
new_mtu += NETXEN_NIU_HDRSIZE + NETXEN_NIU_TLRSIZE;
-   netxen_nic_write_w0(adapter, NETXEN_NIU_XGE_MAX_FRAME_SIZE, new_mtu);
+   if (adapter->portnum == 0)
+   netxen_nic_write_w0(adapter, NETXEN_NIU_XGE_MAX_FRAME_SIZE, 
+   new_mtu);
+   else if (adapter->portnum == 1)
+   netxen_nic_write_w0(adapter, NETXEN_NIU_XG1_MAX_FRAME_SIZE,
+   new_mtu);
return 0;
 }
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >