Re: link error : 2.6.21-rc6-mm1 for s390

2007-04-10 Thread Andrew Morton
On Tue, 10 Apr 2007 22:11:01 -0700 (PDT) David Miller <[EMAIL PROTECTED]> wrote:

> diff --git a/arch/s390/lib/Makefile b/arch/s390/lib/Makefile
> index 7a44fed..59aea65 100644
> --- a/arch/s390/lib/Makefile
> +++ b/arch/s390/lib/Makefile
> @@ -5,6 +5,6 @@
>  EXTRA_AFLAGS := -traditional
>  
>  lib-y += delay.o string.o uaccess_std.o uaccess_pt.o qrnnd.o
> -lib-$(CONFIG_32BIT) += div64.o
> +obj-$(CONFIG_32BIT) += div64.o
>  lib-$(CONFIG_64BIT) += uaccess_mvcos.o
>  lib-$(CONFIG_SMP) += spinlock.o
> diff --git a/arch/s390/lib/div64.c b/arch/s390/lib/div64.c
> index 0481f34..a5f8300 100644
> --- a/arch/s390/lib/div64.c
> +++ b/arch/s390/lib/div64.c
> @@ -147,5 +147,3 @@ uint32_t __div64_32(uint64_t *n, uint32_t base)
>  }
>  
>  #endif /* MARCH_G5 */
> -
> -EXPORT_SYMBOL(__div64_32);
> diff --git a/lib/div64.c b/lib/div64.c
> index 74f0c8c..b71cf93 100644
> --- a/lib/div64.c
> +++ b/lib/div64.c
> @@ -23,7 +23,7 @@
>  /* Not needed on 64bit architectures */
>  #if BITS_PER_LONG == 32
>  
> -uint32_t __div64_32(uint64_t *n, uint32_t base)
> +uint32_t __attribute__((weak)) __div64_32(uint64_t *n, uint32_t base)
>  {
>   uint64_t rem = *n;
>   uint64_t b = base;

I think this means that if CONFIG_32BIT=y, s390 networking gets the whizzy
assembly version and if CONFIG_32BIT=n, it gets to use the generic version.

Possibly the whizzy version could be used if CONFIG_32BIT=n, too.  But
I'd let the s390 people worry about that ;)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET: [UPDATED] Multiqueue network device support implementation.

2007-04-10 Thread Patrick McHardy
Waskiewicz Jr, Peter P wrote:
>>This leaks the device. You treat every single-queue device as 
>>having a single subqueue. If it doesn't get too ugly it would 
>>be nice to avoid this and only allocate the subqueue states 
>>for real multiqueue devices.
> 
> 
> We went back and forth on this.  The reason we allocate a queue in every
> case, even on single-queue devices, was to make the stack not have
> branching for multiqueue and non-multiqueue devices.  If we don't have
> at least one queue on a device, then we can't have
> netif_subqueue_stopped() in the hotpath unless we check if a device is
> multiqueue before.  The original patches I released had this branching,
> and I was asked to not do that.


OK, thanks for the explanation.

>>>+skb->queue_mapping =
>>>+ q->prio2band[q->band2queue[band&TC_PRIO_MAX]];
>>
>>
>>Does this needs to be cleared at some point again? TC actions 
>>might redirect or mirror packets to other (multiqueue) devices.
> 
> 
> If an skb is redirected to another device, the skb should be filtered
> through that device's qdisc, yes?


Yes, but the device might not have a queue or use something different
than prio, so the value would stay the same. I think you need to clear
it before enqueueing a packet or alternatively when redirecting in the
mirred action.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] bridge update for 2.6.22

2007-04-10 Thread Patrick McHardy
Stephen Hemminger wrote:
> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> index a260679..8a55276 100644
> --- a/net/bridge/br_input.c
> +++ b/net/bridge/br_input.c
>   if (unlikely(is_link_local(dest))) {
>   skb->pkt_type = PACKET_HOST;
> - return NF_HOOK(PF_BRIDGE, NF_BR_LOCAL_IN, skb, skb->dev,
> -NULL, br_handle_local_finish) != 0;
> +
> + return (NF_HOOK(PF_BRIDGE, NF_BR_LOCAL_IN, skb, skb->dev,
> + NULL, br_handle_local_finish) == 0) ? skb : 
> NULL;
>   }


I Just want to note, this is broken in multiple ways (not by this
patch, it was already broken before). When a packet is stolen or
queued, NF_HOOK will return 0, but the packet is not owned by the
caller anymore, so we have a potential use-after-free. Additionally
the okfn owns the skb and needs to make sure it continues its path,
which br_handle_local_finish doesn't do, resulting in leaks and
broken queueing. The fix looks quite ugly, bf_handle_local_finish
would need to pass the skb back to netif_receive_skb just after the
handle_bridge call.

All this is not a problem for mainline currently since ebtables
doesn't support QUEUE yet, but since its mentioned on the TODO
list and in any case is incorrect use of NF_HOOK it feels worth
mentioning.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: link error : 2.6.21-rc6-mm1 for s390

2007-04-10 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Tue, 10 Apr 2007 18:47:38 -0700

> attribute(weak) would give a nicer result?
> 
> We'd also need to remove s390's EXPORT_SYMBOL(__div64_32), so s390 ends up
> using lib/div64.c's EXPORT_SYMBOL().

Ok, here is the version of the fix I'll use for now:

commit c3abb3b8d41814ce4691cc4cc3998b0f5242c8d0
Author: David S. Miller <[EMAIL PROTECTED]>
Date:   Tue Apr 10 22:10:39 2007 -0700

[S390]: Fix build on 31-bit.

Allow s390 to properly override the generic
__div64_32() implementation by:

1) Using obj-y for div64.o in s390's makefile instead
   of lib-y

2) Adding the weak attribute to the generic implementation.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

diff --git a/arch/s390/lib/Makefile b/arch/s390/lib/Makefile
index 7a44fed..59aea65 100644
--- a/arch/s390/lib/Makefile
+++ b/arch/s390/lib/Makefile
@@ -5,6 +5,6 @@
 EXTRA_AFLAGS := -traditional
 
 lib-y += delay.o string.o uaccess_std.o uaccess_pt.o qrnnd.o
-lib-$(CONFIG_32BIT) += div64.o
+obj-$(CONFIG_32BIT) += div64.o
 lib-$(CONFIG_64BIT) += uaccess_mvcos.o
 lib-$(CONFIG_SMP) += spinlock.o
diff --git a/arch/s390/lib/div64.c b/arch/s390/lib/div64.c
index 0481f34..a5f8300 100644
--- a/arch/s390/lib/div64.c
+++ b/arch/s390/lib/div64.c
@@ -147,5 +147,3 @@ uint32_t __div64_32(uint64_t *n, uint32_t base)
 }
 
 #endif /* MARCH_G5 */
-
-EXPORT_SYMBOL(__div64_32);
diff --git a/lib/div64.c b/lib/div64.c
index 74f0c8c..b71cf93 100644
--- a/lib/div64.c
+++ b/lib/div64.c
@@ -23,7 +23,7 @@
 /* Not needed on 64bit architectures */
 #if BITS_PER_LONG == 32
 
-uint32_t __div64_32(uint64_t *n, uint32_t base)
+uint32_t __attribute__((weak)) __div64_32(uint64_t *n, uint32_t base)
 {
uint64_t rem = *n;
uint64_t b = base;
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 17/30] Use menuconfig objects - IPVS

2007-04-10 Thread Simon Horman
On Tue, Apr 10, 2007 at 11:25:59PM +0200, Jan Engelhardt wrote:
> 
> Use menuconfigs instead of menus, so the whole menu can be disabled at
> once instead of going through all options.
> 
> Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>

This seems to work fine to me.

Signed-off-by: Simon Horman <[EMAIL PROTECTED]>

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


accessing TCP port numbers of an skb in a qdisc

2007-04-10 Thread Ritesh Kumar

Hi,
   For some per-flow queue management work I need to access TCP port
numbers of an skb inside a qdisc (i.e. in qdisc enqueue and dequeue
functions). Can I assume that skb->data always points to the head of
the IP header of the packet? If that is the case will the following
statements do the trick?

if(skb->nh.iph->protocol == IPPROTO_TCP) {
   skb->h.raw = skb->data + (skb->nh.iph->ihl*4);
   /* read the tcp port numbers in
* skb->h.th->source and skb->h.th->dest
*/
}

Thanks a lot for your help!

Ritesh
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Avi Kivity
Rusty Russell wrote:
> On Mon, 2007-04-09 at 16:38 +0300, Avi Kivity wrote:
>   
>> Moreover, some things just don't lend themselves to a userspace 
>> abstraction.  If we want to expose tso (tcp segmentation offload), we 
>> can easily do so with a kernel driver since the kernel interfaces are 
>> all tso aware.  Tacking on tso awareness to tun/tap is doable, but at 
>> the very least wierd.
>> 
>
> It is kinda weird, yes, but it certainly makes sense.  All the arguments
> for tso apply in triplicate to userspace packet sends...
>
>   

Well, write() with a large buffer is a sort of tso device.  The problem
is tso breaks through several layers (like I'm advocating in the other
thread :), pushing tcp functionality into ethernet.  Well, we've seen worse.


>>> We're dealing with the tun/tap device here, not a socket.
>>>   
>> Hmm.  tun actually has aio_write implemented, but it seems synchronous.  
>> So does the read path.
>>
>> If these are made truly asynchronous, and the write path is made in 
>> addition copyless, then we might have something workable.  I still 
>> cringe at having a pagetable walk in order to deliver a 1500-byte packet.
>> 
>
> Right, now we're talking!
>
> However, it's not clear to me why creating an skb which references a kvm
> guest's memory doesn't need a pagetable walk, but a packet in (other)
> userspace memory does?
>   

Currently guest pages are stashed in a kernel array, as well as being
mmap()ed into user space.

That's not a very strong argument though, as I'd like to be map
userspace memory into the guest, or map address_spaces to the guest, or
something, so accessing guest physical memory will become more expensive
in time.

> My conviction which started this discussion is that if we can offer an
> efficient interface for kvm, we should be able to offer an efficient
> interface for any (other) userspace.
>   

Fully agreed.  It's mostly a question of who and when.  Designing and
implementing this interface is going to be difficult, require deep
knowledge of Linux networking, and consume a lot of time.

> As to async, I'm not *so* worried about that for the moment, although it
> would probably be nicer to fail than to block.  Otherwise we could
> simply set an skb destructor to wake us up.
>   

Nope.  Being async is critical for copyless networking:

- in the transmit path, so need to stop the sender (guest) from touching
the memory until it's on the wire.  This means 100% of packets sent will
be blocked.
- in the receive path, you could separate receive notification from the
single copy that must be done (like poll() + read()), but to make use of
dma engines you need to provide the end address beforehand.

> I think the first step is to see how much worse a decent userspace net
> driver is compared with the current in-kernel one.
>   

A userspace net interface needs to provide the following:

- true async operations
- multiple packets per operation (for interrupt mitigation) (like
lio_listio)
- scatter/gather packets (iovecs)
- configurable wakeup (by packet count/timeout) for queue management
- hacks (tso)

Most of these can be provided by a combination of the pending aio work,
the pending aio/fd integration, and the not-so-pending tap aio work.  As
the first two are available as patches and the third is limited to the
tap device, it is not unreasonable to try it out.  Maybe it will turn
out not to be as difficult as I predicted just a few lines above.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Rusty Russell
On Mon, 2007-04-09 at 16:38 +0300, Avi Kivity wrote:
> Moreover, some things just don't lend themselves to a userspace 
> abstraction.  If we want to expose tso (tcp segmentation offload), we 
> can easily do so with a kernel driver since the kernel interfaces are 
> all tso aware.  Tacking on tso awareness to tun/tap is doable, but at 
> the very least wierd.

It is kinda weird, yes, but it certainly makes sense.  All the arguments
for tso apply in triplicate to userspace packet sends...

> > We're dealing with the tun/tap device here, not a socket.
> 
> Hmm.  tun actually has aio_write implemented, but it seems synchronous.  
> So does the read path.
> 
> If these are made truly asynchronous, and the write path is made in 
> addition copyless, then we might have something workable.  I still 
> cringe at having a pagetable walk in order to deliver a 1500-byte packet.

Right, now we're talking!

However, it's not clear to me why creating an skb which references a kvm
guest's memory doesn't need a pagetable walk, but a packet in (other)
userspace memory does?

My conviction which started this discussion is that if we can offer an
efficient interface for kvm, we should be able to offer an efficient
interface for any (other) userspace.

As to async, I'm not *so* worried about that for the moment, although it
would probably be nicer to fail than to block.  Otherwise we could
simply set an skb destructor to wake us up.

> > Again, sendfile is a *much* harder problem than sending a single packet
> > once, which is the question here.
> 
> sendfile() is a *different* problem.  It doesn't need completion because 
> the data is assumed not to change under it.

Well, let's not argue over that, it's irrelevant.  Hopefully we can do
that over a beer or equivalent sometime.

I think the first step is to see how much worse a decent userspace net
driver is compared with the current in-kernel one.

Rusty.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: link error : 2.6.21-rc6-mm1 for s390

2007-04-10 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Tue, 10 Apr 2007 18:47:38 -0700

> On Tue, 10 Apr 2007 18:36:29 -0700 (PDT)
> David Miller <[EMAIL PROTECTED]> wrote:
> 
> > From: Andrew Morton <[EMAIL PROTECTED]>
> > Date: Tue, 10 Apr 2007 18:29:37 -0700
> > 
> > > git-net.patch implements generic lib/div64.c, but s390 also has a
> > > private one.  Presumably the appropriate fix is to remove s390's
> > > private implementation within davem's tree.
> > 
> > The s390 version seems to be optimized in assembler for that
> > processor, therefore we should probably instead elide the
> > generic version on s390.
> 
> We're sure that it has the same API?

Yes, I read over it, I'm pretty sure it does.

> attribute(weak) would give a nicer result?

I'm not so sure.

> We'd also need to remove s390's EXPORT_SYMBOL(__div64_32), so s390 ends up
> using lib/div64.c's EXPORT_SYMBOL().

It shouldn't matter if we use s390's or the generic version's

Oh, I see, s390 uses lib-y for it's div64.o object, that's a bug.
I'll fix that up, thanks Andrew.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: link error : 2.6.21-rc6-mm1 for s390

2007-04-10 Thread Andrew Morton
On Tue, 10 Apr 2007 18:36:29 -0700 (PDT)
David Miller <[EMAIL PROTECTED]> wrote:

> From: Andrew Morton <[EMAIL PROTECTED]>
> Date: Tue, 10 Apr 2007 18:29:37 -0700
> 
> > git-net.patch implements generic lib/div64.c, but s390 also has a
> > private one.  Presumably the appropriate fix is to remove s390's
> > private implementation within davem's tree.
> 
> The s390 version seems to be optimized in assembler for that
> processor, therefore we should probably instead elide the
> generic version on s390.

We're sure that it has the same API?

> How about something like this?
> 
> diff --git a/include/asm-s390/div64.h b/include/asm-s390/div64.h
> index 6cd978c..21aea15 100644
> --- a/include/asm-s390/div64.h
> +++ b/include/asm-s390/div64.h
> @@ -1 +1,2 @@
>  #include 
> +#define HAVE_ARCH_DIV64_32
> diff --git a/lib/div64.c b/lib/div64.c
> index 74f0c8c..5b480fa 100644
> --- a/lib/div64.c
> +++ b/lib/div64.c
> @@ -23,6 +23,8 @@
>  /* Not needed on 64bit architectures */
>  #if BITS_PER_LONG == 32
>  
> +#ifndef HAVE_ARCH_DIV64_32
> +
>  uint32_t __div64_32(uint64_t *n, uint32_t base)
>  {
>   uint64_t rem = *n;
> @@ -58,6 +60,8 @@ uint32_t __div64_32(uint64_t *n, uint32_t base)
>  
>  EXPORT_SYMBOL(__div64_32);
>  
> +#endif /* !(HAVE_ARCH_DIV64_32) */
> +
>  /* 64bit divisor, dividend and result. dynamic precision */
>  uint64_t div64_64(uint64_t dividend, uint64_t divisor)
>  {

attribute(weak) would give a nicer result?

We'd also need to remove s390's EXPORT_SYMBOL(__div64_32), so s390 ends up
using lib/div64.c's EXPORT_SYMBOL().

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: link error : 2.6.21-rc6-mm1 for s390

2007-04-10 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Tue, 10 Apr 2007 18:29:37 -0700

> git-net.patch implements generic lib/div64.c, but s390 also has a
> private one.  Presumably the appropriate fix is to remove s390's
> private implementation within davem's tree.

The s390 version seems to be optimized in assembler for that
processor, therefore we should probably instead elide the
generic version on s390.

How about something like this?

diff --git a/include/asm-s390/div64.h b/include/asm-s390/div64.h
index 6cd978c..21aea15 100644
--- a/include/asm-s390/div64.h
+++ b/include/asm-s390/div64.h
@@ -1 +1,2 @@
 #include 
+#define HAVE_ARCH_DIV64_32
diff --git a/lib/div64.c b/lib/div64.c
index 74f0c8c..5b480fa 100644
--- a/lib/div64.c
+++ b/lib/div64.c
@@ -23,6 +23,8 @@
 /* Not needed on 64bit architectures */
 #if BITS_PER_LONG == 32
 
+#ifndef HAVE_ARCH_DIV64_32
+
 uint32_t __div64_32(uint64_t *n, uint32_t base)
 {
uint64_t rem = *n;
@@ -58,6 +60,8 @@ uint32_t __div64_32(uint64_t *n, uint32_t base)
 
 EXPORT_SYMBOL(__div64_32);
 
+#endif /* !(HAVE_ARCH_DIV64_32) */
+
 /* 64bit divisor, dividend and result. dynamic precision */
 uint64_t div64_64(uint64_t dividend, uint64_t divisor)
 {
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [SK_BUFF]: Fix missing offset adjustment in skb_copy_expand

2007-04-10 Thread David Miller
From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Wed, 11 Apr 2007 02:42:59 +0200

> Hi Dave,
> 
> Patrick McHardy wrote:
> > [SK_BUFF]: Fix missing offset adjustment in skb_copy_expand
> > 
> > skb_copy_expand changes the headroom, so it needs to adjust the header
> > offsets by the difference between the old and the new value.
> 
> 
> it seems like you missed this one. Attached again for your convenience.

Thanks a lot Patrick, applied.

I know what happened, I applied the pskb_copy_expand() patch and then
deleted this one by accident since the subject lines looked so
similar and I figured I was just getting a second copy of the same
patch :-)

Thanks again!
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: link error : 2.6.21-rc6-mm1 for s390

2007-04-10 Thread Andrew Morton
On Tue, 10 Apr 2007 20:56:16 -0400
Mathieu Desnoyers <[EMAIL PROTECTED]> wrote:

> The last for today : link error of 2.6.21-rc6-mm1 for s390 :
> 
> 
>   
> /opt/crosstool/gcc-4.1.1-glibc-2.3.6/s390-unknown-linux-gnu/bin/s390-unknown-linux-gnu-ld
>  -m elf_s390 -e start -o .tmp_vmlinux1 -T arch/s390/kernel/vmlinux.lds 
> arch/s390/kernel/head.o arch/s390/kernel/init_task.o  init/built-in.o 
> --start-group  usr/built-in.o  arch/s390/mm/built-in.o  
> arch/s390/kernel/built-in.o  arch/s390/crypto/built-in.o  
> arch/s390/appldata/built-in.o  arch/s390/hypfs/built-in.o  kernel/built-in.o  
> mm/built-in.o  fs/built-in.o  ipc/built-in.o  security/built-in.o  
> crypto/built-in.o  block/built-in.o  ltt/built-in.o  lib/lib.a  
> arch/s390/lib/lib.a  lib/built-in.o  arch/s390/lib/built-in.o  
> drivers/built-in.o  sound/built-in.o  drivers/s390/built-in.o  
> arch/s390/math-emu/built-in.o  net/built-in.o --end-group
> lib/built-in.o: In function `__div64_32':
> : multiple definition of `__div64_32'
> arch/s390/lib/lib.a(div64.o):div64.c:(.text+0x0): first defined here
> /opt/crosstool/gcc-4.1.1-glibc-2.3.6/s390-unknown-linux-gnu/bin/s390-unknown-linux-gnu-ld:
>  Warning: size of symbol `__div64_32' changed from 218 in 
> arch/s390/lib/lib.a(div64.o) to 260 in lib/built-in.o

git-net.patch implements generic lib/div64.c, but s390 also has a private one.

Presumably the appropriate fix is to remove s390's private implementation within
davem's tree.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: phylib usage

2007-04-10 Thread Lennert Buytenhek
On Tue, Apr 10, 2007 at 05:20:52PM -0500, Kim Phillips wrote:

> (note I'm coming from an embedded world here.)

Please read this:

http://marc.info/?l=linux-netdev&m=116527863300952&w=2
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Add support for running the Marvell m88e1111 PHY in RGMII mode

2007-04-10 Thread Lennert Buytenhek
On Tue, Apr 10, 2007 at 04:57:23PM -0500, Kim Phillips wrote:

> also adds RX & TX delay bits to help boards with clock skew problems.

This doesn't make sense at all.

RGMII specifies that clock and data are generated simultaneously.  The
necessary 1.5-2ns of clock delay is either achieved by routing the RGMII
clock net to have that amount of skew, or by programming the RGMII
receiver to add that amount of clock delay internally.

So, boards that have 'clock skew problems' should have RX & TX delay
turned OFF.  Boards that route the clock nets without any skew should
have RX & TX delay turned on.

Your patch unconditionally enables clock delay in the 88e, which
only makes sense if the RGMII data and clock nets are routed to have
the approximate same amount of delay.  Which means that it is entirely
board-specific whether this should be enabled or not.  If you enable
the 88e internal RX/TX clock delay feature on a board that also
routes the RGMII clock nets to have the appropriate amount of skew,
things are likely to break entirely.


> [...]
> +
> + temp |= (MII_M_RX_DELAY | MII_M_TX_DELAY);

Enabling this unconditionally is just wrong.

I.e. the patch is wrong, and the description is wrong too.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [SK_BUFF]: Fix missing offset adjustment in skb_copy_expand

2007-04-10 Thread Patrick McHardy
Hi Dave,

Patrick McHardy wrote:
> [SK_BUFF]: Fix missing offset adjustment in skb_copy_expand
> 
> skb_copy_expand changes the headroom, so it needs to adjust the header
> offsets by the difference between the old and the new value.


it seems like you missed this one. Attached again for your convenience.


[SK_BUFF]: Fix missing offset adjustment in skb_copy_expand

skb_copy_expand changes the headroom, so it needs to adjust the header
offsets by the difference between the old and the new value.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

---
commit 9cb76fae709a9303777286998baa457b0730a225
tree 2ea619c7daf9c5e6829dad6d502386eb9c922700
parent fb98b03719ad23840ca005edbba3c86ef1e3282c
author Patrick McHardy <[EMAIL PROTECTED]> Sun, 08 Apr 2007 03:36:49 +0200
committer Patrick McHardy <[EMAIL PROTECTED]> Sun, 08 Apr 2007 03:36:49 +0200

 net/core/skbuff.c |   11 ++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 5c9ee94..f2cffd4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -794,7 +794,9 @@ struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
 	 */
 	struct sk_buff *n = alloc_skb(newheadroom + skb->len + newtailroom,
   gfp_mask);
+	int oldheadroom = skb_headroom(skb);
 	int head_copy_len, head_copy_off;
+	int off = 0;
 
 	if (!n)
 		return NULL;
@@ -804,7 +806,7 @@ struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
 	/* Set the tail pointer and length */
 	skb_put(n, skb->len);
 
-	head_copy_len = skb_headroom(skb);
+	head_copy_len = oldheadroom;
 	head_copy_off = 0;
 	if (newheadroom <= head_copy_len)
 		head_copy_len = newheadroom;
@@ -818,6 +820,13 @@ struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
 
 	copy_skb_header(n, skb);
 
+#ifdef NET_SKBUFF_DATA_USES_OFFSET
+	off  = newheadroom - oldheadroom;
+#endif
+	n->transport_header += off;
+	n->network_header   += off;
+	n->mac_header	+= off;
+
 	return n;
 }
 



Re: [PATCH] CONFIG_PACKET_MMAP should depend on MMU

2007-04-10 Thread Robin Getz
On Tue 10 Apr 2007 08:55, David Howells pondered:
> Looking at alloc_pg_vec() in af_packet.c, I will place my bets on the
> latter case.  I don't know that this is a problem; it depends on how things
> work, and that I don't know offhand.  If someone can give me a simple test
> program, I would be able to evaluate it better.

Hmm - the only think I have used in the past is tcpdump/libpcap from
http://public.lanl.gov/cpw/

Documentation/networking/packet_mmap.txt also seems to be a little dated, but 
does have some code snippets if you wanted to make something lightweight...

Does anyone else on netdev have a small test app?

-Robin
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: phylib usage

2007-04-10 Thread Kim Phillips
On Tue, 10 Apr 2007 14:41:01 -0400
David Hollis <[EMAIL PROTECTED]> wrote:

> I've been keeping an eye on PHYLIB since it's addition to the kernel
> some time ago and I'm a bit curious as to why it doesn't seem to have
> much up-take among other drivers.  A quick check of the kernel source
> shows only two users (AU1000 and gianfar) and both look to be embedded
> type devices.  Are there fundamental issues with PHYLIB that prevent
> it

I just posted patches for ucc_geth to use phylib.

> from being more widely used?  Is it an element of "the driver ain't
> broke, I'm not going to rework my PHY handling at this point"?  Is PHY

I think that could very well be it, at least until your nic gets a new phy
(like the ucc_geth just did :).

> handling something that is just too difficult to fully abstract with
> various specialty nics?
> 
> To semi-answer one of my own questions, I was hoping to use PHYLIB
> with asix.c (a USB Ethernet driver) since the newer devices now have
> external PHYs and it appears that there are quite a few different ones
> in use by
> various OEMs but I found that the locking in use wasn't amenable to
> USB operations, since USB calls can sleep.  This doesn't seem like an
> insurmountable problem however.
> 
> The main reason I even ask is that it seems rather odd to have so many
> drivers that seem to have the same PHY's and yet each driver has to
> deal with various errata and quirks of the PHY.  Having it all
> abstracted makes for one location to deal with things and everybody
> wins.

The main reason behind the phylib I thought was resource control
(no two nic drivers trying to access the same phy at the same time),
but yes, you simply cannot understate the convenience it provides to the
PHY driver duplication/porting problem as a whole.

(note I'm coming from an embedded world here.)

Kim
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NET_POLL missing in the new ibm_emac driver?

2007-04-10 Thread Paul Smith
Hi all; any love here?  Maybe a pointer or two (e.g., what's the
equivalent, in the new emac driver, of these lines in the old driver:

emac_rxeob_dev((void *)ndev, 0);
emac_txeob_dev((void *)ndev, 0);

?)

---

I was having so many problems with the 2.6.14 (old) emac driver that I
backported the new one from 2.6.15.  It works great... until I tried to
use KGDBOE with it.  Then I got:

kgdboe: eth1 doesn't support polling, aborting.

Looking around I see that I need the driver to support the
poll_controller method in the net device, and the new driver doesn't
seem to do that.

I see the old implementation of this method, but of course it uses
interfaces that no longer exist so I can't use that directly.  It seems
like a small change but I'm not really familiar with the implementation
of the ibm_emac driver; does anyone have a patch for the new driver to
provide this feature?


Thanks!

-- 
-
 Paul D. Smith <[EMAIL PROTECTED]>   http://netezza.com
 "Please remain calm--I may be mad, but I am a professional."--Mad Scientist
-
  These are my opinions--Netezza takes no responsibility for them.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Support for Asix AX88796

2007-04-10 Thread Francois Romieu
Thomas LangÄs <[EMAIL PROTECTED]> :
[...]
> I found a driver [1] that I managed to patch into the linux 2.6.9-kernel, 
> but upon booting the dreambox it doesn't detect the networkcard.  I suspect
> it because of this (found inside the ne.c-file used by the dreambox):

>From a (very) quick sight at [1] it looks like a platform driver... without
the platform specific part (i.e. like drivers/net/gianfar.c if there was
no arch/ppc/syslib/mpc85xx_devices.c).

[...]
> CONFIG_DM56xx is enabled, so there seems to be a hardcoded base_addr
> used to probe for the networkchip (which is no surprise since this is for an
> embedded device).  But I'm a bit puzzled to where I should put this in the
> driver below, because it didn't seem to start probing at all when I
> added printk's
> to try and debug.
> 
> Any pointers at all on where to start would be helpfull. :-)

Create a platform_device named "ax88796" to describe the specific of
the DM5xyz platform, pass it through platform_device_register and the
driver in [1] should notice that something is going on.

> In advance, thanks!

Good luck.

-- 
Ueimor

Anybody got a battery for my Ultra 10 ?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] add the ICPlus IP175C PHY driver

2007-04-10 Thread Kim Phillips
From: Michael Barkowski <[EMAIL PROTECTED]>

The ICPlus IP175C sports a 100Mbit/s 5-port switch in addition
to a dedicated 100Mbit/s WAN port.

Signed-off-by: Michael Barkowski <[EMAIL PROTECTED]>
Signed-off-by: Kim Phillips <[EMAIL PROTECTED]>
---
please consider for 2.6.22

 drivers/net/phy/Kconfig  |6 ++
 drivers/net/phy/Makefile |1 +
 drivers/net/phy/icplus.c |  121 ++
 3 files changed, 128 insertions(+), 0 deletions(-)
 create mode 100644 drivers/net/phy/icplus.c

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index f994f12..91cf33e 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -62,6 +62,12 @@ config BROADCOM_PHY
---help---
  Currently supports the BCM5411, BCM5421 and BCM5461 PHYs.
 
+config ICPLUS_PHY
+   tristate "Drivers for ICPlus PHYs"
+   depends on PHYLIB
+   ---help---
+ Currently supports the IP175C PHY.
+
 config FIXED_PHY
tristate "Drivers for PHY emulation on fixed speed/link"
depends on PHYLIB
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index bcd1efb..8885650 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -11,4 +11,5 @@ obj-$(CONFIG_QSEMI_PHY)   += qsemi.o
 obj-$(CONFIG_SMSC_PHY) += smsc.o
 obj-$(CONFIG_VITESSE_PHY)  += vitesse.o
 obj-$(CONFIG_BROADCOM_PHY) += broadcom.o
+obj-$(CONFIG_ICPLUS_PHY)   += icplus.o
 obj-$(CONFIG_FIXED_PHY)+= fixed.o
diff --git a/drivers/net/phy/icplus.c b/drivers/net/phy/icplus.c
new file mode 100644
index 000..4e2912c
--- /dev/null
+++ b/drivers/net/phy/icplus.c
@@ -0,0 +1,121 @@
+/*
+ * Driver for ICPlus PHYs
+ *
+ * Copyright (c) 2007 Freescale Semiconductor, Inc.
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ *
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+MODULE_DESCRIPTION("ICPlus IP175C PHY driver");
+MODULE_AUTHOR("Michael Barkowski");
+MODULE_LICENSE("GPL");
+
+static int ip175c_config_init(struct phy_device *phydev)
+{
+   int err, i;
+   static int full_reset_performed = 0;
+
+   if (full_reset_performed == 0) {
+
+   /* master reset */
+   err = phydev->bus->write(phydev->bus, 30, 0, 0x175c);
+   if (err < 0)
+   return err;
+
+   /* data sheet specifies reset period is 2 msec */
+   udelay(3000);
+
+   /* enable IP175C mode */
+   err = phydev->bus->write(phydev->bus, 29, 31, 0x175c);
+   if (err < 0)
+   return err;
+
+   for (i=0; i<5; i++) {
+   err = phydev->bus->write(phydev->bus, i, MII_BMCR, 
BMCR_RESET);
+   }
+   udelay(3000);
+
+   full_reset_performed = 1;
+   }
+
+   if (phydev->addr != 4) {
+   phydev->state = PHY_RUNNING;
+   phydev->speed = SPEED_100;
+   phydev->duplex = DUPLEX_FULL;
+   phydev->link = 1;
+   netif_carrier_on(phydev->attached_dev);
+   }
+
+   printk("PHY %d initialized\n",phydev->addr);
+   return 0;
+}
+
+static int ip175c_read_status(struct phy_device *phydev)
+{
+   if (phydev->addr == 4) { /* if WAN port */
+   genphy_read_status(phydev);
+   }
+
+   return 0;
+}
+
+
+static int ip175c_config_aneg(struct phy_device *phydev)
+{
+   if (phydev->addr == 4) { /* if WAN port */
+   genphy_config_aneg(phydev);
+   }
+
+   return 0;
+}
+
+
+static struct phy_driver ip175c_driver = {
+   .phy_id = 0x02430d80,
+   .name   = "ICPlus IP175C",
+   .phy_id_mask= 0x0ff0,
+   .features   = PHY_BASIC_FEATURES,
+   .config_init= &ip175c_config_init,
+   .config_aneg= &ip175c_config_aneg,
+   .read_status= &ip175c_read_status,
+   .driver = { .owner = THIS_MODULE,},
+};
+
+static int __init ip175c_init(void)
+{
+   return phy_driver_register(&ip175c_driver);
+}
+
+static void __exit ip175c_exit(void)
+{
+   phy_driver_unregister(&ip175c_driver);
+}
+
+module_init(ip175c_init);
+module_exit(ip175c_exit);
+
-- 
1.5.0.1

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Add support for the Davicom DM9161A PHY

2007-04-10 Thread Kim Phillips
Distinguish between the Davicom DM9161A PHY and the DM9161E.

Signed-off-by: Kim Phillips <[EMAIL PROTECTED]>
---
please consider for 2.6.22

 drivers/net/phy/davicom.c |   34 ++
 1 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/net/phy/davicom.c b/drivers/net/phy/davicom.c
index 519baa3..7ed632d 100644
--- a/drivers/net/phy/davicom.c
+++ b/drivers/net/phy/davicom.c
@@ -139,7 +139,7 @@ static int dm9161_ack_interrupt(struct phy_device *phydev)
return (err < 0) ? err : 0;
 }
 
-static struct phy_driver dm9161_driver = {
+static struct phy_driver dm9161e_driver = {
.phy_id = 0x0181b880,
.name   = "Davicom DM9161E",
.phy_id_mask= 0x0ff0,
@@ -147,7 +147,18 @@ static struct phy_driver dm9161_driver = {
.config_init= dm9161_config_init,
.config_aneg= dm9161_config_aneg,
.read_status= genphy_read_status,
-   .driver = { .owner = THIS_MODULE,},
+   .driver = { .owner = THIS_MODULE,},
+};
+
+static struct phy_driver dm9161a_driver = {
+   .phy_id = 0x0181b8a0,
+   .name   = "Davicom DM9161A",
+   .phy_id_mask= 0x0ff0,
+   .features   = PHY_BASIC_FEATURES,
+   .config_init= dm9161_config_init,
+   .config_aneg= dm9161_config_aneg,
+   .read_status= genphy_read_status,
+   .driver = { .owner = THIS_MODULE,},
 };
 
 static struct phy_driver dm9131_driver = {
@@ -160,31 +171,38 @@ static struct phy_driver dm9131_driver = {
.read_status= genphy_read_status,
.ack_interrupt  = dm9161_ack_interrupt,
.config_intr= dm9161_config_intr,
-   .driver = { .owner = THIS_MODULE,},
+   .driver = { .owner = THIS_MODULE,},
 };
 
 static int __init davicom_init(void)
 {
int ret;
 
-   ret = phy_driver_register(&dm9161_driver);
+   ret = phy_driver_register(&dm9161e_driver);
if (ret)
goto err1;
 
-   ret = phy_driver_register(&dm9131_driver);
+   ret = phy_driver_register(&dm9161a_driver);
if (ret)
goto err2;
+
+   ret = phy_driver_register(&dm9131_driver);
+   if (ret)
+   goto err3;
return 0;
 
- err2: 
-   phy_driver_unregister(&dm9161_driver);
+ err3:
+   phy_driver_unregister(&dm9161a_driver);
+ err2:
+   phy_driver_unregister(&dm9161e_driver);
  err1:
return ret;
 }
 
 static void __exit davicom_exit(void)
 {
-   phy_driver_unregister(&dm9161_driver);
+   phy_driver_unregister(&dm9161e_driver);
+   phy_driver_unregister(&dm9161a_driver);
phy_driver_unregister(&dm9131_driver);
 }
 
-- 
1.5.0.1

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Add support for running the Marvell m88e1111 PHY in RGMII mode

2007-04-10 Thread Kim Phillips
also adds RX & TX delay bits to help boards with clock skew problems.

Signed-off-by: Kim Phillips <[EMAIL PROTECTED]>
---
please consider for 2.6.22

 drivers/net/phy/marvell.c |   43 +++
 1 files changed, 43 insertions(+), 0 deletions(-)

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index 22aec5c..0b2d4db 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -54,6 +54,12 @@
 #define MII_M_PHY_LED_CONTROL  0x18
 #define MII_M_PHY_LED_DIRECT   0x4100
 #define MII_M_PHY_LED_COMBINE  0x411c
+#define MII_M_PHY_EXT_CR   0x14
+#define MII_M_RX_DELAY 0x80
+#define MII_M_TX_DELAY 0x2
+#define MII_M_PHY_EXT_SR   0x1b
+#define MII_M_HWCFG_MODE_MASK  0xf
+#define MII_M_HWCFG_MODE_RGMII 0xb
 
 MODULE_DESCRIPTION("Marvell PHY driver");
 MODULE_AUTHOR("Andy Fleming");
@@ -131,6 +137,42 @@ static int marvell_config_aneg(struct phy_device *phydev)
return err;
 }
 
+static int m88e_config_init(struct phy_device *phydev)
+{
+   int err;
+
+   if (phydev->interface == PHY_INTERFACE_MODE_RGMII) {
+   int temp;
+
+   temp = phy_read(phydev, MII_M_PHY_EXT_CR);
+   if (temp < 0)
+   return temp;
+
+   temp |= (MII_M_RX_DELAY | MII_M_TX_DELAY);
+
+   err = phy_write(phydev, MII_M_PHY_EXT_CR, temp);
+   if (err < 0)
+   return err;
+
+   temp = phy_read(phydev, MII_M_PHY_EXT_SR);
+   if (temp < 0)
+   return temp;
+
+   temp &= ~(MII_M_HWCFG_MODE_MASK);
+   temp |= MII_M_HWCFG_MODE_RGMII;
+
+   err = phy_write(phydev, MII_M_PHY_EXT_SR, temp);
+   if (err < 0)
+   return err;
+   }
+
+   err = phy_write(phydev, MII_BMCR, BMCR_RESET);
+   if (err < 0)
+   return err;
+
+   return 0;
+}
+
 static int m88e1145_config_init(struct phy_device *phydev)
 {
int err;
@@ -216,6 +258,7 @@ static struct phy_driver m88es_driver = {
.read_status = &genphy_read_status,
.ack_interrupt = &marvell_ack_interrupt,
.config_intr = &marvell_config_intr,
+   .config_init = &m88e_config_init,
.driver = {.owner = THIS_MODULE,},
 };
 
-- 
1.5.0.1

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] ucc_geth: implement and increment version number

2007-04-10 Thread Kim Phillips
Signed-off-by: Kim Phillips <[EMAIL PROTECTED]>
---
please consider for 2.6.22

 drivers/net/ucc_geth.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 1b943e6..c9b1b28 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -46,8 +46,9 @@
 
 #undef DEBUG
 
-#define DRV_DESC "QE UCC Gigabit Ethernet Controller version:Sept 11, 2006"
+#define DRV_DESC "QE UCC Gigabit Ethernet Controller"
 #define DRV_NAME "ucc_geth"
+#define DRV_VERSION "1.1"
 
 #define ugeth_printk(level, format, arg...)  \
 printk(level format "\n", ## arg)
@@ -3975,4 +3976,5 @@ module_exit(ucc_geth_exit);
 
 MODULE_AUTHOR("Freescale Semiconductor, Inc");
 MODULE_DESCRIPTION(DRV_DESC);
+MODULE_VERSION(DRV_VERSION);
 MODULE_LICENSE("GPL");
-- 
1.5.0.1

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 23/30] Use menuconfig objects - netdev

2007-04-10 Thread Jan Engelhardt

Use menuconfigs instead of menus, so the whole menu can be disabled at
once instead of going through all options.

Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>

Index: linux-2.6.21-rc5/drivers/net/Kconfig
===
--- linux-2.6.21-rc5.orig/drivers/net/Kconfig
+++ linux-2.6.21-rc5/drivers/net/Kconfig
@@ -1897,8 +1897,12 @@ endmenu
 #  Gigabit Ethernet
 #
 
-menu "Ethernet (1000 Mbit)"
+menuconfig NETDEV_1000
+   bool "Ethernet (1000 Mbit)"
depends on !UML
+   default y
+
+if NETDEV_1000
 
 config ACENIC
tristate "Alteon AceNIC/3Com 3C985/NetGear GA620 Gigabit support"
@@ -2327,14 +2331,18 @@ config ATL1
  To compile this driver as a module, choose M here.  The module
  will be called atl1.
 
-endmenu
+endif # NETDEV_1000
 
 #
 #  10 Gigabit Ethernet
 #
 
-menu "Ethernet (1 Mbit)"
+menuconfig NETDEV_1
+   bool "Ethernet (1 Mbit)"
depends on !UML
+   default y
+
+if NETDEV_1
 
 config CHELSIO_T1
 tristate "Chelsio 10Gb Ethernet support"
@@ -2493,7 +2501,7 @@ config PASEMI_MAC
  This driver supports the on-chip 1/10Gbit Ethernet controller on
  PA Semi's PWRficient line of chips.
 
-endmenu
+endif # NETDEV_1
 
 source "drivers/net/tokenring/Kconfig"
 
#
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 22/30] Use menuconfig objects - toeknring

2007-04-10 Thread Jan Engelhardt

Use menuconfigs instead of menus, so the whole menu can be disabled at
once instead of going through all options.

Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>

Index: linux-2.6.21-rc5/drivers/net/tokenring/Kconfig
===
--- linux-2.6.21-rc5.orig/drivers/net/tokenring/Kconfig
+++ linux-2.6.21-rc5/drivers/net/tokenring/Kconfig
@@ -2,12 +2,10 @@
 # Token Ring driver configuration
 #
 
-menu "Token Ring devices"
-   depends on NETDEVICES && !UML
-
 # So far, we only have PCI, ISA, and MCA token ring devices
-config TR
+menuconfig TR
bool "Token Ring driver support"
+   depends on NETDEVICES && !UML
depends on (PCI || ISA || MCA || CCW)
select LLC
help
@@ -20,9 +18,11 @@ config TR
  from . Most people can
  say N here.
 
+if TR
+
 config IBMTR
tristate "IBM Tropic chipset based adapter support"
-   depends on TR && (ISA || MCA)
+   depends on ISA || MCA
---help---
  This is support for all IBM Token Ring cards that don't use DMA. If
  you have such a beast, say Y and read the Token-Ring mini-HOWTO,
@@ -36,7 +36,7 @@ config IBMTR
 
 config IBMOL
tristate "IBM Olympic chipset PCI adapter support"
-   depends on TR && PCI
+   depends on PCI
---help---
  This is support for all non-Lanstreamer IBM PCI Token Ring Cards.
  Specifically this is all IBM PCI, PCI Wake On Lan, PCI II, PCI II
@@ -54,7 +54,7 @@ config IBMOL
 
 config IBMLS
tristate "IBM Lanstreamer chipset PCI adapter support"
-   depends on TR && PCI && !64BIT
+   depends on PCI && !64BIT
help
  This is support for IBM Lanstreamer PCI Token Ring Cards.
 
@@ -66,7 +66,7 @@ config IBMLS
 
 config 3C359
tristate "3Com 3C359 Token Link Velocity XL adapter support"
-   depends on TR && PCI
+   depends on PCI
---help---
  This is support for the 3Com PCI Velocity XL cards, specifically
  the 3Com 3C359, please note this is not for the 3C339 cards, you
@@ -84,7 +84,7 @@ config 3C359
 
 config TMS380TR
tristate "Generic TMS380 Token Ring ISA/PCI adapter support"
-   depends on TR && (PCI || ISA && ISA_DMA_API || MCA)
+   depends on PCI || ISA && ISA_DMA_API || MCA
select FW_LOADER
---help---
  This driver provides generic support for token ring adapters
@@ -108,7 +108,7 @@ config TMS380TR
 
 config TMSPCI
tristate "Generic TMS380 PCI support"
-   depends on TR && TMS380TR && PCI
+   depends on TMS380TR && PCI
---help---
  This tms380 module supports generic TMS380-based PCI cards.
 
@@ -123,7 +123,7 @@ config TMSPCI
 
 config SKISA
tristate "SysKonnect TR4/16 ISA support"
-   depends on TR && TMS380TR && ISA
+   depends on TMS380TR && ISA
help
  This tms380 module supports SysKonnect TR4/16 ISA cards.
 
@@ -135,7 +135,7 @@ config SKISA
 
 config PROTEON
tristate "Proteon ISA support"
-   depends on TR && TMS380TR && ISA
+   depends on TMS380TR && ISA
help
  This tms380 module supports Proteon ISA cards.
 
@@ -148,7 +148,7 @@ config PROTEON
 
 config ABYSS
tristate "Madge Smart 16/4 PCI Mk2 support"
-   depends on TR && TMS380TR && PCI
+   depends on TMS380TR && PCI
help
  This tms380 module supports the Madge Smart 16/4 PCI Mk2
  cards (51-02).
@@ -158,7 +158,7 @@ config ABYSS
 
 config MADGEMC
tristate "Madge Smart 16/4 Ringnode MicroChannel"
-   depends on TR && TMS380TR && MCA
+   depends on TMS380TR && MCA
help
  This tms380 module supports the Madge Smart 16/4 MC16 and MC32
  MicroChannel adapters.
@@ -168,7 +168,7 @@ config MADGEMC
 
 config SMCTR
tristate "SMC ISA/MCA adapter support"
-   depends on TR && (ISA || MCA_LEGACY) && (BROKEN || !64BIT)
+   depends on (ISA || MCA_LEGACY) && (BROKEN || !64BIT)
---help---
  This is support for the ISA and MCA SMC Token Ring cards,
  specifically SMC TokenCard Elite (8115T) and SMC TokenCard Elite/A
@@ -182,5 +182,4 @@ config SMCTR
  To compile this driver as a module, choose M here: the module will be
  called smctr.
 
-endmenu
-
+endif # TR
#
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/30] Use menuconfig objects - PHY

2007-04-10 Thread Jan Engelhardt

(No MAINTAINERS entry. MODULE_AUTHOR lines exist, but without addresses.)

Use menuconfigs instead of menus, so the whole menu can be disabled at
once instead of going through all options.

Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>

Index: linux-2.6.21-rc5/drivers/net/phy/Kconfig
===
--- linux-2.6.21-rc5.orig/drivers/net/phy/Kconfig
+++ linux-2.6.21-rc5/drivers/net/phy/Kconfig
@@ -2,9 +2,7 @@
 # PHY Layer Configuration
 #
 
-menu "PHY device support"
-
-config PHYLIB
+menuconfig PHYLIB
tristate "PHY Device support and infrastructure"
depends on NET_ETHERNET && (BROKEN || !S390)
help
@@ -12,59 +10,52 @@ config PHYLIB
  devices.  This option provides infrastructure for
  managing PHY devices.
 
+if PHYLIB
+
 comment "MII PHY device drivers"
-   depends on PHYLIB
 
 config MARVELL_PHY
tristate "Drivers for Marvell PHYs"
-   depends on PHYLIB
---help---
  Currently has a driver for the 88E1011S

 config DAVICOM_PHY
tristate "Drivers for Davicom PHYs"
-   depends on PHYLIB
---help---
  Currently supports dm9161e and dm9131
 
 config QSEMI_PHY
tristate "Drivers for Quality Semiconductor PHYs"
-   depends on PHYLIB
---help---
  Currently supports the qs6612
 
 config LXT_PHY
tristate "Drivers for the Intel LXT PHYs"
-   depends on PHYLIB
---help---
  Currently supports the lxt970, lxt971
 
 config CICADA_PHY
tristate "Drivers for the Cicada PHYs"
-   depends on PHYLIB
---help---
  Currently supports the cis8204
+
 config VITESSE_PHY
 tristate "Drivers for the Vitesse PHYs"
-depends on PHYLIB
 ---help---
   Currently supports the vsc8244
 
 config SMSC_PHY
tristate "Drivers for SMSC PHYs"
-   depends on PHYLIB
---help---
  Currently supports the LAN83C185 PHY
 
 config BROADCOM_PHY
tristate "Drivers for Broadcom PHYs"
-   depends on PHYLIB
---help---
  Currently supports the BCM5411, BCM5421 and BCM5461 PHYs.
 
 config FIXED_PHY
tristate "Drivers for PHY emulation on fixed speed/link"
-   depends on PHYLIB
---help---
  Adds the driver to PHY layer to cover the boards that do not have any 
PHY bound,
  but with the ability to manipulate the speed/link in software. The 
relevant MII
@@ -79,5 +70,4 @@ config FIXED_MII_100_FDX
bool "Emulation for 100M Fdx fixed PHY behavior"
depends on FIXED_PHY
 
-endmenu
-
+endif # PHYLIB
#
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 20/30] Use menuconfig objects - ARCNET

2007-04-10 Thread Jan Engelhardt

(Wow, not a single MODULE_AUTHOR line in drivers/net/arcnet/ ...)

Use menuconfigs instead of menus, so the whole menu can be disabled at
once instead of going through all options.

Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>

Index: linux-2.6.21-rc5/drivers/net/arcnet/Kconfig
===
--- linux-2.6.21-rc5.orig/drivers/net/arcnet/Kconfig
+++ linux-2.6.21-rc5/drivers/net/arcnet/Kconfig
@@ -2,10 +2,8 @@
 # Arcnet configuration
 #
 
-menu "ARCnet devices"
+menuconfig ARCNET
depends on NETDEVICES && (ISA || PCI)
-
-config ARCNET
tristate "ARCnet support"
---help---
  If you have a network card of this type, say Y and check out the
@@ -25,9 +23,10 @@ config ARCNET
  .  The module will
  be called arcnet.
 
+if ARCNET
+
 config ARCNET_1201
tristate "Enable standard ARCNet packet format (RFC 1201)"
-   depends on ARCNET
help
  This allows you to use RFC1201 with your ARCnet card via the virtual
  arc0 device.  You need to say Y here to communicate with
@@ -38,7 +37,6 @@ config ARCNET_1201
 
 config ARCNET_1051
tristate "Enable old ARCNet packet format (RFC 1051)"
-   depends on ARCNET
---help---
  This allows you to use RFC1051 with your ARCnet card via the virtual
  arc0s device. You only need arc0s if you want to talk to ARCnet
@@ -53,7 +51,6 @@ config ARCNET_1051
 
 config ARCNET_RAW
tristate "Enable raw mode packet interface"
-   depends on ARCNET
help
  ARCnet "raw mode" packet encapsulation, no soft headers.  Unlikely
  to work unless talking to a copy of the same Linux arcnet driver,
@@ -61,7 +58,6 @@ config ARCNET_RAW
 
 config ARCNET_CAP
tristate "Enable CAP mode packet interface"
-   depends on ARCNET
help
  ARCnet "cap mode" packet encapsulation. Used to get the hardware
   acknowledge back to userspace. After the initial protocol byte every
@@ -80,7 +76,6 @@ config ARCNET_CAP
 
 config ARCNET_COM90xx
tristate "ARCnet COM90xx (normal) chipset driver"
-   depends on ARCNET
help
  This is the chipset driver for the standard COM90xx cards. If you
  have always used the old ARCnet driver without knowing what type of
@@ -92,7 +87,6 @@ config ARCNET_COM90xx
 
 config ARCNET_COM90xxIO
tristate "ARCnet COM90xx (IO mapped) chipset driver"
-   depends on ARCNET
---help---
  This is the chipset driver for the COM90xx cards, using them in
  IO-mapped mode instead of memory-mapped mode. This is slower than
@@ -105,7 +99,6 @@ config ARCNET_COM90xxIO
 
 config ARCNET_RIM_I
tristate "ARCnet COM90xx (RIM I) chipset driver"
-   depends on ARCNET
---help---
  This is yet another chipset driver for the COM90xx cards, but this
  time only using memory-mapped mode, and no IO ports at all. This
@@ -118,7 +111,6 @@ config ARCNET_RIM_I
 
 config ARCNET_COM20020
tristate "ARCnet COM20020 chipset driver"
-   depends on ARCNET
help
  This is the driver for the new COM20020 chipset. It supports such
  things as promiscuous mode, so packet sniffing is possible, and
@@ -136,5 +128,4 @@ config ARCNET_COM20020_PCI
tristate "Support for COM20020 on PCI"
depends on ARCNET_COM20020 && PCI
 
-endmenu
-
+endif # ARCNET
#
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/30] Use menuconfig objects - IPVS

2007-04-10 Thread Jan Engelhardt

Use menuconfigs instead of menus, so the whole menu can be disabled at
once instead of going through all options.

Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>

Index: linux-2.6.21-rc5/net/ipv4/ipvs/Kconfig
===
--- linux-2.6.21-rc5.orig/net/ipv4/ipvs/Kconfig
+++ linux-2.6.21-rc5/net/ipv4/ipvs/Kconfig
@@ -1,10 +1,7 @@
 #
 # IP Virtual Server configuration
 #
-menu   "IP: Virtual Server Configuration"
-   depends on NETFILTER
-
-config IP_VS
+menuconfig IP_VS
tristate "IP virtual server support (EXPERIMENTAL)"
depends on NETFILTER
---help---
@@ -25,9 +22,10 @@ config   IP_VS
  If you want to compile it in kernel, say Y. To compile it as a
  module, choose M here. If unsure, say N.
 
+if IP_VS
+
 config IP_VS_DEBUG
bool "IP virtual server debugging"
-   depends on IP_VS
---help---
  Say Y here if you want to get additional messages useful in
  debugging the IP virtual server code. You can change the debug
@@ -35,7 +33,6 @@ configIP_VS_DEBUG
 
 config IP_VS_TAB_BITS
int "IPVS connection table size (the Nth power of 2)"
-   depends on IP_VS 
default "12" 
---help---
  The IPVS connection hash table uses the chaining scheme to handle
@@ -61,42 +58,35 @@ config  IP_VS_TAB_BITS
  needed for your box.
 
 comment "IPVS transport protocol load balancing support"
-depends on IP_VS
 
 config IP_VS_PROTO_TCP
bool "TCP load balancing support"
-   depends on IP_VS
---help---
  This option enables support for load balancing TCP transport
  protocol. Say Y if unsure.
 
 config IP_VS_PROTO_UDP
bool "UDP load balancing support"
-   depends on IP_VS
---help---
  This option enables support for load balancing UDP transport
  protocol. Say Y if unsure.
 
 config IP_VS_PROTO_ESP
bool "ESP load balancing support"
-   depends on IP_VS
---help---
  This option enables support for load balancing ESP (Encapsulation
  Security Payload) transport protocol. Say Y if unsure.
 
 config IP_VS_PROTO_AH
bool "AH load balancing support"
-   depends on IP_VS
---help---
  This option enables support for load balancing AH (Authentication
  Header) transport protocol. Say Y if unsure.
 
 comment "IPVS scheduler"
-depends on IP_VS
 
 config IP_VS_RR
tristate "round-robin scheduling"
-   depends on IP_VS
---help---
  The robin-robin scheduling algorithm simply directs network
  connections to different real servers in a round-robin manner.
@@ -106,7 +96,6 @@ config   IP_VS_RR
  
 config IP_VS_WRR
 tristate "weighted round-robin scheduling" 
-   depends on IP_VS
---help---
  The weighted robin-robin scheduling algorithm directs network
  connections to different real servers based on server weights
@@ -120,7 +109,6 @@ config  IP_VS_WRR
 
 config IP_VS_LC
 tristate "least-connection scheduling"
-depends on IP_VS
---help---
  The least-connection scheduling algorithm directs network
  connections to the server with the least number of active 
@@ -131,7 +119,6 @@ config  IP_VS_LC
 
 config IP_VS_WLC
 tristate "weighted least-connection scheduling"
-depends on IP_VS
---help---
  The weighted least-connection scheduling algorithm directs network
  connections to the server with the least active connections
@@ -142,7 +129,6 @@ config  IP_VS_WLC
 
 config IP_VS_LBLC
tristate "locality-based least-connection scheduling"
-depends on IP_VS
---help---
  The locality-based least-connection scheduling algorithm is for
  destination IP load balancing. It is usually used in cache cluster.
@@ -157,7 +143,6 @@ config  IP_VS_LBLC
 
 config  IP_VS_LBLCR
tristate "locality-based least-connection with replication scheduling"
-depends on IP_VS
---help---
  The locality-based least-connection with replication scheduling
  algorithm is also for destination IP load balancing. It is 
@@ -176,7 +161,6 @@ config  IP_VS_LBLCR
 
 config IP_VS_DH
tristate "destination hashing scheduling"
-depends on IP_VS
---help---
  The destination hashing scheduling algorithm assigns network
  connections to the servers through looking up a statically assigned
@@ -187,7 +171,6 @@ config  IP_VS_DH
 
 config IP_VS_SH
tristate "source hashing scheduling"
-depends on IP_VS
---help---
  The source hashing scheduling algorithm assigns network
  connections to the servers through looking up a statically assigned
@@ -198,7 +181,6 @@ config  IP_VS_SH
 
 config IP_VS_SED
tristate "shortest ex

[PATCH 0/2] [SCTP] a few bugfixes (now with patches)

2007-04-10 Thread Vlad Yasevich

Sorry, now with the patches that follow...

-vlad
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] [SCTP] Unmap v4mapped addresses during SCTP_BINDX_REM_ADDR operation.

2007-04-10 Thread Vlad Yasevich
From: Paolo Galtieri <[EMAIL PROTECTED]>

During the sctp_bindx() call to add additional addresses to the
endpoint, any v4mapped addresses are converted and stored as regular
v4 addresses.  However, when trying to remove these addresses, the
v4mapped addresses are not converted and the operation fails.  This
patch unmaps the addresses on during the remove operation as well.

Signed-off-by: Paolo Galtieri <[EMAIL PROTECTED]>
Signed-off-by: Vlad Yasevich <[EMAIL PROTECTED]>
---
 net/sctp/socket.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 523e73e..a1d026f 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -627,6 +627,12 @@ int sctp_bindx_rem(struct sock *sk, struct sockaddr 
*addrs, int addrcnt)
retval = -EINVAL;
goto err_bindx_rem;
}
+
+   if (!af->addr_valid(sa_addr, sp, NULL)) {
+   retval = -EADDRNOTAVAIL;
+   goto err_bindx_rem;
+   }
+
if (sa_addr->v4.sin_port != htons(bp->port)) {
retval = -EINVAL;
goto err_bindx_rem;
-- 
1.5.0.3.438.gc49b2

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] [SCTP] Fix assertion (!atomic_read(&sk->sk_rmem_alloc)) failed message

2007-04-10 Thread Vlad Yasevich
From: Tsutomu Fujii <[EMAIL PROTECTED]>

In current implementation, LKSCTP does receive buffer accounting for
data in sctp_receive_queue and pd_lobby. However, LKSCTP don't do
accounting for data in frag_list when data is fragmented. In addition,
LKSCTP doesn't do accounting for data in reasm and lobby queue in
structure sctp_ulpq.
When there are date in these queue, assertion failed message is printed
in inet_sock_destruct because sk_rmem_alloc of oldsk does not become 0
when socket is destroyed.

Signed-off-by: Tsutomu Fujii <[EMAIL PROTECTED]>
Signed-off-by: Vlad Yasevich <[EMAIL PROTECTED]>
---
 net/sctp/socket.c |   48 
 1 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 536298c..523e73e 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -5638,6 +5638,36 @@ void sctp_wait_for_close(struct sock *sk, long timeout)
finish_wait(sk->sk_sleep, &wait);
 }
 
+static void sctp_sock_rfree_frag(struct sk_buff *skb)
+{
+   struct sk_buff *frag;
+
+   if (!skb->data_len)
+   goto done;
+
+   /* Don't forget the fragments. */
+   for (frag = skb_shinfo(skb)->frag_list; frag; frag = frag->next)
+   sctp_sock_rfree_frag(frag);
+
+done:
+   sctp_sock_rfree(skb);
+}
+
+static void sctp_skb_set_owner_r_frag(struct sk_buff *skb, struct sock *sk)
+{
+   struct sk_buff *frag;
+
+   if (!skb->data_len)
+   goto done;
+
+   /* Don't forget the fragments. */
+   for (frag = skb_shinfo(skb)->frag_list; frag; frag = frag->next)
+   sctp_skb_set_owner_r_frag(frag, sk);
+
+done:
+   sctp_skb_set_owner_r(skb, sk);
+}
+
 /* Populate the fields of the newsk from the oldsk and migrate the assoc
  * and its messages to the newsk.
  */
@@ -5692,10 +5722,10 @@ static void sctp_sock_migrate(struct sock *oldsk, 
struct sock *newsk,
sctp_skb_for_each(skb, &oldsk->sk_receive_queue, tmp) {
event = sctp_skb2event(skb);
if (event->asoc == assoc) {
-   sctp_sock_rfree(skb);
+   sctp_sock_rfree_frag(skb);
__skb_unlink(skb, &oldsk->sk_receive_queue);
__skb_queue_tail(&newsk->sk_receive_queue, skb);
-   sctp_skb_set_owner_r(skb, newsk);
+   sctp_skb_set_owner_r_frag(skb, newsk);
}
}
 
@@ -5723,10 +5753,10 @@ static void sctp_sock_migrate(struct sock *oldsk, 
struct sock *newsk,
sctp_skb_for_each(skb, &oldsp->pd_lobby, tmp) {
event = sctp_skb2event(skb);
if (event->asoc == assoc) {
-   sctp_sock_rfree(skb);
+   sctp_sock_rfree_frag(skb);
__skb_unlink(skb, &oldsp->pd_lobby);
__skb_queue_tail(queue, skb);
-   sctp_skb_set_owner_r(skb, newsk);
+   sctp_skb_set_owner_r_frag(skb, newsk);
}
}
 
@@ -5738,6 +5768,16 @@ static void sctp_sock_migrate(struct sock *oldsk, struct 
sock *newsk,
 
}
 
+   sctp_skb_for_each(skb, &assoc->ulpq.reasm, tmp) {
+   sctp_sock_rfree_frag(skb);
+   sctp_skb_set_owner_r_frag(skb, newsk);
+   }
+
+   sctp_skb_for_each(skb, &assoc->ulpq.lobby, tmp) {
+   sctp_sock_rfree_frag(skb);
+   sctp_skb_set_owner_r_frag(skb, newsk);
+   }
+
/* Set the type of socket to indicate that it is peeled off from the
 * original UDP-style socket or created with the accept() call on a
 * TCP-style socket..
-- 
1.5.0.3.438.gc49b2

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] [SCTP] a few bugfixes

2007-04-10 Thread Vlad Yasevich

Hi Dave

Please consider applying the following two patches.  They resolve some
interesting bugs.

Thanks
-vlad
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


net-2.6.22 plans...

2007-04-10 Thread David Miller

As the merge window approaches I've made a decision about how to
handle the TCP RB-Tree work since it's getting really close.

First, I'm going to rebase the net-2.6.22 tree after making a quick
run through my patch backlog looking for low hanging fruit.  I will
include the TCP write queue abstraction bits but not any of the
TCP RB-Tree based work.

Then I'll clone a tcp-2.6 tree from the rebased net-2.6.22 and put all
of the TCP RB-Tree bits in there.

There is not enough time to do a proper performance analysis of the
RB-Tree stuff and I'm busy working on other things at the moment so I
won't be able to do it myself soon either.  So we'll try to take care
of things in time for 2.6.23

Some minor conflicts might result from yanking out the RB-Tree bits
from net-2.6.22 so give me a day or two to finish this.

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


phylib usage

2007-04-10 Thread David Hollis
I've been keeping an eye on PHYLIB since it's addition to the kernel
some time ago and I'm a bit curious as to why it doesn't seem to have
much up-take among other drivers.  A quick check of the kernel source
shows only two users (AU1000 and gianfar) and both look to be embedded
type devices.  Are there fundamental issues with PHYLIB that prevent it
from being more widely used?  Is it an element of "the driver ain't
broke, I'm not going to rework my PHY handling at this point"?  Is PHY
handling something that is just too difficult to fully abstract with
various specialty nics?

To semi-answer one of my own questions, I was hoping to use PHYLIB with
asix.c (a USB Ethernet driver) since the newer devices now have external
PHYs and it appears that there are quite a few different ones in use by
various OEMs but I found that the locking in use wasn't amenable to USB
operations, since USB calls can sleep.  This doesn't seem like an
insurmountable problem however.

The main reason I even ask is that it seems rather odd to have so many
drivers that seem to have the same PHY's and yet each driver has to deal
with various errata and quirks of the PHY.  Having it all abstracted
makes for one location to deal with things and everybody wins.

-- 
David Hollis <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET : loopback driver can use loopback_dev integrated net_device_stats

2007-04-10 Thread David Miller
From: Eric Dumazet <[EMAIL PROTECTED]>
Date: Tue, 10 Apr 2007 19:29:12 +0200

> Hi David
> 
> Please find this patch against net-2.6.22
> 
> Thank you
> 
> [PATCH] NET : loopback driver can use loopback_dev integrated net_device_stats
> 
> Rusty added a new 'stats' field to struct net_device.
> 
> loopback driver can use it instead of declaring another struct 
> net_device_stats
> This saves some memory.
> 
> Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]>

Applied, thanks Eric.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] bridge update for 2.6.22

2007-04-10 Thread David Miller
From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Mon, 9 Apr 2007 13:55:52 -0700

> Here is an update to bridging code for net-2.6.22
> 
> The following changes since commit 532122caf3f7573760c5ec523bc3be14606bb8f2:
>   Ilpo JĂ€rvinen (1):
> [TCP]: Simplify LOST marker code
> 
> are found in the git repository at:
> 
>   master.kernel.org:/pub/scm/linux/kernel/git/shemminger/bridge-2.6.22.git
> 
> Akinobu Mita (1):
>   bridge: check kmem_cache_create() error
> 
> Stephen Hemminger (7):
>   bridge: eliminate call by reference
>   bridge: don't route packets while learning
>   bridge: simpler hash with salt
>   bridge: add sysfs hook to flush forwarding table
>   bridge: add support for user mode STP
>   bridge: change when netlink events go to STP
>   bridge: allow changing hardware address to any valid address

Pulled, thanks a lot Stephen.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] myri10ge: more Intel chipsets providing aligned PCIe completions

2007-04-10 Thread Brice Goglin
Add the Intel 5000 southbridge (aka Intel 6310/6311/6321ESB) PCIe ports
and the Intel E30x0 chipsets to the whitelist of aligned PCIe completion.

Signed-off-by: Brice Goglin <[EMAIL PROTECTED]>
---
 drivers/net/myri10ge/myri10ge.c |   17 +
 1 file changed, 17 insertions(+)

Index: linux-rc/drivers/net/myri10ge/myri10ge.c
===
--- linux-rc.orig/drivers/net/myri10ge/myri10ge.c   2007-04-10 
21:03:59.0 +0200
+++ linux-rc/drivers/net/myri10ge/myri10ge.c2007-04-10 21:04:35.0 
+0200
@@ -2487,6 +2487,10 @@
 
 #define PCI_DEVICE_ID_INTEL_E5000_PCIE23 0x25f7
 #define PCI_DEVICE_ID_INTEL_E5000_PCIE47 0x25fa
+#define PCI_DEVICE_ID_INTEL_6300ESB_PCIEE1 0x3510
+#define PCI_DEVICE_ID_INTEL_6300ESB_PCIEE4 0x351b
+#define PCI_DEVICE_ID_INTEL_E3000_PCIE 0x2779
+#define PCI_DEVICE_ID_INTEL_E3010_PCIE 0x277a
 #define PCI_DEVICE_ID_SERVERWORKS_HT2100_PCIE_FIRST 0x140
 #define PCI_DEVICE_ID_SERVERWORKS_HT2100_PCIE_LAST 0x142
 
@@ -2526,6 +2530,18 @@
PCI_DEVICE_ID_SERVERWORKS_HT2100_PCIE_FIRST
&& bridge->device <=
PCI_DEVICE_ID_SERVERWORKS_HT2100_PCIE_LAST)
+   /* All Intel E3000/E3010 PCIE ports */
+   || (bridge->vendor == PCI_VENDOR_ID_INTEL
+   && (bridge->device ==
+   PCI_DEVICE_ID_INTEL_E3000_PCIE
+   || bridge->device ==
+   PCI_DEVICE_ID_INTEL_E3010_PCIE))
+   /* All Intel 6310/6311/6321ESB PCIE ports */
+   || (bridge->vendor == PCI_VENDOR_ID_INTEL
+   && bridge->device >=
+   PCI_DEVICE_ID_INTEL_6300ESB_PCIEE1
+   && bridge->device <=
+   PCI_DEVICE_ID_INTEL_6300ESB_PCIEE4)
/* All Intel E5000 PCIE ports */
|| (bridge->vendor == PCI_VENDOR_ID_INTEL
&& bridge->device >=


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] myri10ge: update driver version to 1.3.0-1.233

2007-04-10 Thread Brice Goglin
Update the myri10ge driver version number to 1.3.0-1.233.

Signed-off-by: Brice Goglin <[EMAIL PROTECTED]>
---
 drivers/net/myri10ge/myri10ge.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-rc/drivers/net/myri10ge/myri10ge.c
===
--- linux-rc.orig/drivers/net/myri10ge/myri10ge.c   2007-04-10 
21:11:08.0 +0200
+++ linux-rc/drivers/net/myri10ge/myri10ge.c2007-04-10 21:11:11.0 
+0200
@@ -71,7 +71,7 @@
 #include "myri10ge_mcp.h"
 #include "myri10ge_mcp_gen_header.h"
 
-#define MYRI10GE_VERSION_STR "1.3.0-1.227"
+#define MYRI10GE_VERSION_STR "1.3.0-1.233"
 
 MODULE_DESCRIPTION("Myricom 10G driver (10GbE)");
 MODULE_AUTHOR("Maintainer: [EMAIL PROTECTED]");


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] myri10ge: fix management of the firmware 4KB boundary crossing restriction

2007-04-10 Thread Brice Goglin
Simpler way of dealing with the firmware 4KB boundary crossing
restriction for rx buffers.  This fixes a variety of memory
corruption issues when using an "uncommon" MTU with a 16KB
page size.

Signed-off-by: Brice Goglin <[EMAIL PROTECTED]>
---
 drivers/net/myri10ge/myri10ge.c |   19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

Index: linux-rc/drivers/net/myri10ge/myri10ge.c
===
--- linux-rc.orig/drivers/net/myri10ge/myri10ge.c   2007-04-06 
09:05:17.0 +0200
+++ linux-rc/drivers/net/myri10ge/myri10ge.c2007-04-10 21:03:59.0 
+0200
@@ -900,19 +900,9 @@
/* try to refill entire ring */
while (rx->fill_cnt != (rx->cnt + rx->mask + 1)) {
idx = rx->fill_cnt & rx->mask;
-
-   if ((bytes < MYRI10GE_ALLOC_SIZE / 2) &&
-   (rx->page_offset + bytes <= MYRI10GE_ALLOC_SIZE)) {
+   if (rx->page_offset + bytes <= MYRI10GE_ALLOC_SIZE) {
/* we can use part of previous page */
get_page(rx->page);
-#if MYRI10GE_ALLOC_SIZE > 4096
-   /* Firmware cannot cross 4K boundary.. */
-   if ((rx->page_offset >> 12) !=
-   ((rx->page_offset + bytes - 1) >> 12)) {
-   rx->page_offset =
-   (rx->page_offset + bytes) & ~4095;
-   }
-#endif
} else {
/* we need a new page */
page =
@@ -941,6 +931,13 @@
 
/* start next packet on a cacheline boundary */
rx->page_offset += SKB_DATA_ALIGN(bytes);
+
+#if MYRI10GE_ALLOC_SIZE > 4096
+   /* don't cross a 4KB boundary */
+   if ((rx->page_offset >> 12) !=
+   ((rx->page_offset + bytes - 1) >> 12))
+   rx->page_offset = (rx->page_offset + 4096) & ~4095;
+#endif
rx->fill_cnt++;
 
/* copy 8 descriptors to the firmware at a time */


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] last myri10ge updates for 2.6.21

2007-04-10 Thread Brice Goglin
Hi Jeff,

In case it is not too late for 2.6.21, here are 3 minor fixes for myri10ge:
1. fix management of the firmware 4KB boundary crossing restriction
2. more Intel chipsets providing aligned PCIe completions
3. update driver version to 1.3.0-1.233

Thanks,
Brice

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [irda-users] [BUG] 2.6.20.1-rt8 irnet + pppd recursive spinlock...

2007-04-10 Thread Samuel Ortiz
Hi Guennadi,

On Sat, Apr 07, 2007 at 03:59:26AM +0300, Samuel Ortiz wrote:
> IMHO, irnet_flow_indication() should be called asynchronously by
> irttp_run_tx_queue(), through some bottom-half mechanism. That would fix your
> locking issues, and that would reduce the time we spend in the IrDA code with
> this lock taken.
> 
> I will try to come up with some patches for you later this weekend.
The patch below schedules irnet_flow_indication() asynchronously. Could
you please give it a try (it builds, but I couldn't test it...) ? :

diff --git a/include/net/irda/irttp.h b/include/net/irda/irttp.h
index a899e58..941f0f1 100644
--- a/include/net/irda/irttp.h
+++ b/include/net/irda/irttp.h
@@ -128,6 +128,7 @@ struct tsap_cb {
 
struct net_device_stats stats;
struct timer_list todo_timer; 
+   struct work_struct irnet_flow_work;   /* irttp asynchronous flow 
restart */
 
__u32 max_seg_size; /* Max data that fit into an IrLAP frame */
__u8  max_header_size;
diff --git a/net/irda/irnet/irnet.h b/net/irda/irnet/irnet.h
diff --git a/net/irda/irttp.c b/net/irda/irttp.c
index 7069e4a..a0d0f26 100644
--- a/net/irda/irttp.c
+++ b/net/irda/irttp.c
@@ -367,6 +367,29 @@ static int irttp_param_max_sdu_size(void *instance, 
irda_param_t *param,
 /*** CLIENT CALLS ***/
 /** LMP CALLBACKS **/
 /* Everything is happily mixed up. Waiting for next clean up - Jean II */
+static void irttp_flow_restart(struct work_struct *work)
+{
+   struct tsap_cb * self =
+   container_of(work, struct tsap_cb, irnet_flow_work);
+
+   if (self == NULL)
+   return;
+
+   /* Check if we can accept more frames from client. */
+   if ((self->tx_sdu_busy) &&
+   (skb_queue_len(&self->tx_queue) < TTP_TX_LOW_THRESHOLD) &&
+   (!self->close_pend)) {
+   if (self->notify.flow_indication)
+   self->notify.flow_indication(self->notify.instance,
+self, FLOW_START);
+
+   /* self->tx_sdu_busy is the state of the client.
+* We don't really have a race here, but it's always safer
+* to update our state after the client - Jean II */
+   self->tx_sdu_busy = FALSE;
+   }
+}
+
 
 /*
  * Function irttp_open_tsap (stsap, notify)
@@ -402,6 +425,8 @@ struct tsap_cb *irttp_open_tsap(__u8 stsap_sel, int credit, 
notify_t *notify)
self->todo_timer.data = (unsigned long) self;
self->todo_timer.function = &irttp_todo_expired;
 
+   INIT_WORK(&self->irnet_flow_work, irttp_flow_restart);
+
/* Initialize callbacks for IrLMP to use */
irda_notify_init(&ttp_notify);
ttp_notify.connect_confirm = irttp_connect_confirm;
@@ -761,25 +786,10 @@ static void irttp_run_tx_queue(struct tsap_cb *self)
self->stats.tx_packets++;
}
 
-   /* Check if we can accept more frames from client.
-* We don't want to wait until the todo timer to do that, and we
-* can't use tasklets (grr...), so we are obliged to give control
-* to client. That's ok, this test will be true not too often
-* (max once per LAP window) and we are called from places
-* where we can spend a bit of time doing stuff. - Jean II */
if ((self->tx_sdu_busy) &&
(skb_queue_len(&self->tx_queue) < TTP_TX_LOW_THRESHOLD) &&
(!self->close_pend))
-   {
-   if (self->notify.flow_indication)
-   self->notify.flow_indication(self->notify.instance,
-self, FLOW_START);
-
-   /* self->tx_sdu_busy is the state of the client.
-* We don't really have a race here, but it's always safer
-* to update our state after the client - Jean II */
-   self->tx_sdu_busy = FALSE;
-   }
+   schedule_work(&self->irnet_flow_work);
 
/* Reset lock */
self->tx_queue_lock = 0;

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Support for Asix AX88796

2007-04-10 Thread Thomas LangÄs

I've got an DVB-C-box called Dreambox DM500-C and it uses the
AX88796-chip for networking. They (Dream Multimedia, makers of the
Dreambox) has patched the NE2000-drivers in the kernel (ne.c), but only to
get it working.  As a result it doesn't seem to be performing more than
20-30 Mbps.

I found a driver [1] that I managed to patch into the linux 2.6.9-kernel, but
upon booting the dreambox it doesn't detect the networkcard.  I suspect
it because of this (found inside the ne.c-file used by the dreambox):

static int __init do_ne_probe(struct net_device *dev)
{
#if defined(CONFIG_DM7020)
  unsigned int base_addr = dev->base_addr = (unsigned
int)ioremap(0xf200,4096);
  __u8 *p = (__u8*)base_addr;
  if (*p == 0xFF)
  {
  shift_five=0;
  base_addr = dev->base_addr = (unsigned
int)ioremap(0xf2000600,4096);
  }
#elif defined(CONFIG_DM56xx)
  unsigned int base_addr = dev->base_addr = (unsigned
int)ioremap(0xf2000600,4096);
#else
  unsigned int base_addr = dev->base_addr;
#endif

#ifndef MODULE
  int orig_irq = dev->irq = 29;
#endif


CONFIG_DM56xx is enabled, so there seems to be a hardcoded base_addr
used to probe for the networkchip (which is no surprise since this is for an
embedded device).  But I'm a bit puzzled to where I should put this in the
driver below, because it didn't seem to start probing at all when I
added printk's
to try and debug.

Any pointers at all on where to start would be helpfull. :-)

In advance, thanks!

Links:
[1] - http://lwn.net/Articles/159224/

--
Thomas
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] NET : loopback driver can use loopback_dev integrated net_device_stats

2007-04-10 Thread Eric Dumazet
Hi David

Please find this patch against net-2.6.22

Thank you

[PATCH] NET : loopback driver can use loopback_dev integrated net_device_stats

Rusty added a new 'stats' field to struct net_device.

loopback driver can use it instead of declaring another struct net_device_stats
This saves some memory.

Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]>

diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index 6df673a..6ba6ed2 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -164,11 +164,9 @@ #endif
return 0;
 }
 
-static struct net_device_stats loopback_stats;
-
 static struct net_device_stats *get_stats(struct net_device *dev)
 {
-   struct net_device_stats *stats = &loopback_stats;
+   struct net_device_stats *stats = &dev->stats;
unsigned long bytes = 0;
unsigned long packets = 0;
int i;
@@ -208,7 +206,6 @@ static const struct ethtool_ops loopback
 struct net_device loopback_dev = {
.name   = "lo",
.get_stats  = &get_stats,
-   .priv   = &loopback_stats,
.mtu= (16 * 1024) + 20 + 20 + 12,
.hard_start_xmit= loopback_xmit,
.hard_header= eth_header,
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix MCA when shutting down tulip quad-NIC

2007-04-10 Thread Olaf Hering
On Thu, Apr 05, Valerie Henson wrote:

> On Tue, Apr 03, 2007 at 11:19:16PM +0200, Olaf Hering wrote:
> > From: [EMAIL PROTECTED]
> > 
> >  https://bugzilla.novell.com/show_bug.cgi?id=SUSE39204
> 
> Wow, registering for Novell's bugzilla is painful.  And in the end I
> get "Access denied" on that bug.  Can you give us this information
> some other way?

I did not see an easy way to make the bug public other than moving it to 
the openSuSE category.

> > Shutting down the network causes an MCA because of an IO TLB error when
> > a DEC quad 10/100 card is in any slot.  This problem was originally seen
> > on an HP rx4640.
> 
> I'm not clear on why pci_disable_device() would fix this bug.  Do you
> have an explanation (or can copy one out of the bug report)?  I'm
> hesitant to make even obviously correct changes to the tulip driver
> without good evidence, given the incredible variety of buggy hardware
> out there.

The comments in the bug do not have an detailed analysis.
One of the comments is:

...
Comment #1 From Andrew Patterson 2004-04-20 19:47:32 MST [reply]

1. ifdown the interfaces; then ifup them
2. do this in a loop from a script and it generally MCA's within 2 minutes.
...

The first version for 2.6.5 contained the pci_disable_device()
and a version which was commited to mainline:

http://git.kernel.org/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commitdiff;h=6379dd571265528f3911b9deafe2a29af2e71a2b

Later the patch contained just the pci_disable_device() call.

Andrew, does your testscript still fail in SLES10 or mainline?

> This looks to me like another iteration of the shutdown DMA/irq race
> at first glance.  Grant has a patch for it; I'm working on one I
> consider cleaner.

Thats likely the same issue.
http://www.linuxarkivet.se/mlists/linux-net/0409/msg00173.html
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] NET: [UPDATED] Multiqueue network device support implementation.

2007-04-10 Thread Waskiewicz Jr, Peter P
> Peter P Waskiewicz Jr wrote:
> > +   /* To retrieve statistics per subqueue - FOR FUTURE USE */
> > +   struct net_device_stats* (*get_subqueue_stats)(struct 
> net_device *dev,
> > +   int 
> queue_index);
> 
> 
> Please no future use stuff, just add it when you need it.

Gotcha.  I'll remove this.

> 
> > diff --git a/net/core/dev.c b/net/core/dev.c index 219a57f..c11c8fa 
> > 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3326,12 +3328,23 @@ struct net_device *alloc_netdev(int 
> sizeof_priv, const char *name,
> > if (sizeof_priv)
> > dev->priv = netdev_priv(dev);
> >  
> > +   alloc_size = (sizeof(struct net_device_subqueue) * queue_count);
> > + 
> > +   p = kzalloc(alloc_size, GFP_KERNEL);
> > +   if (!p) {
> > +   printk(KERN_ERR "alloc_netdev: Unable to 
> allocate queues.\n");
> > +   return NULL;
> 
> 
> This leaks the device. You treat every single-queue device as 
> having a single subqueue. If it doesn't get too ugly it would 
> be nice to avoid this and only allocate the subqueue states 
> for real multiqueue devices.

We went back and forth on this.  The reason we allocate a queue in every
case, even on single-queue devices, was to make the stack not have
branching for multiqueue and non-multiqueue devices.  If we don't have
at least one queue on a device, then we can't have
netif_subqueue_stopped() in the hotpath unless we check if a device is
multiqueue before.  The original patches I released had this branching,
and I was asked to not do that.  I'd also like to see all queue-related
stuff be pulled from net_device and put into net_device_subqueue at some
point, even for single-queue devices.  Thoughts?

> 
> > --- a/net/sched/sch_generic.c
> > +++ b/net/sched/sch_generic.c
> > @@ -133,7 +133,8 @@ static inline int qdisc_restart(struct 
> net_device *dev)
> > /* And release queue */
> > spin_unlock(&dev->queue_lock);
> >  
> > -   if (!netif_queue_stopped(dev)) {
> > +   if (!netif_queue_stopped(dev) &&
> > +   !netif_subqueue_stopped(dev, 
> skb->queue_mapping)) {
> > int ret;
> >  
> > ret = dev_hard_start_xmit(skb, 
> dev); @@ -149,7 +150,6 @@ static 
> > inline int qdisc_restart(struct net_device *dev)
> > goto collision;
> > }
> > }
> > -
> 
> 
> Unrelated whitespace change.

I'll fix that.

> 
> > /* NETDEV_TX_BUSY - we need to requeue */
> > /* Release the driver */
> > if (!nolock) {
> > diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c index 
> > 5cfe60b..7365621 100644
> > --- a/net/sched/sch_prio.c
> > +++ b/net/sched/sch_prio.c
> > @@ -43,6 +43,7 @@ struct prio_sched_data
> > struct tcf_proto *filter_list;
> > u8  prio2band[TC_PRIO_MAX+1];
> > struct Qdisc *queues[TCQ_PRIO_BANDS];
> > +   u16 band2queue[TC_PRIO_MAX + 1];
> >  };
> >  
> >  
> > @@ -63,20 +64,26 @@ prio_classify(struct sk_buff *skb, 
> struct Qdisc *sch, int *qerr)
> > case TC_ACT_SHOT:
> > return NULL;
> > };
> > -
> 
> Same here

I'll fix that too.

> 
> > if (!q->filter_list ) {
> >  #else
> > if (!q->filter_list || tc_classify(skb, 
> q->filter_list, &res)) {  
> > #endif
> > if (TC_H_MAJ(band))
> > band = 0;
> > +   skb->queue_mapping =
> > + 
> q->prio2band[q->band2queue[band&TC_PRIO_MAX]];
> > +
> 
> 
> Does this needs to be cleared at some point again? TC actions 
> might redirect or mirror packets to other (multiqueue) devices.

If an skb is redirected to another device, the skb should be filtered
through that device's qdisc, yes?

> 
> > @@ -242,6 +259,30 @@ static int prio_tune(struct Qdisc 
> *sch, struct rtattr *opt)
> > }
> > }
> > }
> > +   /* setup queue to band mapping */
> > +   if (q->bands < sch->dev->egress_subqueue_count) {
> > +   qmapoffset = 1;
> > +   mod = 0;
> > +   } else {
> > +   mod = q->bands % sch->dev->egress_subqueue_count;
> > +   qmapoffset = q->bands / 
> sch->dev->egress_subqueue_count +
> > +   ((mod) ? 1 : 0);
> > +   }
> > +
> > +   queue = 0;
> > +   offset = 0;
> > +   for (i = 0; i < q->bands; i++) {
> > +   q->band2queue[i] = queue;
> > +   if ( ((i + 1) - offset) == qmapoffset) {
> > +   queue++;
> > +   offset += qmapoffset;
> > +   if (mod)
> > +   mod--;
> > +   qmapoffset = q->bands /
> > +   sch->dev->egress_subqueue_count +
> > +   ((mod) ? 1 : 0);
> > +  

Re: [PATCH 2/2] NET: Multiqueue network device support implementation.

2007-04-10 Thread Evgeniy Polyakov
On Tue, Apr 10, 2007 at 08:41:49AM -0700, Waskiewicz Jr, Peter P ([EMAIL 
PROTECTED]) wrote:
> > On Mon, Apr 09, 2007 at 02:28:41PM -0700, Peter P Waskiewicz 
> > Jr ([EMAIL PROTECTED]) wrote:
> > > + alloc_size = (sizeof(struct net_device_subqueue) * queue_count);
> > > + 
> > > + p = kzalloc(alloc_size, GFP_KERNEL);
> > > + if (!p) {
> > > + printk(KERN_ERR "alloc_netdev: Unable to 
> > allocate queues.\n");
> > > + return NULL;
> > 
> > I think you either do not want to print it, or want 
> > additional details about device...
> 
> Ok.  This is essentially the same output printed if the netdev itself
> cannot be allocated.  Should I update both strings to have more
> device-specific information?

I wonder, if it is ever possible with gfp_kernel...

I think different patch would be ok.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] NET: Multiqueue network device support implementation.

2007-04-10 Thread Waskiewicz Jr, Peter P
> On Mon, Apr 09, 2007 at 02:28:41PM -0700, Peter P Waskiewicz 
> Jr ([EMAIL PROTECTED]) wrote:
> > +   alloc_size = (sizeof(struct net_device_subqueue) * queue_count);
> > + 
> > +   p = kzalloc(alloc_size, GFP_KERNEL);
> > +   if (!p) {
> > +   printk(KERN_ERR "alloc_netdev: Unable to 
> allocate queues.\n");
> > +   return NULL;
> 
> I think you either do not want to print it, or want 
> additional details about device...

Ok.  This is essentially the same output printed if the netdev itself
cannot be allocated.  Should I update both strings to have more
device-specific information?

> 
> > +   }
> > + 
> > +   dev->egress_subqueue = p;
> > +   dev->egress_subqueue_count = queue_count;
> > +
> > dev->get_stats = maybe_internal_stats;
> > setup(dev);
> > strcpy(dev->name, name);
> > return dev;
> >  }
> > -EXPORT_SYMBOL(alloc_netdev);
> > +EXPORT_SYMBOL(alloc_netdev_mq);
> >  
> >  /**
> >   * free_netdev - free network device
> > @@ -3345,6 +3358,7 @@ void free_netdev(struct net_device *dev)  {  
> > #ifdef CONFIG_SYSFS
> > /*  Compatibility with error handling in drivers */
> > +   kfree((char *)dev->egress_subqueue);
> > if (dev->reg_state == NETREG_UNINITIALIZED) {
> > kfree((char *)dev - dev->padded);
> > return;
> > @@ -3356,6 +3370,7 @@ void free_netdev(struct net_device *dev)
> > /* will free via device release */
> > put_device(&dev->dev);
> >  #else
> > +   kfree((char *)dev->egress_subqueue);
> 
> Still casting :)

The latest repost removes these casts.


Thanks for the feedback,

-PJ Waskiewicz
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: two gateways with one NIC

2007-04-10 Thread Ben Greear

W Agtail wrote:

On Mon, 2007-04-09 at 11:11 -0700, Ben Greear wrote:
  

W Agtail wrote:


Nice one, but unfortunately still doesn't work.
I'm now not seeing any marked messages in /var/log/messages and traffic
still going via gw2 for port 8088.
  

Maybe you could use something like my mac-vlan virtual device to make
your single NIC look like two NICs?  You can find links to the patch and
the macvlan-config tool on this page:

http://www.candelatech.com/~greear/vlan.html

Ben





Thanks Ben, this looks quite an interesting idea.
Is it possible to create /etc/sysconfig/network-scripts/* in the same
way as ethN:N scripts I wonder?
  
No idea...I create these things using mvl_config tool.  At the least, 
you could

edit rc.local or similar.

Thanks,
Ben


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  



--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Add etun driver

2007-04-10 Thread John W. Linville
On Tue, Apr 10, 2007 at 02:05:46PM +0200, Patrick McHardy wrote:

> So simply put: if I can implement support for
> "ip wireless add dev wlan0 mode managed essid ... key ..."
> in less than 100 lines and get a working connection afterwards, it
> seems worth it.

ACK

-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Add etun driver

2007-04-10 Thread Johannes Berg
On Tue, 2007-04-10 at 14:05 +0200, Patrick McHardy wrote:
> 
> I know too little about wireless to judge really this. My opinion is
> that if it is possible to add and configure an interface (even if
> only for simple cases) without knowledge about driver internals by
> setting a few parameters, it would probably make sense to use
> RTM_NEWLINK as well. If a userspace daemon or complex knowledge of
> driver internals is needed, it probably should stay seperated.

It still is though we're moving towards the userspace daemon thing.

> So simply put: if I can implement support for
> "ip wireless add dev wlan0 mode managed essid ... key ..."
> in less than 100 lines and get a working connection afterwards, it
> seems worth it.

For completely open networks (no "key ...") or just WEP (static key) it
could probably be done, but I don't see how right now.

johannes


signature.asc
Description: This is a digitally signed message part


Re: two gateways with one NIC

2007-04-10 Thread W Agtail
On Mon, 2007-04-09 at 11:11 -0700, Ben Greear wrote:
> W Agtail wrote:
> > Nice one, but unfortunately still doesn't work.
> > I'm now not seeing any marked messages in /var/log/messages and traffic
> > still going via gw2 for port 8088.
> 
> Maybe you could use something like my mac-vlan virtual device to make
> your single NIC look like two NICs?  You can find links to the patch and
> the macvlan-config tool on this page:
> 
> http://www.candelatech.com/~greear/vlan.html
> 
> Ben
> 
> 

Thanks Ben, this looks quite an interesting idea.
Is it possible to create /etc/sysconfig/network-scripts/* in the same
way as ethN:N scripts I wonder?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] CONFIG_PACKET_MMAP should depend on MMU

2007-04-10 Thread David Howells
Robin Getz <[EMAIL PROTECTED]> wrote:

> David - I know you have been reworking the noMMU vma handling - is there a 
> solution to vm_insert_page?

The reason vm_insert_page() is being called, I imagine, is because
packet_mmap() has to insert mappings to an already existing buffer.  All it
does is munge the PTEs in that virtual region to point to the buffer.

As long as the buffer is completely contiguous (which I don't know for
certain), then this function can be trivially reduced in NOMMU-mode to
something that just returns the address of the requested part of the buffer.
No remapping would be required.

However...  If the buffer is *not* completely contiguous, then you can still
perform mmaps of it - but only where the desired part _is_ contiguous.
Alternatively, you can arrange for the buffer to be completely contiguous
upfront.

Looking at alloc_pg_vec() in af_packet.c, I will place my bets on the latter
case.  I don't know that this is a problem; it depends on how things work, and
that I don't know offhand.  If someone can give me a simple test program, I
would be able to evaluate it better.

David
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Avi Kivity

Evgeniy Polyakov wrote:

On Tue, Apr 10, 2007 at 03:17:45PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote:
  

Check a link please in case we are talking about different ideas:
http://marc.info/?l=linux-netdev&m=112262743505711&w=2

 
  
I don't really understand what you're testing there.  in particular, how 
can the copying time change so dramatically depending on whether you've 
just rebooted or not?

 
I tested page remapping time - i.e. time to replace a page in two

different mappings - the same should be performed in host and guest
kernels if such design is going to be used for communication.

I can only explain after-reboot slow copy with empty caches - arbitrary
kernel pages were copied into buffer (not the same data as in posted
code).
  


Doing this in kvm would be significantly more complex, as we'd need to 
use full reverse mapping to locate all guest mappings (we already 
reverse map writable pages for other reasons), so the 25-50% difference 
might be nullified or even turn into overhead.


Here are the Xen numbers for reference.  Xen probably has more overhead 
than kvm for such things, though, as it needs to do hypercalls from dom0 
which is in-kernel for kvm.


http://lists.xensource.com/archives/html/xen-devel/2007-03/msg01218.html

--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Evgeniy Polyakov
On Tue, Apr 10, 2007 at 03:17:45PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote:
> >Check a link please in case we are talking about different ideas:
> >http://marc.info/?l=linux-netdev&m=112262743505711&w=2
> >
> >  
> 
> I don't really understand what you're testing there.  in particular, how 
> can the copying time change so dramatically depending on whether you've 
> just rebooted or not?
 
I tested page remapping time - i.e. time to replace a page in two
different mappings - the same should be performed in host and guest
kernels if such design is going to be used for communication.

I can only explain after-reboot slow copy with empty caches - arbitrary
kernel pages were copied into buffer (not the same data as in posted
code).

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Avi Kivity

Evgeniy Polyakov wrote:

This is what Xen does.  It is actually less performant than copying, IIRC.

The problem with flipping pages around is that physical addresses are 
cached both in the kvm mmu and in the on-chip tlbs, necessitating 
expensive page table walks and tlb invalidation IPIs.



Hmm, I'm not familiar with Xen driver, but similar technique was used
with zero-copy network sniffer some time ago, substituting userspace
pages with pages containing skb data was about 25-50% faster than
copying 1500 bytes in general, and in order of 10 times faster in some
cases.

Check a link please in case we are talking about different ideas:
http://marc.info/?l=linux-netdev&m=112262743505711&w=2

  


I don't really understand what you're testing there.  in particular, how 
can the copying time change so dramatically depending on whether you've 
just rebooted or not?




--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Add etun driver

2007-04-10 Thread Patrick McHardy
Johannes Berg wrote:
> Fair enough. Then the question however remains whether wireless should
> try to do all things it needs in one or try leveraging multiple things
> from other places. Thoughts?


I know too little about wireless to judge really this. My opinion is
that if it is possible to add and configure an interface (even if
only for simple cases) without knowledge about driver internals by
setting a few parameters, it would probably make sense to use
RTM_NEWLINK as well. If a userspace daemon or complex knowledge of
driver internals is needed, it probably should stay seperated.

So simply put: if I can implement support for
"ip wireless add dev wlan0 mode managed essid ... key ..."
in less than 100 lines and get a working connection afterwards, it
seems worth it.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Evgeniy Polyakov
On Tue, Apr 10, 2007 at 02:21:24PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote:
> >You want to implement zero-copy network device between host and guest, if
> >I understood this thread correctly?
> >So, for sending part, device allocates pages from receiver's memory (or
> >from shared memory), receiver gets an 'interrupt' and got pages from own
> >memory, which are attached to new skb and transferred up to the network
> >stack.
> >It can be extended to use shared ring of pages.
> >  
> 
> This is what Xen does.  It is actually less performant than copying, IIRC.
> 
> The problem with flipping pages around is that physical addresses are 
> cached both in the kvm mmu and in the on-chip tlbs, necessitating 
> expensive page table walks and tlb invalidation IPIs.

Hmm, I'm not familiar with Xen driver, but similar technique was used
with zero-copy network sniffer some time ago, substituting userspace
pages with pages containing skb data was about 25-50% faster than
copying 1500 bytes in general, and in order of 10 times faster in some
cases.

Check a link please in case we are talking about different ideas:
http://marc.info/?l=linux-netdev&m=112262743505711&w=2

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Add etun driver

2007-04-10 Thread Johannes Berg
On Tue, 2007-04-10 at 13:09 +0200, Patrick McHardy wrote:

> I know :) It was a few month ago when I noticed the new bonding
> sysfs interface when I first thought that we really need this.

:)

> > I don't think wireless can get away without a new tool. So much stuff
> > there. Look at
> > http://git.kernel.org/?p=linux/kernel/git/linville/wireless-dev.git;a=blob;f=include/linux/nl80211.h;hb=HEAD
> 
> 
> Maybe not wireless, but bonding, briding, vlan, etun, possibly more.

Fair enough. Then the question however remains whether wireless should
try to do all things it needs in one or try leveraging multiple things
from other places. Thoughts?

johannes


signature.asc
Description: This is a digitally signed message part


Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Avi Kivity

Evgeniy Polyakov wrote:

But it looks from this discussion, that it will not prevent from
changing in-kernel driver - place a hook into skb allocation path and
allocate data from opposing memory - get pages from another side and put
them into fragments, then copy headers into skb->data.
  
  

I don't understand this (opposing memory, another side?).  Can you
elaborate?



You want to implement zero-copy network device between host and guest, if
I understood this thread correctly?
So, for sending part, device allocates pages from receiver's memory (or
from shared memory), receiver gets an 'interrupt' and got pages from own
memory, which are attached to new skb and transferred up to the network
stack.
It can be extended to use shared ring of pages.
  


This is what Xen does.  It is actually less performant than copying, IIRC.

The problem with flipping pages around is that physical addresses are 
cached both in the kvm mmu and in the on-chip tlbs, necessitating 
expensive page table walks and tlb invalidation IPIs.


Note that for sending from the guest an external host can be done 
copylessly, and for the receive side using a dma engine (like I/OAT) can 
reduce the cost of the copy.


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Add etun driver

2007-04-10 Thread Jeff Garzik

Patrick McHardy wrote:

Maybe not wireless, but bonding, briding, vlan, etun, possibly more.


[if I understand you correctly] I agree.  With ethtool, the idea is to 
have a single tool that supports multiple hardware platforms -- even to 
the point of introducing hardware-specific code into ethtool.


While I agree with wireless-dev that wireless /needs/ a separate tool, 
for things like bridging and vlan it would certainly be nice to mitigate 
the tool count explosion.


Jeff


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Add etun driver

2007-04-10 Thread Patrick McHardy
Johannes Berg wrote:
> On Tue, 2007-04-10 at 12:46 +0200, Patrick McHardy wrote:
> 
> 
>>The main advantage that we don't get more weird sysfs/proc/ioctl based
>>interfaces
> 
> 
> Please don't put me into a corner I don't want to be in ;) The new
> wireless stuff was completely designed using netlink. The sysfs
> interface to these two specific things was a concession since it used to
> exist before and we don't really have a fully functional userspace tool
> yet.


I know :) It was a few month ago when I noticed the new bonding
sysfs interface when I first thought that we really need this.

>> and use the same interface that is used for all other network
>>configuration, which f.e. will allow to add support for all software
>>devices to iproute without much effort, so you don't need 30 different
>>tools for configuring the different software device types anymore.
>>Additionally we get atomic setup/dumps and extensibility.
> 
> 
> I don't think wireless can get away without a new tool. So much stuff
> there. Look at
> http://git.kernel.org/?p=linux/kernel/git/linville/wireless-dev.git;a=blob;f=include/linux/nl80211.h;hb=HEAD


Maybe not wireless, but bonding, briding, vlan, etun, possibly more.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NET: Multiqueue network device support implementation.

2007-04-10 Thread Herbert Xu
Waskiewicz Jr, Peter P <[EMAIL PROTECTED]> wrote:
>
>> >@@ -3356,6 +3370,7 @@ void free_netdev(struct net_device *dev)
>> > /* will free via device release */
>> > put_device(&dev->dev);
>> > #else
>> >+kfree((char *)dev->egress_subqueue);
>> > kfree((char *)dev - dev->padded);
>> > #endif
>> > }
>> 
>> Ahem. Explain that cast.
> 
>This can be removed if needed; however, I'm just copying what
> the other kfree()'s are doing in this function.  Any instance of a
> typecast that I introduced in these patches are just following what
> others have done in that section of the code.  So the cast is just for
> consistency in this particular area.  If you'd like me to remove it, I
> can do that.

The other cast is there for the subtraction, not the kfree...

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Add etun driver

2007-04-10 Thread Johannes Berg
On Tue, 2007-04-10 at 12:46 +0200, Patrick McHardy wrote:

> Not totally different, so far I think we should use the same attributes
> as for RTM_SETLINK messages and include the device-specific stuff in
> IFLA_PROTINFO, which is symetric to what the kernel sends in RTM_NETLINK
> messages (see br_netlink.c for an example). The easiest case would be an
> empty IFLA_PROTINFO attribute, which would simply create a device
> without any configuration.

I'll have to look up these things.

> The main advantage that we don't get more weird sysfs/proc/ioctl based
> interfaces

Please don't put me into a corner I don't want to be in ;) The new
wireless stuff was completely designed using netlink. The sysfs
interface to these two specific things was a concession since it used to
exist before and we don't really have a fully functional userspace tool
yet.

>  and use the same interface that is used for all other network
> configuration, which f.e. will allow to add support for all software
> devices to iproute without much effort, so you don't need 30 different
> tools for configuring the different software device types anymore.
> Additionally we get atomic setup/dumps and extensibility.

I don't think wireless can get away without a new tool. So much stuff
there. Look at
http://git.kernel.org/?p=linux/kernel/git/linville/wireless-dev.git;a=blob;f=include/linux/nl80211.h;hb=HEAD

johannes


signature.asc
Description: This is a digitally signed message part


Re: [PATCH] net: Add etun driver

2007-04-10 Thread Patrick McHardy
Johannes Berg wrote:
> On Tue, 2007-04-10 at 11:52 +0200, Patrick McHardy wrote:
> 
> 
>>Without having thought much about it yet, roughly like this:
>>
>>- driver receives RTM_NEWLINK message (under rtnl)
>>- driver allocates new device
>>- driver initializes device based on content of RTM_NEWLINK message
>>- driver returns
> 
> 
> Sounds good to me, but where's the advantage over something that isn't
> generic if RTM_NEWLINK contains totally different things depending on
> the subsystem like wireless where it'd have to contain the hardware
> identifier?


Not totally different, so far I think we should use the same attributes
as for RTM_SETLINK messages and include the device-specific stuff in
IFLA_PROTINFO, which is symetric to what the kernel sends in RTM_NETLINK
messages (see br_netlink.c for an example). The easiest case would be an
empty IFLA_PROTINFO attribute, which would simply create a device
without any configuration.

The main advantage that we don't get more weird sysfs/proc/ioctl based
interfaces and use the same interface that is used for all other network
configuration, which f.e. will allow to add support for all software
devices to iproute without much effort, so you don't need 30 different
tools for configuring the different software device types anymore.
Additionally we get atomic setup/dumps and extensibility.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Add etun driver

2007-04-10 Thread Johannes Berg
On Tue, 2007-04-10 at 11:52 +0200, Patrick McHardy wrote:

> Without having thought much about it yet, roughly like this:
> 
> - driver receives RTM_NEWLINK message (under rtnl)
> - driver allocates new device
> - driver initializes device based on content of RTM_NEWLINK message
> - driver returns

Sounds good to me, but where's the advantage over something that isn't
generic if RTM_NEWLINK contains totally different things depending on
the subsystem like wireless where it'd have to contain the hardware
identifier?

johannes


signature.asc
Description: This is a digitally signed message part


Re: [PATCH] net: Add etun driver

2007-04-10 Thread Patrick McHardy
Johannes Berg wrote:
> On Tue, 2007-04-10 at 09:52 +0200, Patrick McHardy wrote:
> 
> 
>>Shouldn't be a problem either. Creating the device atomically also
>>prevents that anything else is setting them UP before they're fully
>>configured.
> 
> 
> How would you do it generically then? I'm absolutely not opposed to the
> idea but for now haven't seen how to do it.


Without having thought much about it yet, roughly like this:

- driver receives RTM_NEWLINK message (under rtnl)
- driver allocates new device
- driver initializes device based on content of RTM_NEWLINK message
- driver returns

Device creation won't be generic of course since only the driver knows
how to do that.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET: [UPDATED] Multiqueue network device support implementation.

2007-04-10 Thread Patrick McHardy
Peter P Waskiewicz Jr wrote:
> + /* To retrieve statistics per subqueue - FOR FUTURE USE */
> + struct net_device_stats* (*get_subqueue_stats)(struct net_device *dev,
> + int queue_index);


Please no future use stuff, just add it when you need it.

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 219a57f..c11c8fa 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3326,12 +3328,23 @@ struct net_device *alloc_netdev(int sizeof_priv, 
> const char *name,
>   if (sizeof_priv)
>   dev->priv = netdev_priv(dev);
>  
> + alloc_size = (sizeof(struct net_device_subqueue) * queue_count);
> + 
> + p = kzalloc(alloc_size, GFP_KERNEL);
> + if (!p) {
> + printk(KERN_ERR "alloc_netdev: Unable to allocate queues.\n");
> + return NULL;


This leaks the device. You treat every single-queue device as having
a single subqueue. If it doesn't get too ugly it would be nice to avoid
this and only allocate the subqueue states for real multiqueue devices.

> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -133,7 +133,8 @@ static inline int qdisc_restart(struct net_device *dev)
>   /* And release queue */
>   spin_unlock(&dev->queue_lock);
>  
> - if (!netif_queue_stopped(dev)) {
> + if (!netif_queue_stopped(dev) &&
> + !netif_subqueue_stopped(dev, skb->queue_mapping)) {
>   int ret;
>  
>   ret = dev_hard_start_xmit(skb, dev);
> @@ -149,7 +150,6 @@ static inline int qdisc_restart(struct net_device *dev)
>   goto collision;
>   }
>   }
> -


Unrelated whitespace change.

>   /* NETDEV_TX_BUSY - we need to requeue */
>   /* Release the driver */
>   if (!nolock) {
> diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
> index 5cfe60b..7365621 100644
> --- a/net/sched/sch_prio.c
> +++ b/net/sched/sch_prio.c
> @@ -43,6 +43,7 @@ struct prio_sched_data
>   struct tcf_proto *filter_list;
>   u8  prio2band[TC_PRIO_MAX+1];
>   struct Qdisc *queues[TCQ_PRIO_BANDS];
> + u16 band2queue[TC_PRIO_MAX + 1];
>  };
>  
>  
> @@ -63,20 +64,26 @@ prio_classify(struct sk_buff *skb, struct Qdisc *sch, int 
> *qerr)
>   case TC_ACT_SHOT:
>   return NULL;
>   };
> -

Same here

>   if (!q->filter_list ) {
>  #else
>   if (!q->filter_list || tc_classify(skb, q->filter_list, &res)) {
>  #endif
>   if (TC_H_MAJ(band))
>   band = 0;
> + skb->queue_mapping =
> +   q->prio2band[q->band2queue[band&TC_PRIO_MAX]];
> +


Does this needs to be cleared at some point again? TC actions might
redirect or mirror packets to other (multiqueue) devices.

> @@ -242,6 +259,30 @@ static int prio_tune(struct Qdisc *sch, struct rtattr 
> *opt)
>   }
>   }
>   }
> + /* setup queue to band mapping */
> + if (q->bands < sch->dev->egress_subqueue_count) {
> + qmapoffset = 1;
> + mod = 0;
> + } else {
> + mod = q->bands % sch->dev->egress_subqueue_count;
> + qmapoffset = q->bands / sch->dev->egress_subqueue_count +
> + ((mod) ? 1 : 0);
> + }
> +
> + queue = 0;
> + offset = 0;
> + for (i = 0; i < q->bands; i++) {
> + q->band2queue[i] = queue;
> + if ( ((i + 1) - offset) == qmapoffset) {
> + queue++;
> + offset += qmapoffset;
> + if (mod)
> + mod--;
> + qmapoffset = q->bands /
> + sch->dev->egress_subqueue_count +
> + ((mod) ? 1 : 0);
> + }
> + }


Besides being quite ugly, I don't think this does what you want.
For bands < queues we get band2queue[0] = 0, all others map to 1.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Add etun driver

2007-04-10 Thread Johannes Berg
On Tue, 2007-04-10 at 09:52 +0200, Patrick McHardy wrote:

> Shouldn't be a problem either. Creating the device atomically also
> prevents that anything else is setting them UP before they're fully
> configured.

How would you do it generically then? I'm absolutely not opposed to the
idea but for now haven't seen how to do it.

johannes


signature.asc
Description: This is a digitally signed message part


Re: [PATCH] NET: [UPDATED] Multiqueue network device support implementation.

2007-04-10 Thread Patrick McHardy
Waskiewicz Jr, Peter P wrote:
> Thanks Pat for the initial feedback.  I can post a set of patches to
> e1000 using the new API; I'll try to get them out asap (need to apply to
> this kernel tree).


Thanks.

> However, the PRIO qdisc still uses the priority in
> the bands for dequeueing priority, and will feed the queues on the NIC.
> The e1000, and any other multiqueue NIC, will schedule Tx based on how
> the PRIO qdisc feeds the queues.  So the only priority here is the
> dequeuing priority from the kernel.  The e1000 will use the new API for
> starting/stopping the individual queues based on the descriptors
> available, much like it does today for the global queue.


Packets will only be dequeued from a band if the associated subqueue
is active, which moves the decision from prio to the driver, no?
What policy does e1000 use for scheduling its internal queues?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Evgeniy Polyakov
On Tue, Apr 10, 2007 at 11:19:52AM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote:
> I meant, network aio in the mainline kernel.  I am aware of the various
> out-of-tree implementations.

If potential users do not pay attention to initial implementaion, it is
quite hard to them to get into. But actually it does not matter to this
discussion.

> > But it looks from this discussion, that it will not prevent from
> > changing in-kernel driver - place a hook into skb allocation path and
> > allocate data from opposing memory - get pages from another side and put
> > them into fragments, then copy headers into skb->data.
> >   
> 
> I don't understand this (opposing memory, another side?).  Can you
> elaborate?

You want to implement zero-copy network device between host and guest, if
I understood this thread correctly?
So, for sending part, device allocates pages from receiver's memory (or
from shared memory), receiver gets an 'interrupt' and got pages from own
memory, which are attached to new skb and transferred up to the network
stack.
It can be extended to use shared ring of pages.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Avi Kivity
Evgeniy Polyakov wrote:
> On Mon, Apr 09, 2007 at 04:38:18PM +0300, Avi Kivity ([EMAIL PROTECTED]) 
> wrote:
>   
>>> But I don't get this "we can enhance the kernel but not userspace" vibe
>>> 8(
>>>  
>>>   
>> I've been waiting for network aio since ~2003.  If it arrives in the 
>> next few days, I'm all for it; much more than kvm can use it 
>> profitably.  But I'm not going to write that interface myself.
>> 
>
> Hmm, you missed at least two implementations of network aio in the 
> previous year, and now with syslets we can have third one.
>   

I meant, network aio in the mainline kernel.  I am aware of the various
out-of-tree implementations.

> But it looks from this discussion, that it will not prevent from
> changing in-kernel driver - place a hook into skb allocation path and
> allocate data from opposing memory - get pages from another side and put
> them into fragments, then copy headers into skb->data.
>   

I don't understand this (opposing memory, another side?).  Can you
elaborate?

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] NET: Multiqueue network device support implementation.

2007-04-10 Thread Evgeniy Polyakov
On Mon, Apr 09, 2007 at 02:28:41PM -0700, Peter P Waskiewicz Jr ([EMAIL 
PROTECTED]) wrote:
> + alloc_size = (sizeof(struct net_device_subqueue) * queue_count);
> + 
> + p = kzalloc(alloc_size, GFP_KERNEL);
> + if (!p) {
> + printk(KERN_ERR "alloc_netdev: Unable to allocate queues.\n");
> + return NULL;

I think you either do not want to print it, or want additional details
about device...

> + }
> + 
> + dev->egress_subqueue = p;
> + dev->egress_subqueue_count = queue_count;
> +
>   dev->get_stats = maybe_internal_stats;
>   setup(dev);
>   strcpy(dev->name, name);
>   return dev;
>  }
> -EXPORT_SYMBOL(alloc_netdev);
> +EXPORT_SYMBOL(alloc_netdev_mq);
>  
>  /**
>   *   free_netdev - free network device
> @@ -3345,6 +3358,7 @@ void free_netdev(struct net_device *dev)
>  {
>  #ifdef CONFIG_SYSFS
>   /*  Compatibility with error handling in drivers */
> + kfree((char *)dev->egress_subqueue);
>   if (dev->reg_state == NETREG_UNINITIALIZED) {
>   kfree((char *)dev - dev->padded);
>   return;
> @@ -3356,6 +3370,7 @@ void free_netdev(struct net_device *dev)
>   /* will free via device release */
>   put_device(&dev->dev);
>  #else
> + kfree((char *)dev->egress_subqueue);

Still casting :)


-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Evgeniy Polyakov
On Mon, Apr 09, 2007 at 04:38:18PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote:
> >But I don't get this "we can enhance the kernel but not userspace" vibe
> >8(
> >  
> 
> I've been waiting for network aio since ~2003.  If it arrives in the 
> next few days, I'm all for it; much more than kvm can use it 
> profitably.  But I'm not going to write that interface myself.

Hmm, you missed at least two implementations of network aio in the 
previous year, and now with syslets we can have third one.

But it looks from this discussion, that it will not prevent from
changing in-kernel driver - place a hook into skb allocation path and
allocate data from opposing memory - get pages from another side and put
them into fragments, then copy headers into skb->data.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Add etun driver

2007-04-10 Thread Patrick McHardy
Johannes Berg wrote:
> Our virtual devices are always associated with a piece of hardware, and
> we really want them to be associated with that at all times, even when
> not UP. Everything else seems like a huge complication if only because
> then we can't have whoever will be responsible for the device allocate
> it's private space area.


Shouldn't be a problem either. Creating the device atomically also
prevents that anything else is setting them UP before they're fully
configured.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html