Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-21 Thread David Miller
From: Dave Jones [EMAIL PROTECTED]
Date: Sat, 21 Oct 2006 01:00:16 -0400

 Practically no-one cared about it, so it bit-rotted really fast
 after we shipped RHEL4.  That, along with the focus shifting to
 making kdump work seemed to kill it off over the last 12 months.

Then we can truly kill off the -drop() callback as part
of Stephen's patches.

Stephen, I'll review your new set over the weekend.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Thu, 19 Oct 2006 10:15:43 -0700

 The original skb management for netpoll was a mess, it had two queue paths
 and a callback. This changes it to have a per-instance transmit queue
 and use a tasklet rather than a work queue for the congested case.
 
 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

I think you mis-diffed this one:

-   WARN_ON(skb-protocol == 0);

That line doesn't exist in my copy of net/core/netpoll.c
even with your first patch applied.

Also, you forgot to remove the -drop callback pointer
from struct netpoll, which you should do if it really
isn't used any more.

I think you might run into problems there, as I believe the netdump
stuff does make non-trivial use of the -drop callback.  Indeed, it
uses the -dump callback for invoking a special
netpoll_start_netdump() function.  I'm pretty sure -dump was created
specifically to accomodate netdump.

So this is something else which will need to be worked out before we
can apply this patch.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Stephen Hemminger
On Fri, 20 Oct 2006 00:15:30 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Thu, 19 Oct 2006 10:15:43 -0700
 
  The original skb management for netpoll was a mess, it had two queue paths
  and a callback. This changes it to have a per-instance transmit queue
  and use a tasklet rather than a work queue for the congested case.
  
  Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
 
 I think you mis-diffed this one:
 
 - WARN_ON(skb-protocol == 0);
 
 That line doesn't exist in my copy of net/core/netpoll.c
 even with your first patch applied.
 
 Also, you forgot to remove the -drop callback pointer
 from struct netpoll, which you should do if it really
 isn't used any more.
 
 I think you might run into problems there, as I believe the netdump
 stuff does make non-trivial use of the -drop callback.  Indeed, it
 uses the -dump callback for invoking a special
 netpoll_start_netdump() function.  I'm pretty sure -dump was created
 specifically to accomodate netdump.
 

Netdump is not in the tree, so I can't fix it. Also netdump is pretty
much superseded by kdump.

 So this is something else which will need to be worked out before we
 can apply this patch.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Fri, 20 Oct 2006 08:18:57 -0700

 Netdump is not in the tree, so I can't fix it. Also netdump is pretty
 much superseded by kdump.

Unless kdump is %100 ready you can be sure vendors will ship netdump
for a little while longer.  I think gratuitously breaking netdump is
not the best idea.

It's not like netdump is some binary blob you can't get the source
to easily. :-)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Stephen Hemminger
On Fri, 20 Oct 2006 12:24:27 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Fri, 20 Oct 2006 08:18:57 -0700
 
  Netdump is not in the tree, so I can't fix it. Also netdump is pretty
  much superseded by kdump.
 
 Unless kdump is %100 ready you can be sure vendors will ship netdump
 for a little while longer.  I think gratuitously breaking netdump is
 not the best idea.
 
 It's not like netdump is some binary blob you can't get the source
 to easily. :-)
 

Sorry, but why should we treat out-of-tree vendor code any differently
than out-of-tree other code.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Fri, 20 Oct 2006 08:40:15 -0700

 The only user of the drop hook was netconsole, and I fixed that path.
 This probably breaks netdump, but that is out of tree, so it needs
 to fix itself.

I believe that netdump needs to requeue things because dropping the
packet is simply not allowed, and the -drop callback gives the
netdump code a way to handle things without actually dropping the
packet.  If that's true, you can't just free the SKB on it.

Are you sure your new TX strategy can avoid such drops properly?

Please take a quick peek at the netdump code, it's available, and make
some reasonable effort to determine whether it can still work with
your new code.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Stephen Hemminger
On Fri, 20 Oct 2006 12:27:53 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Fri, 20 Oct 2006 08:40:15 -0700
 
  The only user of the drop hook was netconsole, and I fixed that path.
  This probably breaks netdump, but that is out of tree, so it needs
  to fix itself.
 
 I believe that netdump needs to requeue things because dropping the
 packet is simply not allowed, and the -drop callback gives the
 netdump code a way to handle things without actually dropping the
 packet.  If that's true, you can't just free the SKB on it.
 
 Are you sure your new TX strategy can avoid such drops properly?

Yes, it has a queue. if it can't send it waits and retries.

 
 Please take a quick peek at the netdump code, it's available, and make
 some reasonable effort to determine whether it can still work with
 your new code.

Where, I'm not digging in side some RHEL rpm patch pile to find it.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Fri, 20 Oct 2006 12:25:27 -0700

 Sorry, but why should we treat out-of-tree vendor code any
 differently than out-of-tree other code.

I think what netdump was trying to do, provide a way to
requeue instead of fully drop the SKB, is quite reasonable.
Don't you think?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Stephen Hemminger
On Fri, 20 Oct 2006 12:52:26 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Fri, 20 Oct 2006 12:25:27 -0700
 
  Sorry, but why should we treat out-of-tree vendor code any
  differently than out-of-tree other code.
 
 I think what netdump was trying to do, provide a way to
 requeue instead of fully drop the SKB, is quite reasonable.
 Don't you think?

Yes, but the queued vs non-queued stuff showed up out of order.
The queued messages go through the wrong Tx path. ie. they end up
going into to NIT etc, since the deferred send uses a work queue
it wouldn't work for last-gasp messages or netdump since getting
a work queue to run requires going back to scheduler and processes
running... and it should use skb_buff_head instead
of roll your own queueing.

The other alternative would be to make the send logic non-blocking
and fully push retry to the caller.

I'll make a fix to netdump, if I can find it.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Stephen Hemminger
On Fri, 20 Oct 2006 12:52:26 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Fri, 20 Oct 2006 12:25:27 -0700
 
  Sorry, but why should we treat out-of-tree vendor code any
  differently than out-of-tree other code.
 
 I think what netdump was trying to do, provide a way to
 requeue instead of fully drop the SKB, is quite reasonable.
 Don't you think?


Netdump doesn't even exist in the current Fedora source rpm.
I think Dave dropped it.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Fri, 20 Oct 2006 08:40:15 -0700

 -static void queue_process(void *p)
 +static void netpoll_run(unsigned long arg)
  {
 ...
 - spin_unlock_irqrestore(queue_lock, flags);
 + netif_tx_lock(dev);
 + if (netif_queue_stopped(dev) ||
 + dev-hard_start_xmit(skb, dev) != NETDEV_TX_OK) {
 + skb_queue_head(npinfo-tx_q, skb);
 + netif_tx_unlock(dev);
 + tasklet_schedule(npinfo-tx_task);
 + return;
 + }

We really can't handle TX stopped this way from the netpoll_send_skb()
path.  All that old retry logic in netpoll_send_skb() is really
necessary.

If we are in deep IRQ context, took an OOPS, and are trying to get a
netpoll packet out for the kernel log message, we have to try as hard
as possible to get the packet out then and there, even if that means
waiting some amount of time for netif_queue_stopped() to become false.

That is what the existing code is trying to do.

If you defer to a tasklet, the kernel state from the OOPS can be so
corrupted that the tasklet will never run and we'll never get the
netconsole message needed to debug the problem.

Also, if we tasklet schedule from the tasklet, we'll just keep looping
in the tasklet and never leave softirq context, which is also bad
behavior.  Even in the tasklet, we should spin and poll when possible
like the current netpoll_send_skb() code does.

So we really can't apply this patch.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Stephen Hemminger
On Fri, 20 Oct 2006 13:42:09 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Fri, 20 Oct 2006 08:40:15 -0700
 
  -static void queue_process(void *p)
  +static void netpoll_run(unsigned long arg)
   {
  ...
  -   spin_unlock_irqrestore(queue_lock, flags);
  +   netif_tx_lock(dev);
  +   if (netif_queue_stopped(dev) ||
  +   dev-hard_start_xmit(skb, dev) != NETDEV_TX_OK) {
  +   skb_queue_head(npinfo-tx_q, skb);
  +   netif_tx_unlock(dev);
  +   tasklet_schedule(npinfo-tx_task);
  +   return;
  +   }
 
 We really can't handle TX stopped this way from the netpoll_send_skb()
 path.  All that old retry logic in netpoll_send_skb() is really
 necessary.
 
 If we are in deep IRQ context, took an OOPS, and are trying to get a
 netpoll packet out for the kernel log message, we have to try as hard
 as possible to get the packet out then and there, even if that means
 waiting some amount of time for netif_queue_stopped() to become false.
 

But, it also violates the assumptions of the network devices.
It calls NAPI poll back with IRQ's disabled and potentially doesn't
obey the semantics about only running on the same CPU as the
received packet.

 That is what the existing code is trying to do.
 
 If you defer to a tasklet, the kernel state from the OOPS can be so
 corrupted that the tasklet will never run and we'll never get the
 netconsole message needed to debug the problem.

So we can try once and if that fails we have to defer to tasklet.
We can't call NAPI, we can't try and cleanup the device will need
an IRQ to get unblocked.  

Or add another device callback that just to handle that case.

 Also, if we tasklet schedule from the tasklet, we'll just keep looping
 in the tasklet and never leave softirq context, which is also bad
 behavior.  Even in the tasklet, we should spin and poll when possible
 like the current netpoll_send_skb() code does.
 
 So we really can't apply this patch.


-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Andi Kleen

 But, it also violates the assumptions of the network devices.
 It calls NAPI poll back with IRQ's disabled and potentially doesn't
 obey the semantics about only running on the same CPU as the
 received packet.

netpoll always played a little fast'n'lose with various locking rules.
Also often inside the drivers are a little sloppy in the poll path.

That's fine for getting an oops out, but risky for normal kernel
messages when the driver must be still working afterwards.

The standard console code makes this conditional on oops_in_progress -
it breaks locks when true and otherwise it follows the safe
approach.

Perhaps it would be better to use different paths in netconsole too 
depending  on whether oops_in_progress is true or not.  

e.g. if !oops_in_progress (use the standard output path)
else use current path.

That would avoid breaking the driver during normal operation
and also have the advantage that the normal netconsole messages
would go through the queueing disciplines etc.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Fri, 20 Oct 2006 13:48:26 -0700

 On Fri, 20 Oct 2006 13:42:09 -0700 (PDT)
 David Miller [EMAIL PROTECTED] wrote:
 
  We really can't handle TX stopped this way from the netpoll_send_skb()
  path.  All that old retry logic in netpoll_send_skb() is really
  necessary.
  
  If we are in deep IRQ context, took an OOPS, and are trying to get a
  netpoll packet out for the kernel log message, we have to try as hard
  as possible to get the packet out then and there, even if that means
  waiting some amount of time for netif_queue_stopped() to become false.
 
 But, it also violates the assumptions of the network devices.
 It calls NAPI poll back with IRQ's disabled and potentially doesn't
 obey the semantics about only running on the same CPU as the
 received packet.

Actually, all the locking here is fine, that's why it checks
poll_owner for current smp_processor_id().

Otherwise it is safe to sit and spin waiting for link up/down or
TX full conditions to pass.

 So we can try once and if that fails we have to defer to tasklet.

Not true, we should spin and retry for some reasonable amount of time.
That reasonable amount of time should be the maximum of the time
it takes to free up space from a TX full condition and the time it
takes to bring the link up.

That is what the current code is trying to do, and you've erroneously
deleted all of that logic.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread David Miller
From: Andi Kleen [EMAIL PROTECTED]
Date: Fri, 20 Oct 2006 23:01:29 +0200

 netpoll always played a little fast'n'lose with various locking rules.

The current code is fine, it never reenters -poll, because it
maintains a -poll_owner which it checks in netpoll_send_skb()
before trying to call back into -poll.

Every call to -poll first sets -poll_owner to the current cpu id.
netpoll_send_skb() aborts and does a drop if -poll_owner is set to
the current smp_processor_id().

I sometimes feel like I'm the only person actually reading the sources
and past threads on this topic before replying.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Andi Kleen
On Friday 20 October 2006 23:08, David Miller wrote:
 From: Andi Kleen [EMAIL PROTECTED]
 Date: Fri, 20 Oct 2006 23:01:29 +0200
 
  netpoll always played a little fast'n'lose with various locking rules.
 
 The current code is fine, it never reenters -poll, because it
 maintains a -poll_owner which it checks in netpoll_send_skb()
 before trying to call back into -poll.

I was more thinking of reentry of the interrupt handler in 
the driver etc. A lot of them also do printk, but that is fortunately
caught higher level.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Stephen Hemminger
On Fri, 20 Oct 2006 23:16:03 +0200
Andi Kleen [EMAIL PROTECTED] wrote:

 On Friday 20 October 2006 23:08, David Miller wrote:
  From: Andi Kleen [EMAIL PROTECTED]
  Date: Fri, 20 Oct 2006 23:01:29 +0200
  
   netpoll always played a little fast'n'lose with various locking rules.
  
  The current code is fine, it never reenters -poll, because it
  maintains a -poll_owner which it checks in netpoll_send_skb()
  before trying to call back into -poll.
 
 I was more thinking of reentry of the interrupt handler in 
 the driver etc. A lot of them also do printk, but that is fortunately
 caught higher level.
 
 -Andi

One problem is that with netconsole the NAPI poll routine can be
called with IRQ's disabled. This means that calls to dev_kfree_skb()
are incorrect in this context. The driver would have to use dev_kfree_skb_any()
instead.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Dave Jones
On Fri, Oct 20, 2006 at 01:25:32PM -0700, Stephen Hemminger wrote:
  On Fri, 20 Oct 2006 12:52:26 -0700 (PDT)
  David Miller [EMAIL PROTECTED] wrote:
  
   From: Stephen Hemminger [EMAIL PROTECTED]
   Date: Fri, 20 Oct 2006 12:25:27 -0700
   
Sorry, but why should we treat out-of-tree vendor code any
differently than out-of-tree other code.
   
   I think what netdump was trying to do, provide a way to
   requeue instead of fully drop the SKB, is quite reasonable.
   Don't you think?
  
  
  Netdump doesn't even exist in the current Fedora source rpm.
  I think Dave dropped it.

Indeed. Practically no-one cared about it, so it bit-rotted
really fast after we shipped RHEL4.  That, along with the focus
shifting to making kdump work seemed to kill it off over the last
12 months.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html