Re: d80211: How does TX flow control work?

2007-01-10 Thread Jiri Benc
On Mon, 08 Jan 2007 21:18:48 +0100, Jan Kiszka wrote:
 The actual problem was meanwhile identified: shorewall happened to
 overwrite the queueing discipline of wmaster0 with pfifo_fast. I found
 the magic knob to tell shorewall to no longer do this (at least until I
 want to manage traffic control that way...), but I still wonder if it is
 an acceptable situation. Currently, the user can intentionally or
 accidentally screw up the stack this way.

Hm, we probably need a way to tell the kernel not to remove 802.11
qdisc. Jouni, Simon, is that possible or do we need to patch NET_SCHED
code?

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: d80211: How does TX flow control work?

2007-01-10 Thread Simon Barber
Scratches head -- this is from memory when I was thinking about this
problem a long time ago... I think we can return an error in the qdisc
destructor function - making sure legitimate interface removal is not
the cause of the qdisc deletion first of course.

Simon 

-Original Message-
From: Jiri Benc [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, January 10, 2007 6:20 PM
To: Jan Kiszka
Cc: netdev@vger.kernel.org; Ivo Van Doorn;
[EMAIL PROTECTED]; Jouni Malinen; Simon Barber
Subject: Re: d80211: How does TX flow control work?

On Mon, 08 Jan 2007 21:18:48 +0100, Jan Kiszka wrote:
 The actual problem was meanwhile identified: shorewall happened to 
 overwrite the queueing discipline of wmaster0 with pfifo_fast. I found

 the magic knob to tell shorewall to no longer do this (at least until 
 I want to manage traffic control that way...), but I still wonder if 
 it is an acceptable situation. Currently, the user can intentionally 
 or accidentally screw up the stack this way.

Hm, we probably need a way to tell the kernel not to remove 802.11
qdisc. Jouni, Simon, is that possible or do we need to patch NET_SCHED
code?

Thanks,

 Jiri

--
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: d80211: How does TX flow control work?

2007-01-08 Thread Jan Kiszka
Jan Kiszka wrote:
 Jan Kiszka wrote:
 Jiri Benc wrote:
 On Wed, 03 Jan 2007 19:10:01 +0100, Jan Kiszka wrote:
 BUG: warning at 
 /usr/src/rt2x00/rt2x00/ieee80211/ieee80211.c:1256/ieee80211_tx()
  cfa02245 ieee80211_master_start_xmit+0x105/0x430 [80211]  c024e35d 
 __ip_ct_refresh_acct+0x4d/0x60
  c024fd11 tcp_packet+0x941/0x970  c0217442 qdisc_restart+0x92/0x100
  c020d43d dev_queue_xmit+0xbd/0x1a0  cfa050d8 
 ieee80211_subif_start_xmit+0x468/0x480 [80211]
  c0207dca skb_clone+0x3a/0x1a0  c021d16d nf_hook_slow+0x4d/0xc0
  c020d495 dev_queue_xmit+0x115/0x1a0  c0226a63 ip_output+0x1c3/0x200
  c0225740 ip_finish_output+0x0/0x180  c022628b 
 ip_queue_xmit+0x36b/0x3b0
  c0224130 dst_output+0x0/0x10  ce9bae7d usb_hcd_giveback_urb+0x2d/0x60 
 [usbcore]
  c0237da2 tcp_v4_send_check+0x82/0xd0  c0237da2 
 tcp_v4_send_check+0x82/0xd0
  c0233244 tcp_transmit_skb+0x5e4/0x610  c0234b36 
 __tcp_push_pending_frames+0x676/0x740
  c0207f81 __alloc_skb+0x51/0x100  c022b817 tcp_sendmsg+0x897/0x980
  c0153fa9 core_sys_select+0x1b9/0x2b0  c0241f1d inet_sendmsg+0x3d/0x50
  c0202a8f do_sock_write+0x8f/0xa0  c020301f sock_aio_write+0x5f/0x70
  c01443d3 do_sync_write+0xc3/0x100  c01247f0 
 autoremove_wake_function+0x0/0x40
  c0144ca1 vfs_write+0xa1/0x140  c01451d3 sys_write+0x43/0x70
  c0102ae7 syscall_call+0x7/0xb

 Does it tell you anything already? Is there something I may instrument? 
 What
 could the driver do wrong to trigger such bug?
 Do you have CONFIG_NET_SCHED enabled?

 
 Sorry, this was most probably false alarm for the official stack. The
 problem now appears to be related to a patch against d80211 that is only
 present in the rt2x00 CVS.

Well, I said most probably...

The actual problem was meanwhile identified: shorewall happened to
overwrite the queueing discipline of wmaster0 with pfifo_fast. I found
the magic knob to tell shorewall to no longer do this (at least until I
want to manage traffic control that way...), but I still wonder if it is
an acceptable situation. Currently, the user can intentionally or
accidentally screw up the stack this way.

Jan


PS: Tests performed on a 2.6.17 kernel, but I don't see a reason why
newer kernels should be immune.



signature.asc
Description: OpenPGP digital signature


Re: d80211: How does TX flow control work?

2007-01-06 Thread Jan Kiszka
Jan Kiszka wrote:
 Jiri Benc wrote:
 On Wed, 03 Jan 2007 19:10:01 +0100, Jan Kiszka wrote:
 BUG: warning at 
 /usr/src/rt2x00/rt2x00/ieee80211/ieee80211.c:1256/ieee80211_tx()
  cfa02245 ieee80211_master_start_xmit+0x105/0x430 [80211]  c024e35d 
 __ip_ct_refresh_acct+0x4d/0x60
  c024fd11 tcp_packet+0x941/0x970  c0217442 qdisc_restart+0x92/0x100
  c020d43d dev_queue_xmit+0xbd/0x1a0  cfa050d8 
 ieee80211_subif_start_xmit+0x468/0x480 [80211]
  c0207dca skb_clone+0x3a/0x1a0  c021d16d nf_hook_slow+0x4d/0xc0
  c020d495 dev_queue_xmit+0x115/0x1a0  c0226a63 ip_output+0x1c3/0x200
  c0225740 ip_finish_output+0x0/0x180  c022628b ip_queue_xmit+0x36b/0x3b0
  c0224130 dst_output+0x0/0x10  ce9bae7d usb_hcd_giveback_urb+0x2d/0x60 
 [usbcore]
  c0237da2 tcp_v4_send_check+0x82/0xd0  c0237da2 
 tcp_v4_send_check+0x82/0xd0
  c0233244 tcp_transmit_skb+0x5e4/0x610  c0234b36 
 __tcp_push_pending_frames+0x676/0x740
  c0207f81 __alloc_skb+0x51/0x100  c022b817 tcp_sendmsg+0x897/0x980
  c0153fa9 core_sys_select+0x1b9/0x2b0  c0241f1d inet_sendmsg+0x3d/0x50
  c0202a8f do_sock_write+0x8f/0xa0  c020301f sock_aio_write+0x5f/0x70
  c01443d3 do_sync_write+0xc3/0x100  c01247f0 
 autoremove_wake_function+0x0/0x40
  c0144ca1 vfs_write+0xa1/0x140  c01451d3 sys_write+0x43/0x70
  c0102ae7 syscall_call+0x7/0xb

 Does it tell you anything already? Is there something I may instrument? What
 could the driver do wrong to trigger such bug?
 Do you have CONFIG_NET_SCHED enabled?


Sorry, this was most probably false alarm for the official stack. The
problem now appears to be related to a patch against d80211 that is only
present in the rt2x00 CVS.

Jan



signature.asc
Description: OpenPGP digital signature


Re: d80211: How does TX flow control work?

2007-01-03 Thread Jiri Benc
On Tue, 02 Jan 2007 14:08:21 +0100, Jan Kiszka wrote:
 What I (think to) understand is that a low-level drivers call
 ieee80211_stop_queue() if they run out of buffers. That flips a
 per-queue bit (IEEE80211_LINK_STATE_XOFF), prevents that any further
 frame is passed to the low-level TX routine,

Correct.

 and can cause that up to
 *one* packet per queue is stored in
 ieee80211_local::pending_packets[queue].

This is needed due to fragmented frames. After resume, passing of
fragments to the driver has to continue where it was stopped. Returning
the half-sent fragmented frame to the 802.11 qdisc wasn't possible
until recently (I think the conversion of master interface to native
802.11 type could allow that now - but it's probably not worth the
effort).

 But it looks to me like nothing
 prevents ieee80211_tx() being invoked even in case that there is already
 some stuff in that single-packet storage.

The 802.11 qdisc (see wme_qdiscop_dequeue) takes care of that.

 That in turn triggers WARN_ONs in ieee80211_tx() under high load for me
 (with rt2500usb). And it should also cause orphaned skbs because the
 storage is overwritten in that case. Either I'm blind or something is
 fishy...

You are most likely hitting some bug. Could you post more information
please?

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: d80211: How does TX flow control work?

2007-01-03 Thread Jan Kiszka
Jiri Benc wrote:

 On Tue, 02 Jan 2007 14:08:21 +0100, Jan Kiszka wrote:
   
 What I (think to) understand is that a low-level drivers call
 ieee80211_stop_queue() if they run out of buffers. That flips a
 per-queue bit (IEEE80211_LINK_STATE_XOFF), prevents that any further
 frame is passed to the low-level TX routine,
 

 Correct.

   
 and can cause that up to
 *one* packet per queue is stored in
 ieee80211_local::pending_packets[queue].
 

 This is needed due to fragmented frames. After resume, passing of
 fragments to the driver has to continue where it was stopped. Returning
 the half-sent fragmented frame to the 802.11 qdisc wasn't possible
 until recently (I think the conversion of master interface to native
 802.11 type could allow that now - but it's probably not worth the
 effort).

   
 But it looks to me like nothing
 prevents ieee80211_tx() being invoked even in case that there is already
 some stuff in that single-packet storage.
 

 The 802.11 qdisc (see wme_qdiscop_dequeue) takes care of that.

   
Ahh, that is an interesting new piece in the puzzle.


 That in turn triggers WARN_ONs in ieee80211_tx() under high load for me
 (with rt2500usb). And it should also cause orphaned skbs because the
 storage is overwritten in that case. Either I'm blind or something is
 fishy...
 

 You are most likely hitting some bug. Could you post more information
 please?

   
Test scenario is rt2500usb from the rt2x00 CVS (+my currently half-pending
series), an ASUS WL167g USB stick, and hostapd driving that stick in master
mode. As soon as I trigger the AP to send out some longer TCP stream, I get
these warnings:

BUG: warning at /usr/src/rt2x00/rt2x00/ieee80211/ieee80211.c:1256/ieee80211_tx()
 cfa02245 ieee80211_master_start_xmit+0x105/0x430 [80211]  c024e35d 
__ip_ct_refresh_acct+0x4d/0x60
 c024fd11 tcp_packet+0x941/0x970  c0217442 qdisc_restart+0x92/0x100
 c020d43d dev_queue_xmit+0xbd/0x1a0  cfa050d8 
ieee80211_subif_start_xmit+0x468/0x480 [80211]
 c0207dca skb_clone+0x3a/0x1a0  c021d16d nf_hook_slow+0x4d/0xc0
 c020d495 dev_queue_xmit+0x115/0x1a0  c0226a63 ip_output+0x1c3/0x200
 c0225740 ip_finish_output+0x0/0x180  c022628b ip_queue_xmit+0x36b/0x3b0
 c0224130 dst_output+0x0/0x10  ce9bae7d usb_hcd_giveback_urb+0x2d/0x60 
[usbcore]
 c0237da2 tcp_v4_send_check+0x82/0xd0  c0237da2 tcp_v4_send_check+0x82/0xd0
 c0233244 tcp_transmit_skb+0x5e4/0x610  c0234b36 
__tcp_push_pending_frames+0x676/0x740
 c0207f81 __alloc_skb+0x51/0x100  c022b817 tcp_sendmsg+0x897/0x980
 c0153fa9 core_sys_select+0x1b9/0x2b0  c0241f1d inet_sendmsg+0x3d/0x50
 c0202a8f do_sock_write+0x8f/0xa0  c020301f sock_aio_write+0x5f/0x70
 c01443d3 do_sync_write+0xc3/0x100  c01247f0 
autoremove_wake_function+0x0/0x40
 c0144ca1 vfs_write+0xa1/0x140  c01451d3 sys_write+0x43/0x70
 c0102ae7 syscall_call+0x7/0xb

Does it tell you anything already? Is there something I may instrument? What
could the driver do wrong to trigger such bug?

Jan




signature.asc
Description: OpenPGP digital signature


Re: d80211: How does TX flow control work?

2007-01-03 Thread Jiri Benc
On Wed, 03 Jan 2007 19:10:01 +0100, Jan Kiszka wrote:
 BUG: warning at 
 /usr/src/rt2x00/rt2x00/ieee80211/ieee80211.c:1256/ieee80211_tx()
  cfa02245 ieee80211_master_start_xmit+0x105/0x430 [80211]  c024e35d 
 __ip_ct_refresh_acct+0x4d/0x60
  c024fd11 tcp_packet+0x941/0x970  c0217442 qdisc_restart+0x92/0x100
  c020d43d dev_queue_xmit+0xbd/0x1a0  cfa050d8 
 ieee80211_subif_start_xmit+0x468/0x480 [80211]
  c0207dca skb_clone+0x3a/0x1a0  c021d16d nf_hook_slow+0x4d/0xc0
  c020d495 dev_queue_xmit+0x115/0x1a0  c0226a63 ip_output+0x1c3/0x200
  c0225740 ip_finish_output+0x0/0x180  c022628b ip_queue_xmit+0x36b/0x3b0
  c0224130 dst_output+0x0/0x10  ce9bae7d usb_hcd_giveback_urb+0x2d/0x60 
 [usbcore]
  c0237da2 tcp_v4_send_check+0x82/0xd0  c0237da2 
 tcp_v4_send_check+0x82/0xd0
  c0233244 tcp_transmit_skb+0x5e4/0x610  c0234b36 
 __tcp_push_pending_frames+0x676/0x740
  c0207f81 __alloc_skb+0x51/0x100  c022b817 tcp_sendmsg+0x897/0x980
  c0153fa9 core_sys_select+0x1b9/0x2b0  c0241f1d inet_sendmsg+0x3d/0x50
  c0202a8f do_sock_write+0x8f/0xa0  c020301f sock_aio_write+0x5f/0x70
  c01443d3 do_sync_write+0xc3/0x100  c01247f0 
 autoremove_wake_function+0x0/0x40
  c0144ca1 vfs_write+0xa1/0x140  c01451d3 sys_write+0x43/0x70
  c0102ae7 syscall_call+0x7/0xb
 
 Does it tell you anything already? Is there something I may instrument? What
 could the driver do wrong to trigger such bug?

Do you have CONFIG_NET_SCHED enabled?

Thanks,

 Jiri

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html