Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-21 Thread Josip Rodin
On Sat, Aug 17, 2013 at 06:30:48PM +0200, Josip Rodin wrote:
  LOCATIONOFFSET COUNT
 net_tx_action 0 1
 
 qdisc tbf 20: parent 1:2 rate 2Kbit burst 20Kb lat 4295.0s
  Sent 1235809 bytes 6051 pkt (dropped 182, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0

JFTR I worked around this problem by giving up on sch_tbf - I replaced it
with an equivalent simple sch_htb setup (htb qdisc, htb class, sfq qdisc).

-- 
 2. That which causes joy or happiness.


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130821182927.ga18...@entuzijast.net



Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-17 Thread Josip Rodin
Package: linux-image-3.2.0-4-amd64
Version: 3.2.46-1

Hi,

I have a gateway machine, with $iface_Internet == xenbr2 and $iface_intranet
== xenbr0, running these traffic control rules on the outside interface
which are supposed to be a trivial ToS match and a limit on 20 Mbps:

tc qdisc del dev $iface_Internet root || true
tc qdisc add dev $iface_Internet root handle 1: prio
tc qdisc add dev $iface_Internet parent 1:1 handle 10: sfq
tc qdisc add dev $iface_Internet parent 1:2 handle 20: tbf rate 20mbit buffer 
20480 limit 16384
tc qdisc add dev $iface_Internet parent 1:3 handle 30: sfq

This worked just fine for about seven years now on a machine running
squeeze, and a fair few distro+kernel versions before that.
I changed the rate from 10 to 20 on 2012-10-12, and everything kept working
fine.

However, the upgrade to this new kernel appears to have killed it - the tbf
rule is causing outgoing HTTP connections to max out at around 8 Kbps.

When I remove tbf, everything is fine.

I think there's a software problem there - even if these rules were somehow
broken to begin with, this is a poor way of telling me that.

Please fix it. TIA.

-- 
 2. That which causes joy or happiness.


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130817080030.ga31...@entuzijast.net



Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-17 Thread Ben Hutchings
Control: tag -1 moreinfo

On Sat, 2013-08-17 at 10:00 +0200, Josip Rodin wrote:
 Package: linux-image-3.2.0-4-amd64
 Version: 3.2.46-1
 
 Hi,
 
 I have a gateway machine, with $iface_Internet == xenbr2 and $iface_intranet
 == xenbr0, running these traffic control rules on the outside interface
 which are supposed to be a trivial ToS match and a limit on 20 Mbps:
 
 tc qdisc del dev $iface_Internet root || true
 tc qdisc add dev $iface_Internet root handle 1: prio
 tc qdisc add dev $iface_Internet parent 1:1 handle 10: sfq
 tc qdisc add dev $iface_Internet parent 1:2 handle 20: tbf rate 20mbit buffer 
 20480 limit 16384
 tc qdisc add dev $iface_Internet parent 1:3 handle 30: sfq
 
 This worked just fine for about seven years now on a machine running
 squeeze, and a fair few distro+kernel versions before that.
 I changed the rate from 10 to 20 on 2012-10-12, and everything kept working
 fine.
 
 However, the upgrade to this new kernel appears to have killed it - the tbf
 rule is causing outgoing HTTP connections to max out at around 8 Kbps.
[...]

This might be the same as bug #708995.  Does turning off GRO on the
internal interface (not the bridge but the physical interface) work
around it?

Ben.

-- 
Ben Hutchings
Teamwork is essential - it allows you to blame someone else.


signature.asc
Description: This is a digitally signed message part


Processed: Re: Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-17 Thread Debian Bug Tracking System
Processing control commands:

 tag -1 moreinfo
Bug #719958 [linux-image-3.2.0-4-amd64] traffic control simple token bucket 
filter within prio broken in wheezy
Added tag(s) moreinfo.

-- 
719958: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=719958
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems


--
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/handler.s.b719958.137673501315941.transcr...@bugs.debian.org



Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-17 Thread Josip Rodin
On Sat, Aug 17, 2013 at 12:33:02PM +0200, Josip Rodin wrote:
 On Sat, Aug 17, 2013 at 12:23:21PM +0200, Ben Hutchings wrote:
   tc qdisc add dev $iface_Internet parent 1:2 handle 20: tbf rate 20mbit 
   buffer 20480 limit 16384
   
   However, the upgrade to this new kernel appears to have killed it - the 
   tbf
   rule is causing outgoing HTTP connections to max out at around 8 Kbps.
  [...]
  
  This might be the same as bug #708995.  Does turning off GRO on the
  internal interface (not the bridge but the physical interface) work
  around it?
 
 Yes, it looks like ifenslave -c bond0 eth0  ethtool -K eth0 gro off makes
 TBF precise again, and vice versa.

That's on one machine. But on another wheezy machine with the same setup but
somewhat different hardware, turning off GRO didn't help.

How do I debug this further?

-- 
 2. That which causes joy or happiness.


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130817105659.ga32...@entuzijast.net



Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-17 Thread Josip Rodin
On Sat, Aug 17, 2013 at 12:23:21PM +0200, Ben Hutchings wrote:
  tc qdisc add dev $iface_Internet parent 1:2 handle 20: tbf rate 20mbit 
  buffer 20480 limit 16384
  
  However, the upgrade to this new kernel appears to have killed it - the tbf
  rule is causing outgoing HTTP connections to max out at around 8 Kbps.
 [...]
 
 This might be the same as bug #708995.  Does turning off GRO on the
 internal interface (not the bridge but the physical interface) work
 around it?

Yes, it looks like ifenslave -c bond0 eth0  ethtool -K eth0 gro off makes
TBF precise again, and vice versa.

-- 
 2. That which causes joy or happiness.


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130817103302.ga26...@entuzijast.net



Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-17 Thread Ben Hutchings
On Sat, 2013-08-17 at 12:56 +0200, Josip Rodin wrote:
 On Sat, Aug 17, 2013 at 12:33:02PM +0200, Josip Rodin wrote:
  On Sat, Aug 17, 2013 at 12:23:21PM +0200, Ben Hutchings wrote:
tc qdisc add dev $iface_Internet parent 1:2 handle 20: tbf rate 20mbit 
buffer 20480 limit 16384

However, the upgrade to this new kernel appears to have killed it - the 
tbf
rule is causing outgoing HTTP connections to max out at around 8 Kbps.
   [...]
   
   This might be the same as bug #708995.  Does turning off GRO on the
   internal interface (not the bridge but the physical interface) work
   around it?
  
  Yes, it looks like ifenslave -c bond0 eth0  ethtool -K eth0 gro off makes
  TBF precise again, and vice versa.
 
 That's on one machine. But on another wheezy machine with the same setup but
 somewhat different hardware, turning off GRO didn't help.
 
 How do I debug this further?

You could try using the perf dropmonitor script as I described on my bug
report.

The other machine might also have LRO enabled on the internal interface,
although this is supposed to be disabled for bridged interfaces.  If the
other machine is also passing traffic from another VM on the same
physical host, it might be necessary to disable TSO on the interface
within the other VM.

Ben.

-- 
Ben Hutchings
Teamwork is essential - it allows you to blame someone else.


signature.asc
Description: This is a digitally signed message part


Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-17 Thread Josip Rodin
On Sat, Aug 17, 2013 at 02:06:57PM +0200, Ben Hutchings wrote:
 On Sat, 2013-08-17 at 12:56 +0200, Josip Rodin wrote:
  On Sat, Aug 17, 2013 at 12:33:02PM +0200, Josip Rodin wrote:
   On Sat, Aug 17, 2013 at 12:23:21PM +0200, Ben Hutchings wrote:
 tc qdisc add dev $iface_Internet parent 1:2 handle 20: tbf rate 
 20mbit buffer 20480 limit 16384
 
 However, the upgrade to this new kernel appears to have killed it - 
 the tbf
 rule is causing outgoing HTTP connections to max out at around 8 Kbps.
[...]

This might be the same as bug #708995.  Does turning off GRO on the
internal interface (not the bridge but the physical interface) work
around it?
   
   Yes, it looks like ifenslave -c bond0 eth0  ethtool -K eth0 gro off 
   makes
   TBF precise again, and vice versa.
  
  That's on one machine. But on another wheezy machine with the same setup but
  somewhat different hardware, turning off GRO didn't help.
  
  How do I debug this further?
 
 You could try using the perf dropmonitor script as I described on my bug
 report.

Didn't you say that was also broken? :)

 The other machine might also have LRO enabled on the internal interface,
 although this is supposed to be disabled for bridged interfaces.  If the
 other machine is also passing traffic from another VM on the same
 physical host, it might be necessary to disable TSO on the interface
 within the other VM.

There's no distinction here between physical interfaces; I receive traffic
on a bond0 throught several VLANs.

On one machine there's eth0 and eth2 behind that bond0, and that's the
one where the workaround works. On the other one, there's only eth0 behind
that bond0 (by accident), and the workaround doesn't make tbf work, oddly
enough. I also tried removing other offload options, but didn't make a dent.

The machines have different hardware but identical netfilter and tc rules,
and I shift traffic between them by moving the IP addresses, using
keepalived.

-- 
 2. That which causes joy or happiness.


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130817125423.ga20...@entuzijast.net



Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-17 Thread Ben Hutchings
On Sat, 2013-08-17 at 14:54 +0200, Josip Rodin wrote:
 On Sat, Aug 17, 2013 at 02:06:57PM +0200, Ben Hutchings wrote:
  On Sat, 2013-08-17 at 12:56 +0200, Josip Rodin wrote:
   On Sat, Aug 17, 2013 at 12:33:02PM +0200, Josip Rodin wrote:
On Sat, Aug 17, 2013 at 12:23:21PM +0200, Ben Hutchings wrote:
  tc qdisc add dev $iface_Internet parent 1:2 handle 20: tbf rate 
  20mbit buffer 20480 limit 16384
  
  However, the upgrade to this new kernel appears to have killed it - 
  the tbf
  rule is causing outgoing HTTP connections to max out at around 8 
  Kbps.
 [...]
 
 This might be the same as bug #708995.  Does turning off GRO on the
 internal interface (not the bridge but the physical interface) work
 around it?

Yes, it looks like ifenslave -c bond0 eth0  ethtool -K eth0 gro off 
makes
TBF precise again, and vice versa.
   
   That's on one machine. But on another wheezy machine with the same setup 
   but
   somewhat different hardware, turning off GRO didn't help.
   
   How do I debug this further?
  
  You could try using the perf dropmonitor script as I described on my bug
  report.
 
 Didn't you say that was also broken? :)
[...]

It's fixed now.

Ben.

-- 
Ben Hutchings
Teamwork is essential - it allows you to blame someone else.


signature.asc
Description: This is a digitally signed message part


Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-17 Thread Josip Rodin
On Sat, Aug 17, 2013 at 02:58:07PM +0200, Ben Hutchings wrote:
How do I debug this further?
   
   You could try using the perf dropmonitor script as I described on my bug
   report.
  
  Didn't you say that was also broken? :)
 [...]
 
 It's fixed now.

Hmm. Googling says it was fixed in May, so it doesn't sound like something
that's going to come close to entering 3.2...

So I took the new script and placed into
/usr/share/perf_3.2-core/scripts/python/net_dropmonitor.py

But I still can't seem to run it:

% perf script net_dropmonitor
invalid or unsupported event: 'skb:kfree_skb'

Help?

-- 
 2. That which causes joy or happiness.


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130817133342.ga27...@entuzijast.net



Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-17 Thread Ben Hutchings
On Sat, 2013-08-17 at 15:33 +0200, Josip Rodin wrote:
 On Sat, Aug 17, 2013 at 02:58:07PM +0200, Ben Hutchings wrote:
 How do I debug this further?

You could try using the perf dropmonitor script as I described on my bug
report.
   
   Didn't you say that was also broken? :)
  [...]
  
  It's fixed now.
 
 Hmm. Googling says it was fixed in May, so it doesn't sound like something
 that's going to come close to entering 3.2...
 
 So I took the new script and placed into
 /usr/share/perf_3.2-core/scripts/python/net_dropmonitor.py
 
 But I still can't seem to run it:
 
 % perf script net_dropmonitor
 invalid or unsupported event: 'skb:kfree_skb'
 
 Help?

Try running it as root...

Ben.

-- 
Ben Hutchings
Teamwork is essential - it allows you to blame someone else.


signature.asc
Description: This is a digitally signed message part


Bug#719958: traffic control simple token bucket filter within prio broken in wheezy

2013-08-17 Thread Josip Rodin
On Sat, Aug 17, 2013 at 04:08:12PM +0200, Ben Hutchings wrote:
 On Sat, 2013-08-17 at 15:33 +0200, Josip Rodin wrote:
  On Sat, Aug 17, 2013 at 02:58:07PM +0200, Ben Hutchings wrote:
  How do I debug this further?
 
 You could try using the perf dropmonitor script as I described on my 
 bug
 report.

Didn't you say that was also broken? :)
   [...]
   
   It's fixed now.
  
  Hmm. Googling says it was fixed in May, so it doesn't sound like something
  that's going to come close to entering 3.2...
  
  So I took the new script and placed into
  /usr/share/perf_3.2-core/scripts/python/net_dropmonitor.py
  
  But I still can't seem to run it:
  
  % perf script net_dropmonitor
  invalid or unsupported event: 'skb:kfree_skb'
  
  Help?
 
 Try running it as root...

Well, that was stupid. Anyway, my test file transfer that drags along like
this:

Length: 2586317 (2,5M) [application/octet-stream]
Saving to: /dev/null

 8% [==] 207.064 13,7K/s  eta 2m 18s  
^C

Results in this:

Starting trace (Ctrl-C to dump results)
^CGathering kallsyms data
 LOCATIONOFFSET COUNT
net_tx_action 0 1

At the same time, the tc output changes from:

qdisc prio 1: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 948738 bytes 5358 pkt (dropped 117, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc sfq 10: parent 1:1 limit 127p quantum 1514b divisor 1024
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc tbf 20: parent 1:2 rate 2Kbit burst 20Kb lat 4295.0s
 Sent 948738 bytes 5358 pkt (dropped 117, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc sfq 30: parent 1:3 limit 127p quantum 1514b divisor 1024
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

to this:

qdisc prio 1: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 1235809 bytes 6051 pkt (dropped 182, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc sfq 10: parent 1:1 limit 127p quantum 1514b divisor 1024
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc tbf 20: parent 1:2 rate 2Kbit burst 20Kb lat 4295.0s
 Sent 1235809 bytes 6051 pkt (dropped 182, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc sfq 30: parent 1:3 limit 127p quantum 1514b divisor 1024
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

-- 
 2. That which causes joy or happiness.


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130817163048.ga23...@entuzijast.net