Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
Hi Dave, Dave Täht schreef op 2/3/17 13:10: On 3/2/17 11:51 AM, Stijn Segers wrote: > Thanks Sebastian, turned out to be a silly syntax error, I have it all > disabled now. Ethtool -k and ethtool -K printing/requiring different > stuff doesn't help of course :-) > > I re-enabled SQM, will see how that works out with the offloading disabled. Would be good to know. I lost a bit of sleep lately (given how badly we got bit by RCU on the ATF front, I worry about cake... but I can't see how that would break, there.) With my 50 Mbps VDSL2 downlink, I see a 25% load on one core tops. So that looks good. Checked uptime this morning, all hunky-dory. So I'll keep offloading disabled for now. Cheers Stijn ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
On 3/2/17 11:51 AM, Stijn Segers wrote: > Thanks Sebastian, turned out to be a silly syntax error, I have it all > disabled now. Ethtool -k and ethtool -K printing/requiring different > stuff doesn't help of course :-) > > I re-enabled SQM, will see how that works out with the offloading disabled. Would be good to know. I lost a bit of sleep lately (given how badly we got bit by RCU on the ATF front, I worry about cake... but I can't see how that would break, there.) In terms of general "why does shaping use so much cpu"... I am keen to stress that the core fq_codel algorithm is very lightweight and barely shows up on traces when used without software rate limiting and with BQL. You CAN see a difference in forwarding performance at really high native rates if you use pfifo and compare it to fq_codel on some platforms - pfifo-fast is simpler overall. To experiment, you can re-enable pfifo-fast in scenarios if you want - (tc qdisc add dev whatever pfifo limit somethingsane, or bfifo something sane) ... however things like nat and firewall rules tend to dominate the forwarding costs, and fq_codel reduces latency muchly over pfifo), and the principal use of fq_codel is for sqm (and now wifi). As for software rate shaping - this is very cpu intensive no matter how you do it. I wish we didn't have to do it - and with certain (mostly old DSL) modems that do flow control you don't. The only one I know that gets this right is the transverse geode that david woodhouse has. One of my disappointments across the industry is not seeing BQL roll out universally on any dsl firmwares, starting, oh, 5 years ago. If we had ethernet devices with a programmable timer (only interrupt me on 40mbit rate) we could also completely eliminate software rate shaping anyway my benchmarks are showing that: cake in it's "besteffort" mode smokes HTB + fq_codel, affording over 40% more headroom in terms of cpu with bandwidth. (Independent confirmation across more cpu types is need) In the default mode, with the new 3 tier classification, wash, nat and triple-isolate/dual-host/dual-src features - which we hope are going to help folk deal with torrent better in particular - it's a wash. cake is a LOT more cpu intense than fq_codel is, especially in its default modes, which it makes up for by being more unified. Mostly. If you are running low on cpu and are trying to shape inbound on most of these low-end mips devices to speeds > 60Mbits, I'd highly recommend switching to using "besteffort" on that rather than the 3 QoS queue default. Most ISPs are not classifying traffic well, anyway, and FQ solves nearly everything, especially per host fq But none of what I just said applies if there's a bug somewhere else! GRO has given me fits for years now, and I'm scarred by that. In terms of cpu costs in cake/fq_codel - dequeue, hashing, and timestamping show up most on a trace. The rate limiting effort where all that is happening shows up in softirq dominating the platform. I have *always* worried that there exists devices (particularly multi-cores) without a first class high speed internal clock facility, but thus far haven't had an issue with it (unlike on BSD, which has internal timings good to only a ms). As for speeding up hashing, I've been looking over various algorithms to do that for years now, I'm open to suggestions. The fastest new ones tend to depend on co-processor support. The fastest I've seen relies on the CRC32 instruction which is only in some intel platforms. Cake could certainly use a big round of profiling but it is generally my hope that we won big with it, in its present form. I welcome (especially flent) benchmarks of sqm on various architectures we've not explored fully - notably arm ones - My hat is off to all that have worked so hard to make this subsystem - and also all of lede - work so well, in this release. > Cheers > > Stijn > > > ___ > Lede-dev mailing list > Lede-dev@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/lede-dev ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
Thanks Sebastian, turned out to be a silly syntax error, I have it all disabled now. Ethtool -k and ethtool -K printing/requiring different stuff doesn't help of course :-) I re-enabled SQM, will see how that works out with the offloading disabled. Cheers Stijn ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
On Thu, Mar 2, 2017 at 2:32 AM, Martin Tippmann wrote: > On Wed, Mar 1, 2017 at 11:40 PM, Weedy wrote: >> On 28 February 2017 at 05:40, Martin Tippmann wrote: >>> On Mon, Feb 27, 2017 at 9:17 PM, Stijn Segers >>> wrote: Okay, so I tracked it down to cake being the culprit. When I disable the Cake SQM instance, no more of those traces, and no more sudden reboots. If I can help debug this, let me know - I enabled a Cake SQM instance on an APU2 and so far that seems to run fine. >>> >>> cake: Maybe it's related - I'm seeing high cpu usage with cake on >>> TP-Link 841N routers even with none, moderate traffic after a while. I >>> don't see hanging tasks in the logs but the system feels sluggish even >>> it's idle. >> >> >> I have a WR842ND v2, I'm also seeing high CPU suddenly with only >> default fq-codel on 4.4.50. Totally missed the 4.4.50 part, sorry - so it's not the ath9k fixes, I have no idea. ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
On Wed, Mar 1, 2017 at 11:40 PM, Weedy wrote: > On 28 February 2017 at 05:40, Martin Tippmann wrote: >> On Mon, Feb 27, 2017 at 9:17 PM, Stijn Segers >> wrote: >>> Okay, so I tracked it down to cake being the culprit. When I disable the >>> Cake SQM instance, no more of those traces, and no more sudden reboots. >>> >>> If I can help debug this, let me know - I enabled a Cake SQM instance on an >>> APU2 and so far that seems to run fine. >> >> cake: Maybe it's related - I'm seeing high cpu usage with cake on >> TP-Link 841N routers even with none, moderate traffic after a while. I >> don't see hanging tasks in the logs but the system feels sluggish even >> it's idle. > > > I have a WR842ND v2, I'm also seeing high CPU suddenly with only > default fq-codel on 4.4.50. with fq_codel everything is fine for me (on WR841Nv7/8/11), even cake is fine on a MAC1200R. We have a few a few WR842N in our community mesh network but these are also remote and in use at the moment. Just to be sure: Is the build newer than January 27? We've seen high sys without these fixes: https://git.lede-project.org/?p=source.git;a=commit;h=82d580e8b5c43f4dd228f2bb5927ca3e47752a34 https://git.lede-project.org/?p=source.git;a=commit;h=b94177e10fc72f9309eae7459c3570e5c080e960 > Are you able to git bisect? This particular device is installed in a > remote location so I can't deal with it for a while. Never done that but I guess it's possible. At the moment I'm not even sure how to reliable reproduce the cake issue. Filling /tmp might trigger it. I've discovered after routers got slow after uploading the a new sysupgrade image to /tmp. I wanted to play with trace-cmd (ftrace) after discovering that you can use trace-cmd to record what's going on in the kernel remotly using the -N option (http://man7.org/linux/man-pages/man1/trace-cmd-record.1.html) but this is pretty much all new to me. The idea was to get a grasp where that high sys is coming from. Had no time to play with this and I'm not sure if trace-cmd record is even the right tool for this or works on my 32mb ram/4mb flash device. http://www.brendangregg.com/linuxperf.html has a lot of stuff that looks useful. I don't have much time until the weekend, if I manage to get the traces running I'll post them in the bugticket. regards Martin ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
On 1 March 2017 at 17:40, Weedy wrote: > On 28 February 2017 at 05:40, Martin Tippmann wrote: >> On Mon, Feb 27, 2017 at 9:17 PM, Stijn Segers >> wrote: >>> Okay, so I tracked it down to cake being the culprit. When I disable the >>> Cake SQM instance, no more of those traces, and no more sudden reboots. >>> >>> If I can help debug this, let me know - I enabled a Cake SQM instance on an >>> APU2 and so far that seems to run fine. >> >> cake: Maybe it's related - I'm seeing high cpu usage with cake on >> TP-Link 841N routers even with none, moderate traffic after a while. I >> don't see hanging tasks in the logs but the system feels sluggish even >> it's idle. > > > I have a WR842ND v2, I'm also seeing high CPU suddenly with only > default fq-codel on 4.4.50. > Are you able to git bisect? This particular device is installed in a > remote location so I can't deal with it for a while. > > >pppoe-wan│ 494.90KiB351 pps │ 12.13KiB192 > pps > > Mem: 19620K used, 8308K free, 4K shrd, 864K buff, 1472K cached > CPU: 1% usr 93% sys 0% nic 0% idle 0% io 0% irq 4% sirq > Load average: 6.89 4.88 2.96 5/58 9906 > PID PPID USER STAT VSZ %VSZ %CPU COMMAND > 1555 1 root R 1948 7% 16% /usr/sbin/hostapd -s -P > /var/run/wifi >97 2 root SW 0 0% 12% [kworker/0:1] > 440 1 root S 1180 4% 7% /sbin/ubusd > 9165 8584 root R 1196 4% 5% top -d3 > 840 1 root S 1396 5% 5% /usr/sbin/odhcpd > 9886 9883 root R 1224 4% 4% /bin/sh /etc/netCheck.sh > 3 2 root SW 0 0% 4% [ksoftirqd/0] > 104 2 root SW 0 0% 2% [kswapd0] > > On the other hand, here is my WDR4300 > > >pppoe-wan│ 420.29KiB 1.69K pps│ 78.09KiB > 1.10K pps > > Mem: 21212K used, 6716K free, 12K shrd, 792K buff, 1584K cached > CPU: 6% usr 88% sys 0% nic 0% idle 0% io 0% irq 5% sirq > Load average: 3.38 2.13 1.16 1/56 9259 > PID PPID USER STAT VSZ %VSZ %CPU COMMAND >97 2 root RW 0 0% 18% [kworker/0:1] > 1555 1 root S 1948 7% 12% /usr/sbin/hostapd -s -P > /var/run/wifi > 2714 2713 root R 2484 9% 4% /usr/sbin/bmon -p ppp* > 9240 9239 root S 1184 4% 4% /bin/ping -W 3 -c 6 -q -I pppoe-wan > g > 840 1 root S 1396 5% 4% /usr/sbin/odhcpd > 3 2 root SW 0 0% 4% [ksoftirqd/0] > 9241 9239 root S 1204 4% 3% awk /packets received {print $4} > 9165 8584 root R 1188 4% 3% top -d3 > 2389 1 dnsmasq S 1412 5% 2% /usr/sbin/dnsmasq -C > /var/etc/dnsmasq Crap wrong paste, both of those top excerpts are from my WR842ND at about 550pps. But choking on sys usage. HERE is my WDR4300, moving over 2000pps with basically no sys usage. Mem: 42960K used, 17456K free, 216K shrd, 3084K buff, 9716K cached CPU: 5% usr 1% sys 0% nic 79% idle 0% io 0% irq 14% sirq Load average: 0.14 0.13 0.13 1/59 7749 PID PPID USER STAT VSZ %VSZ %CPU COMMAND 2365 1 nobody R 5260 9% 4% /usr/sbin/darkstat --verbose -i br-la 2297 2295 root S 2408 4% 3% /usr/sbin/bmon -p ppp* 3 2 root RW 0 0% 1% [ksoftirqd/0] 2548 2536 root R 1188 2% 1% top -d3 2229 995 root S 1132 2% 0% /usr/sbin/dropbear -F -P /var/run/dro 20388 2 root SW 0 0% 0% [kworker/0:0] >pppoe-wan│ 420.29KiB 1.69K pps│ 78.09KiB 1.10K pps ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
On 28 February 2017 at 05:40, Martin Tippmann wrote: > On Mon, Feb 27, 2017 at 9:17 PM, Stijn Segers > wrote: >> Okay, so I tracked it down to cake being the culprit. When I disable the >> Cake SQM instance, no more of those traces, and no more sudden reboots. >> >> If I can help debug this, let me know - I enabled a Cake SQM instance on an >> APU2 and so far that seems to run fine. > > cake: Maybe it's related - I'm seeing high cpu usage with cake on > TP-Link 841N routers even with none, moderate traffic after a while. I > don't see hanging tasks in the logs but the system feels sluggish even > it's idle. I have a WR842ND v2, I'm also seeing high CPU suddenly with only default fq-codel on 4.4.50. Are you able to git bisect? This particular device is installed in a remote location so I can't deal with it for a while. >pppoe-wan│ 494.90KiB351 pps │ 12.13KiB192 pps Mem: 19620K used, 8308K free, 4K shrd, 864K buff, 1472K cached CPU: 1% usr 93% sys 0% nic 0% idle 0% io 0% irq 4% sirq Load average: 6.89 4.88 2.96 5/58 9906 PID PPID USER STAT VSZ %VSZ %CPU COMMAND 1555 1 root R 1948 7% 16% /usr/sbin/hostapd -s -P /var/run/wifi 97 2 root SW 0 0% 12% [kworker/0:1] 440 1 root S 1180 4% 7% /sbin/ubusd 9165 8584 root R 1196 4% 5% top -d3 840 1 root S 1396 5% 5% /usr/sbin/odhcpd 9886 9883 root R 1224 4% 4% /bin/sh /etc/netCheck.sh 3 2 root SW 0 0% 4% [ksoftirqd/0] 104 2 root SW 0 0% 2% [kswapd0] On the other hand, here is my WDR4300 >pppoe-wan│ 420.29KiB 1.69K pps│ 78.09KiB 1.10K pps Mem: 21212K used, 6716K free, 12K shrd, 792K buff, 1584K cached CPU: 6% usr 88% sys 0% nic 0% idle 0% io 0% irq 5% sirq Load average: 3.38 2.13 1.16 1/56 9259 PID PPID USER STAT VSZ %VSZ %CPU COMMAND 97 2 root RW 0 0% 18% [kworker/0:1] 1555 1 root S 1948 7% 12% /usr/sbin/hostapd -s -P /var/run/wifi 2714 2713 root R 2484 9% 4% /usr/sbin/bmon -p ppp* 9240 9239 root S 1184 4% 4% /bin/ping -W 3 -c 6 -q -I pppoe-wan g 840 1 root S 1396 5% 4% /usr/sbin/odhcpd 3 2 root SW 0 0% 4% [ksoftirqd/0] 9241 9239 root S 1204 4% 3% awk /packets received {print $4} 9165 8584 root R 1188 4% 3% top -d3 2389 1 dnsmasq S 1412 5% 2% /usr/sbin/dnsmasq -C /var/etc/dnsmasq ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
Hi Stijn, > On Mar 1, 2017, at 21:28, Stijn Segers > wrote: > > Op wo, 1 mrt 2017 om 10:59 , schreef Sebastian Moeller : >> Hi Stijn, >>> On Mar 1, 2017, at 10:46, Stijn Segers >>> wrote: >>> Hi Baptiste, >>> Thanks for your input. I found this thread [1] however which suggests >>> offloading capabilities: >>> "The GSW is found in all of the 1000mbit SoCs. it has 5 external ports, >>> 1-2 cpu ports and 1 further port that the internal HW offloading engine >>> connects to. The switch core used is a MT7530, which also exists as a >>> standalone chip. [...]" >>> Johh, is there a way to disable offloading on MT7530? >> If you run “ethtool -k $IFACE” ethtool should show you which offloads the >> device supports and which can be toggled. Here is an example from a >> wndr3700v2 (eth1 being the dedicated WAN interface not connected to a switch) >> root@router:~# ethtool -k eth1 >> Features for eth1: >> rx-checksumming: off [fixed] >> tx-checksumming: off >> tx-checksum-ipv4: off [fixed] >> tx-checksum-ip-generic: off [fixed] >> tx-checksum-ipv6: off [fixed] >> tx-checksum-fcoe-crc: off [fixed] >> tx-checksum-sctp: off [fixed] >> scatter-gather: off >> tx-scatter-gather: off [fixed] >> tx-scatter-gather-fraglist: off [fixed] >> tcp-segmentation-offload: off >> tx-tcp-segmentation: off [fixed] >> tx-tcp-ecn-segmentation: off [fixed] >> tx-tcp6-segmentation: off [fixed] >> udp-fragmentation-offload: off [fixed] >> generic-segmentation-offload: off [requested on] >> generic-receive-offload: on >> large-receive-offload: off [fixed] >> rx-vlan-offload: off [fixed] >> tx-vlan-offload: off [fixed] >> ntuple-filters: off [fixed] >> receive-hashing: off [fixed] >> highdma: off [fixed] >> rx-vlan-filter: off [fixed] >> vlan-challenged: off [fixed] >> tx-lockless: off [fixed] >> netns-local: off [fixed] >> tx-gso-robust: off [fixed] >> tx-fcoe-segmentation: off [fixed] >> tx-gre-segmentation: off [fixed] >> tx-ipip-segmentation: off [fixed] >> tx-sit-segmentation: off [fixed] >> tx-udp_tnl-segmentation: off [fixed] >> fcoe-mtu: off [fixed] >> tx-nocache-copy: off >> loopback: off [fixed] >> rx-fcs: off [fixed] >> rx-all: off [fixed] >> tx-vlan-stag-hw-insert: off [fixed] >> rx-vlan-stag-hw-parse: off [fixed] >> rx-vlan-stag-filter: off [fixed] >> l2-fwd-offload: off [fixed] >> busy-poll: off [fixed] >> Best Regards >> Sebastian >>> Thanks! >>> Stijn >>> [1] https://lkml.org/lkml/2016/2/26/527 >>> ___ >>> Lede-dev mailing list >>> Lede-dev@lists.infradead.org >>> http://lists.infradead.org/mailman/listinfo/lede-dev > > Hi Sebastian, > > That does indeed return a long list, but it looks like the things that Dave > Taht suggested I disable are already disabled? > > ethtool -K tso off gso off gro off > > Of those, only gso shows as tx-gso-robust [off]: fixed... > > # ethtool -k eth0 > Features for eth0: > rx-checksumming: on > tx-checksumming: on > tx-checksum-ipv4: on > tx-checksum-ip-generic: off [fixed] > tx-checksum-ipv6: on > tx-checksum-fcoe-crc: off [fixed] > tx-checksum-sctp: off [fixed] > scatter-gather: on > tx-scatter-gather: on > tx-scatter-gather-fraglist: off [fixed] > tcp-segmentation-offload: on > tx-tcp-segmentation: on > tx-tcp-ecn-segmentation: off [fixed] > tx-tcp6-segmentation: on > udp-fragmentation-offload: off [fixed] > generic-segmentation-offload: on This is GSO > generic-receive-offload: on And this GRO so both seem to be still on, I do seem to remember that Linux will keep giant packets intact once assembled, so for testing you might want to disable GRO for all interfaces, to make sure no large meta-packet ever reaches cake… I also recall being flummoxed by the fact that the output of ethtool -k is not identical with what to give to ethtool -K... Best Regards > large-receive-offload: off [fixed] > rx-vlan-offload: off [fixed] > tx-vlan-offload: on > ntuple-filters: off [fixed] > receive-hashing: off [fixed] > highdma: off [fixed] > rx-vlan-filter: off [fixed] > vlan-challenged: off [fixed] > tx-lockless: off [fixed] > netns-local: off [fixed] > tx-gso-robust: off [fixed] > tx-fcoe-segmentation: off [fixed] > tx-gre-segmentation: off [fixed] > tx-ipip-segmentation: off [fixed] > tx-sit-segmentation: off [fixed] > tx-udp_tnl-segmentation: off [fixed] > fcoe-mtu: off [fixed] > tx-nocache-copy: off > loopback: off [fixed] > rx-fcs: off [fixed] > rx-all: off [fixed] > tx-vlan-stag-hw-insert: off [fixed] > rx-vlan-stag-hw-parse: off [fixed] > rx-vlan-stag-filter: off [fixed] > l2-fwd-offload: off [fixed] > busy-poll: off [fixed] > ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
Op wo, 1 mrt 2017 om 10:59 , schreef Sebastian Moeller : Hi Stijn, On Mar 1, 2017, at 10:46, Stijn Segers wrote: Hi Baptiste, Thanks for your input. I found this thread [1] however which suggests offloading capabilities: "The GSW is found in all of the 1000mbit SoCs. it has 5 external ports, 1-2 cpu ports and 1 further port that the internal HW offloading engine connects to. The switch core used is a MT7530, which also exists as a standalone chip. [...]" Johh, is there a way to disable offloading on MT7530? If you run “ethtool -k $IFACE” ethtool should show you which offloads the device supports and which can be toggled. Here is an example from a wndr3700v2 (eth1 being the dedicated WAN interface not connected to a switch) root@router:~# ethtool -k eth1 Features for eth1: rx-checksumming: off [fixed] tx-checksumming: off tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: off [fixed] tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: off tx-scatter-gather: off [fixed] tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: off tx-tcp-segmentation: off [fixed] tx-tcp-ecn-segmentation: off [fixed] tx-tcp6-segmentation: off [fixed] udp-fragmentation-offload: off [fixed] generic-segmentation-offload: off [requested on] generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: off [fixed] tx-vlan-offload: off [fixed] ntuple-filters: off [fixed] receive-hashing: off [fixed] highdma: off [fixed] rx-vlan-filter: off [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: off [fixed] tx-ipip-segmentation: off [fixed] tx-sit-segmentation: off [fixed] tx-udp_tnl-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] busy-poll: off [fixed] Best Regards Sebastian Thanks! Stijn [1] https://lkml.org/lkml/2016/2/26/527 ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev Hi Sebastian, That does indeed return a long list, but it looks like the things that Dave Taht suggested I disable are already disabled? ethtool -K tso off gso off gro off Of those, only gso shows as tx-gso-robust [off]: fixed... # ethtool -k eth0 Features for eth0: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: on tx-checksum-ip-generic: off [fixed] tx-checksum-ipv6: on tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: off [fixed] tx-tcp6-segmentation: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: off [fixed] tx-vlan-offload: on ntuple-filters: off [fixed] receive-hashing: off [fixed] highdma: off [fixed] rx-vlan-filter: off [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: off [fixed] tx-ipip-segmentation: off [fixed] tx-sit-segmentation: off [fixed] tx-udp_tnl-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] busy-poll: off [fixed] ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
> On Mar 1, 2017, at 10:59, Sebastian Moeller wrote: > > Hi Stijn, > > >> On Mar 1, 2017, at 10:46, Stijn Segers >> wrote: >> >> Hi Baptiste, >> >> Thanks for your input. I found this thread [1] however which suggests >> offloading capabilities: >> >> "The GSW is found in all of the 1000mbit SoCs. it has 5 external ports, >> 1-2 cpu ports and 1 further port that the internal HW offloading engine >> connects to. The switch core used is a MT7530, which also exists as a >> standalone chip. [...]" >> >> Johh, is there a way to disable offloading on MT7530? > > If you run “ethtool -k $IFACE” ethtool should show you which offloads the > device supports and which can be toggled. Here is an example from a > wndr3700v2 (eth1 being the dedicated WAN interface not connected to a switch) > > root@router:~# ethtool -k eth1 > Features for eth1: > rx-checksumming: off [fixed] > tx-checksumming: off > tx-checksum-ipv4: off [fixed] > tx-checksum-ip-generic: off [fixed] > tx-checksum-ipv6: off [fixed] > tx-checksum-fcoe-crc: off [fixed] > tx-checksum-sctp: off [fixed] > scatter-gather: off > tx-scatter-gather: off [fixed] > tx-scatter-gather-fraglist: off [fixed] > tcp-segmentation-offload: off > tx-tcp-segmentation: off [fixed] > tx-tcp-ecn-segmentation: off [fixed] > tx-tcp6-segmentation: off [fixed] > udp-fragmentation-offload: off [fixed] > generic-segmentation-offload: off [requested on] > generic-receive-offload: on > large-receive-offload: off [fixed] > rx-vlan-offload: off [fixed] > tx-vlan-offload: off [fixed] > ntuple-filters: off [fixed] > receive-hashing: off [fixed] > highdma: off [fixed] > rx-vlan-filter: off [fixed] > vlan-challenged: off [fixed] > tx-lockless: off [fixed] > netns-local: off [fixed] > tx-gso-robust: off [fixed] > tx-fcoe-segmentation: off [fixed] > tx-gre-segmentation: off [fixed] > tx-ipip-segmentation: off [fixed] > tx-sit-segmentation: off [fixed] > tx-udp_tnl-segmentation: off [fixed] > fcoe-mtu: off [fixed] > tx-nocache-copy: off > loopback: off [fixed] > rx-fcs: off [fixed] > rx-all: off [fixed] > tx-vlan-stag-hw-insert: off [fixed] > rx-vlan-stag-hw-parse: off [fixed] > rx-vlan-stag-filter: off [fixed] > l2-fwd-offload: off [fixed] > busy-poll: off [fixed] > What I forgot to mention, if ethtool -k does not show anything ethtool -K should also fail... > Best Regards > Sebastian > > >> >> Thanks! >> >> Stijn >> >> [1] https://lkml.org/lkml/2016/2/26/527 >> >> ___ >> Lede-dev mailing list >> Lede-dev@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/lede-dev > ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
Hi Stijn, > On Mar 1, 2017, at 10:46, Stijn Segers > wrote: > > Hi Baptiste, > > Thanks for your input. I found this thread [1] however which suggests > offloading capabilities: > > "The GSW is found in all of the 1000mbit SoCs. it has 5 external ports, > 1-2 cpu ports and 1 further port that the internal HW offloading engine > connects to. The switch core used is a MT7530, which also exists as a > standalone chip. [...]" > > Johh, is there a way to disable offloading on MT7530? If you run “ethtool -k $IFACE” ethtool should show you which offloads the device supports and which can be toggled. Here is an example from a wndr3700v2 (eth1 being the dedicated WAN interface not connected to a switch) root@router:~# ethtool -k eth1 Features for eth1: rx-checksumming: off [fixed] tx-checksumming: off tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: off [fixed] tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: off tx-scatter-gather: off [fixed] tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: off tx-tcp-segmentation: off [fixed] tx-tcp-ecn-segmentation: off [fixed] tx-tcp6-segmentation: off [fixed] udp-fragmentation-offload: off [fixed] generic-segmentation-offload: off [requested on] generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: off [fixed] tx-vlan-offload: off [fixed] ntuple-filters: off [fixed] receive-hashing: off [fixed] highdma: off [fixed] rx-vlan-filter: off [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: off [fixed] tx-ipip-segmentation: off [fixed] tx-sit-segmentation: off [fixed] tx-udp_tnl-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] busy-poll: off [fixed] Best Regards Sebastian > > Thanks! > > Stijn > > [1] https://lkml.org/lkml/2016/2/26/527 > > ___ > Lede-dev mailing list > Lede-dev@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/lede-dev ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
Hi Baptiste, Thanks for your input. I found this thread [1] however which suggests offloading capabilities: "The GSW is found in all of the 1000mbit SoCs. it has 5 external ports, 1-2 cpu ports and 1 further port that the internal HW offloading engine connects to. The switch core used is a MT7530, which also exists as a standalone chip. [...]" Johh, is there a way to disable offloading on MT7530? Thanks! Stijn [1] https://lkml.org/lkml/2016/2/26/527 ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
On Wed, Mar 01, 2017 at 08:23:13AM +0100, Stijn Segers wrote: > OK... So I tried to disable offloading on the WAN interface (eth0.2 for the > DIR-860L), but that throws the following error: > > # ethtool -K tso off gso off gro off eth0 > Cannot get device feature names: No such device > > Same for any other devices I try... Is there a way to disable offloading > without ethtool? I'm not sure the NIC of most embedded devices actually support offloading... signature.asc Description: PGP signature ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
Op di, 28 feb 2017 om 11:40 , schreef Martin Tippmann : On Mon, Feb 27, 2017 at 9:17 PM, Stijn Segers wrote: Okay, so I tracked it down to cake being the culprit. When I disable the Cake SQM instance, no more of those traces, and no more sudden reboots. If I can help debug this, let me know - I enabled a Cake SQM instance on an APU2 and so far that seems to run fine. cake: Maybe it's related - I'm seeing high cpu usage with cake on TP-Link 841N routers even with none, moderate traffic after a while. I don't see hanging tasks in the logs but the system feels sluggish even it's idle. filed a bug here: https://bugs.lede-project.org/index.php?do=details&task_id=563 don't have much info through. The same version is running fine on a MAC1200R with 128MB memory through. regards Martin OK... So I tried to disable offloading on the WAN interface (eth0.2 for the DIR-860L), but that throws the following error: # ethtool -K tso off gso off gro off eth0 Cannot get device feature names: No such device Same for any other devices I try... Is there a way to disable offloading without ethtool? Cheers Stijn ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
Op di, 28 feb 2017 om 1:15 , schreef Eric Luehrsen : On 02/27/2017 03:17 PM, Stijn Segers wrote: Okay, so I tracked it down to cake being the culprit. When I disable the Cake SQM instance, no more of those traces, and no more sudden reboots. If I can help debug this, let me know - I enabled a Cake SQM instance on an APU2 and so far that seems to run fine. So, in short: .50/51 bump is okay. Cheers Stijn It wasn't long ago that HFSC was causing issues with 4.4. I am not sure it was all worked out. I withdrew HFSC based scripts from SQM despite being one of a few proponents otherwise. Maybe something else in the TC chain is the root cause. Maybe? Eric Dave Taht suggested I disable offloading (with ethtool -K tso off gso off gro off) to see if that changes anything, so I'll be testing that now. Cheers! Stijn ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
On Mon, Feb 27, 2017 at 9:17 PM, Stijn Segers wrote: > Okay, so I tracked it down to cake being the culprit. When I disable the > Cake SQM instance, no more of those traces, and no more sudden reboots. > > If I can help debug this, let me know - I enabled a Cake SQM instance on an > APU2 and so far that seems to run fine. cake: Maybe it's related - I'm seeing high cpu usage with cake on TP-Link 841N routers even with none, moderate traffic after a while. I don't see hanging tasks in the logs but the system feels sluggish even it's idle. filed a bug here: https://bugs.lede-project.org/index.php?do=details&task_id=563 don't have much info through. The same version is running fine on a MAC1200R with 128MB memory through. regards Martin ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
Re: [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
On 02/27/2017 03:17 PM, Stijn Segers wrote: > Okay, so I tracked it down to cake being the culprit. When I disable the > Cake SQM instance, no more of those traces, and no more sudden reboots. > > If I can help debug this, let me know - I enabled a Cake SQM instance on > an APU2 and so far that seems to run fine. > > So, in short: .50/51 bump is okay. > > Cheers > > Stijn It wasn't long ago that HFSC was causing issues with 4.4. I am not sure it was all worked out. I withdrew HFSC based scripts from SQM despite being one of a few proponents otherwise. Maybe something else in the TC chain is the root cause. Maybe? Eric ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev
[LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01] Kernel: bump to 4.4.51
Okay, so I tracked it down to cake being the culprit. When I disable the Cake SQM instance, no more of those traces, and no more sudden reboots. If I can help debug this, let me know - I enabled a Cake SQM instance on an APU2 and so far that seems to run fine. So, in short: .50/51 bump is okay. Cheers Stijn ___ Lede-dev mailing list Lede-dev@lists.infradead.org http://lists.infradead.org/mailman/listinfo/lede-dev