Not really relevant to this thread, probably, was this very good article on scaling linux to many cores:
https://blog.acolyer.org/2016/04/26/the-linux-scheduler-a-decade-of-wasted-cores/ I still like the idea of making single threaded cpus better, but only the millcomputer even comes close to trying, effectively. On Wed, Apr 27, 2016 at 12:45 PM, Dave Taht <[email protected]> wrote: > On Wed, Apr 27, 2016 at 12:32 PM, Stephen Hemminger > <[email protected]> wrote: >> DPDK gets impressive performance on large systems (like 14M packets/sec per >> core), but not convinced on smaller systems. > > My take on dpdk has been mostly that it's a great way to heat data > centers. Still I would really like to see these advanced algorithms > (cake, pie, fq_codel, htb) tested on it at these higher speeds. > > And I still have great hope for cheap, FPGA-assisted designs that > could one day be turned into asics, but not as much as I did last year > when I first started fiddling with the meshsr onenetswitch. I really > wish I could find a few good EE's to tackle making something fq_codel > like work on the netfpga project, the proof of concept verilog already > exists for DRR and AQM technologies. > >> Performance depends on having good CPU cache. I get poor performance on Atom >> etc. > > I had hoped that the rangeley class atoms would do better on dpdk, as > they do I/O direct to cache. I am not sure which processors that is > actually in, anymore. > >> Also driver support is limited (mostly 10G and above) > > Well, as we push end-user class devices to 1GigE, we are having issues > with overuse of offloads to get there, and in terms > of PPS, certainly pushing small packets is becoming a problem, on > ethernet and wifi. I would like to see a 100 dollar router that could > do full PPS at that speed, feeding fiber and going over 802.11ac, and > we are quite far from there. I see, for example, that meraki is using > click (I think) to push more processing into userspace. > > Also the time for a packet to transit linux from read to write is > "interesting". Last I looked it was something like 42 function calls > in the path to "get there", and some of my benchmarks on both the c2 > and apu2 are showing that that time is significant enough for fq_codel > to start kicking in to compensate. (which is kind of cool to see the > packet processing adapt to the cpu load, actually - and I still long > for timestamping on rx directly to adapt ever better) > > I have also acquired a mild dislike for seeing stuff like this: > > where the tx and rx rings are cleaned up in the same thread and there > is only one interrupt line for both. > > 51: 18 59244 253350 314273 PCI-MSI > 1572865-edge enp3s0-TxRx-0 > 52: 5 484274 141746 197260 PCI-MSI > 1572866-edge enp3s0-TxRx-1 > 53: 9 152225 29943 436749 PCI-MSI > 1572867-edge enp3s0-TxRx-2 > 54: 22 54327 299670 360356 PCI-MSI > 1572868-edge enp3s0-TxRx-3 > 56: 525343 513165 2355680 525593 PCI-MSI > 2097152-edge ath10k_pci > > and the ath10k only uses one interrupt. Maybe I'm wrong on my > assumptions, I'd think in today's multi-core environment that > processing tx and rx separately might be a win. (?) > > I keep hoping for on-board assist for routing table lookups on > something - your classic cam - for example. I saw today that there has > been some work on getting source specific routing into dpdk, which > makes me happy - > > https://www.ietf.org/proceedings/95/slides/slides-95-hackathon-18.pdf > > which is, incidentally, where I found the reference to the vpp stuff. > > https://www.ietf.org/blog/author/jari/ > > >> >> On Wed, Apr 27, 2016 at 12:28 PM, Aaron Wood <[email protected]> wrote: >>> >>> I'm looking at DPDK for a project, but I think I can make substantial >>> gains with just AF_PACKET + FANOUT and SO_REUSEPORT. It's not clear to my >>> yet how much DPDK is going to gain over those (and those can go a long way >>> on higher-powered platforms). >>> >>> On lower-end systems, I'm more suspicious of the memory bus (and the cache >>> in particular), than I am the raw CPU power. >>> >>> -Aaron >>> >>> On Wed, Apr 27, 2016 at 11:57 AM, Dave Taht <[email protected]> wrote: >>>> >>>> https://fd.io/technology seems to have come a long way. >>>> >>>> -- >>>> Dave Täht >>>> Let's go make home routers and wifi faster! With better software! >>>> http://blog.cerowrt.org >>>> _______________________________________________ >>>> Bloat mailing list >>>> [email protected] >>>> https://lists.bufferbloat.net/listinfo/bloat >>> >>> >>> >>> _______________________________________________ >>> Cake mailing list >>> [email protected] >>> https://lists.bufferbloat.net/listinfo/cake >>> >> > > > > -- > Dave Täht > Let's go make home routers and wifi faster! With better software! > http://blog.cerowrt.org -- Dave Täht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org _______________________________________________ Bloat mailing list [email protected] https://lists.bufferbloat.net/listinfo/bloat
