> >> Analyzing the memory layout with gdb for large structures is time consuming >and not usually recommended. >> I would suggest using Poke-a-hole(pahole) and that helps to understand >and fix the structures in no time. >> With pahole it's going to be lot easier to work with large structures >especially. > >Thanks for the pointer. I'll have a look at pahole. >It doesn't affect my reasoning against optimizing the compactification of >struct >dp_netdev_pmd_thread, though. > >> >Finally, even for x86 there is not even a performance improvement. I >> >re-ran our standard L3VPN over VXLAN performance PVP test on master >> >and with Ilya's revert patch: >> > >> >Flows master reverted >> >8, 4.46 4.48 >> >100, 4.27 4.29 >> >1000, 4.07 4.07 >> >2000, 3.68 3.68 >> >5000, 3.03 3.03 >> >10000, 2.76 2.77 >> >20000, 2.64 2.65 >> >50000, 2.60 2.61 >> >100000, 2.60 2.61 >> >500000, 2.60 2.61 >> >> What are the CFLAGS in this case, as they seem to make difference. I >> have added my finding here for a different patch targeted at performance >> >> https://mail.openvswitch.org/pipermail/ovs-dev/2017- >November/341270.ht >> ml > >I'm compiling with "-O3 -msse4.2" to be in line with production deployments >of OVS-DPDK that need to run on a wider family of Xeon generations.
Thanks for this. AFAIK by specifying '-msse4.2' alone, you don't get to use the builtin_popcnt(). One way to enable is to use '-mpopcnt' in CFLAGS or build with march=native. (This is slightly out of context for this thread and JFYI. Ignore this if you only want to use intrinsics and not builtin popcnt.) > >> >> Patches to consider when testing your use case: >> Xzalloc_cachline: https://mail.openvswitch.org/pipermail/ovs-dev/2017- >November/341231.html >> (If using output batching) >> https://mail.openvswitch.org/pipermail/ovs- >dev/2017-November/341230.html > >I didn't use these. Tx batching is not relevant here. And I understand the >xzalloc_cacheline patch alone does not guarantee that the allocated memory >is indeed cache line-aligned. Atleast with POSIX_MEMALIGN, address will be aligned on 64 byte and start at CACHE_LINE_SIZE boundary. I am yet to check Ben's new patch and test it. - Bhanuprakash. > >Thx, Jan _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
