> Analyzing the memory layout with gdb for large structures is time consuming > and not usually recommended. > I would suggest using Poke-a-hole(pahole) and that helps to understand and > fix the structures in no time. > With pahole it's going to be lot easier to work with large structures > especially.
Thanks for the pointer. I'll have a look at pahole. It doesn't affect my reasoning against optimizing the compactification of struct dp_netdev_pmd_thread, though. > >Finally, even for x86 there is not even a performance improvement. I re-ran > >our standard L3VPN over VXLAN performance PVP test on master and with > >Ilya's revert patch: > > > >Flows master reverted > >8, 4.46 4.48 > >100, 4.27 4.29 > >1000, 4.07 4.07 > >2000, 3.68 3.68 > >5000, 3.03 3.03 > >10000, 2.76 2.77 > >20000, 2.64 2.65 > >50000, 2.60 2.61 > >100000, 2.60 2.61 > >500000, 2.60 2.61 > > What are the CFLAGS in this case, as they seem to make difference. I have > added my finding here for a different patch targeted at > performance > https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341270.html I'm compiling with "-O3 -msse4.2" to be in line with production deployments of OVS-DPDK that need to run on a wider family of Xeon generations. > > Patches to consider when testing your use case: > Xzalloc_cachline: > https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341231.html > (If using output batching) > https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341230.html I didn't use these. Tx batching is not relevant here. And I understand the xzalloc_cacheline patch alone does not guarantee that the allocated memory is indeed cache line-aligned. Thx, Jan _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
