Hi, I can test Ipv4 forward but my plugin is disabled and all packets are sent to the drop node without any processing so I think the problem is not from plugins. alss in Vtune see DTLB miss from a function inside librte_net_ice.so i can build a RelwithDebinfo to see which function it is. also In another scenario I tried to add one more worker Each worker has 8 queue 8160 descriptors with this I don't see DTLB miss. At first i thought that it must be the HW limit but again why do DPDK testpmd have no problem.
Also, for one worker test in vtune, the result says I have 60% L1 bound and in subsection 100% is DTLB miss so I understand that L1 cache is limited, 100% comes from DTLB miss. On Thu, Dec 18, 2025 at 8:46 PM Benoit Ganne (bganne) via lists.fd.io <[email protected]> wrote: > Hi Mohammad, > > thanks for the data. Looking at it I do not see anything wrong, apart from > being ~50% slower than what we typically get with this kind of HW. > Just to rule out any other issue, could you try to run your test without > your plugin and just doing simple IPv4 forwarding? That would allow to > better compare with CSIT tests. > > best > ben > ________________________________________ > From: [email protected] <[email protected]> on behalf of mohammad > mahdi nemati haravani via lists.fd.io <nemati.mahdi255= > [email protected]> > Sent: Wednesday, December 17, 2025 15:19 > To: [email protected] > Subject: Re: [vpp-dev] Huge number of "dTLB-load-misses" when assigning > more than 1 rx descriptor in VPP > > Hi, > these are the outputs > > great > mohammad > > On Wed, Dec 17, 2025 at 4:51 PM Benoit Ganne (bganne) via lists.fd.io< > http://lists.fd.io> <[email protected]<mailto: > [email protected]>> wrote: > Hello, > > can you share the output of the following commands while traffic is > running: > - vppctl cle run && vppctl sh run > - vppctl cle err && vppctl sh err > - vppctl cle hard && vppctl sh hard > > And also the output of: > - vppctl sh log > - vppctl sh pci > - vppctl sh thr > > Best > ben > > ________________________________________ > From: [email protected]<mailto:[email protected]> <[email protected] > <mailto:[email protected]>> on behalf of mohammad mahdi nemati haravani > via lists.fd.io<http://lists.fd.io> <[email protected] > <mailto:[email protected]>> > Sent: Wednesday, December 17, 2025 11:20 > To: [email protected]<mailto:[email protected]> > Subject: [vpp-dev] Huge number of "dTLB-load-misses" when assigning more > than 1 rx descriptor in VPP > > Hi, > I was testing VPP on a system with CPU Intel® Xeon® Gold 6248R Processor > and I have ICE 810-C one hundred gig > when i try to assign more queue to worker threads i see DTLB miss rise in > vtune and perf > i even set system to drop all packet but it seems there is a problem with > VPP code or configuration > for reference i use dpdk testpmd and base on test pmd result i have this > > VPP > 1 queue 8160 rx descriptor -> rx miss & 100% DTLB miss > 1 queue 4096 rx descriptor -> no miss & no DTLB miss > 2 queue 4096 rx descriptor -> no miss & no DTLB miss (but vpp loop per > second drop drastically) > 4 queue 4096 rx descriptor -> rx miss & 100% DTLB miss > > DPDK testpmd > 1 queue 8160 rx descriptor -> no miss & no DTLB miss > 1 queue 4096 rx descriptor -> no miss & no DTLB miss > 2 queue 4096 rx descriptor -> no miss & no DTLB miss > 4 queue 4096 rx descriptor -> no miss & no DTLB miss > 4 queue 8160 rx descriptor -> no miss & no DTLB miss > 8 queue 8160 rx descriptor -> no miss & no DTLB miss > > for both test i use one thread > i want to know what is the problem that causes this since DPDK testpmd is > ok but vpp is not > also i use 1G huge page for both test > > > > > > >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#26664): https://lists.fd.io/g/vpp-dev/message/26664 Mute This Topic: https://lists.fd.io/mt/116824309/21656 Group Owner: [email protected] Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/14379924/21656/631435203/xyzzy [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
