On Wed, Jun 24, 2026 at 10:59 AM Morten Brørup <[email protected]> 
wrote:
>
> +Pavan Nikhilesh, +Stephen Hemminger, +Wathsala Vithanage, +Bruce Richardson, 
> +Thomas Monjalon
>
> > From: saeed bishara [mailto:[email protected]]
> > Sent: Tuesday, 23 June 2026 16.11
> >
> > > > also, instead of adding cacheline for this profiling data, can we
> > > > share with line 1 that used solely for xstats?
> > >
> > > This profiling data is 4 indexes * 2 values * 8-byte fields, so one
> > cache line in itself.
> > make sense.
> > btw, the default value of RTE_GRAPH_BURST_SIZE is 256, I suspect that
> > real applications will enforce smaller burst when pulling from input
> > devices (e.g. 32). Do you expect such cases to change
> > RTE_GRAPH_BURST_SIZE?
>
> Excellent question! I don't know.
> They should. E.g. an application optimized for latency should certainly not 
> process bursts of 256 objects.
>
> IMO, the root problem is the lack of a unified burst size across DPDK, which 
> causes every library to be designed with its own optimal burst size.
> E.g. the Mbuf library uses 64 (for rte_pktmbuf_free_bulk()), and the Graph 
> library uses 256.
>
> There has been an attempt at introducing a unified burst size [1] for DPDK, 
> but it met a lot of resistance, so it still needs to be refined before we can 
> reach a conclusion.
> The drivers supposedly can report an "optimal" burst size at run-time, which 
> the application can then use. But the application is unable to configure its 
> internal burst sizes if one driver reports 64 and another reports 32.
> I'm strongly in favor of a build time constant, used across DPDK. The default 
> value should work reasonably well across drivers and libraries.
> And if an application wants to optimize for performance (either throughput or 
> latency), the developer should experiment to find the optimal value.
> Furthermore, designing for a build time constant max burst size throughout 
> DPDK might provide performance benefits in itself, as the compiler can 
> optimize for this.
>
> [1]: https://inbox.dpdk.org/dev/[email protected]/
>
> Now, back to your question...
> As a workaround, I can sample Graph node performance data for 32 objects, 
> instead of sampling for RTE_GRAPH_BURST_SIZE / 2.
I see, so there is no simple static parameter here. what about
tracking max burst, then report the calls/cycles for that case, the
user will also find what was that max burst, and how often it occured.

saeed

Reply via email to