<snip> > >>>>>>>> > >>>>>>>>> From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com] > >>>>>>>>> Sent: Monday, 27 June 2022 13.06 > >>>>>>>>> > >>>>>>>>> Hi. > >>>>>>>>> > >>>>>>>>> Is it safe to enable stats on MT safe services? > >>>>>>>>> > >>>>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122- > >> 313273af > >>>>>>>>> - > >>>> 4 > >>>>>>>>> 54445555731-6096fdb16385f88f&q=1&e=27b94605-d1e2-40b6- > >> af6d- > >>>> 9ebc54d > >>>>>>>>> > >>>> > >> > 5db18&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain% > >>>> 2Flib > >>>>>>>>> %2Feal%2Fcommon%2Frte_service.c%23 > >>>>>>>>> L3 > >>>>>>>>> 6 > >>>>>>>>> 6 > >>>>>>>>> > >>>>>>>>> It seems to me this would have to be an __atomic_add for this > >>>>>>>>> code to produce deterministic results. > >>>>>>>> > >>>>>>>> I agree. The same goes for the 'calls' field. > >>>>>>> The calling function does the locking. > >>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122- > 313273af > >>>>>>> - > >>>> 454 > >>>>>>> 445555731-5ce419f8bf9f9b76&q=1&e=27b94605-d1e2-40b6-af6d- > >>>> 9ebc54d5db1 > >>>>>>> > >>>> > >> > 8&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib > >>>> %2Feal > >>>>>>> %2Fcommon%2Frte_service.c%23L3 > >>>>>>> 98 > >>>>>>> > >>>>>>> For more information you can look at: > >>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122- > 313273af > >>>>>>> - > >>>> 454 > >>>>>>> 445555731-ba0d1416f08856f0&q=1&e=27b94605-d1e2-40b6- > af6d- > >>>> 9ebc54d5db1 > >>>>>>> > >>>> > >> > 8&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib > >>>> %2Feal > >>>>>>> %2Finclude%2Frte_service.h%23L > >>>>>>> 120 > >>>>>>> > >>>>>> > >>>>>> What about the > >>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122- > 313273af- > >>>> 4544 > >>>>>> 45555731-b64334addc78c264&q=1&e=27b94605-d1e2-40b6-af6d- > >>>> 9ebc54d5db18& > >>>>>> > >>>> > >> > u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib%2 > >>>> Feal%2F > >>>>>> common%2Frte_service.c%23L404 > >>>>>> call (for MT safe services)? > >>>>>> > >>>>>> There's no lock held there. > >>>>> Good point. > >>>>> This is the case where the service running in service cores is MT > >>>>> safe. However, > >>>> the stats are incremented outside of the MT Safety mechanism > >>>> employed by the service. So, yes, this and other updates in the > >>>> function 'service_runner_do_callback' need to be updated atomically. > >>>> > >>>> Maybe a better solution would be to move this to the core_state > >>>> struct (and eliminate the "calls" field since the same information > >>>> is already in the core_state struct). The contention on these cache > >>>> lines will be pretty crazy for services with short run time (say a > >>>> thousand cycles or less per call), assuming they are mapped to many > cores. > >>> That's one option, the structures are internal as well. With this > >>> option stats > >> need to be aggregated which will not give an accurate view. But, that > >> is the nature of the statistics. > >>> > >> > >> Per-core counters is a very common pattern. Used for Linux MIB > >> counters, for example. I'm not sure I think it's much less accurate. > >> I mean, you just load in quick succession what's globally visible for > >> the different per-lcore counters. If you do a relaxed store on an > >> ARM, how long time does it take until it's seen by someone doing a relaxed > load on a different core? Roughly. > > I think my explanation of the problem is not clear. > > > > If a service is running on more than one core and the stats are per core, > > when > we aggregate, the resulting statistics is not atomic. By making the stats per > core, > we will be taking out that feature which is present currently (even though it > is > implemented incorrectly). As we agree, the proposed change is a common > pattern and taking away the atomicity of the stats might not be a problem. > > > > Isn't it just a push model, versus a pull one? Both give just an > approximation, > albeit a very good one, of how many cycles are spent "now" for a particular > service. Isn't time a local phenomena in a SMP system, and there is no global > "now"? Maybe you can achieve such with a transaction or handshake of some > sort, but I don't see how the an __atomic_add would be enough. If we consider a global time line of events, using atomic operation will provide a single 'now' from the reader's perspective (of course there might be writers waiting to update). Without the atomic operations, there will not be a single 'now' from reader's perspective, there will be multiple read events on the timeline.
> > I was fortunate to get some data from a real-world application, and enabling > service core stats resulted in a 7% degradation of overall system capacity. > I'm > guessing atomic instructions would not make things better. Is the service running on multiple cores? > > >> > >>> I am also wondering if these stats are of any use other than for > >>> debugging. > >> Adding a capability to disable stats might help as well. > >>> > >> > >> They could be used as a crude tool to determine service core utilization. > >> Comparing utilization between different services running on the same core > >> should be straight-forward, but lcore utilization is harder in absolute > >> terms. If > >> you just look at "cycles", a completely idle core would look like it's > >> very busy > >> (basically rdtsc latency added for every loop). I assume you'd have to do > some > >> heuristic based on both "calls" and "cycles" to get an estimate. > >> > >> I think service core utilization would be very useful, although that would > require > >> some changes in the service function signature, so the service can report > back if > >> it did some useful work for a particular call. > >> > >> That would make for a DPDK 'top'. Just like 'top', it can't impose any > >> serious > >> performance degradation when used, to be really useful, I think. > >> > >> Sure, it should be possible to turn it on and off. I thought that was the > >> case > >> already? > > Thanks, yes, this exists already. Though the 'loops' counter is out of the > > stats > enable check, looks like it is considered as an attribute for some reason. > > > >> > >>>> > >>>> Idle service cores will basically do nothing else than stall waiting > >>>> for these lines, I suspect, hampering the progress of more busy cores. > >