On 2022-06-28 17:24, Honnappa Nagarahalli wrote: > <snip> > >>>>>>>> >>>>>>>>> From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com] >>>>>>>>> Sent: Monday, 27 June 2022 13.06 >>>>>>>>> >>>>>>>>> Hi. >>>>>>>>> >>>>>>>>> Is it safe to enable stats on MT safe services? >>>>>>>>> >>>>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122- >> 313273af >>>>>>>>> - >>>> 4 >>>>>>>>> 54445555731-6096fdb16385f88f&q=1&e=27b94605-d1e2-40b6- >> af6d- >>>> 9ebc54d >>>>>>>>> >>>> >> 5db18&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain% >>>> 2Flib >>>>>>>>> %2Feal%2Fcommon%2Frte_service.c%23 >>>>>>>>> L3 >>>>>>>>> 6 >>>>>>>>> 6 >>>>>>>>> >>>>>>>>> It seems to me this would have to be an __atomic_add for this >>>>>>>>> code to produce deterministic results. >>>>>>>> >>>>>>>> I agree. The same goes for the 'calls' field. >>>>>>> The calling function does the locking. >>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af- >>>> 454 >>>>>>> 445555731-5ce419f8bf9f9b76&q=1&e=27b94605-d1e2-40b6-af6d- >>>> 9ebc54d5db1 >>>>>>> >>>> >> 8&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib >>>> %2Feal >>>>>>> %2Fcommon%2Frte_service.c%23L3 >>>>>>> 98 >>>>>>> >>>>>>> For more information you can look at: >>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af- >>>> 454 >>>>>>> 445555731-ba0d1416f08856f0&q=1&e=27b94605-d1e2-40b6-af6d- >>>> 9ebc54d5db1 >>>>>>> >>>> >> 8&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib >>>> %2Feal >>>>>>> %2Finclude%2Frte_service.h%23L >>>>>>> 120 >>>>>>> >>>>>> >>>>>> What about the >>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af- >>>> 4544 >>>>>> 45555731-b64334addc78c264&q=1&e=27b94605-d1e2-40b6-af6d- >>>> 9ebc54d5db18& >>>>>> >>>> >> u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib%2 >>>> Feal%2F >>>>>> common%2Frte_service.c%23L404 >>>>>> call (for MT safe services)? >>>>>> >>>>>> There's no lock held there. >>>>> Good point. >>>>> This is the case where the service running in service cores is MT >>>>> safe. However, >>>> the stats are incremented outside of the MT Safety mechanism employed >>>> by the service. So, yes, this and other updates in the function >>>> 'service_runner_do_callback' need to be updated atomically. >>>> >>>> Maybe a better solution would be to move this to the core_state >>>> struct (and eliminate the "calls" field since the same information is >>>> already in the core_state struct). The contention on these cache >>>> lines will be pretty crazy for services with short run time (say a >>>> thousand cycles or less per call), assuming they are mapped to many cores. >>> That's one option, the structures are internal as well. With this option >>> stats >> need to be aggregated which will not give an accurate view. But, that is the >> nature of the statistics. >>> >> >> Per-core counters is a very common pattern. Used for Linux MIB counters, for >> example. I'm not sure I think it's much less accurate. I mean, you just load >> in >> quick succession what's globally visible for the different per-lcore >> counters. If >> you do a relaxed store on an ARM, how long time does it take until it's seen >> by >> someone doing a relaxed load on a different core? Roughly. > I think my explanation of the problem is not clear. > > If a service is running on more than one core and the stats are per core, > when we aggregate, the resulting statistics is not atomic. By making the > stats per core, we will be taking out that feature which is present currently > (even though it is implemented incorrectly). As we agree, the proposed change > is a common pattern and taking away the atomicity of the stats might not be a > problem. >
Isn't it just a push model, versus a pull one? Both give just an approximation, albeit a very good one, of how many cycles are spent "now" for a particular service. Isn't time a local phenomena in a SMP system, and there is no global "now"? Maybe you can achieve such with a transaction or handshake of some sort, but I don't see how the an __atomic_add would be enough. I was fortunate to get some data from a real-world application, and enabling service core stats resulted in a 7% degradation of overall system capacity. I'm guessing atomic instructions would not make things better. >> >>> I am also wondering if these stats are of any use other than for debugging. >> Adding a capability to disable stats might help as well. >>> >> >> They could be used as a crude tool to determine service core utilization. >> Comparing utilization between different services running on the same core >> should be straight-forward, but lcore utilization is harder in absolute >> terms. If >> you just look at "cycles", a completely idle core would look like it's very >> busy >> (basically rdtsc latency added for every loop). I assume you'd have to do >> some >> heuristic based on both "calls" and "cycles" to get an estimate. >> >> I think service core utilization would be very useful, although that would >> require >> some changes in the service function signature, so the service can report >> back if >> it did some useful work for a particular call. >> >> That would make for a DPDK 'top'. Just like 'top', it can't impose any >> serious >> performance degradation when used, to be really useful, I think. >> >> Sure, it should be possible to turn it on and off. I thought that was the >> case >> already? > Thanks, yes, this exists already. Though the 'loops' counter is out of the > stats enable check, looks like it is considered as an attribute for some > reason. > >> >>>> >>>> Idle service cores will basically do nothing else than stall waiting >>>> for these lines, I suspect, hampering the progress of more busy cores. >