RE: Service core statistics MT safety

Honnappa Nagarahalli Wed, 29 Jun 2022 13:07:20 -0700

<snip>
> >>>>>>>>>>
> >>>>>>>>>>> From: Mattias Rönnblom
> >>>>>>>>>>> [mailto:mattias.ronnb...@ericsson.com]
> >>>>>>>>>>> Sent: Monday, 27 June 2022 13.06
> >>>>>>>>>>>
> >>>>>>>>>>> Hi.
> >>>>>>>>>>>
> >>>>>>>>>>> Is it safe to enable stats on MT safe services?
> >>>>>>>>>>>
> >>>>>>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-
> >>>> 313273af
> >>>>>>>>>>> -
> >>>>>> 4
> >>>>>>>>>>> 54445555731-6096fdb16385f88f&q=1&e=27b94605-d1e2-
> 40b6-
> >>>> af6d-
> >>>>>> 9ebc54d
> >>>>>>>>>>>
> >>>>>>
> >>>>
> >>
> 5db18&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%
> >>>>>> 2Flib
> >>>>>>>>>>> %2Feal%2Fcommon%2Frte_service.c%23
> >>>>>>>>>>> L3
> >>>>>>>>>>> 6
> >>>>>>>>>>> 6
> >>>>>>>>>>>
> >>>>>>>>>>> It seems to me this would have to be an __atomic_add for
> >>>>>>>>>>> this code to produce deterministic results.
> >>>>>>>>>>
> >>>>>>>>>> I agree. The same goes for the 'calls' field.
> >>>>>>>>> The calling function does the locking.
> >>>>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-
> >> 313273af
> >>>>>>>>> -
> >>>>>> 454
> >>>>>>>>> 445555731-5ce419f8bf9f9b76&q=1&e=27b94605-d1e2-40b6-
> af6d-
> >>>>>> 9ebc54d5db1
> >>>>>>>>>
> >>>>>>
> >>>>
> >>
> 8&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib
> >>>>>> %2Feal
> >>>>>>>>> %2Fcommon%2Frte_service.c%23L3
> >>>>>>>>> 98
> >>>>>>>>>
> >>>>>>>>> For more information you can look at:
> >>>>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-
> >> 313273af
> >>>>>>>>> -
> >>>>>> 454
> >>>>>>>>> 445555731-ba0d1416f08856f0&q=1&e=27b94605-d1e2-40b6-
> >> af6d-
> >>>>>> 9ebc54d5db1
> >>>>>>>>>
> >>>>>>
> >>>>
> >>
> 8&u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib
> >>>>>> %2Feal
> >>>>>>>>> %2Finclude%2Frte_service.h%23L
> >>>>>>>>> 120
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> What about the
> >>>>>>>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-
> >> 313273af-
> >>>>>> 4544
> >>>>>>>> 45555731-b64334addc78c264&q=1&e=27b94605-d1e2-40b6-
> af6d-
> >>>>>> 9ebc54d5db18&
> >>>>>>>>
> >>>>>>
> >>>>
> >>
> u=https%3A%2F%2Fgithub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib%2
> >>>>>> Feal%2F
> >>>>>>>> common%2Frte_service.c%23L404
> >>>>>>>> call (for MT safe services)?
> >>>>>>>>
> >>>>>>>> There's no lock held there.
> >>>>>>> Good point.
> >>>>>>> This is the case where the service running in service cores is
> >>>>>>> MT safe. However,
> >>>>>> the stats are incremented outside of the MT Safety mechanism
> >>>>>> employed by the service. So, yes, this and other updates in the
> >>>>>> function 'service_runner_do_callback' need to be updated atomically.
> >>>>>>
> >>>>>> Maybe a better solution would be to move this to the core_state
> >>>>>> struct (and eliminate the "calls" field since the same
> >>>>>> information is already in the core_state struct). The contention
> >>>>>> on these cache lines will be pretty crazy for services with short
> >>>>>> run time (say a thousand cycles or less per call), assuming they
> >>>>>> are mapped to many
> >> cores.
> >>>>> That's one option, the structures are internal as well. With this
> >>>>> option stats
> >>>> need to be aggregated which will not give an accurate view. But,
> >>>> that is the nature of the statistics.
> >>>>>
> >>>>
> >>>> Per-core counters is a very common pattern. Used for Linux MIB
> >>>> counters, for example. I'm not sure I think it's much less accurate.
> >>>> I mean, you just load in quick succession what's globally visible
> >>>> for the different per-lcore counters. If you do a relaxed store on
> >>>> an ARM, how long time does it take until it's seen by someone doing
> >>>> a relaxed
> >> load on a different core? Roughly.
> >>> I think my explanation of the problem is not clear.
> >>>
> >>> If a service is running on more than one core and the stats are per
> >>> core, when
> >> we aggregate, the resulting statistics is not atomic. By making the
> >> stats per core, we will be taking out that feature which is present
> >> currently (even though it is implemented incorrectly). As we agree,
> >> the proposed change is a common pattern and taking away the atomicity of
> the stats might not be a problem.
> >>>
> >>
> >> Isn't it just a push model, versus a pull one? Both give just an
> >> approximation, albeit a very good one, of how many cycles are spent
> >> "now" for a particular service. Isn't time a local phenomena in a SMP
> >> system, and there is no global "now"? Maybe you can achieve such with
> >> a transaction or handshake of some sort, but I don't see how the an
> __atomic_add would be enough.
> > If we consider a global time line of events, using atomic operation will 
> > provide
> a single 'now' from the reader's perspective (of course there might be writers
> waiting to update). Without the atomic operations, there will not be a single
> 'now' from reader's perspective, there will be multiple read events on the
> timeline.
> >
> 
> At the time of the read operation (in the global counter solution), there may 
> well
> be cycles consumed or calls having been made, but not yet posted. The window
> between call having been made, and global counter having been incremented
> (and thus made globally visible) is small, but non-zero.
Agree. The read value is the atomic state of the system at a given instance 
(when the read was executed), though that instance happened few cycles back.
(Just to be clear, I am fine with per-core counters)


> 
> (The core-local counter solution also use atomic operations, although not
> __atomic_add, but store, for the producer, and load, for the consumer.)
Agree we need atomic operations. I am not sure if __atomic_fetch_add or 
__atomic_store_n would have a large difference. __atomic_fetch_add would result 
in less number of instructions. I am fine with either.

> 
> >>
> >> I was fortunate to get some data from a real-world application, and
> >> enabling service core stats resulted in a 7% degradation of overall
> >> system capacity. I'm guessing atomic instructions would not make things
> better.
> > Is the service running on multiple cores?
> >
> 
> Yes. I think something like 6 cores were used in this case. The effect will 
> grow
> with core count, obviously. On a large system, I don't think you will do much
> else but fight for this cache line.
> 
> If you want post counter updates to some shared data structure, you need to
> batch the updates to achieve reasonable efficiency. That will be, obviously, 
> at
> the cost of accuracy, since there will be a significant delay between local-
> counter-increment, and post-in-global-data-structure. The system will be much
> less able to answer how many cycles have been consumed at a particular point
> in time.
> 
> For really large counter sets, the above, batched-update approach may be
> required. You simply can't afford the memory required to duplicate the counter
> struct across all cores in the system. In my experience, this still can be 
> made to
> meet real-world counter accuracy requirement.
> (Accuracy in the time dimension.)
> 
> >>
> >>>>
> >>>>> I am also wondering if these stats are of any use other than for
> debugging.
> >>>> Adding a capability to disable stats might help as well.
> >>>>>
> >>>>
> >>>> They could be used as a crude tool to determine service core utilization.
> >>>> Comparing utilization between different services running on the same core
> >>>> should be straight-forward, but lcore utilization is harder in absolute
> terms. If
> >>>> you just look at "cycles", a completely idle core would look like it's 
> >>>> very
> busy
> >>>> (basically rdtsc latency added for every loop). I assume you'd have to do
> >> some
> >>>> heuristic based on both "calls" and "cycles" to get an estimate.
> >>>>
> >>>> I think service core utilization would be very useful, although that 
> >>>> would
> >> require
> >>>> some changes in the service function signature, so the service can report
> >> back if
> >>>> it did some useful work for a particular call.
> >>>>
> >>>> That would make for a DPDK 'top'. Just like 'top', it can't impose any
> serious
> >>>> performance degradation when used, to be really useful, I think.
> >>>>
> >>>> Sure, it should be possible to turn it on and off. I thought that was 
> >>>> the case
> >>>> already?
> >>> Thanks, yes, this exists already. Though the 'loops' counter is out of 
> >>> the stats
> >> enable check, looks like it is considered as an attribute for some reason.
> >>>
> >>>>
> >>>>>>
> >>>>>> Idle service cores will basically do nothing else than stall waiting
> >>>>>> for these lines, I suspect, hampering the progress of more busy cores.
> >>>
> >

RE: Service core statistics MT safety

Reply via email to