On 11 Sep 2023, at 12:41, Robin Jarry wrote:

> Hey Kevin,
>
> Kevin Traynor, Sep 07, 2023 at 15:37:
>> This came up in conversation with other maintainers as I mentioned I was
>> reviewing and the question raised was - Why add this ? if you want these
>> values exposed, wouldn't it be better to to add to ovsdb ?
>
> That's a good point. I had considered using ovsdb but it seemed to me
> less suitable for a few reasons:
>
> * I had understood that ovsdb is a configuration database, not a state
>   reporting database.
>
> * To have reliable and up to date numbers, ovs would need to push them
>   at high rate to the database so that clients to get outdated cpu
>   usage. The DPDK telemetry socket is real-time, the current numbers are
>   returned on every request.
>
> * I would need to define a custom schema / table to store structured
>   information in the db. The DPDK telemetry socket already has a schema
>   defined for this.
>
> * Accessing ovsdb requires a library making it more complex to use for
>   telemetry scrapers. The DPDK telemetry socket can be accessed with
>   a standalone python script with no external dependencies[1].
>
> [1]: 
> https://github.com/rjarry/dpdk/blob/main/usertools/prometheus-dpdk-exporter.py#L135-L143
>
> Maybe my observations are wrong, please do correct me if they are.

I feel like if we do need another way of getting (real time) statistics out of 
OVS, we should use the same communication channel as the other ovs-xxx 
utilities are using. But rather than returning text-based responses, we might 
be able to make it JSON (which is already used by the dbase). I know that 
Adrian is already investigating machine-readable output for some existing 
utilities, maybe it can be extended for the (pmd) statistics use case.

Using something like the DPDK telemetry socket, might not work for other use 
cases where DPDK is not in play.

>> Are you looking for individual lcore usage with identification of that
>> pmd? or overall aggregate usage ?
>>
>> I ask because it will report lcore id's which would need to be mapped to
>> pmd core id's for anything regarding individual pmds.
>>
>> That can be found in ovs-vswitchd.log or checked locally with
>> 'ovs-appctl dpdk/lcore-list' but assuming if they were available, then
>> user would not be using dpdk telemetry anyway.
>
> I would assume that the important data is the aggregate usage for
> overall monitoring and resource planing. Individual pmd usage can be
> accessed for fine tuning and debugging via appctl.
>
>> These stats are cumulative so in the absence of 'ovs-appctl
>> dpif-netdev/pmd-stats-clear'  that would need to be taken care of with
>> some post-processing by whatever is pulling these stats - otherwise
>> you'll get cumulative stats for an unknown time period and unknown
>> traffic profile (e.g. it would be counting before any traffic started).
>>
>> These might also be reset with pmd-stats-clear independently, so that
>> would need to be accounted for too.
>
> The only important data point that we need is the ratio between
> busy/(busy + idle) over a specified delta which any scraper can do.
> I consider these numbers like any other counter that can eventually be
> reset.
>
> See this reply from Morten Brørup on dpdk-dev for more context:
>
> https://lore.kernel.org/dpdk-dev/[email protected]/
>
>> Another thing I noticed is that without the pmd-sleep info the stats in
>> isolation can be misleading. Example below:
>>
>> With low rate traffic and clearing stats between 10 sec runs
>>
>> 2023-09-07T13:14:56Z|00158|dpif_netdev|INFO|PMD max sleep request is 0
>> usecs.
>> 2023-09-07T13:14:56Z|00159|dpif_netdev|INFO|PMD load based sleeps are
>> disabled.
>>
>> Time: 13:15:06.842
>> Measurement duration: 10.009 s
>>
>> pmd thread numa_id 0 core_id 8:
>>
>>    Iterations:             51712564  (0.19 us/it)
>>    - Used TSC cycles:   26021354654  (100.0 % of total cycles)
>>    - idle iterations:      51710963  ( 99.9 % of used cycles)
>>    - busy iterations:          1601  (  0.1 % of used cycles)
>>    - sleep iterations:            0  (  0.0 % of iterations)
>> ^^^ can see here that pmd does not sleep and is 0.1% busy
>>
>>    Sleep time (us):               0  (  0 us/iteration avg.)
>>    Rx packets:                37250  (4 Kpps, 866 cycles/pkt)
>>    Datapath passes:           37250  (1.00 passes/pkt)
>>    - PHWOL hits:                  0  (  0.0 %)
>>    - MFEX Opt hits:               0  (  0.0 %)
>>    - Simple Match hits:       37250  (100.0 %)
>>    - EMC hits:                    0  (  0.0 %)
>>    - SMC hits:                    0  (  0.0 %)
>>    - Megaflow hits:               0  (  0.0 %, 0.00 subtbl lookups/hit)
>>    - Upcalls:                     0  (  0.0 %, 0.0 us/upcall)
>>    - Lost upcalls:                0  (  0.0 %)
>>    Tx packets:                    0
>>
>> {
>>    "/eal/lcore/usage": {
>>      "lcore_ids": [
>>        1
>>      ],
>>      "total_cycles": [
>>        26127284389
>>      ],
>>      "busy_cycles": [
>>        32370313
>>      ]
>>    }
>> }
>>
>> ^^^ This in isolation implies pmd is 32370313/26127284389 0.12% busy
>> which is true
>>
>> 2023-09-07T13:15:06Z|00160|dpif_netdev|INFO|PMD max sleep request is 500
>> usecs.
>> 2023-09-07T13:15:06Z|00161|dpif_netdev|INFO|PMD load based sleeps are
>> enabled.
>>
>> Time: 13:15:16.908
>> Measurement duration: 10.008 s
>>
>> pmd thread numa_id 0 core_id 8:
>>
>>    Iterations:                75197  (133.09 us/it)
>>    - Used TSC cycles:     237910969  (  0.9 % of total cycles)
>>    - idle iterations:         73782  ( 74.4 % of used cycles)
>>    - busy iterations:          1415  ( 25.6 % of used cycles)
>>    - sleep iterations:        74033  ( 98.5 % of iterations)
>> ^^^ can see here that pmd spends most of the time sleeping and is 25%
>> busy when it is not sleeping
>>
>>    Sleep time (us):         9916314  (134 us/iteration avg.)
>>    Rx packets:                37249  (4 Kpps, 1637 cycles/pkt)
>>    Datapath passes:           37249  (1.00 passes/pkt)
>>    - PHWOL hits:                  0  (  0.0 %)
>>    - MFEX Opt hits:               0  (  0.0 %)
>>    - Simple Match hits:       37249  (100.0 %)
>>    - EMC hits:                    0  (  0.0 %)
>>    - SMC hits:                    0  (  0.0 %)
>>    - Megaflow hits:               0  (  0.0 %, 0.00 subtbl lookups/hit)
>>    - Upcalls:                     0  (  0.0 %, 0.0 us/upcall)
>>    - Lost upcalls:                0  (  0.0 %)
>>    Tx packets:                    0
>>
>> {
>>    "/eal/lcore/usage": {
>>      "lcore_ids": [
>>        1
>>      ],
>>      "total_cycles": [
>>        238786638
>>      ],
>>      "busy_cycles": [
>>        61268951
>>      ]
>>    }
>> }
>>
>> ^^^ this in isolation implies that pmd is 61268951/238786638 25% busy
>> but it's misleading because missing sleep info
>
> Hmm I should add the sleep cycles to the total_cycles counter. I thought
> it was part of idle. Good catch.
>
> Thanks for the review and testing!

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to