Re: [dpdk-users] Performance of rte_eth_stats_get

2021-07-14 Thread Alireza Sanaee

On 19/05/2021 17:06, Stephen Hemminger wrote:

On Wed, 19 May 2021 15:14:38 +
"Van Haaren, Harry"  wrote:


-Original Message-
From: users  On Behalf Of Filip Janiszewski
Sent: Wednesday, May 19, 2021 2:10 PM
To: users@dpdk.org
Subject: [dpdk-users] Performance of rte_eth_stats_get

Hi,

Is it safe to call rte_eth_stats_get while capturing from the port?

I'm mostly concerned about performance, if rte_eth_stats_get will in any
way impact the port performance, in the application I plan to call the
function from a thread that is not directly involved in the capture,
there's another worker responsible for rx bursting, but I wonder if the
NIC might get upset if I call it too frequently (say 10 times per
second) and potentially cause some performance issues.

The question is really Nic agnostic, but if the Nic vendor is actually
relevant then I'm running Intel 700 series nic and Mellanox ConnectX-4/5.


To understand what really goes on when getting stats, it might help to list the
steps involved in getting statistics from the NIC HW.

1) CPU sends an MMIO read (Memory Mapped I/O, aka, sometimes referred
to as a "pci read") to the NIC.
2) The PCI bus has to handle extra TLPs (pci transactions) to satisfy read
3) NIC has to send a reply based on accessing its internal counters
4) CPU gets a result from the PCI read.

Notice how elegantly this whole process is abstracted from SW? In code, reading
a stat value is just dereferencing a pointer that is mapped to the NIC HW 
address.
In practice from a CPU performance point of view, doing an MMIO-read is one of
the slowest things you can do. You say the stats-reads are occurring from a 
thread
that is not handling rx/datapath, so perhaps the CPU cycle cost itself isn't a 
concern.

Do note however, that when reading a full set of extended stats from the NIC, 
there
could be many 10's to 100's of MMIO reads (depending on the statistics 
requested,
and how the PMD itself is implemented to handle stats updates).

The PCI bus does become more busy with reads to the NIC HW when doing lots of
statistic updates, so there is some more contention/activity to be expected 
there.
The PCM tool can be very useful to see MMIO traffic, you could measure how many
extra PCI transactions are occurring due to reading stats every X ms?
https://github.com/opcm/pcm

I can recommend measuring pkt latency/jitter as a histogram, as then outliers 
in performance
can be identified. If you specifically want to identify if these are due stats 
reads, compare
with a "no stats reads" latency/jitter histogram, and graphically see the 
impact.
In the end if it doesn't affect packet latency/jitter, then it has no impact 
right?

Ultimately, I can't give a generic answer - best steps are to measure carefully 
and find out!


Thanks


Hope the above helps and doesn't add confusion :)  Regards, -Harry


Many drivers require transactions with the firmware via mailbox.
And that transaction needs a spin wait for the shared area.



Thank you for explaining the steps quite nicely. I also noticed this 
problem too. Calling `rte_eth_stats_get` in the PMDport per batch almost
halves the throughput in a 10G setup IIRC, the cost is prohibitively 
HIGH. This, however, doesn't show up when DPDK connects a vhost-pmdport, 
since all of port statistics are probably somewhere in the shared

memory.


Re: [dpdk-users] Performance of rte_eth_stats_get

2021-05-19 Thread Stephen Hemminger
On Wed, 19 May 2021 15:14:38 +
"Van Haaren, Harry"  wrote:

> > -Original Message-
> > From: users  On Behalf Of Filip Janiszewski
> > Sent: Wednesday, May 19, 2021 2:10 PM
> > To: users@dpdk.org
> > Subject: [dpdk-users] Performance of rte_eth_stats_get
> > 
> > Hi,
> > 
> > Is it safe to call rte_eth_stats_get while capturing from the port?
> > 
> > I'm mostly concerned about performance, if rte_eth_stats_get will in any
> > way impact the port performance, in the application I plan to call the
> > function from a thread that is not directly involved in the capture,
> > there's another worker responsible for rx bursting, but I wonder if the
> > NIC might get upset if I call it too frequently (say 10 times per
> > second) and potentially cause some performance issues.
> > 
> > The question is really Nic agnostic, but if the Nic vendor is actually
> > relevant then I'm running Intel 700 series nic and Mellanox ConnectX-4/5.  
> 
> To understand what really goes on when getting stats, it might help to list 
> the
> steps involved in getting statistics from the NIC HW.
> 
> 1) CPU sends an MMIO read (Memory Mapped I/O, aka, sometimes referred
> to as a "pci read") to the NIC.
> 2) The PCI bus has to handle extra TLPs (pci transactions) to satisfy read
> 3) NIC has to send a reply based on accessing its internal counters
> 4) CPU gets a result from the PCI read.
> 
> Notice how elegantly this whole process is abstracted from SW? In code, 
> reading
> a stat value is just dereferencing a pointer that is mapped to the NIC HW 
> address.
> In practice from a CPU performance point of view, doing an MMIO-read is one of
> the slowest things you can do. You say the stats-reads are occurring from a 
> thread
> that is not handling rx/datapath, so perhaps the CPU cycle cost itself isn't 
> a concern.
> 
> Do note however, that when reading a full set of extended stats from the NIC, 
> there
> could be many 10's to 100's of MMIO reads (depending on the statistics 
> requested,
> and how the PMD itself is implemented to handle stats updates).
> 
> The PCI bus does become more busy with reads to the NIC HW when doing lots of
> statistic updates, so there is some more contention/activity to be expected 
> there.
> The PCM tool can be very useful to see MMIO traffic, you could measure how 
> many
> extra PCI transactions are occurring due to reading stats every X ms?
> https://github.com/opcm/pcm
> 
> I can recommend measuring pkt latency/jitter as a histogram, as then outliers 
> in performance
> can be identified. If you specifically want to identify if these are due 
> stats reads, compare
> with a "no stats reads" latency/jitter histogram, and graphically see the 
> impact.
> In the end if it doesn't affect packet latency/jitter, then it has no impact 
> right?
> 
> Ultimately, I can't give a generic answer - best steps are to measure 
> carefully and find out!
> 
> > Thanks  
> 
> Hope the above helps and doesn't add confusion :)  Regards, -Harry

Many drivers require transactions with the firmware via mailbox.
And that transaction needs a spin wait for the shared area.


Re: [dpdk-users] Performance of rte_eth_stats_get

2021-05-19 Thread Van Haaren, Harry
> -Original Message-
> From: users  On Behalf Of Filip Janiszewski
> Sent: Wednesday, May 19, 2021 2:10 PM
> To: users@dpdk.org
> Subject: [dpdk-users] Performance of rte_eth_stats_get
> 
> Hi,
> 
> Is it safe to call rte_eth_stats_get while capturing from the port?
> 
> I'm mostly concerned about performance, if rte_eth_stats_get will in any
> way impact the port performance, in the application I plan to call the
> function from a thread that is not directly involved in the capture,
> there's another worker responsible for rx bursting, but I wonder if the
> NIC might get upset if I call it too frequently (say 10 times per
> second) and potentially cause some performance issues.
> 
> The question is really Nic agnostic, but if the Nic vendor is actually
> relevant then I'm running Intel 700 series nic and Mellanox ConnectX-4/5.

To understand what really goes on when getting stats, it might help to list the
steps involved in getting statistics from the NIC HW.

1) CPU sends an MMIO read (Memory Mapped I/O, aka, sometimes referred
to as a "pci read") to the NIC.
2) The PCI bus has to handle extra TLPs (pci transactions) to satisfy read
3) NIC has to send a reply based on accessing its internal counters
4) CPU gets a result from the PCI read.

Notice how elegantly this whole process is abstracted from SW? In code, reading
a stat value is just dereferencing a pointer that is mapped to the NIC HW 
address.
In practice from a CPU performance point of view, doing an MMIO-read is one of
the slowest things you can do. You say the stats-reads are occurring from a 
thread
that is not handling rx/datapath, so perhaps the CPU cycle cost itself isn't a 
concern.

Do note however, that when reading a full set of extended stats from the NIC, 
there
could be many 10's to 100's of MMIO reads (depending on the statistics 
requested,
and how the PMD itself is implemented to handle stats updates).

The PCI bus does become more busy with reads to the NIC HW when doing lots of
statistic updates, so there is some more contention/activity to be expected 
there.
The PCM tool can be very useful to see MMIO traffic, you could measure how many
extra PCI transactions are occurring due to reading stats every X ms?
https://github.com/opcm/pcm

I can recommend measuring pkt latency/jitter as a histogram, as then outliers 
in performance
can be identified. If you specifically want to identify if these are due stats 
reads, compare
with a "no stats reads" latency/jitter histogram, and graphically see the 
impact.
In the end if it doesn't affect packet latency/jitter, then it has no impact 
right?

Ultimately, I can't give a generic answer - best steps are to measure carefully 
and find out!

> Thanks

Hope the above helps and doesn't add confusion :)  Regards, -Harry


[dpdk-users] Performance of rte_eth_stats_get

2021-05-19 Thread Filip Janiszewski
Hi,

Is it safe to call rte_eth_stats_get while capturing from the port?

I'm mostly concerned about performance, if rte_eth_stats_get will in any
way impact the port performance, in the application I plan to call the
function from a thread that is not directly involved in the capture,
there's another worker responsible for rx bursting, but I wonder if the
NIC might get upset if I call it too frequently (say 10 times per
second) and potentially cause some performance issues.

The question is really Nic agnostic, but if the Nic vendor is actually
relevant then I'm running Intel 700 series nic and Mellanox ConnectX-4/5.

Thanks

-- 
BR, Filip
+48 666 369 823