Hi Daniel, Erez & Slava,

My time to be sorry, I missed this email when coming back from vacation.

On 8/18/23 14:04, Daniel Östman wrote:
Hi Maxime,

Sorry for the late reply, I've been on vacation.
Please see my answer below.

/ Daniel

-----Original Message-----
From: Maxime Coquelin <[email protected]>
Sent: Thursday, 22 June 2023 17:48
To: Daniel Östman <[email protected]>; Erez Ferber
<[email protected]>; Slava Ovsiienko <[email protected]>
Cc: [email protected]; Matan Azrad <[email protected]>;
[email protected]
Subject: Re: mlx5: imissed / out_of_buffer counter always 0

Hi,

On 6/21/23 22:22, Maxime Coquelin wrote:
Hi Daniel, all,

On 6/5/23 16:00, Daniel Östman wrote:
Hi Slava and Erez and thanks for your answers,

Regarding the firmware, I’ve also deployed in a different OpenShift
cluster were I see the exact same issue but with a different Mellanox
NIC:

Mellanox Technologies MT2892 Family - ConnectX-6 DX 2-port 100GbE
QSFP56 PCIe Adapter

driver: mlx5_core

version: 5.0-0
firmware-version: 22.36.1010 (DEL0000000027)

  From what I can see the firmware is relatively new on that one?

With below configuration:
- ConnectX-6 Dx MT2892
- Kernel: 6.4.0-rc6
- FW version: 22.35.1012 (MT_0000000528)

The out-of-buffer counter is fetched via
mlx5_devx_cmd_queue_counter_query():

[pid  2942] ioctl(17, RDMA_VERBS_IOCTL, 0x7ffcb15bcd10) = 0 [pid
2942] write(1, "\n  ######################## NIC "..., 80) = 80 [pid
2942] write(1, "  RX-packets: 630997736  RX-miss"..., 70) = 70 [pid
2942] write(1, "  RX-errors: 0\n", 15) = 15 [pid  2942] write(1, "
RX-nombuf:  0         \n", 25) = 25 [pid  2942] write(1, "
TX-packets: 0          TX-erro"..., 60) = 60 [pid  2942] write(1,
"\n", 1)           = 1 [pid  2942] write(1, "  Throughput (since last
show)\n", 31) = 31 [pid  2942] write(1, "  Rx-pps:            0
"..., 106) = 106 [pid  2942] write(1,"
##############################"..., 79) = 79

It looks like we may miss some mlx5 kernel patches so that we can use
mlx5_devx_cmd_queue_counter_query() with RHEL?

Erez, Slava, any idea on the patches that could be missing?

Above test was on baremetal as root, I get the same "working" behaviour on
RHEL as root.

We managed to reproduce Daniel's with running the same within a container,
enabling debug logs we have this warning:

mlx5_common: DevX create q counter set failed errno=121 status=0x2
syndrome=0x8975f1
mlx5_net: Port 0 queue counter object cannot be created by DevX - fall-back
to use the kernel driver global queue counter.

Running the container as privileged solves the issue, and so does when
adding SYS_RAWIO capability to the container.

Erez, Slava, is that expected to require SYS_RAWIO just to get a stat counter?

Erez & Slava, could it be possible to get the stats counters via devx
without requiring SYS_RAWIO?


Daniel, could you try adding SYS_RAWIO to your pod to confirm you face the
same issue?

Yes I can confirm what you are seeing when running in a cluster with Openshift 
4.12 (RHEL 8.6) and with SYS_RAWIO or running as privileged.
But with privileged container I also need to run with UID 0 for it to work, is 
that what you are doing as well?

I don't have an OCP setup at hand right now to test it, but IIRC yes we
ran it with UID 0.

In both these cases the counter can be successfully retrieved through the DevX 
interface.

Ok.

However, when running in a cluster with Openshift 4.10 (RHEL 8.4) I can not get 
it to work with any of these two approaches.

I'm not sure this is Kernel related, as I tested on both RHEL-8.4.0 and
latest RHEL_8.4 and I can get que q counters via ioctl().

Maxime

Thanks in advance,
Maxime
Regards,
Maxime


I tried setting dv_flow_en=0 (and saw that it was propagated to
config->dv_flow_en) but it didn’t seem to help.

Erez, I’m not sure what you mean by shared or non-shared mode in this
case, however it seems it could be related to the fact that the
container is running in a separate network namespace. Because the
hw_counter directory is available on the host (cluster node), but not
in the pod container.

Best regards,

Daniel

*From:*Erez Ferber <[email protected]>
*Sent:* Monday, 5 June 2023 12:29
*To:* Slava Ovsiienko <[email protected]>
*Cc:* Daniel Östman <[email protected]>; [email protected];
Matan Azrad <[email protected]>; [email protected];
[email protected]
*Subject:* Re: mlx5: imissed / out_of_buffer counter always 0

Hi Daniel,

is the container running in shared or non-shared mode ?

For shared mode, I assume the kernel sysfs counters which DPDK relies
on for imissed/out_of_buffer are not exposed.

Best regards,

Erez

On Fri, 2 Jun 2023 at 18:07, Slava Ovsiienko <[email protected]
<mailto:[email protected]>> wrote:

     Hi, Daniel

     I would recommend to take the following action:

     - update the firmware, 16.33.xxxx looks to be outdated a little bit.
     Please, try 16.35.1012 or later.
        mlx5_glue->devx_obj_create might succeed with the newer FW.

     - try to specify dv_flow_en=0 devarg, it forces mlx5 PMD to use
     rdma_core library for queue management
       and kernel driver will  be aware about Rx queues being created
and
     attach them to the kernel counter set

     With best regards,
     Slava

     *From:*Daniel Östman <[email protected]
     <mailto:[email protected]>>
     *Sent:* Friday, June 2, 2023 3:59 PM
     *To:* [email protected] <mailto:[email protected]>
     *Cc:* Matan Azrad <[email protected] <mailto:[email protected]>>;
     Slava Ovsiienko <[email protected]
     <mailto:[email protected]>>; [email protected]
     <mailto:[email protected]>; [email protected]
     <mailto:[email protected]>
     *Subject:* mlx5: imissed / out_of_buffer counter always 0

     Hi,

     I’m deploying a containerized DPDK application in an OpenShift
     Kubernetes environment using DPDK 21.11.3.

     The application uses a Mellanox ConnectX-5 100G NIC through VFs.

     The problem I have is that the ETH stats counter imissed (which
     seems to be mapped to “out_of_buffer” internally in mlx5 PMD
driver)
     is 0 when I don’t expect it to be, i.e. when the application
doesn’t
     read the packets fast enough.

     Using GDB I can see that it tries to access the counter through
     /sys/class/infiniband/mlx5_99/ports/1/hw_counters/out_of_buffer
but
     the hw_counters directory is missing so it will just return a
zero
     value. I don’t know why it is missing.

     When looking at mlx5_os_read_dev_stat() I can see that there is
an
     alternative way of reading the counter, through
     mlx5_devx_cmd_queue_counter_query() but under the condition that
     priv->q_counters are set.

     It doesn’t get set in my case because
mlx5_glue->devx_obj_create()
     fails (errno 22) in mlx5_devx_cmd_queue_counter_alloc().

     Have I missed something?

     NIC info:

     Mellanox Technologies MT27800 Family [ConnectX-5] - 100Gb 2-port
     QSFP28 MCX516A-CCHT
     driver: mlx5_core
     version: 5.0-0
     firmware-version: 16.33.1048 (MT_0000000417)

     Please let me know if I need to provide more information.

     Best regards,

     Daniel


Reply via email to