I had similar issue; it was apparently not a lustre issue for us. In addition 
to the entries, you see below we also saw "AMD-Vi: Event ... IO_PAGE_FAULT " in 
the logs. 

Setting iommu=pt helped us.

Hope that helps. 

Thank you,
Amit

-----Original Message-----
From: lustre-discuss <[email protected]> On Behalf Of 
Nehring, Shane R [LAS] via lustre-discuss
Sent: Thursday, May 18, 2023 10:06 AM
To: [email protected]
Subject: [lustre-discuss] mlx5 errors on oss

Hello all,

We recently added infiniband to our cluster and are in the process of testing 
it with lustre. We're running the distro provided drivers for the mellanox 
cards with the latest firmware. Overnight we started seeing the following 
errors on a few oss:

infiniband mlx5_0: dump_cqe:272:(pid 40058): dump error cqe
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000030: 00 00 00 00 00 00 88 13 08 00 00 a0 00 63 4d d2 infiniband mlx5_0: 
dump_cqe:272:(pid 40057): dump error cqe
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000030: 00 00 00 00 00 00 88 13 08 00 00 a1 00 c2 8e d2 infiniband mlx5_0: 
dump_cqe:272:(pid 40057): dump error cqe
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000030: 00 00 00 00 00 00 88 13 08 00 00 a2 00 1a 12 d2

I found a post suggesting this might be iommu related, disabling the iommu 
doesn't seem to help any.

We're running luster 2.15, more or less at the tip of b2_15
(b74560d74a9f890838dbf2f0719e3d27c1e5eaf8)

Has anyone seen this before or have any pointers?

Thanks

Shane
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to