Hi Jennifer,

I beg your pardon, but without further help all I see and can confirm is
that indeed you have I/O errors. I can see your I/O errors:

 381 Mar 23 12:23:58 ICTM1605S01H4 kernel: [ 1141.462069] blk_update_request: 
I/O error, dev nvme0c0n43, sector 2537840 op 0x0:(READ) flags 0x4004000 
phys_seg 66 prio class 0
 391 Mar 23 12:23:58 ICTM1605S01H4 kernel: [ 1141.464827] Buffer I/O error on 
dev nvme0n43, logical block 0, async page read
 394 Mar 23 12:23:58 ICTM1605S01H4 kernel: [ 1141.465199] 
ldm_validate_partition_table(): Disk read failed.

And what maybe is the underlying issue in a problematic rdma connection.

 566 Mar 23 12:24:18 ICTM1605S01H4 kernel: [ 1161.461659] nvme nvme1: rdma 
connection establishment failed (-104)
 567 Mar 23 12:24:18 ICTM1605S01H4 kernel: [ 1161.461673] nvme nvme1: Failed 
reconnect attempt 1
 568 Mar 23 12:24:18 ICTM1605S01H4 kernel: [ 1161.461678] nvme nvme1: 
Reconnecting in 10 seconds...

But unfortunately the logs themselves do neither contain more than what you've 
already found.
Nor is your description detailed enough to "try the same in other environments"

You wrote
"
  On all four of my Ubuntu 20.04 hosts, an I/O error is detected almost
  immediately after my E-Series storage controller ***???***.
"

I've marked the missing spot in your text, what steps exactly did you do
to trigger this error. The sentence in the bug report ends a bit abrupt.

Furthermore I wanted to ask, is that smash tool a helper to exercise
tests/stress onto the disks? And if so from where did you get it as I
have not immediately found good hits for it?

And finally, if Linux sees just I/O errors and failing rdma, there
should be something on the other end as well - at least disconnects or
such. So is there anything else that might help to understand this on
the netapp side?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1920991

Title:
  Ubuntu 20.04 - NVMe/IB I/O error detected while manually resetting
  controller

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvme-cli/+bug/1920991/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to