I’m going to also post this to the Cloudstack list as well.

Attempting to rsync a large file to the Ceph volume, the instance becomes 
unresponsive at the network level. It eventually returns but it will 
continually drop offline as the file copies. Dmesg shows this on the Cloudstack 
host machine:

[ 7144.888744] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
TDH <80>
TDT <d0>
next_to_use <d0>
next_to_clean <7f>
buffer_info[next_to_clean]:
time_stamp <100686d46>
next_to_watch <80>
jiffies <100687140>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 7146.872563] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
TDH <80>
TDT <d0>
next_to_use <d0>
next_to_clean <7f>
buffer_info[next_to_clean]:
time_stamp <100686d46>
next_to_watch <80>
jiffies <100687900>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 7148.856703] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
TDH <80>
TDT <d0>
next_to_use <d0>
next_to_clean <7f>
buffer_info[next_to_clean]:
time_stamp <100686d46>
next_to_watch <80>
jiffies <1006880c0>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 7150.199756] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly

The host machine:

System Information
Manufacturer: Dell Inc.
Product Name: OptiPlex 990

Running CentOS 8.4.

I also see the same error on another host of a different hw type:

Manufacturer: Hewlett-Packard
Product Name: HP Compaq 8200 Elite SFF PC

but both are using e1000 drivers.

I upgraded the kernel to 5.13.x and I thought this fixed the issue, but now I 
see the error again.

Migrating the instance to a bigger server class machine (also e1000e, old 
Rackable system) where I have a bigger pipe via bonding, I don’t seem to have 
the issue.

Just curious if this could be a known bug with e1000e and if there is any kind 
of work around.

Thanks
-jeremy

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to