[ceph-users] Re: OSD crc errors: Faulty SSD?

Igor Fedotov Wed, 27 Aug 2025 08:01:03 -0700

Hi Roland,

this looks like a messenger error to me. Hence it's rather atransport/networking issue not data-at-rest one.


Thanks,

Igor

On 8/27/2025 5:01 PM, Roland Giesler wrote:

I have relatively new Samsung Enterprise NVMe in a node that isgenerating the following error:
2025-08-26T15:56:43.870+0200 7fe8ac968700 0 bad crc in data3326000616 != exp 1246001655 fromv1:192.168.131.4:0/17990930902025-08-26T16:03:54.757+0200 7fe8ad96a700 0 bad crc in data3195468789 != exp 4291467912 fromv1:192.168.131.3:0/3153987912025-08-26T16:17:34.160+0200 7fe8ad96a700 0 bad crc in data1471079732 != exp 1408597599 fromv1:192.168.131.3:0/3153987912025-08-26T16:33:34.035+0200 7fe8ad96a700 0 bad crc in data 724234454!= exp 3110238891 fromv1:192.168.131.3:0/3153987912025-08-26T16:36:34.265+0200 7fe8ad96a700 0 bad crc in data 96649884!= exp 3724606899 fromv1:192.168.131.3:0/3153987912025-08-26T16:40:34.395+0200 7fe8ad96a700 0 bad crc in data1554359919 != exp 1420125995 fromv1:192.168.131.3:0/3153987912025-08-26T16:54:18.323+0200 7fe8ad169700 0 bad crc in data 362320144!= exp 1850249930 fromv1:192.168.131.1:0/1316652062
This is ceph osd.40.  More details from the log:
2025-08-26T17:00:15.013+0200 7fe8a06fe700 4 rocksdb: EVENT_LOG_v1{"time_micros": 1756220415016487, "job": 447787, "event":"table_file_deletion", "file_number": 389377}2025-08-26T17:00:15.013+0200 7fe8a06fe700 4 rocksdb: EVENT_LOG_v1{"time_micros": 1756220415016839, "job": 447787, "event":"table_file_deletion", "file_number": 389359}2025-08-26T17:00:15.013+0200 7fe8a06fe700 4 rocksdb: EVENT_LOG_v1{"time_micros": 1756220415017245, "job": 447787, "event":"table_file_deletion", "file_number": 389342}2025-08-26T17:00:15.013+0200 7fe8a06fe700 4 rocksdb: EVENT_LOG_v1{"time_micros": 1756220415017633, "job": 447787, "event":"table_file_deletion", "file_number": 389276}2025-08-26T17:00:15.041+0200 7fe8a06fe700 4 rocksdb: EVENT_LOG_v1{"time_micros": 1756220415047481, "job": 447787, "event":"table_file_deletion", "file_number": 389254}2025-08-26T17:00:15.045+0200 7fe8a06fe700 4 rocksdb: (Original LogTime 2025/08/26-17:00:15.047592)[db/db_impl/db_impl_compaction_flush.cc:2818] Compaction nothing to do2025-08-26T17:04:35.776+0200 7fe899ee2700 4 rocksdb:[db/db_impl/db_impl.cc:901] ------- DUMPING STATS -------2025-08-26T17:04:35.776+0200 7fe899ee2700 4 rocksdb:[db/db_impl/db_impl.cc:903]
** DB Stats **
Uptime(secs): 24012063.5 total, 600.0 interval
Cumulative writes: 10G writes, 37G keys, 10G commit groups, 1.0 writesper commit group, ingest: 19073.99 GB, 0.81 MB/sCumulative WAL: 10G writes, 4860M syncs, 2.14 writes per sync,written: 19073.99 GB, 0.81 MB/s
Cumulative stall: 00:00:0.000H:M:S, 0.0 percent
Interval writes: 159K writes, 546K keys, 159K commit groups, 1.0writes per commit group, ingest: 202.96 MB, 0.34 MB/sInterval WAL: 159K writes, 76K syncs, 2.10 writes per sync, written:0.20 MB, 0.34 MB/s
Interval stall: 00:00:0.000H:M:S, 0.0 percent
** Compaction Stats [default] **
Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB)Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec)Comp(cnt) Avg(sec) KeyIn KeyDrop---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- L0 0/0 0.00 KB 0.0 0.0 0.0 0.0 2.4 2.4 0.0 1.0 0.0 25.1 98.33 49.99 19074 0.005 0 0 L1 2/0 137.69 MB 1.0 299.7 2.4 297.3 297.9 0.7 0.0 123.5 59.6 59.2 5152.28 4745.40 4769 1.080 8130M 47M L2 7/0 410.80 MB 0.2 1.3 0.2 1.1 1.2 0.1 0.3 5.8 79.4 72.1 17.00 15.59 4 4.250 32M 3822K Sum 9/0 548.48 MB 0.0 301.0 2.6 298.4 301.5 3.1 0.3 125.0 58.5 58.6 5267.61 4810.98 23847 0.221 8162M 51M Int 0/0 0.00 KB 0.0 0.1 0.0 0.1 0.1 0.0 0.0 3194.3 48.0 47.9 2.87 2.72 2 1.437 5633K 2173
** Compaction Stats [default] **
Priority Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB)Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec)Comp(cnt) Avg(sec) KeyIn KeyDrop------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Low 0/0 0.00 KB 0.0 301.0 2.6 298.4 299.1 0.7 0.0 0.0 59.6 59.3 5169.28 4761.00 4773 1.083 8162M 51MHigh 0/0 0.00 KB 0.0 0.0 0.0 0.0 2.4 2.4 0.0 0.0 0.0 25.1 98.33 49.99 19073 0.005 0 0User 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 32.8 0.00 0.00 1 0.001 0 0
This gets repeated many times until finally all seems well again.

This a few minutes later the problem repeats itself.

Is this a faulty ssd causing this?

Environment:
# pveversion: pve-manager/7.4-19/f98bf8d4 (running kernel:5.15.131-2-pve)# ceph version 17.2.7 (29dffbfe59476a6bb5363cf5cc629089b25654e3)quincy (stable)
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: OSD crc errors: Faulty SSD?

Reply via email to