Hi Roland,

this looks like a messenger error to me. Hence it's rather a transport/networking issue not data-at-rest one.

Thanks,

Igor

On 8/27/2025 5:01 PM, Roland Giesler wrote:
I have relatively new Samsung Enterprise NVMe in a node that is generating the following error:

2025-08-26T15:56:43.870+0200 7fe8ac968700  0 bad crc in data 3326000616 != exp 1246001655 fromv1:192.168.131.4:0/1799093090 2025-08-26T16:03:54.757+0200 7fe8ad96a700  0 bad crc in data 3195468789 != exp 4291467912 fromv1:192.168.131.3:0/315398791 2025-08-26T16:17:34.160+0200 7fe8ad96a700  0 bad crc in data 1471079732 != exp 1408597599 fromv1:192.168.131.3:0/315398791 2025-08-26T16:33:34.035+0200 7fe8ad96a700  0 bad crc in data 724234454 != exp 3110238891 fromv1:192.168.131.3:0/315398791 2025-08-26T16:36:34.265+0200 7fe8ad96a700  0 bad crc in data 96649884 != exp 3724606899 fromv1:192.168.131.3:0/315398791 2025-08-26T16:40:34.395+0200 7fe8ad96a700  0 bad crc in data 1554359919 != exp 1420125995 fromv1:192.168.131.3:0/315398791 2025-08-26T16:54:18.323+0200 7fe8ad169700  0 bad crc in data 362320144 != exp 1850249930 fromv1:192.168.131.1:0/1316652062


This is ceph osd.40.  More details from the log:

2025-08-26T17:00:15.013+0200 7fe8a06fe700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1756220415016487, "job": 447787, "event": "table_file_deletion", "file_number": 389377} 2025-08-26T17:00:15.013+0200 7fe8a06fe700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1756220415016839, "job": 447787, "event": "table_file_deletion", "file_number": 389359} 2025-08-26T17:00:15.013+0200 7fe8a06fe700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1756220415017245, "job": 447787, "event": "table_file_deletion", "file_number": 389342} 2025-08-26T17:00:15.013+0200 7fe8a06fe700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1756220415017633, "job": 447787, "event": "table_file_deletion", "file_number": 389276} 2025-08-26T17:00:15.041+0200 7fe8a06fe700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1756220415047481, "job": 447787, "event": "table_file_deletion", "file_number": 389254} 2025-08-26T17:00:15.045+0200 7fe8a06fe700  4 rocksdb: (Original Log Time 2025/08/26-17:00:15.047592) [db/db_impl/db_impl_compaction_flush.cc:2818] Compaction nothing to do 2025-08-26T17:04:35.776+0200 7fe899ee2700  4 rocksdb: [db/db_impl/db_impl.cc:901] ------- DUMPING STATS ------- 2025-08-26T17:04:35.776+0200 7fe899ee2700  4 rocksdb: [db/db_impl/db_impl.cc:903]
** DB Stats **
Uptime(secs): 24012063.5 total, 600.0 interval
Cumulative writes: 10G writes, 37G keys, 10G commit groups, 1.0 writes per commit group, ingest: 19073.99 GB, 0.81 MB/s Cumulative WAL: 10G writes, 4860M syncs, 2.14 writes per sync, written: 19073.99 GB, 0.81 MB/s
Cumulative stall: 00:00:0.000H:M:S, 0.0 percent
Interval writes: 159K writes, 546K keys, 159K commit groups, 1.0 writes per commit group, ingest: 202.96 MB, 0.34 MB/s Interval WAL: 159K writes, 76K syncs, 2.10 writes per sync, written: 0.20 MB, 0.34 MB/s
Interval stall: 00:00:0.000H:M:S, 0.0 percent
** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------   L0      0/0    0.00 KB   0.0      0.0     0.0      0.0 2.4      2.4       0.0   1.0      0.0     25.1 98.33             49.99     19074    0.005       0      0   L1      2/0   137.69 MB   1.0    299.7     2.4    297.3 297.9      0.7       0.0 123.5     59.6     59.2 5152.28           4745.40      4769    1.080   8130M    47M   L2      7/0   410.80 MB   0.2      1.3     0.2      1.1 1.2      0.1       0.3   5.8     79.4     72.1 17.00             15.59         4    4.250     32M  3822K  Sum      9/0   548.48 MB   0.0    301.0     2.6    298.4 301.5      3.1       0.3 125.0     58.5     58.6 5267.61           4810.98     23847    0.221   8162M    51M  Int      0/0    0.00 KB   0.0      0.1     0.0      0.1 0.1      0.0       0.0 3194.3     48.0     47.9 2.87              2.72         2    1.437   5633K   2173
** Compaction Stats [default] **
Priority    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  Low      0/0    0.00 KB   0.0    301.0     2.6    298.4 299.1      0.7       0.0   0.0     59.6     59.3 5169.28           4761.00      4773    1.083   8162M    51M High      0/0    0.00 KB   0.0      0.0     0.0      0.0 2.4      2.4       0.0   0.0      0.0     25.1 98.33             49.99     19073    0.005       0      0 User      0/0    0.00 KB   0.0      0.0     0.0      0.0 0.0      0.0       0.0   0.0      0.0     32.8 0.00              0.00         1    0.001       0      0

This gets repeated many times until finally all seems well again.

This a few minutes later the problem repeats itself.

Is this a faulty ssd causing this?

Environment:
# pveversion: pve-manager/7.4-19/f98bf8d4 (running kernel: 5.15.131-2-pve) # ceph version 17.2.7 (29dffbfe59476a6bb5363cf5cc629089b25654e3) quincy (stable)
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to