Hi Frank
1. We will disable the disk controller and disk-level caching to avoid future 
issues.
2. My pools are:
ceph osd lspools
    2 cephfs_metadata
    3 cephfs_data
    4 rbd
The PG now inconsistent is 3.b,  therefore, it belongs to cephfs_data pool.
Following also shows the PG 3.b belongs to cephfs_data:
ceph pg ls-by-pool cephfs_data | grep 3.b
3.b     6992        0         0       0  9649392528           0          0 3005 
active+clean+inconsistent   ...

3. Deep scrubs shows only one object having an issue: soid 
3:d577e975:::1000023675e.00000000
This object seems lost.
rados -p cephfs_metadata ls | grep 1000023675e.00000000
rados -p cephfs_data ls | grep 1000023675e.00000000
rados -p rbd ls | grep 1000023675e.00000000

4. I tried to find what are the files effected by this issue, but I get "No 
such file or directory" for the path. I have properly mounted ceph on home as 
before.
cephfs-data-scan -c /etc/ceph/ceph.conf pg_files /home/sagara 
3.b2020-11-03T17:06:21.770+0800 7f3f213ab100 -1 pgeffects.hit_dir: Failed to 
open path: (2) No such file or directory
How do I see what are the files effected by this issue?

5. What should be the course of the action now to bring the cluster to 
"active+clean" to move forward? I don't mind roll back the PG having the issue. 
I have a file-level backup. If roll back the PG is the way forward, how to do?

Thank you.
Best regards
Sagara

    On Monday, November 2, 2020, 11:29:55 PM GMT+8, Frank Schilder 
<[email protected]> wrote:  
 
 > But there can be a on chip disk controller on the motherboard, I'm not sure.

There is always some kind of controller. Could be on-board. Usually, the cache 
settings are accessible when booting into the BIOS set-up.

> If your worry is fsync persistence

No, what I worry about is volatile write cache, which is usually enabled by 
default. This cache exists on disk as well as on controller. To avoid loosing 
writes on power fail, the controller needs to be in write-through mode and the 
disk write cache disabled. The latter can be done with smartctl, the former in 
the BIOS setup.

Did you test power failure? If so, how often? On how many hosts simultaneously? 
Pulling network cables will not trigger cache related problems. The problem 
with write cache is, that you rely on a lot of bells and whistles where some 
usually fail. With ceph, this will lead to exactly the problem you are 
observing now.

Your pool configuration looks OK. You need to find out where exactly the scrub 
errors are situated. It looks like meta-data damage and you might loose some 
data. Be careful to do only read-only admin operations for now.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


  
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to