Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-12 Thread Christian Balzer
Hello, I can only nod emphatically to what Robert said, don't issue repairs unless you a) don't care about the data or b) have verified that your primary OSD is good. See this for some details on how establish which replica(s) are actually good or not:

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-12 Thread Christian Eichelmann
Hi Christian, Hi Robert, thank you for your replies! I was already expecting something like this. But I am seriously worried about that! Just assume that this is happening at night. Our shift has not necessarily enough knowledge to perform all the steps in Sebasien's article. And if we always

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-12 Thread Dan van der Ster
On Tue, May 12, 2015 at 1:07 AM, Anthony D'Atri a...@dreamsnake.net wrote: Agree that 99+% of the inconsistent PG's I see correlate directly to disk flern. Check /var/log/kern.log*, /var/log/messages*, etc. and I'll bet you find errors correlating. More to this... In the case that an

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-12 Thread Anthony D'Atri
For me that's true about 1/3 the time, but often I do still have to repair the PG after removing the affected OSD. YMMV. Agree that 99+% of the inconsistent PG's I see correlate directly to disk flern. Check /var/log/kern.log*, /var/log/messages*, etc. and I'll bet you find errors

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-11 Thread Chris Hoy Poy
-users@lists.ceph.com Subject: [ceph-users] Scrub Error / How does ceph pg repair work? Hi all! We are experiencing approximately 1 scrub error / inconsistent pg every two days. As far as I know, to fix this you can issue a ceph pg repair, which works fine for us. I have a few qestions regarding

[ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-11 Thread Christian Eichelmann
Hi all! We are experiencing approximately 1 scrub error / inconsistent pg every two days. As far as I know, to fix this you can issue a ceph pg repair, which works fine for us. I have a few qestions regarding the behavior of the ceph cluster in such a case: 1. After ceph detects the scrub error,

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-11 Thread Anthony D'Atri
Agree that 99+% of the inconsistent PG's I see correlate directly to disk flern. Check /var/log/kern.log*, /var/log/messages*, etc. and I'll bet you find errors correlating. -- Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-11 Thread Robert LeBlanc
Personally I would not just run this command automatically because as you stated, it only copies the primary PGs to the replicas and if the primary is corrupt, you will corrupt your secondaries.I think the monitor log shows which OSD has the problem so if it is not your primary, then just issue