subject:"\[ceph\-users\] Re\: PGs down"

[ceph-users] Re: PGs down

2020-12-22 Thread Jeremy Austin

Hi Igor, I had taken the OSDs out already, so bringing them up allowed a full rebalance to occur. I verified that they were not exhibiting ATA or SMART-reportable errors, wiped them and re-added. I will deep scrub. Thanks again! Jeremy On Mon, Dec 21, 2020 at 11:39 PM Igor Fedotov wrote: >

[ceph-users] Re: PGs down

2020-12-22 Thread Igor Fedotov

Hi Jeremy, good to know you managed to bring your OSDs up. Have you been able to reweight them to 0 and migrate data out of these "broken" OSDs? If so I suggest to redeploy them - the corruption is still in the DB and it might pop-up one day. If not please do that first - you might still

[ceph-users] Re: PGs down

2020-12-21 Thread Jeremy Austin

Igor, You're a bloomin' genius, as they say. Disabling auto compaction allowed OSDs 11 and 12 to spin up/out. The 7 down PGs recovered; there were a few unfound items previously which I went ahead and deleted, given that this is EC, revert not being an option. HEALTH OK :) I'm now intending to

[ceph-users] Re: PGs down

2020-12-21 Thread Igor Fedotov

Hi Alexander, the option you provided controls bluefs log compaction not rocksdb ones. Hence it doesn't make sense in Jeremy's case. Thanks, Igor On 12/21/2020 6:55 AM, Alexander E. Patrakov wrote: On Mon, Dec 21, 2020 at 4:57 AM Jeremy Austin wrote: On Sun, Dec 20, 2020 at 2:25 PM

[ceph-users] Re: PGs down

2020-12-21 Thread Igor Fedotov

Hi Jeremy, you might want to try RocksDB's disable_auto_compactions option for that. To adjust rocksdb's options one should edit/insert bluestore_rocksdb_options in ceph.conf. E.g. bluestore_rocksdb_options =

[ceph-users] Re: PGs down

2020-12-21 Thread Jeremy Austin

On Sun, Dec 20, 2020 at 6:56 PM Alexander E. Patrakov wrote: > On Mon, Dec 21, 2020 at 4:57 AM Jeremy Austin wrote: > > > > On Sun, Dec 20, 2020 at 2:25 PM Jeremy Austin > wrote: > > > > > Will attempt to disable compaction and report. > > > > > > > Not sure I'm doing this right. In [osd]

[ceph-users] Re: PGs down

2020-12-20 Thread Alexander E. Patrakov

On Mon, Dec 21, 2020 at 4:57 AM Jeremy Austin wrote: > > On Sun, Dec 20, 2020 at 2:25 PM Jeremy Austin wrote: > > > Will attempt to disable compaction and report. > > > > Not sure I'm doing this right. In [osd] section of ceph.conf, I added > periodic_compaction_seconds=0 > > and attempted to

[ceph-users] Re: PGs down

2020-12-20 Thread Jeremy Austin

On Sun, Dec 20, 2020 at 2:25 PM Jeremy Austin wrote: > Will attempt to disable compaction and report. > Not sure I'm doing this right. In [osd] section of ceph.conf, I added periodic_compaction_seconds=0 and attempted to start the OSDs in question. Same error as before. Am I setting compaction

[ceph-users] Re: PGs down

2020-12-20 Thread Jeremy Austin

Sorry for the delay, Igor; answers inline. On Mon, Dec 14, 2020 at 2:09 AM Igor Fedotov wrote: > Hi Jeremy, > > I think you lost the data for OSD.11 & .12 I'm not aware of any reliable > enough way to recover RocksDB from this sort of errors. > > Theoretically you might want to disable auto

[ceph-users] Re: PGs down

2020-12-15 Thread Igor Fedotov

otov Sent: Monday, 14 December 2020 12:09 To: Jeremy Austin Cc: ceph-users@ceph.io Subject: [ceph-users] Re: PGs down Hi Jeremy, I think you lost the data for OSD.11 & .12 I'm not aware of any reliable enough way to recover RocksDB from this sort of errors. Theoretically you might want t

[ceph-users] Re: PGs down

2020-12-15 Thread Wout van Heeswijk

To: Jeremy Austin Cc: ceph-users@ceph.io Subject: [ceph-users] Re: PGs down Hi Jeremy, I think you lost the data for OSD.11 & .12 I'm not aware of any reliable enough way to recover RocksDB from this sort of errors. Theoretically you might want to disable auto compaction for Roc

[ceph-users] Re: PGs down

2020-12-14 Thread Igor Fedotov

Hi Jeremy, I think you lost the data for OSD.11 & .12 I'm not aware of any reliable enough way to recover RocksDB from this sort of errors. Theoretically you might want to disable auto compaction for RocksDB for these daemons and try to bring then up and attempt to drain the data out of

[ceph-users] Re: PGs down

2020-12-13 Thread Jeremy Austin

OSD 12 looks much the same.I don't have logs back to the original date, but this looks very similar — db/sst corruption. The standard fsck approaches couldn't fix it. I believe it was a form of ATA failure — OSD 11 and 12, if I recall correctly, did not actually experience SMARTD-reportable

[ceph-users] Re: PGs down

2020-12-12 Thread Igor Fedotov

Hi Jeremy, wondering what were the OSDs' logs when they crashed for the first time? And does OSD.12 reports the similar problem for now: 3> 2020-12-12 20:23:45.756 7f2d21404700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: expected 3113305400, got 1242690251 in

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

[ceph-users] Re: PGs down

14 matches

Site Navigation

Mail list logo

Footer information