Hi Igor,
I had taken the OSDs out already, so bringing them up allowed a full
rebalance to occur.
I verified that they were not exhibiting ATA or SMART-reportable errors,
wiped them and re-added.
I will deep scrub.
Thanks again!
Jeremy
On Mon, Dec 21, 2020 at 11:39 PM Igor Fedotov wrote:
>
Hi Jeremy,
good to know you managed to bring your OSDs up.
Have you been able to reweight them to 0 and migrate data out of these
"broken" OSDs?
If so I suggest to redeploy them - the corruption is still in the DB and
it might pop-up one day.
If not please do that first - you might still
Igor,
You're a bloomin' genius, as they say.
Disabling auto compaction allowed OSDs 11 and 12 to spin up/out. The 7 down
PGs recovered; there were a few unfound items previously which I went ahead
and deleted, given that this is EC, revert not being an option.
HEALTH OK :)
I'm now intending to
Hi Alexander,
the option you provided controls bluefs log compaction not rocksdb ones.
Hence it doesn't make sense in Jeremy's case.
Thanks,
Igor
On 12/21/2020 6:55 AM, Alexander E. Patrakov wrote:
On Mon, Dec 21, 2020 at 4:57 AM Jeremy Austin wrote:
On Sun, Dec 20, 2020 at 2:25 PM
Hi Jeremy,
you might want to try RocksDB's disable_auto_compactions option for that.
To adjust rocksdb's options one should edit/insert
bluestore_rocksdb_options in ceph.conf.
E.g.
bluestore_rocksdb_options =
On Sun, Dec 20, 2020 at 6:56 PM Alexander E. Patrakov
wrote:
> On Mon, Dec 21, 2020 at 4:57 AM Jeremy Austin wrote:
> >
> > On Sun, Dec 20, 2020 at 2:25 PM Jeremy Austin
> wrote:
> >
> > > Will attempt to disable compaction and report.
> > >
> >
> > Not sure I'm doing this right. In [osd]
On Mon, Dec 21, 2020 at 4:57 AM Jeremy Austin wrote:
>
> On Sun, Dec 20, 2020 at 2:25 PM Jeremy Austin wrote:
>
> > Will attempt to disable compaction and report.
> >
>
> Not sure I'm doing this right. In [osd] section of ceph.conf, I added
> periodic_compaction_seconds=0
>
> and attempted to
On Sun, Dec 20, 2020 at 2:25 PM Jeremy Austin wrote:
> Will attempt to disable compaction and report.
>
Not sure I'm doing this right. In [osd] section of ceph.conf, I added
periodic_compaction_seconds=0
and attempted to start the OSDs in question. Same error as before. Am I
setting compaction
Sorry for the delay, Igor; answers inline.
On Mon, Dec 14, 2020 at 2:09 AM Igor Fedotov wrote:
> Hi Jeremy,
>
> I think you lost the data for OSD.11 & .12 I'm not aware of any reliable
> enough way to recover RocksDB from this sort of errors.
>
> Theoretically you might want to disable auto
otov
Sent: Monday, 14 December 2020 12:09
To: Jeremy Austin
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: PGs down
Hi Jeremy,
I think you lost the data for OSD.11 & .12 I'm not aware of any
reliable enough way to recover RocksDB from this sort of errors.
Theoretically you might want t
To: Jeremy Austin
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: PGs down
Hi Jeremy,
I think you lost the data for OSD.11 & .12 I'm not aware of any
reliable enough way to recover RocksDB from this sort of errors.
Theoretically you might want to disable auto compaction for Roc
Hi Jeremy,
I think you lost the data for OSD.11 & .12 I'm not aware of any
reliable enough way to recover RocksDB from this sort of errors.
Theoretically you might want to disable auto compaction for RocksDB for
these daemons and try to bring then up and attempt to drain the data out
of
OSD 12 looks much the same.I don't have logs back to the original date, but
this looks very similar — db/sst corruption. The standard fsck approaches
couldn't fix it. I believe it was a form of ATA failure — OSD 11 and 12, if
I recall correctly, did not actually experience SMARTD-reportable
Hi Jeremy,
wondering what were the OSDs' logs when they crashed for the first time?
And does OSD.12 reports the similar problem for now:
3> 2020-12-12 20:23:45.756 7f2d21404700 -1 rocksdb: submit_common error:
Corruption: block checksum mismatch: expected 3113305400, got 1242690251
in
14 matches
Mail list logo