пт, 7 окт. 2022 г. в 19:50, Frank Schilder :
> For the interested future reader, we have subdivided 400G high-performance
> SSDs into 4x100G OSDs for our FS meta data pool. The increased concurrency
> improves performance a lot. But yes, we are on the edge. OMAP+META is almost
> 50%.
Please be
m: Szabo, Istvan (Agoda)
Sent: 07 October 2022 14:28
To: Frank Schilder
Cc: Igor Fedotov; ceph-users@ceph.io
Subject: RE: [ceph-users] Re: OSD crashes during upgrade mimic->octopus
Finally how is your pg distribution? How many pg/disk?
Istvan Sza
Schilder
Sent: Friday, October 7, 2022 6:50 PM
To: Igor Fedotov ; ceph-users@ceph.io
Subject: [ceph-users] Re: OSD crashes during upgrade mimic->octopus
Email received from the internet. If in doubt, don't click any link nor open
any attachment !
Hi all,
try
crashes during upgrade mimic->octopus
Hi Igor,
I added a sample of OSDs on identical disks. The usage is quite well balanced,
so the numbers I included are representative. I don't believe that we had one
such extreme outlier. Maybe it ran full during conversion. Most of the data is
OMAP after
Just FYI:
standalone ceph-bluestore-tool's quick-fix behaves pretty similar to the
action performed on start-up with bluestore_fsck_quick_fix_on_mount = true
On 10/7/2022 10:18 AM, Frank Schilder wrote:
Hi Stefan,
super thanks!
I found a quick-fix command in the help output:
#
For format updates one can use quick-fix command instead of repair, it
might work a bit faster..
On 10/7/2022 10:07 AM, Stefan Kooman wrote:
On 10/7/22 09:03, Frank Schilder wrote:
Hi Igor and Stefan,
thanks a lot for your help! Our cluster is almost finished with
recovery and I would like
Hi Frank,
one more thing I realized during the night :)
Whe performing conversion DB gets a significant bunch of new data
(approx. on par with the original OMAP volume) without old one being
immediately removed. Hence one should expect DB size grows dramatically
at this point. Which should
Hi Stefan,
super thanks!
I found a quick-fix command in the help output:
# ceph-bluestore-tool -h
[...]
Positional options:
--command arg fsck, repair, quick-fix, bluefs-export,
bluefs-bdev-sizes, bluefs-bdev-expand,
Hi Igor and Stefan,
thanks a lot for your help! Our cluster is almost finished with recovery and I
would like to switch to off-line conversion of the SSD OSDs. In one of Stefan's
I coud find the command for manual compaction:
ceph-kvstore-tool bluestore-kv "/var/lib/ceph/osd/ceph-${OSD_ID}"
From: Frank Schilder
Sent: 07 October 2022 01:53:20
To: Igor Fedotov; ceph-users@ceph.io
Subject: [ceph-users] Re: OSD crashes during upgrade mimic->octopus
Hi Igor,
I added a sample of OSDs on identical disks. The usage is quite well balanced,
so the numbers I inclu
well, I've just realized that you're apparently unable to collect these
high-level stats for broken OSDs, aren't you?
But if that's the case you shouldn't make any assumption about faulty
OSDs utilization from healthy ones - it's definitely a very doubtful
approach ;)
On 10/7/2022 2:19
The log I inspected was for osd.16 so please share that OSD
utilization... And honestly I trust allocator's stats more so it's
rather CLI stats are incorrect if any. Anyway free dump should provide
additional proofs..
And once again - do other non-starting OSDs show the same ENOSPC error?
Hi Igor,
I suspect there is something wrong with the data reported. These OSDs are only
50-60% used. For example:
IDCLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP
META AVAIL%USE VAR PGS STATUS TYPE NAME
29 ssd 0.09099 1.0 93
Hi Frank,
the abort message "bluefs enospc" indicates lack of free space for
additional bluefs space allocations which prevents osd from startup.
From the following log line one can see that bluefs needs ~1M more
space while the total available one is approx 622M. the problem is that
bluefs
Hi Igor,
the problematic disk holds OSDs 16,17,18 and 19. OSD 16 is the one crashing the
show. I collected its startup log here: https://pastebin.com/25D3piS6 . The
line sticking out is line 603:
Hi Stefan and anyone else reading this, we are probably misunderstanding each
other here:
> There is a strict MDS maintenance dance you have to perform [1].
> ...
> [1]: https://docs.ceph.com/en/octopus/cephfs/upgrading/
Our ceph fs shut-down was *after* completing the upgrade to octopus, *not
Hi Stefan,
to answer your question as well:
> ... conversion from octopus to
> pacific, and the resharding as well). We would save half the time by
> compacting them before hand. It would take, in our case, many hours to
> do a conversion, so it would pay off immensely. ...
With experiments on
Hi Igor.
> But could you please share full OSD startup log for any one which is
> unable to restart after host reboot?
Will do. I also would like to know what happened here and if it is possible to
recover these OSDs. The rebuild takes ages with the current throttled recovery
settings.
>
Sorry - no clue about CephFS related questions...
But could you please share full OSD startup log for any one which is
unable to restart after host reboot?
On 10/6/2022 5:12 PM, Frank Schilder wrote:
Hi Igor and Stefan.
Not sure why you're talking about replicated(!) 4(2) pool.
Its
On 10/6/22 16:12, Frank Schilder wrote:
Hi Igor and Stefan.
Not sure why you're talking about replicated(!) 4(2) pool.
Its because in the production cluster its the 4(2) pool that has that problem. On the
test cluster it was an > > EC pool. Seems to affect all sorts of pools.
I have to
Hi Igor and Stefan.
> > Not sure why you're talking about replicated(!) 4(2) pool.
>
> Its because in the production cluster its the 4(2) pool that has that
> problem. On the test cluster it was an > > EC pool. Seems to affect all sorts
> of pools.
I have to take this one back. It is indeed
On 10/6/2022 3:16 PM, Stefan Kooman wrote:
On 10/6/22 13:41, Frank Schilder wrote:
Hi Stefan,
thanks for looking at this. The conversion has happened on 1 host
only. Status is:
- all daemons on all hosts upgraded
- all OSDs on 1 OSD-host were restarted with
Are crashing OSDs still bound to two hosts?
If not - does any died OSD unconditionally mean its underlying disk is
unavailable any more?
On 10/6/2022 3:35 PM, Frank Schilder wrote:
Hi Igor.
Not sure why you're talking about replicated(!) 4(2) pool.
Its because in the production cluster
Hi Igor.
> Not sure why you're talking about replicated(!) 4(2) pool.
Its because in the production cluster its the 4(2) pool that has that problem.
On the test cluster it was an EC pool. Seems to affect all sorts of pools.
I just lost another disk, we have PGs down now. I really hope the
On 10/6/2022 2:55 PM, Frank Schilder wrote:
Hi Igor,
it has the SSD OSDs down, the HDD OSDs are running just fine. I don't want to
make a bad situation worse for now and wait for recovery to finish. The
inactive PGs are activating very slowly.
Got it.
By the way, there are 2 out of 4 OSDs
On 10/6/22 13:41, Frank Schilder wrote:
Hi Stefan,
thanks for looking at this. The conversion has happened on 1 host only. Status
is:
- all daemons on all hosts upgraded
- all OSDs on 1 OSD-host were restarted with bluestore_fsck_quick_fix_on_mount
= true in its local ceph.conf, these OSDs
Hi Igor,
it has the SSD OSDs down, the HDD OSDs are running just fine. I don't want to
make a bad situation worse for now and wait for recovery to finish. The
inactive PGs are activating very slowly.
By the way, there are 2 out of 4 OSDs up in the replicated 4(2) pool. Why are
PGs even
From your response to Stefan I'm getting that one of two damaged hosts
has all OSDs down and unable to start. I that correct? If so you can
reboot it with no problem and proceed with manual compaction [and other
experiments] quite "safely" for the rest of the cluster.
On 10/6/2022 2:35 PM,
Hi Stefan,
thanks for looking at this. The conversion has happened on 1 host only. Status
is:
- all daemons on all hosts upgraded
- all OSDs on 1 OSD-host were restarted with bluestore_fsck_quick_fix_on_mount
= true in its local ceph.conf, these OSDs completed conversion and rebooted, I
would
Hi Igor,
I can't access these drives. They have an OSD- or LVM process hanging in
D-state. Any attempt to do something with these gets stuck as well.
I somehow need to wait for recovery to finish and protect the still running
OSDs from crashing similarly badly.
After we have full redundancy
IIUC the OSDs that expose "had timed out after 15" are failing to start
up. Is that correct or I missed something? I meant trying compaction
for them...
On 10/6/2022 2:27 PM, Frank Schilder wrote:
Hi Igor,
thanks for your response.
And what's the target Octopus release?
ceph version
On 10/6/22 13:06, Frank Schilder wrote:
Hi all,
we are stuck with a really unpleasant situation and we would appreciate help.
Yesterday we completed the ceph deamon upgrade from mimic to octopus all he way
through with bluestore_fsck_quick_fix_on_mount = false and started the OSD OMAP
Hi Igor,
thanks for your response.
> And what's the target Octopus release?
ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
I'm afraid I don't have the luxury right now to take OSDs down or add extra
load with an on-line compaction. I would really appreciate a
And what's the target Octopus release?
On 10/6/2022 2:06 PM, Frank Schilder wrote:
Hi all,
we are stuck with a really unpleasant situation and we would appreciate help.
Yesterday we completed the ceph deamon upgrade from mimic to octopus all he way
through with
34 matches
Mail list logo