.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Szabo, Istvan (Agoda)
Sent: 11 January 2023 09:06:51
To: Ceph Users
Subject: [ceph-users] Snap trimming best practice
Hi,
Wonder have you ever faced issue
https://www.spinics.net/lists/ceph-users/msg73231.html
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Dongdong Tao
Sent: 11 January 2023 04:30:14
To: Frank Schilder
Cc: Igor Fedotov; ceph-users@ceph.io; cobanser..
-incidence.
Are there any specific conditions for this problem to be present or amplified
that could have to do with hardware?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users
consider restarting an OSD? What values of the above variables are critical and
what are tolerable? Of course a proper fix would be better, but I doubt that
everyone is willing to apply a patch. Therefore, some guidance on how to
mitigate this problem to acceptable levels might be useful. I'm thin
You need to stop all daemons, remove the mon stores and wipe the OSDs with
ceph-volume. Find out which OSDs were running on which host (ceph-volume
inventory DEVICE) and use
ceph-volume lvm zap --destroy --osd-id ID
on these hosts.
Best regards,
=
Frank Schilder
AIT Risø
a user. What exactly is "header navigation"
expected to do if it contains nothing else? Unless I'm looking at the wrong
thing (I can't see the attached image), this header can be removed. The "edit
on github" link can be added to the end of a page.
Best regards,
a way beyond bumping osd_max_scrubs to
increase the number of scheduled and executed deep scrubs.
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Dan van der Ster
Sent: 05 January 2023 15:36
To: Frank Schilder
Cc
. It
would also be nice to have a command like "ceph mon repair" or "ceph mon
resync" instead of having to do a complete manual daemon rebuild.
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Dan va
be an explanation.
Regards,
Eugen
Zitat von Frank Schilder :
> Hi all,
>
> we have these messages in our logs daily:
>
> 1/3/23 12:20:00 PM[INF]overall HEALTH_OK
> 1/3/23 12:19:46 PM[ERR] mon.2 ScrubResult(keys
> {auth=77,config=2,health=11,logm=10} crc
> {auth=688385498,
, google wasn't of too much help. Is this scrub error something to
worry about?
Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email
would have the desired effect? Are there
other parameters to look at that allow gradual changes in the number of scrubs
going on?
Thanks a lot for your help!
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list
Hi Eugen,
thanks! I think this explains our observation.
Thanks and merry Christmas!
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Eugen Block
Sent: 21 December 2022 14:03:06
To: ceph-users@ceph.io
Subject: [ceph-users] Re
suspicious and we wonder if it
has anything to do with the ceph client/fs.
The cluster has been healthy the whole time.
Best regards and thanks for pointers!
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list
578 active+clean
6339 active+remapped+backfill_wait
142 active+remapped+backfilling
6 active+clean+snaptrim
io:
client: 32 MiB/s rd, 247 MiB/s wr, 1.10k op/s rd, 1.57k op/s wr
recovery: 4.2 GiB/s, 1.56k objects/s
=====
Frank Sc
Hi Martin,
I can't find the output of
ceph osd df tree
ceph status
anywhere. I thought you posted it, but well. Could you please post the output
of these commands?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
onsider the disk as "available".
Shame that the deployment tools are so inconsistent. It would be much easier to
repair things if there was an easy way to query what is possible, how much
space on a drive could be used and for what, etc.
Best regards,
=
Frank Schil
splitting will stop if recovery IO is going on (some
objects are degraded).
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Martin Buss
Sent: 14 December 2022 19:32
To: ceph-users@ceph.io
Subject: [ceph-users] Re: New
t versions.
So, back to Eugen's answer: go through this list and try solutions of earlier
cases.
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Monish Selvaraj
Sent: 12 December 2022 11:32:26
To: Eugen Bloc
the outliers only. What I would not recommend is to go all balanced
and 95% OSD utilisation. You will see serious performance loss after some OSDs
reached 80% and if you loose an OSD or host you will have to combat the fallout
of deleted upmaps.
Best regards,
=
Frank Schilder
AIT Risø
the pro of being fairly
stable under OSD failures/additions at the expanse of a few % less capacity.
Maybe someone else an help here?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Matt Larson
Sent: 04 December
to this confusion.
Both dual-uses are legacy and very hard to clean up in the docs.
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Rainer Krienke
Sent: 02 December 2022 12:44:26
To: ceph-users@ceph.io
Sub
or
more physically separated possibilities for network routing that will never go
down simultaneously. If just network link goes between OSDs on both sides
access will be down.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
tience and explanations!
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Venky Shankar
Sent: 30 November 2022 07:45
To: Frank Schilder
Cc: Reed Dier; ceph-users; Patrick Donnelly
Subject: Re: [ceph-users] Re: MDS stuck ops
Hi Fran
-ram-growth/
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Gregory Farnum
Sent: 29 November 2022 22:25:54
To: Joshua Timmer
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: Implications of pglog_hardlimit
On Tue, Nov
ubtree partitioning policies".
OK, I will try this out, I can restore manual pins without problems.
Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Patrick Donnelly
Sent: 29 November 2022 18:08:56
?
Thanks a lot for your time again!
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Venky Shankar
Sent: 29 November 2022 15:54:12
To: Frank Schilder
Cc: Reed Dier; ceph-users
Subject: Re: [ceph-users] Re: MDS stuck ops
Hi Frank,
On Tue, Nov
s the implementation not
match documentation?
Thanks for any insight!
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Venky Shankar
Sent: 29 November 2022 10:09:21
To: Frank Schilder
Cc: Reed Dier; ceph-user
he DCs?
Without stretch mode you need 3 DCs and a geo-replicated 3(2) pool.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Wolfpaw - Dale Corse
Sent: 29 November 2022 07:20:20
To: 'ceph-users'
Subject: [ceph-user
g by hand and it solved all
sorts of problems.
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
. If I fail an MDS, its only 1/8th of users noticing (except maybe rank
0). The fail-over is usually fast enough that I don't get complaints. We have
ca. 1700 kernel clients, it takes a few minutes for the new MS to become active.
Best regards,
=
Frank Schilder
AIT Risø Campus
oosted. We are also on octopus.
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Reed Dier
Sent: 28 November 2022 19:14:55
To: Venky Shankar
Cc: ceph-users
Subject: [ceph-users] Re: MDS stuck ops
Hi Venk
Thanks, also for finding the related tracker issue! It looks like a fix has
already been approved. Hope it shows up in the next release.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Eugen Block
Sent: 28
when adding a new host. That's a stable situation
from an operations point of view.
Hope that helps.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Matt Larson
Sent: 26 November 2022 21:07:41
To: ceph-users
t, I think it is
worth a ticket. Since I can't test on versions higher than octopus yet, could
you then open the ticket?
Thanks!
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Eugen Block
Sent: 23 November 2022 09:27:22
To: ceph-use
-level sub-dir distributed over 10K
sub-trees, which really didn't help performance at all.
If anyone has the dynamic balancer in action, intentionally or not, it might be
worth trying to pin everything up to a depth of 2-3 in the FS tree.
Best regards,
=====
Frank Schilder
AIT R
ere is an in-official recovery
procedure somewhere).
I would prefer that ceph-volume lvm zap employs the same strict sanity checks
as other ceph-commands to avoid accidents. In my case it was a typo, one wrong
letter.
Best regards,
=====
Frank Schilder
AIT Ri
to wpq or look at high-client IO profiles.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: martin.kon...@konsec.com on behalf of Konold,
Martin
Sent: 19 November 2022 18:06:54
To: ceph-users@ceph.io
Subject: [ceph
of the documentation can I trust?
If it is implemented, I would like to get it working - if this is possible at
all. Would you still take a look at the data?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Patrick
cf884bd39c17e3236e0632ac146dc4)
octopus (stable)": 1070
}
}
I will collect the other output you ask for and send it to you privately.
Unless you state otherwise, I will attach a gz-file to an e-mail.
Thanks for your help!
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygni
gh, I also need to know how the
correct output should look like. I would be grateful if you could provide this
additional information.
Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________
From: Patrick Donnelly
Sent:
.
Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Dan van der Ster
Sent: 18 November 2022 10:43:12
To: Frank Schilder
Cc: Igor Fedotov; ceph-users@ceph.io
Subject: Re: [ceph-users] Re: LVM osds loose
ssible to reproduce a realistic ceph-osd IO pattern for
testing. Is there any tool available for this?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Frank Schilder
Sent: 14 November 2022 13:03:58
To: Igor Fedotov
be deeper in the directory
tree, which in turn should be pinned to a rank and not move. That's why I would
really like to know what directories are moved around.
Thanks and best regards!
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From
dropcaches on client nodes after job completion, so there is potential for
reloading data)?
Thanks a lot!
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Patrick Donnelly
Sent: 16 November 2022 22:50:22
To: Frank Schilder
Cc: ceph
misunderstanding the warning? What is happening here and why are these ops
there? Does this point to a config problem?
Thanks for any explanations!
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph
, then
effective-weight = crush-weight * reweight,
but it is clearly not implemented this way. Please take a look at the specific
re-mapping examples on a test cluster I posted with effective-weights=0.5*1 and
1*0.5.
Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109
when using reweight. And this should
not happen, it smells like a really bad bug.
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Etienne Menguy
Sent: 15 November 2022 10:45:19
To: Frank Schilder; ceph-users@ceph.io
S
to the documentation, I would expect identical mappings in all 3
cases. Can someone help me out here?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Frank Schilder
Sent: 15 November 2022 10:09:10
To: ceph-users@ceph.io
Subject
mappings change if the relative weight of all OSDs to each other stays the same
(the probabilities of picking an OSD are unchanged over all OSDs)?
Thanks for any hints.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
for how many log-entries are created per second with these settings
for tuning log_max_recent?
Thanks for your help!
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Frank Schilder
Sent: 11 November 2022 10:25:17
To: Igor Fedotov
2 are doing?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Igor Fedotov
Sent: 10 November 2022 15:48:23
To: Frank Schilder; ceph-users@ceph.io
Subject: Re: [ceph-users] Re: LVM osds loose connection to disk
Hi
196 MiB 0 25 TiB
I do not even believe that stored is correct everywhere, the numbers are very
different in the other form of report. This is really irritating. I think you
should file a bug report.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
t, I would
like to avoid hunting ghosts.
Many thanks and best regards!
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
____
From: Frank Schilder
Sent: 10 October 2022 23:33:32
To: Igor Fedotov; ceph-users@ceph.io
Subject: [ceph-users] Re: LVM
Hi Eugen,
I created https://tracker.ceph.com/issues/58002
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Eugen Block
Sent: 03 November 2022 11:41
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph
,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
to increase the PG count on some pools.
Apart from that, you should always use the full PG capacity that your cluster
can afford, it will not only speed up so many things, it will also improve
resiliency and all-to-all recovery.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning
st regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Rainer Krienke
Sent: 07 November 2022 09:20:44
To: ceph-users@ceph.io
Subject: [ceph-users] How to manuall take down an osd
Hi,
today morning I had osd.77 in my ceph nautil
Yes, it will. The PG never had the last copy, which needs to be build for the
first time. Just wait for it to finish.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Nicola Mori
Sent: 03 November 2022 13:37:30
Hi Szabo,
its a switch-local network shared with an HPC cluster with spine-leaf topology.
The storage nodes sit on leafs and the leafs all connect to the same spine.
Everything with duplicated hardware and LACP bonding.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109
Ah, no. Just set it to 250 as well. I think choose_total_tries is the overall
max, using set_choose_tries higher than choose_total_tries has no effect. In my
case, the bad mapping was already resolved with both=51, but your case looks a
bit more serious.
Best regards,
=
Frank
ect that it was internal
communication going bonkers.
Since the impact is quite high it would be nice to have a pointer as to what
might have happened.
Thanks and best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mail
/issues/57348 contains
examples of how the output looks like.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Nicola Mori
Sent: 03 November 2022 10:57:03
To: ceph-users
Subject: [ceph-users] Re: Missing OSD in up set
The default for choose_total_tries was 50 in my case and way too small. It will
get better once you have more host buckets to choose OSDs from.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Nicola Mori
Sent: 03
Hi Nicola,
might be
https://docs.ceph.com/en/quincy/rados/troubleshooting/troubleshooting-pg/#crush-gives-up-too-soon
or https://tracker.ceph.com/issues/57348.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From
influence the average much.
I was always wondering how users ended up with more than 1000 PGs per OSD by
accident during recovery. It now makes more sense. If there is no per-OSD
warning, this can easily happen.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
uch as possible out of the "does this really work in all corner cases"
equation and rather rely on "I did this 100 times in the past without a
problem" situations.
That users may have to repeat a task is not a problem. Damaging the file system
itself, on the other hand, is.
Thanks an
18399 pgs
objects: 1.41G objects, 2.5 PiB
usage: 3.2 PiB used, 8.3 PiB / 12 PiB avail
pgs: 18378 active+clean
20active+clean+scrubbing+deep
1 active+clean+scrubbing
Any idea what the problem could be?
Thanks and best regards.
=====
Fran
number of parameters to ensure that
the remaining sub-cluster continues to operate as normal as possible, for
example, handles OSD fails in the usual way despite 90% of OSDs being down
already.
Thanks for your input and best regards,
=====
Frank Schilder
AIT Risø Campus
who can point
me to an installation procedure?
Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Zach Heise (SSCC)
Sent: 19 October 2022 21:25:14
To: Neeraj Pratap Singh; ceph-users@ceph.io
Subject: [ceph
o long. I need a fast (unclean yet
recoverable) procedure. Maybe data in flight gets lost, but the FS itself must
come up healthy again.
Any hints on how to do this? Also for the MON store log size problem?
Thanks and best regards,
=====
Frank Schilder
AIT Ri
Thanks for any hints/corrections/confirmations!
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
. It would be great if this message could be
improved in this way.
Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le
A disk may be failing without smartctl or other tools showing anything. Does it
have remapped sectors? I would just throw the disk out and get a new one.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Michel
t;OSD crashes during upgrade mimic->octopus").
The 300G OSDs on our test cluster worked fine.
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Tyler Stachecki
Sent: 27 September 2022 02:00
To: Marc
Cc: F
wer objects misplaced after replacement.
Its more work, but also faster recovery.
If you continue to replace hosts and give them new host names, you should
remove the old ones. At some point these buckets might interfere with mappings
in unexpected ways.
Best regards,
=====
Frank Schil
are they there in the first place? Are
you planning to add hosts or are these replaced ones?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Matthew Darwin
Sent: 14 October 2022 18:57:37
To: c...@elchaka.de; ceph-users@ceph.io
Subject
tive+clean+scrubbing
1 active+clean+scrubbing+deep+inconsistent+repair
io:
client: 444 MiB/s rd, 446 MiB/s wr, 2.19k op/s rd, 2.34k op/s wr
recovery: 0 B/s, 223 objects/s
Yay!
Thanks and best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 10
RUB_ERRORS)
2022-10-11T19:26:24.246215+0200 mon.ceph-01 (mon.0) 633487 : cluster [ERR]
Health check failed: Possible data damage: 1 pg inconsistent (PG_DAMAGED)
Thanks and best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
__
but, well, it might locate something.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Stefan Kooman
Sent: 13 October 2022 13:56:45
To: Yoann Moulin; Patrick Donnelly
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: MDS
ng
disables this warning. Recovery is the operation where exceeding a PG limit
limit without knowing will hurt most.
Thanks for the heads up. Probably need to watch my * a bit more with certain
things.
Best regards,
=====
Frank Schilder
AIT Risø Ca
problems with it.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Josh Baergen
Sent: 07 October 2022 17:16:49
To: Nicola Mori
Cc: ceph-users
Subject: [ceph-users] Re: Iinfinite backfill loop + number of pgp groups
such activity any more. The
issue tracker seems to have turned into a black hole. Do you know what the
reason might be?
thanks and bets regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Dan van der Ster
Sent: 11 October 2022 19
https://tracker.ceph.com/issues/45253
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Michael Thomas
Sent: 08 October 2022 16:40:37
To: ceph-users@ceph.io
Subject: [ceph-users] Invalid crush class
In 15.2.7, how can I remove
re pool. I'm done with getting the cluster
up again and these disks are now almost empty. The problem seems to be that
100G OSDs are just a bit too small for octopus.
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
Fro
it of disk life time. If I really need to
reduce the impact of recovery IO I can set recovery_sleep.
My personal opinion to the user group.
Thanks for your help and have a nice evening!
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_
ope this makes some sense when interpreting the logs.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
____
From: Igor Fedotov
Sent: 09 October 2022 22:07:16
To: Frank Schilder; ceph-users@ceph.io
Subject: Re: [ceph-users] LVM osds loose
, we disabled autoscaler on all pools and also globally. Still,
it interferes with admin commands in an unsolicited way. I would like the PG
merge happen on the fly as the data moves to the new OSDs.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
/home/x/y/z.
Thanks and good Sunday.
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Milind Changire
Sent: 09 October 2022 09:24:20
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] How to check which directory has
ecognised as down. Any hints on what to check if it happens again are also
welcome.
Many thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
What is the right way to confirm its working?
Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
will do a deep-scrub and report back.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Dan van der Ster
Sent: 08 October 2022 11:18:37
To: Frank Schilder
Cc: Ceph Users
Subject: Re: [ceph-users] recurring stat
,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Dan van der Ster
Sent: 08 October 2022 11:03:05
To: Frank Schilder
Cc: Ceph Users
Subject: Re: [ceph-users] recurring stat mismatch on PG
Hi,
Is that 15.2.17? It reminds me of this bug -
https
log_channel(cluster) log [ERR] :
19.1fff deep-scrub 1 errors
This exact same mismatch was found before and I executed a pg-repair that fixed
it. Now its back. Does anyone have an idea why this might be happening and how
to deal with it?
Thanks!
=
Frank Schilder
AIT Risø Campus
Bygning
-bluestore-tool/. I guess I will
stick with the tested command "repair". Nothing I found mentions what exactly
is executed on start-up with bluestore_fsck_quick_fix_on_mount = true.
Thanks for your quick answer!
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 10
n off-line conversion is not mentioned. I know it has been
posted before, but I seem unable to find it on this list. If someone could send
me the command, I would be most grateful.
Thanks and best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 10
Hi Igor,
sorry for the extra e-mail. I forgot to ask: I'm interested in a tool to
de-fragment the OSD. It doesn't look like the fsck command does that. Is there
any such tool?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
help at this late hour!
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Igor Fedotov
Sent: 07 October 2022 00:37:34
To: Frank Schilder; ceph-users@ceph.io
Cc: Stefan Kooman
Subject: Re: [ceph-users] OSD crashes during
d to loose more. The
rebuild simply takes too long in the current situation.
Thanks for your help and best regards,
=====
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Igor Fedotov
Sent: 06 October 2022 17:03:53
To: Frank Schilder;
s includes FS maintenance,
shut down and startup. Ceph fs clients should not crash on "ceph fs set XYZ
down true", they should freeze. Etc.
Its just the omap conversion that was postponed to post-upgrade as explained in
[1], nothing else.
Best regards,
=
Frank Schilder
AI
worked great ... until the unconverted
OSDs started crashing. Things are stable now since about an hour. I really hope
nothing more crashes. Recovery will likely take more than 24 hours. A long way
to go in such a fragile situation.
Best regards,
=====
Frank Schilder
AIT Risø Campus
Bygn
201 - 300 of 731 matches
Mail list logo