) in the default
subvolume group.
So, in your case the actual path to the subvolume would be
/mnt/volumes/_nogroup/subvol2/
On Tue, Aug 22, 2023 at 4:50 PM Eugen Block wrote:
Hi,
while writing a response to [1] I tried to convert an existing
directory within a single cephfs into a subvolume
will recognize it (no extend attr needed), if you use subvolumegroup
name difference than "_nogroup", you must provide it in all subvolume
command [--group_name ]
regards,
Anh Phan
On Wed, Aug 23, 2023 at 6:51 PM Eugen Block wrote:
Hi,
I started a new thread [2] to not hijack yours.
to a subvolume, but it also didn't
appear in the list of set subvolumes. Perhaps it's no longer
supported?
Michal
On 8/22/23 12:56, Eugen Block wrote:
Hi,
I don't know if there's a way to change the path (I assume not
except creating a new path and copy the data), but you could set up
debugging revealed that something was off.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Eugen Block
Sent: Wednesday, August 23, 2023 8:55 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Client failing
Hi,
pointing you to your own thread [1] ;-)
[1]
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/HFILR5NMUCEZH7TJSGSACPI4P23XTULI/
Zitat von Frank Schilder :
Hi all,
I have this warning the whole day already (octopus latest cluster):
HEALTH_WARN 4 clients failing to
Hi,
while writing a response to [1] I tried to convert an existing
directory within a single cephfs into a subvolume. According to [2]
that should be possible, I'm just wondering how to confirm that it
actually worked. Because setting the xattr works fine, the directory
just doesn't show
Hi,
I don't know if there's a way to change the path (I assume not except
creating a new path and copy the data), but you could set up a
directory the "old school" way (mount the root filesystem, create your
subdirectory tree) and then convert the directory into a subvolume by
setting
the compute nodes.
Am Di., 22. Aug. 2023 um 09:17 Uhr schrieb Eugen Block :
You'll need to update the mon_host line as well. Not sure if it makes
sense to have both old and new network in there, but I'd try on one
host first and see if it works.
Zitat von Boris Behrens :
> We're work
Hi,
can you add 'ceph -s' output? Has the recovery finished and if not, do
you see progress? Has the upgrade finished? You could try a 'ceph mgr
fail'.
Zitat von Alfredo Daniel Rezinovsky :
I had many movement in my cluster. Broken node, replacement, rebalancing.
Noy I'm stuck in
basic osd_mclock_max_capacity_iops_ssd
15333.697366
Am Mo., 21. Aug. 2023 um 14:20 Uhr schrieb Eugen Block :
Hi,
> I don't have those configs. The cluster is not maintained via cephadm /
> orchestrator.
I just assumed that with Quincy it already would be managed by
cephadm. S
Eugen Block :
Hi,
there have been a couple of threads wrt network change, simply
restarting OSDs is not sufficient. I still haven't had to do it
myself, but did you 'ceph orch reconfig osd' after adding the second
public network, then restart them? I'm not sure if the orchestrator
works
Hi,
there have been a couple of threads wrt network change, simply
restarting OSDs is not sufficient. I still haven't had to do it
myself, but did you 'ceph orch reconfig osd' after adding the second
public network, then restart them? I'm not sure if the orchestrator
works as expected
Hi,
I tried to find an older thread that explained this quite well, maybe
my google foo left me...
Anyway, the docs [1] explain the "degraded" state of a PG:
When a client writes an object to the primary OSD, the primary OSD
is responsible for writing the replicas to the replica OSDs.
Yeah, that's basically it, also taking into account Anthony's
response, of course.
Zitat von Nicola Mori :
Thanks Eugen for the explanation. To summarize what I understood:
- delete from GUI simply does a drain+destroy;
- destroy will preserve the OSD id so that it will be used by the
next
Hi,
your subject is "...two monitors per host" but I guess you're
asking for MDS daemons per host. ;-) What's the output of 'ceph orch
ls mds --export'? You're using 3 active MDS daemons, maybe you set
"count_per_host: 2" to have enough standby daemons? I don't think an
upgrade would
!
Eugen
Zitat von Robert Sander :
On 8/16/23 12:10, Eugen Block wrote:
I don't really have a good idea right now, but there was a thread [1]
about ssh sessions that are not removed, maybe that could have such an
impact? And if you crank up the debug level to 30, do you see anything
else
I don't really have a good idea right now, but there was a thread [1]
about ssh sessions that are not removed, maybe that could have such an
impact? And if you crank up the debug level to 30, do you see anything
else?
ceph config set mgr debug_mgr 30
[1]
That would have been my suggestion as well, set your own container
image and override the default. Just one comment, the config option is
"container_image" and not "container", that one fails:
$ ceph config set global container my-registry:5000/ceph/ceph:16.2.9
Error EINVAL: unrecognized
Hi,
literally minutes before your email popped up in my inbox I had
announced that I would upgrade our cluster from 16.2.10 to 16.2.13
tomorrow. Now I'm hesitating. ;-)
I guess I would start looking on the nodes where it failed to upgrade
OSDs and check out the cephadm.log as well as
Hi,
after you deployed the RGW service, have all the pools been created
(automatically)? Can you share the output of:
ceph -s
ceph osd pool ls
overlays: uncrognized mount option "volatile" or missing value
I don't think that's the issue here.
Zitat von nguyenvand...@baoviet.com.vn:
Hi,
just a thought: Maybe that message is just telling you that the
previous session has been blocklisted during the client reboot. MDS
clients are frequently requested to free up their caps etc., if they
don't do that within the defined interval (don't know by heart) the
client session
Hi,
I can't seem to find the threads I was looking for, this has been
discussed before. Anyway, IIRC it could be a MGR issue which fails to
update the stats. Maybe a MGR failover clears things up? If that
doesn't help I would try a compaction on one OSD and see if the stats
are corrected
Hi,
after you added the labels to the MONs, did the orchestrator
(re)deploy MONs on the dedicated MON hosts? Are there now 5 MONs
running? If the orchestrator didn't clean that up by itself (it can
take up to 15 minutes, I believe) you can help it by removing a daemon
manually [1]:
Hi,
if you deploy OSDs from scratch you don't have to create LVs manually,
that is handled entirely by ceph-volume (for example on cephadm based
clusters you only provide a drivegroup definition). I'm not sure if
automating db/wal migration has been considered, it might be (too)
subtree pinning. So we want to know if any config we can tune for
the dynamic subtree pinning. Thanks again!
Thanks,
xz
2023年8月9日 17:40,Eugen Block 写道:
Hi,
you could benefit from directory pinning [1] or dynamic subtree
pinning [2]. We had great results with manual pinning in an older
Hi,
I'll try to summarize as far as I understand the process, please
correct me if I'm wrong.
- delete: drain and then delete (optionally keep OSD ID)
- destroy: mark as destroyed (to re-use OSD ID)
- purge: remove everything
I would call the "delete" option in the dashboard as a "safe
Hi,
you could benefit from directory pinning [1] or dynamic subtree
pinning [2]. We had great results with manual pinning in an older
Nautilus cluster, didn't have a chance to test the dynamic subtree
pinning yet though. It's difficult to tell in advance which option
would suit best your
Hi,
just last week there was a thread [1] about a large omap warning for a
single user with 400k buckets. There's no resharding for that (but
with 64k you would stay under the default 200k threshold), so that's
the downside, I guess. I can't tell what other impacts that may have.
I'm no programmer but if I understand [1] correctly it's an unsigned
long long:
int ImageCtx::snap_set(uint64_t in_snap_id) {
which means the max snap_id should be this:
2^64 = 18446744073709551616
Not sure if you can get your cluster to reach that limit, but I also
don't know what
Turn off the autoscaler and increase pg_num to 512 or so (power of 2).
The recommendation is to have between 100 and 150 PGs per OSD (incl.
replicas). And then let the balancer handle the rest. What is the
current balancer status (ceph balancer status)?
Zitat von Spiros Papageorgiou :
Hi
Check out the ownership of the newly created DB device, according to
your output it belongs to the root user. In the osd.log you probably
should see something related to "permission denied". If you change it
to ceph:ceph the OSD might start properly.
Zitat von Roland Giesler :
Ouch, I
Can you query those config options yourself?
storage01:~ # ceph config get mgr mgr/dashboard/standby_behaviour
storage01:~ # ceph config get mgr mgr/dashboard/AUDIT_API_ENABLED
I'm not sure if those are responsible for the crash though.
Zitat von "Adiga, Anantha" :
Hi,
Mgr service crash
It's all covered in the docs [1], one of the points I already
mentioned (require-osd-release), you should have bluestore OSDs and
converted them to ceph-volume before you can adopt them with cephadm
(if you deployed your cluster pre-nautilus).
[1]
:
spec:
data_devices:
paths:
- /dev/sdh
- /dev/sdi
- /dev/sdj
- /dev/sdk
- /dev/sdl
db_devices:
paths:
- /dev/sdf
filter_logic: AND
objectstore: bluestore
Von: Eugen Block
Gesendet: Mittwoch, 2. August 2023 08:13
Do you really need device paths in your configuration? You could use
other criteria like disk sizes, vendors, rotational flag etc. If you
really want device paths you'll probably need to ensure they're
persistent across reboots via udev rules.
Zitat von Kilian Ries :
Hi,
it seems that
c. But, I
think one of our guys mentioned that the cleanup might not be getting
rid of buckets, only the files in them. So, I may have to get our
dev guys to revisit this and see if we can clean up a crapload of
empty buckets.
On Tue, 2023-08-01 at 08:37 +0000, Eugen Block wrote:
Th
Hi,
from Ceph perspective it's supported to upgrade from N to P, you can
safely skip O. We have done that on several clusters without any
issues. You just need to make sure that your upgrade to N was
complete. Just a few days ago someone tried to upgrade from O to Q
with
active+clean13h 8092'56868
8093:4813791 [26,30,13]p26 [26,30,13]p26 2023-07-
31T17:50:40.349450+ 2023-07-31T17:50:40.349450+
311 periodic scrub scheduled @ 2023-08-02T04:39:41.913504+
On Tue, 2023-08-01 at 06:14 +, Eugen Block wrote:
Yeah, regarding data
You could add (debug) logs for starters ;-)
There was a thread [1] describing something quite similar, pointing to
this bug report [2]. In recent versions it's supposed to be fixed
although I don't see the tracker or PR number in the release notes of
both pacific and quincy. Can you verify
hdd 7.27739 1.0 7.3 TiB 1.1 TiB 1.1 TiB 1.1 GiB 8.4
GiB 6.2 TiB 14.99 0.94 19 up
TOTAL 291 TiB 47 TiB 46 TiB51 GiB 359
GiB 244 TiB 16.02
MIN/MAX VAR: 0.52/1.77 STDDEV: 4.56
On Mon, 2023-07-31 at 09:22 +, Eugen Block wrote:
Hi,
can you
Hi,
can you share some more details like 'ceph df' and 'ceph osd df'? I
don't have too much advice yet, but to see all entries in your meta
pool you need add the --all flag because those objects are stored in
namespaces:
rados -p default.rgw.meta ls --all
That pool contains user and
omments.
Thanks,
Eugen
Zitat von Josh Baergen :
Out of curiosity, what is your require_osd_release set to? (ceph osd
dump | grep require_osd_release)
Josh
On Tue, Jul 11, 2023 at 5:11 AM Eugen Block wrote:
I'm not so sure anymore if that could really help here. The dump-keys
output from
Can you paste 'ceph versions' output please? You state that you
upgraded from octopus --> quincy but your require-osd-release is
nautilus. Did you change that to octopus after the previous upgrade?
It's not supported to skip more than version (N --> P, O --> Q, but
not N --> Q). Maybe it
I think I see something similar on a Pacific cluster, the alertmanager
doesn't seem to be aware of a mgr failover. One of the active alerts
is CephMgrPrometheusModuleInactive stating:
The mgr/prometheus module at storage04.fqdn:9283 is unreachable.
...
Which is true because the active mgr
Hi,
what exactly is your question? You seem to have made progress in
bringing OSDs back up and reducing inactive PGs. What is unexpected to
me is that one host failure would cause inactive PGs. Can you share
more details about your osd tree and crush rules of the affected
inactive PGs?
Hi,
apparently, my previous suggestions don't apply here (full OSDs or
max_pgs_per_osd limit). Did you also check the rgw client keyrings?
Did you also upgrade the operating system? Maybe some apparmor stuff?
Can you set debug to 30 to see if there're more to see? Anything in
the mon or
I can provide some more details, these were the recovery steps taken
so far, they started from here (I don't know the whole/exact story
though):
70/868386704 objects unfound (0.000%)
Reduced data availability: 8 pgs inactive, 8 pgs incomplete
Possible data damage: 1 pg recovery_unfound
the cluster status? Is there recovery or backfilling
going on?
No. Everything is good except this PG is not getting scrubbed.
Vlad
On 7/21/23 01:41, Eugen Block wrote:
Hi,
what's the cluster status? Is there recovery or backfilling going on?
Zitat von Vladimir Brik :
I have a PG that hasn't
Hi,
what's the cluster status? Is there recovery or backfilling going on?
Zitat von Vladimir Brik :
I have a PG that hasn't been scrubbed in over a month and not
deep-scrubbed in over two months.
I tried forcing with `ceph pg (deep-)scrub` but with no success.
Looking at the logs of that
Hi,
a couple of threads with similar error messages all lead back to some
sort of pool or osd issue. What is your current cluster status (ceph
-s)? Do you have some full OSDs? Those can cause this initialization
timeout as well as hit the max_pg_per_osd limit. So a few more cluster
Hi,
during cluster upgrades from L to N or later one had to rebuild OSDs
which were originally deployed by ceph-disk switching to ceph-volume.
We've done this on multiple clusters and redeployed one node by one.
We did not drain the nodes beforehand because the EC resiliency
configuration
? How would that work in a real cluster with multiple
MONs? If I stop the first, clean up the mon db, then start it again,
wouldn't it sync the keys from its peers? Not sure how that would
work...
Zitat von Eugen Block :
It was installed with Octopus and hasn't been upgraded yet
It was installed with Octopus and hasn't been upgraded yet:
"require_osd_release": "octopus",
Zitat von Josh Baergen :
Out of curiosity, what is your require_osd_release set to? (ceph osd
dump | grep require_osd_release)
Josh
On Tue, Jul 11, 2023 at 5:11 A
ect yet...
Zitat von Dan van der Ster :
Oh yes, sounds like purging the rbd trash will be the real fix here!
Good luck!
__
Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com
On Mon, Jul 10, 2023 at 6:10 AM Eugen Block wrote:
resharding operation on bucket index detected,
blocking
Zitat von Eugen Block :
We had a quite small window yesterday to debug, I found the error
messages but we didn't collect the logs yet, I will ask them to do
that on Monday. I *think* the error was something like this:
reshardin
of snapshot
tombstones (rbd mirroring snapshots in the trash namespace), maybe
that will reduce the osd_snap keys in the mon db, which then would
increase the startup time. We'll see...
Zitat von Eugen Block :
Thanks, Dan!
Yes that sounds familiar from the luminous and mimic days
dedicated WAL device, but I have only
/dev/nvme0n1 , so I cannot write a correct YAML file...
Dne Po, čec 10, 2023 at 09:12:29 CEST napsal Eugen Block:
Yes, because you did *not* specify a dedicated WAL device. This is also
reflected in the OSD metadata:
$ ceph osd metadata 6 | grep dedicated
osdspec affinity osd_spec_default
type block
vdo 0
devices /dev/sdi
(part of listing...)
Sincerely
Jan Marek
Dne Po, čec 10, 2023 at 08:10:58 CEST napsal Eugen Block:
Hi,
if you don't specify a different devi
Hi,
if you don't specify a different device for WAL it will be
automatically colocated on the same device as the DB. So you're good
with this configuration.
Regards,
Eugen
Zitat von Jan Marek :
Hello,
I've tried to add to CEPH cluster OSD node with a 12 rotational
disks and 1 NVMe. My
which error code was returned to the client there? it
should be a retryable error, and many http clients have retry logic to
prevent these errors from reaching the application
On Fri, Jul 7, 2023 at 6:35 AM Eugen Block wrote:
Hi *,
last week I successfully upgraded a customer cluster from
e Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com<mailto:istvan.sz...@agoda.com>
---
On 2023. Jul 7., at 17:49, Eugen Block wrote:
Email received from the internet. If in doubt, don't click any link
nor ope
On 2023. Jul 7., at 17:35, Eugen Block wrote:
Email received from the internet. If in doubt, don't click any link
nor open any attachment !
Hi *,
last week I successfully upgraded a customer cluster from Nautilus to
Pacific, no real issues, their main use
Hi *,
last week I successfully upgraded a customer cluster from Nautilus to
Pacific, no real issues, their main use is RGW. A couple of hours
after most of the OSDs were upgraded (the RGWs were not yet) their
application software reported an error, it couldn't write to a bucket.
This
and reduce tens of percent of total size.
This may be just another SST file creation, 1GB by default, Ii I
remember it right
Do you was looks to Grafana, about this HDD's utilization, IOPS?
k
Sent from my iPhone
On 7 Jul 2023, at 10:54, Eugen Block wrote:
Can you share some more details what
to the payload size or keys option, but a
timing option.
Zitat von Eugen Block :
Thanks, Dan!
Yes that sounds familiar from the luminous and mimic days.
The workaround for zillions of snapshot keys at that time was to use:
ceph config set mon mon_sync_max_payload_size 4096
I actually did search
ble to understand what is taking so long, and tune
mon_sync_max_payload_size and mon_sync_max_payload_keys accordingly.
Good luck!
Dan
__
Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com
On Thu, Jul 6, 2023 at 1:47 PM Eugen
Hi *,
I'm investigating an interesting issue on two customer clusters (used
for mirroring) I've not solved yet, but today we finally made some
progress. Maybe someone has an idea where to look next, I'd appreciate
any hints or comments.
These are two (latest) Octopus clusters, main usage
. Cheers,
Michel
Le 19/06/2023 à 14:09, Eugen Block a écrit :
Hi, I have a real hardware cluster for testing available now. I'm
not sure whether I'm completely misunderstanding how it's supposed
to work or if it's a bug in the LRC plugin.
This cluster has 18 HDD nodes available across 3 rooms
on were up to date.
I do not know why the osd config files did not get refreshed however
I guess something went wrong draining the nodes we removed from the
cluster.
Best regards,
Malte
Am 21.06.23 um 22:11 schrieb Eugen Block:
I still can’t really grasp what might have happened here
Hi,
without knowing the details I just assume that it’s just „translated“,
the syntax you set is the older way of setting rbd caps, since a
couple of years it’s sufficient to use „profile rbd“. Do you notice
client access issues (which I would not expect) or are you just
curious about the
Hi,
have you tried restarting the primary OSD (currently 343)? It looks
like this PG is part of an EC pool, are there enough hosts available,
assuming your failure-domain is host? I assume that ceph isn't able to
recreate the shard on a different OSD. You could share your osd tree
and
uck undersized for 13h, current state
undersized+remapped+peered, last acting [236]
pg 10.c is stuck undersized for 13h, current state
active+undersized+remapped, last acting [237,236]
Best,
Malte
Am 21.06.23 um 10:31 schrieb Eugen Block:
Hi,
Yes, we drained the nodes. It needed two we
Hi,
Will that try to be smart and just restart a few at a time to keep things
up and available. Or will it just trigger a restart everywhere
simultaneously.
basically, that's what happens for example during an upgrade if
services are restarted. It's designed to be a rolling upgrade
ash[2323668]: debug
2023-06-21T08:11:04.174+ 7fabef5a1200 0 monclient(hunting):
authenticate timed out after 300
Same messages on all OSDs.
We still have some nodes running and did not restart those OSDs.
Best,
Malte
Am 21.06.23 um 09:50 schrieb Eugen Block:
Hi,
can you share more deta
Hi,
can you share more details what exactly you did? How did you remove
the nodes? Hopefully, you waited for the draining to finish? But if
the remaining OSDs wait for removed OSDs it sounds like the draining
was not finished.
Zitat von Malte Stroem :
Hello,
we removed some nodes from
You should report this in the openstack-discuss
mailing list or create a bug report on launchpad. If you want I can do
that as well.
I will do some more testing to have more details.
Thanks,
Eugen
Zitat von Eugen Block :
Hi,
I don't quite understand the issue yet, maybe you can clarify.
If
this (very high volume) list... Or may somebody pass the
email thread to one of them?
Help would be really appreciated. Cheers,
Michel
Le 19/06/2023 à 14:09, Eugen Block a écrit :
Hi, I have a real hardware cluster for testing available now. I'm
not sure whether I'm completely misunderstanding how
Hi,
so grafana is starting successfully now? What did you change?
Regarding the container images, yes there are defaults in cephadm
which can be overridden with ceph config. Can you share this output?
ceph config dump | grep container_image
I tend to always use a specific image as
that help me to understand the
problem I remain interested. I propose to keep this thread for that.
Zitat, I shared my crush map in the email you answered if the
attachment was not suppressed by mailman.
Cheers,
Michel
Sent from my mobile
Le 18 mai 2023 11:19:35 Eugen Block a écrit :
H
Hi,
I don't quite understand the issue yet, maybe you can clarify.
If I perform a "change volume type" from OpenStack on volumes
attached to the VMs the system successfully migrates the volume from
the source pool to the destination pool and at the end of the
process the volume is visible
Hi,
I don't think this is going to work. Each OSD belongs to a specific
host and you can't have multiple buckets (e.g. bucket type "host")
with the same name in the crush tree. But if I understand your
requirement correctly, there should be no need to do it this way. If
you structure your
Hi,
did you check the MON logs? They should contain some information about
the reason why the OSD is marked down and out. You could also just try
to mark it in yourself, does it change anything?
$ ceph osd in 34
I would also take another look into the OSD logs:
cephadm logs --name osd.34
Hi,
can you check for snapshots in the trash namespace?
# rbd snap ls --all /
Instead of removing the feature try to remove the snapshot from trash
(if there are any).
Zitat von Adam Boyhan :
I have a small cluster on Pacific with roughly 600 RBD images. Out
of those 600 images I
Sure: https://docs.ceph.com/en/latest/rados/operations/balancer/#throttling
Zitat von Louis Koo :
ok, I will try it. Could you show me the archive doc?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to
Hi,
I wonder if a redeploy of the crash service would fix that, did you try that?
Zitat von Zakhar Kirpichenko :
I've opened a bug report https://tracker.ceph.com/issues/61589, which
unfortunately received no attention.
I fixed the issue by manually setting directory ownership
for
daemons?
I can definitely try. However, I tried to lower the max number of mds.
Unfortunately, one of the MDSs seem to be stuck in "stopping" state for
more than 12 hours now.
Best,
Emmanuel
On Wed, May 24, 2023 at 4:34 PM Eugen Block wrote:
Hi,
using standby-replay daemons is somethi
Hi,
can you paste the following output?
# ceph config-key list | grep grafana
Do you have a mgr/cephadm/grafana_key set? I would check the contents
of crt and key and see if they match. A workaround to test the
certificate and key pair would be to use a per-host config [1]. Maybe
it's
I suspect the target_max_misplaced_ratio (default 0.05). You could try
setting it to 1 and see if it helps. This has been discussed multiple
times on this list, check out the archives for more details.
Zitat von Louis Koo :
Thanks for your responses, I want to know why it spend much time to
Hi,
it's not really useful to create multiple threads for the same
question. I wrote up some examples [1] which worked for me to
integrate keystone and radosgw.
From the debug logs below, it appears that radosgw is still trying
to authenticate with Swift instead of Keystone.
Any pointers
Hi,
the short answer is yes, but without knowing anything about the
cluster or what happened exactly it's a wild guess.
In general, you can use the ceph-objectstore-tool [1] to export a PG
(one replica or chunk) from an OSD and import it to a different OSD. I
have to add, I never had to do
Try on the mentioned host if there is a daemon with:
cephadm ls | grep apcepfpspsp0111
If there is one you can remove it with cephadm rm-daemon …
Sometimes a MGR failover clears up that message:
ceph mgr fail
Zitat von farhad kh :
hi everyone
i have a warning ` 1 stray daemon(s) not
Hi,
using standby-replay daemons is something to test as it can have a
negative impact, it really depends on the actual workload. We stopped
using standby-replay in all clusters we (help) maintain, in one
specific case with many active MDSs and a high load the failover time
decreased and
Hi,
there was a thread [1] just a few weeks ago. Which mgr modules are
enabled in your case? Also the mgr caps seem to be relevant here.
[1]
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/BKP6EVZZHJMYG54ZW64YABYV6RLPZNQO/
Zitat von Tobias Hachmer :
Hello list,
we have
Hi,
there was a change introduced [1] for cephadm to use dashes for
container names instead of dots. That still seems to be an issue
somehow, in your case cephadm is complaining about the missing
directory:
Hi,
OSDs don't just communicate with each other but especially with MONs,
too. They also check the OSD status (for example OSDs are marked out
after 10 minutes if the MONs haven't heard from the OSDs during the
mon_osd_down_out_interval), so your /etc/hosts should definitely
contain the
Hi,
the config options you mention should work, but not in the ceph.conf.
You should set it via ‚ceph config set …‘ and then restart the daemons
(ceph orch daemon restart osd).
Zitat von Renata Callado Borges :
Dear all,
How are you?
I have a Pacific 3 nodes cluster, and the machines
inline).
If somebody on the list has some clue on the LRC plugin, I'm still
interested by understand what I'm doing wrong!
Cheers,
Michel
Le 04/05/2023 à 15:07, Eugen Block a écrit :
> Hi,
>
> I don't think you've shared your osd tree yet, could you do that?
> Apparently nobody else but us
Hi,
I would recommend to add the —image option to the bootstrap command so
it will only try to pull it from the local registry. If you also
provide the —skip-monitoring-stack option it will ignore Prometheus
etc for the initial bootstrap. After your cluster has been deployed
you can set
channel I also got a response, there's a theory that trash snapshops
appeared during mon reelection and vanished after upgrading to quincy.
I'll recommend to delete the trash snapshots manually, then maybe
increase the snaptrim config.
Zitat von Stefan Kooman :
On 5/16/23 09:47, Eugen
Good morning,
I would be grateful if anybody could shed some light on this, I can't
reproduce it in my lab clusters so I was hoping for the community.
A customer has 2 clusters with rbd mirroring (snapshots) enabled, it
seems to work fine, they have regular checks and the images on the
401 - 500 of 1347 matches
Mail list logo