.@ibm.com | lflo...@redhat.com
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To
o MDS is active for it.
[1]:
https://docs.ceph.com/en/reef/cephfs/disaster-recovery-experts/#using-an-alternate-metadata-pool-for-recovery
Best regards,
Odair M. Ditkun Jr
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Does quincy automatically switch existing things to 4k or do you need to do a
new ost to get the 4k size?
Thanks,
Kevin
From: Igor Fedotov
Sent: Wednesday, June 21, 2023 5:56 AM
To: Carsten Grommel; ceph-users@ceph.io
Subject: [ceph-users] Re: Ceph
-30T18:35:22.826+ 7fe190013700 0
bluestore(/var/lib/ceph/osd/ceph-183) probe -20: 0, 0, 0
Thanks,
Kevin
From: Fox, Kevin M
Sent: Thursday, May 25, 2023 9:36 AM
To: Igor Fedotov; Hector Martin; ceph-users@ceph.io
Subject: Re: [ceph-users] Re: BlueStore
if that is the right query, then I'll gather the metrics, restart and gather
some more after and let you know.
Thanks,
Kevin
From: Igor Fedotov
Sent: Thursday, May 25, 2023 9:29 AM
To: Fox, Kevin M; Hector Martin; ceph-users@ceph.io
Subject: Re: [ceph-
If you can give me instructions on what you want me to gather before the
restart and after restart I can do it. I have some running away right now.
Thanks,
Kevin
From: Igor Fedotov
Sent: Thursday, May 25, 2023 9:17 AM
To: Fox, Kevin M; Hector Martin
Is this related to https://tracker.ceph.com/issues/58022 ?
We still see run away osds at times, somewhat randomly, that causes runaway
fragmentation issues.
Thanks,
Kevin
From: Igor Fedotov
Sent: Thursday, May 25, 2023 8:29 AM
To: Hector Martin;
frozen.
--
Odair M. Ditkun Jr
Master's networks and distributed systems — University of State Parana
— Brazil
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
gt;
> have you tried a mgr failover?
>
> Zitat von Duncan M Tooke :
>
> > Hi,
> >
> > Our Ceph cluster is in an error state with the message:
> >
> > # ceph status
> > cluster:
> > id: 58140ed2-4ed4-11ed-b4db-5c6f69756a60
> >
Hi,
Our Ceph cluster is in an error state with the message:
# ceph status
cluster:
id: 58140ed2-4ed4-11ed-b4db-5c6f69756a60
health: HEALTH_ERR
Module 'cephadm' has failed: invalid literal for int() with base
10: '352.broken'
This happened after trying to re-add an OSD
I've seen this twice in production on two separate occasions as well. one osd
gets stuck. a bunch of pg's go into laggy state.
ceph pg dump | grep laggy
shows all the laggy pg's share the same osd.
Restarting the affected osd restored full service.
Will either the file store or the posix/gpfs filter support the underlying
files changing underneath so you can access the files either through s3 or by
other out of band means (smb, nfs, etc)?
Thanks,
Kevin
From: Matt Benjamin
Sent: Monday, March 20,
+1. If I know radosgw on top of cephfs is a thing, I may change some plans. Is
that the planned route?
Thanks,
Kevin
From: Daniel Gryniewicz
Sent: Monday, March 6, 2023 6:21 AM
To: Kai Stian Olstad
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: s3
Minio no longer lets you read / write from the posix side. Only through minio
itself. :(
Haven't found a replacement yet. If you do, please let me know.
Thanks,
Kevin
From: Robert Sander
Sent: Tuesday, February 28, 2023 9:37 AM
To: ceph-users@ceph.io
Hello to everyone
When I use this command to see bucket usage
radosgw-admin bucket stats --bucket=
It work only when the owner of the bucket is activated
How to see the usage even when the owner is suspended ?
Here is 2 exemple, one with the owner activated et the other one with owner
We successfully did ceph-deploy+octopus+centos7 -> (ceph-deploy
unsupported)+octopus+centos8stream (using leap) -> (ceph-deploy
unsupported)+pacific+centos8stream -> cephadm+pacific+centos8stream
Everything in place. Leap was tested repeatedly till the procedure/sideeffects
were very well
If you have prometheus enabled, the metrics should be in there I think?
Thanks,
Kevin
From: Peter van Heusden
Sent: Thursday, January 12, 2023 6:12 AM
To: ceph-users@ceph.io
Subject: [ceph-users] BlueFS spillover warning gone after upgrade to Quincy
What else is going on? (ceph -s). If there is a lot of data being shuffled
around, it may just be because its waiting for some other actions to complete
first.
Thanks,
Kevin
From: Torkil Svensgaard
Sent: Tuesday, January 10, 2023 2:36 AM
To:
Is there any problem removing the radosgw and all backing pools from a cephadm
managed cluster? Ceph won't become unhappy about it? We have one cluster with a
really old, historical radosgw we think would be better to remove and someday
later, recreate fresh.
Thanks,
Kevin
We went on a couple clusters from ceph-deploy+centos7+nautilus to
cephadm+rocky8+pacific using ELevate as one of the steps. Went through octopus
as well. ELevate wasn't perfect for us either, but was able to get the job
done. Had to test it carefully on the test clusters multiple times to get
if its this:
http://www.acmemicro.com/Product/17848/Kioxia-KCD6XLUL15T3---15-36TB-SSD-NVMe-2-5-inch-15mm-CD6-R-Series-SIE-PCIe-4-0-5500-MB-sec-Read-BiCS-FLASH-TLC-1-DWPD
its listed as 1 DWPD with a 5 year warranty. So should be ok.
Thanks,
Kevin
From:
When we switched (Was using the compat balancer previously), I:
1. turned off the balancer
2. forced the client minimum (new centos7 clients are ok being forced to
luminious even though they report as jewel. There's an email thread elsewhere
describing it)
3. slowly reweighted the crush compat
I think you can do it like:
```
service_type: rgw
service_id: main
service_name: rgw.main
placement:
label: rgwmain
spec:
config:
rgw_keystone_admin_user: swift
```
?
From: Thilo-Alexander Ginkel
Sent: Thursday, November 17, 2022 10:21 AM
To:
There should be prom metrics for each.
Thanks,
Kevin
From: Christophe BAILLON
Sent: Monday, November 14, 2022 10:08 AM
To: ceph-users
Subject: [ceph-users] How to monitor growing of db/wal partitions ?
Check twice before you click! This email originated
If its the same issue, I'd check the fragmentation score on the entire cluster
asap. You may have other osds close to the limit and its harder to fix when all
your osds cross the line at once. If you drain this one, it may push the other
ones into the red zone if your too close, making the
I haven't done it, but had to read through the documentation a couple months
ago and what I gathered was:
1. if you have a db device specified but no wal device, it will put the wal on
the same volume as the db.
2. the recommendation seems to be to not have a separate volume for db and wal
if
Would it cause problems to mix the smartctl exporter along with ceph's built in
monitoring stuff?
Thanks,
Kevin
From: Wyll Ingersoll
Sent: Friday, October 14, 2022 10:48 AM
To: Konstantin Shalygin; John Petrini
Cc: Marc; Paul Mezzanini; ceph-users
| Activity | Dentries |
Inodes | Dirs| Caps
0 | active | ceph-g-ssd-4-2.mxwjvd | Reqs: 130 /s | 10.2 M |
10.1 M | 356.8 k | 707.6 k
0-s | standby-replay | ceph-g-ssd-4-1.ixqewp | Evts: 0 /s | 156.5 k |
127.7 k | 47.4 k | 0
It is working really well
I plan
09:45 schrieb Arnaud M:
>
> > A ZFS file system can store up to *256 quadrillion zettabytes* (ZB).
>
> How would a storage system look like in reality that could hold such an
> amount of data?
>
> Regards
> --
> Robert Sander
> Heinlein Consulting GmbH
> Sch
which I
> know/worked with more than 50.000 HDDs without problems.
>
> On Mon, Jun 20, 2022 at 10:46 AM Arnaud M
> wrote:
> >
> > Hello to everyone
> >
> > I have looked on the internet but couldn't find an answer.
> > Do you know the maximum size of a ceph f
Hello to everyone
I have looked on the internet but couldn't find an answer.
Do you know the maximum size of a ceph filesystem ? Not the max size of a
single file but the limit of the whole filesystem ?
For example a quick search on zfs on google output :
A ZFS file system can store up to *256
Hello
I will speak about cephfs because it what I am working on
Of course you can do some kind of rsync or rclone between two cephfs
clusters but at petabytes scales it will be really slow and cost a lot !
There is another approach that we tested successfully (only on test not in
prod)
We
pool configuration (EC k+m or replicated X setup), and do you
> use the same pool for indexes and data? I'm assuming this is RGW usage via
> the S3 API, let us know if this is not correct.
>
> On Tue, Mar 29, 2022 at 4:13 PM Alex Closs wrote:
>
> > Hey folks,
> >
> > W
Hello Linkriver
I might have an issue close to your
Can you tell us if your strays dirs are full ?
What does this command output to you ?
ceph tell mds.0 perf dump | grep strays
Does the value change over time ?
All the best
Arnaud
Le mer. 16 mars 2022 à 15:35, Linkriver Technology <
2022 at 3:57 AM Arnaud M
> wrote:
>
>> Hello to everyone :)
>>
>> Just some question about filesystem scrubbing
>>
>> In this documentation it is said that scrub will help admin check
>> consistency of filesystem:
>>
>> http
mars 2022 à 23:26, Arnaud M a écrit :
> Hello to everyone :)
>
> Just some question about filesystem scrubbing
>
> In this documentation it is said that scrub will help admin check
> consistency of filesystem:
>
> https://docs.ceph.com/en/latest/cephfs/sc
h.io/hyperkitty/list/ceph-users@ceph.io/thread/2NT55RUMD33KLGQCDZ74WINPPQ6WN6CW/
>
> And about the crash, it could be related to
> https://tracker.ceph.com/issues/51824
>
> Cheers, dan
>
>
> On Tue, Mar 1, 2022 at 11:30 AM Arnaud M
> wrote:
> >
> > Hello Dan
&
Hello to everyone :)
Just some question about filesystem scrubbing
In this documentation it is said that scrub will help admin check
consistency of filesystem:
https://docs.ceph.com/en/latest/cephfs/scrub/
So my questions are:
Is filesystem scrubbing mandatory ?
How often should I scrub the
of deleted files.
> You need to delete the snapshots, or "reintegrate" the hardlinks by
> recursively listing the relevant files.
>
> BTW, in pacific there isn't a big problem with accumulating lots of
> stray files. (Before pacific there was a default limit of 1M strays,
&
I am using ceph pacific (16.2.5)
Does anyone have an idea about my issues ?
Thanks again to everyone
All the best
Arnaud
Le mar. 1 mars 2022 à 01:04, Arnaud M a écrit :
> Hello to everyone
>
> Our ceph cluster is healthy and everything seems to go well but we have a
> lot o
Hello to everyone
Our ceph cluster is healthy and everything seems to go well but we have a
lot of num_strays
ceph tell mds.0 perf dump | grep stray
"num_strays": 1990574,
"num_strays_delayed": 0,
"num_strays_enqueuing": 0,
"strays_created": 3,
We launch a local registry for cases like these and mirror the relevant
containers there. This keeps copies of the images closer to the target cluster
and reduces load on the public registries. Its not that much different from
mirroring a yum/apt repo locally to speed up access. For large
eph /dev/sdk2
>
>
> > -Original Message-
> > From: M Ranga Swami Reddy
> > Sent: Thursday, 8 July 2021 11:49
> > To: ceph-devel ; ceph-users > us...@ceph.com>
> > Subject: [ceph-users] Fwd: ceph upgrade from luminous to nautils
> >
> &
-- Forwarded message -
From: M Ranga Swami Reddy
Date: Thu, Jul 8, 2021 at 2:30 PM
Subject: ceph upgrade from luminous to nautils
To: ceph-devel
Dear All,
I am using the Ceph with Luminous version with 2000+ OSDs.
Planning to upgrade the ceph from Luminous to Nautils.
Currently
I'm not aware of any directly, but I know rook-ceph is used on Kubernetes, and
Kubernetes is sometimes deployed with BGP based SDN layers. So there may be a
few deployments that do it that way.
From: Martin Verges
Sent: Monday, July 5, 2021 11:23 PM
To:
https://docs.ceph.com/en/latest/rbd/rbd-openstack/
From: Szabo, Istvan (Agoda)
Sent: Wednesday, June 30, 2021 9:50 AM
To: Ceph Users
Subject: [ceph-users] Ceph connect to openstack
Check twice before you click! This email originated from outside PNNL.
Orchestration is hard, especially with every permutation. The devs have
implemented what they feel is the right solution for their own needs from the
sound of it. The orchestration was made modular to support non containerized
deployment. It just takes someone to step up and implement the
I bumped into this recently:
https://samuel.karp.dev/blog/2021/05/running-freebsd-jails-with-containerd-1-5/
:)
Kevin
From: Sage Weil
Sent: Thursday, June 24, 2021 2:06 PM
To: Stefan Kooman
Cc: Nico Schottelius; Kai Börnert; Marc; ceph-users
Subject:
I've actually had rook-ceph not proceed with something that I would have
continued on with. Turns out I was wrong and it was right. Its checking was
more through then mine. Thought that was pretty cool. It eventually cleared
itself and finished up.
For a large ceph cluster, the orchestration
Ultimately, that's what a container image is. From the outside, its a
statically linked binary. From the inside, it can be assembled using modular
techniques. The best thing about it, is you can use container scanners and
other techniques to gain a lot of the benefits of that modularity still.
While there are many reasons containerization helps, I'll just touch on one
real quick that is relevant to the conversation.
Orchestration.
Implementing orchestration of an entire clustered software with many different:
* package managers
* dependency chains
* init systems
* distro specific
Debating containers vs packages is like debating systemd vs initrd. There are
lots of reasons why containers (and container orchestration) are good for
deploying things, including ceph. Repeating them in each project every time it
comes up is not really productive. I'd recommend looking at why
The quick answer, is they are optimized for different use cases.
Things like relational databases (mysql, postgresql) benefit from the
performance that a dedicated filesystem can provide (rbd). Shared filesystems
are usually counter indicated with such software.
Shared filesystems like cephfs
There are a lot of benefits to containerization that is hard to do without it.
Finer grained ability to allocate resources to services. (This process gets 2g
of ram and 1 cpu)
Security is better where only minimal software is available within the
container so on service compromise its harder to
+1
From: Marc
Sent: Thursday, February 11, 2021 12:09 PM
To: ceph-users
Subject: [ceph-users] ceph osd df results
Check twice before you click! This email originated from outside PNNL.
Should the ceph osd df results not have this result for every
Ping
From: Fox, Kevin M
Sent: Tuesday, December 29, 2020 3:17 PM
To: ceph-users@ceph.io
Subject: [ceph-users] radosgw bucket index issue
We have a fairly old cluster that has over time been upgraded to nautilus. We
were digging through some things
We have a fairly old cluster that has over time been upgraded to nautilus. We
were digging through some things and found 3 bucket indexes without a
corresponding bucket. They should have been deleted but somehow were left
behind. When we try and delete the bucket index, it will not allow it as
Hi
With the help of Dan van der Steri i managed to confirm my suspicions
that the problem with osdmap not trimming correctly was caused by PGs
that somehow wasn't marked as created by MONs.
For example, from log:
2020-11-16 12:57:00.514 7f131496f700 10 mon.monb01@0(probing).osd e72792
Hi
I removed related options excluding "mon_debug_block_osdmap_trim
false"
Logs below. I'm not sure how to extract required information so i just
used grep. If it's not enough then please let me know. I can also upload
etire log somewhere if required.
root@monb01:~# grep trim
Hi
Thanks for the reply. Yeah, i restarted all of the mon servers, in
sequence, and yesterday just leader alone without any success.
Reports:
root@monb01:~# ceph report | grep committed
report 4002437698
"monmap_first_committed": 1,
"monmap_last_committed": 6,
Hi
We have ceph cluster running on Nautilus, recently upgraded from Mimic.
When in Mimic we noticed issue with osdmap not trimming, which caused
part of our cluster to crash due to osdmap cache misses. We solved it by
adding "osd_map_cache_size = 5000" to our ceph.conf
Because we had at that
W dniu 2020-11-04 01:18, m.sliwin...@lh.pl napisał(a):
Just in case - result of ceph report is here:
http://paste.ubuntu.com/p/D7yfr3pzr4/
Hi
We have a weird issue iwth our ceph cluster - almost all PGs assigned
to one specific pool became stuck, locking out all operations without
reporting
"last_interval_clean": 67745,
"last_epoch_split": 0,
"last_epoch_marked_full": 0,
"same_up_since": 69411,
"same_interval_since": 69411,
"same_primary_since": 69411,
"last_scrub": "68062'199623",
"last_scrub_stamp&qu
"last_interval_clean": 67745,
"last_epoch_split": 0,
"last_epoch_marked_full": 0,
"same_up_since": 69411,
"same_interval_since": 69411,
"same_primary_since": 69411,
"last_scrub": "68062'199623",
"last_scrub_stamp&qu
Cool. Cool cool cool.
Looks like the issue I was experiencing was resolved by
https://github.com/ceph/ceph/pull/34745. Didn't know encrypted OSD's weren't
supported at all. v15.2.0 did seem to handle them fine, looks like 15.2.1 and
15.2.2 have some regression there. Under 15.2.1 OSD's are
Just following up. Is hanging, during the provisioning of an encrypted OSD,
expected behavior with the current tooling?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
I noticed the luks volumes were open, even though luksOpen hung. I killed
cryptsetup (once per disk) and ceph-volume continued and eventually created the
osd's for the host (yes, this node will be slated for another reinstall when
cephadm is stabilized).
Is there a way to remove an osd service
Hi, trying to migrate a second ceph cluster to Cephadm. All the host
successfully migrated from "legacy" except one of the OSD hosts (cephadm kept
duplicating osd ids e.g. two "osd.5", still not sure why). To make things
easier, we re-provisioned the node (reinstalled from netinstall, applied
Disable psk in zabbix server and resend ceph zabbix send.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Hi
we deploy ceph object storage and want secure RGW. Is there any solution or any
user experience about it?
Is it common to use WAF ?
tnx
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
P.s. The Vagrant file I'm using:
https://gist.github.com/markmaas/5b39e3356a240a9b0fc2063d6d16ffab
That might help someone as well ;-)
On Sun, Mar 15, 2020 at 4:01 PM Mark M wrote:
>
> Hi List,
>
> Trying to learn Ceph, so I'm using a Vagrant setup of six nodes with
> Docker
Hi List,
Trying to learn Ceph, so I'm using a Vagrant setup of six nodes with
Docker and LVM on them.
But as soon as I reach this step
https://docs.ceph.com/docs/master/cephadm/#bootstrap-a-new-cluster
I get stuck with this error:
```
INFO:cephadm:Mgr epoch 5 not available, waiting (6/10)...
Iam using the Luminous 12.2.11 version with prometheus.
On Sun, Mar 8, 2020 at 12:28 PM XuYun wrote:
> You can enable prometheus module of mgr if you are running Nautilus.
>
> > 2020年3月8日 上午2:15,M Ranga Swami Reddy 写道:
> >
> > On Fri, Mar 6, 2020 at 1:06 AM M Ranga
On Fri, Mar 6, 2020 at 1:06 AM M Ranga Swami Reddy
wrote:
> Hello,
> Can we get the IOPs of any rbd image/volume?
>
> For ex: I have created volumes via OpenStack Cinder. Want to know
> the IOPs of these volumes.
>
> In general - we can get pool stats, but not seen the per v
Thanks! I should have tried that, upgrading the clusters to 14.2.8, concurrent
to that endpoint being down, made the issue hard to track.
I'll make sure to re-enable telemetry when that pr merges into the next release.
___
ceph-users mailing list --
Is there another way to disable telemetry then using:
> ceph telemetry off
> Error EIO: Module 'telemetry' has experienced an error and cannot handle
> commands: cannot concatenate 'str' and 'UUID' objects
I'm attempting to get all my clusters out of a constant HEALTH_ERR state caused
by
Hello all,
I'm maintaining a small Nautilus 12 OSD cluster (36TB raw). My mon nodes have
the mgr/mds collocated/stacked with the mon. Each are allocated 10gb of RAM.
During a recent single disk failure and corresponding recovery, I noticed my
mgr/mon's were starting to get OOM killed/restarted
Hello - Recently we have upgraded to Luminous 12.2.11. After that we can
see the scrub errors on the object storage pool only on daily basis. After
repair, it will be cleared. But again it will come tomorrow after scrub
performed the PG.
Any known issue - on scrub errs with 12.2.11 version?
78 matches
Mail list logo