On 1/20/20 4:17 PM, Anton Aleksandrov wrote:
> Hello community,
>
> We have very small ceph cluster of just 12 OSDs (1 per small server), 3
> MDS (one active) and 1 cephFS client.
>
Which version of Ceph?
$ ceph versions
> CephFS client is running Centos7, kernel 3.10.0-957.27.2.el7.x86_64.
On 1/10/20 5:32 PM, Stefan Priebe - Profihost AG wrote:
> Hi,
>
> we‘re currently in the process of building a new ceph cluster to backup rbd
> images from multiple ceph clusters.
>
> We would like to start with just a single ceph cluster to backup which is
> about 50tb. Compression ratio of
On 1/10/20 7:43 PM, Philip Brown wrote:
> Surprisingly, a google search didnt seem to find the answer on this, so guess
> I should ask here:
>
> what determines if an rdb is "100% busy"?
>
> I have some backend OSDs, and an iSCSI gateway, serving out some RBDs.
>
> iostat on the gateway says
On 1/13/20 6:37 PM, vita...@yourcmc.ru wrote:
>> Hi,
>>
>> we're playing around with ceph but are not quite happy with the IOs.
>> on average 5000 iops / write
>> on average 13000 iops / read
>>
>> We're expecting more. :( any ideas or is that all we can expect?
>
> With server SSD you can expe
On 1/9/20 2:27 PM, Stefan Priebe - Profihost AG wrote:
> Hi Wido,
> Am 09.01.20 um 14:18 schrieb Wido den Hollander:
>>
>>
>> On 1/9/20 2:07 PM, Daniel Aberger - Profihost AG wrote:
>>>
>>> Am 09.01.20 um 13:39 schrieb Janne Johansson:
>>>>
On 1/9/20 2:07 PM, Daniel Aberger - Profihost AG wrote:
>
> Am 09.01.20 um 13:39 schrieb Janne Johansson:
>>
>> I'm currently trying to workout a concept for a ceph cluster which can
>> be used as a target for backups which satisfies the following
>> requirements:
>>
>> - approx.
On 12/7/19 3:39 PM, Philippe D'Anjou wrote:
> @Wido Den Hollander
>
> First of all the docs say: "In most cases, this distribution is
> “perfect,” which an equal number of PGs on each OSD (+/-1 PG, since they
> might not divide evenly)."
> Either this is jus
On 12/7/19 1:42 PM, Philippe D'Anjou wrote:
> @Wido Den Hollander
>
> That doesn't explain why its between 76 and 92 PGs, that's major not equal.
The balancer will balance the PGs so that all OSDs have an almost equal
data usage. It doesn't balance that all OSD
On 12/7/19 11:42 AM, Philippe D'Anjou wrote:
> Hi,
> the docs say the upmap mode is trying to achieve perfect distribution as
> to have equal amount of PGs/OSD.
> This is what I got(v14.2.4):
>
> 0 ssd 3.49219 1.0 3.5 TiB 794 GiB 753 GiB 38 GiB 3.4 GiB 2.7
> TiB 22.20 0.32 82 up
>
On 12/3/19 3:07 PM, Aleksey Gutikov wrote:
That is true. When an OSD goes down it will take a few seconds for it's
Placement Groups to re-peer with the other OSDs. During that period
writes to those PGs will stall for a couple of seconds.
I wouldn't say it's 40s, but it can take ~10s.
Hel
On 12/3/19 11:40 AM, John Hearns wrote:
I had a fat fingered moment yesterday
I typed ceph auth del osd.3
Where osd.3 is an otherwise healthy little osd
I have not set noout or down on osd.3 yet
This is a Nautilus cluster.
ceph health reports everything is OK
On 11/29/19 6:28 AM, jes...@krogh.cc wrote:
> Hi Nathan
>
> Is that true?
>
> The time it takes to reallocate the primary pg delivers “downtime” by
> design. right? Seen from a writing clients perspective
>
That is true. When an OSD goes down it will take a few seconds for it's
Placement Gr
On 11/28/19 12:56 PM, David Majchrzak, ODERLAND Webbhotell AB wrote:
> Hi!
>
> We've deployed a new flash only ceph cluster running Nautilus and I'm
> currently looking at any tunables we should set to get the most out of
> our NVMe SSDs.
>
> I've been looking a bit at the options from the blo
> Director – Information Technology
> Perform Air International Inc.
> dhils...@performair.com
> www.PerformAir.com
>
>
>
> -Original Message-
> From: Wido den Hollander [mailto:w...@42on.com]
> Sent: Friday, November 15, 2019 1:56 AM
> To: Dominic Hilsbos
On 11/15/19 4:25 PM, Paul Emmerich wrote:
> On Fri, Nov 15, 2019 at 4:02 PM Wido den Hollander wrote:
>>
>> I normally use LVM on top
>> of each device and create 2 LVs per OSD:
>>
>> - WAL: 1GB
>> - DB: xx GB
>
> Why? I've seen this a few ti
On 11/15/19 3:19 PM, Kristof Coucke wrote:
> Hi all,
>
>
>
> We’ve configured a Ceph cluster with 10 nodes, each having 13 large
> disks (14TB) and 2 NVMe disks (1,6TB).
>
> The idea was to use the NVMe as “fast device”…
>
> The recommendations I’ve read in the online documentation, state t
On 11/11/19 2:00 PM, Shawn Iverson wrote:
> Hello Cephers!
>
> I had a node over the weekend go nuts from what appears to have been
> failed/bad memory modules and/or motherboard.
>
> This resulted in several OSDs blocking IO for > 128s (indefinitely).
>
> I was not watching my alerts too clo
On 11/15/19 12:57 PM, Willi Schiegel wrote:
> Hello All,
>
> I'm starting to setup a Ceph cluster and am confused about the
> recommendations for the network setup.
>
> In the Mimic manual I can read
>
> "We recommend running a Ceph Storage Cluster with two networks: a public
> (front-side) ne
On 11/15/19 11:24 AM, Simon Ironside wrote:
> Hi Florian,
>
> Any chance the key your compute nodes are using for the RBD pool is
> missing 'allow command "osd blacklist"' from its mon caps?
>
Added to this I recommend to use the 'profile rbd' for the mon caps.
As also stated in the OpenStack
Did you check /var/log/ceph/ceph.log on one of the Monitors to see which
pool and Object the large Object is in?
Wido
On 11/15/19 12:23 AM, dhils...@performair.com wrote:
> All;
>
> We had a warning about a large OMAP object pop up in one of our clusters
> overnight. The cluster is configured
On 10/30/19 3:04 AM, soumya tr wrote:
> Hi all,
>
> I have a 3 node ceph cluster setup using juju charms. ceph health shows
> having inactive pgs.
>
> ---
> /# ceph status
> cluster:
> id: 0e36956e-ef64-11e9-b472-00163e6e01e8
> health: HEALTH_WARN
> Reduced
On 10/26/19 8:01 AM, Philippe D'Anjou wrote:
> V14.2.4
> So, this is not new, this happens every time there is a rebalance, now
> because of raising PGs. PG balancer is disabled because I thought it was
> the reason but apparently it's not, but it ain't helping either.
>
> Ceph is totally borged
On 10/25/19 5:27 AM, luckydog xf wrote:
> Hi, list,
>
> Currently my ceph nodes with 3 MON and 9 OSDs, everything is fine.
> Now I plan to add onre more public network, the initial public network
> is 103.x/24, and the target network is 109.x/24. And 103 cannot reach
> 109, as I don't conf
On 10/1/19 4:38 PM, Stefan Kooman wrote:
> Quoting Wido den Hollander (w...@42on.com):
>> Hi,
>>
>> The Telemetry [0] module has been in Ceph since the Mimic release and
>> when enabled it sends back a anonymized JSON back to
>> https://telemetry.ceph.com/ ever
On 10/1/19 5:11 PM, Mattia Belluco wrote:
> Hi all,
>
> Same situation here:
>
> Ceph 13.2.6 on Ubuntu 16.04.
>
Thanks for the feedback both! I enabled it on a Ubuntu 18.04 with
Nautilus 14.2.4 system.
> Best
> Mattia
>
> On 10/1/19 4:38 PM, Stefan Koo
Hi,
The Telemetry [0] module has been in Ceph since the Mimic release and
when enabled it sends back a anonymized JSON back to
https://telemetry.ceph.com/ every 72 hours with information about the
cluster.
For example:
- Version(s)
- Number of MONs, OSDs, FS, RGW
- Operating System used
- CPUs u
On 9/17/19 11:01 PM, Oliver Freyermuth wrote:
> Dear Cephalopodians,
>
> I realized just now that:
> https://eu.ceph.com/rpm-nautilus/el7/x86_64/
> still holds only released up to 14.2.2, and nothing is to be seen of
> 14.2.3 or 14.2.4,
> while the main repository at:
> https://download.ceph
On 9/14/19 4:24 AM, Alfred wrote:
> Hi ceph users,
>
>
> If I understand correctly the "min_compat_client" option in the OSD map
> was replaced in Luminous with "require_min_compat_client".
>
> After upgrading a cluster to Luminous and setting
> set-require-min-compat-client to jewel, the min_com
On 9/4/19 1:34 AM, Eric Choi wrote:
> Hi there,
>
> We are trying to upgrade a cluster in our test environment to Nautilus,
> and we ran into this hurdle after upgrading Mons and Mgrs. Restarting
> the first OSD, we noticed that ceph -s is reporting many PG's state as
> unknown:
>
> [root@ceph
On 8/26/19 6:46 PM, Thomas Schneider wrote:
> Hi,
>
> I'm running Debian 10 with btrfs-progs=5.2.1.
>
> Creating snapshots with snapper=0.8.2 works w/o errors.
>
> However, I run into an issue and need to restore various files.
>
> I thought that I could simply take the files from a snapshot
On 8/26/19 1:35 PM, Simon Oosthoek wrote:
> On 26-08-19 13:25, Simon Oosthoek wrote:
>> On 26-08-19 13:11, Wido den Hollander wrote:
>>
>>>
>>> The reweight might actually cause even more confusion for the balancer.
>>> The balancer uses upmap mode a
On 8/26/19 12:33 PM, Simon Oosthoek wrote:
> On 26-08-19 12:00, EDH - Manuel Rios Fernandez wrote:
>> Balancer just balance in Healthy mode.
>>
>> The problem is that data is distributed without be balanced in their
>> first
>> write, that cause unproperly data balanced across osd.
>
> I suppose
On 8/22/19 5:49 PM, Jason Dillaman wrote:
> On Thu, Aug 22, 2019 at 11:29 AM Wido den Hollander wrote:
>>
>>
>>
>> On 8/22/19 3:59 PM, Jason Dillaman wrote:
>>> On Thu, Aug 22, 2019 at 9:23 AM Wido den Hollander wrote:
>>>>
>>>> Hi,
On 8/22/19 3:59 PM, Jason Dillaman wrote:
> On Thu, Aug 22, 2019 at 9:23 AM Wido den Hollander wrote:
>>
>> Hi,
>>
>> In a couple of situations I have encountered that Virtual Machines
>> running on RBD had a high I/O-wait, nearly 100%, on their vdX (VirtIO
Hi,
In a couple of situations I have encountered that Virtual Machines
running on RBD had a high I/O-wait, nearly 100%, on their vdX (VirtIO)
or sdX (Virtio-SCSI) devices while they were performing CPU intensive tasks.
These servers would be running a very CPU intensive application while
*not* do
On 8/14/19 9:48 AM, Simon Oosthoek wrote:
> Hi all,
>
> Yesterday I marked out all the osds on one node in our new cluster to
> reconfigure them with WAL/DB on their NVMe devices, but it is taking
> ages to rebalance. The whole cluster (and thus the osds) is only ~1%
> full, therefore the full
gt;
> On 8/13/19 3:51 PM, Paul Emmerich wrote:
>
> > On Tue, Aug 13, 2019 at 10:04 PM Wido den Hollander <mailto:w...@42on.com>> wrote:
> >> I just checked an RGW-only setup. 6TB drive, 58% full, 11.2GB of
> DB in
> >> use. No slow
; -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Wido
> den Hollander
> Sent: Tuesday, August 13, 2019 12:51 PM
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] WAL/DB size
>
>
>
> On 8/13/19 5:54 PM, Hemant
On 8/13/19 5:54 PM, Hemant Sonawane wrote:
> Hi All,
> I have 4 6TB of HDD and 2 450GB SSD and I am going to partition each
> disk to 220GB for rock.db. So my question is does it make sense to use
> wal for my configuration? if yes then what could be the size of it? help
> will be really apprecia
f min_size=2 you would need both B and C to fail within that window or
when they are recovering A.
That is an even smaller chance than a single disk failure.
Wido
> Wido den Hollander mailto:w...@42on.com>> 于2019年7月25
> 日周四 下午3:39写道:
>
>
>
> On 7/25/19 9:
that single
disk/OSD now dies while performing the recovery you have lost data.
The PG (or PGs) becomes inactive and you either need to perform data
recovery on the failed disk or revert back to the last state.
I can't take that risk in this situation.
Wido
> My 0.02
>
> Wido den Holland
On 7/25/19 8:55 AM, Janne Johansson wrote:
> Den ons 24 juli 2019 kl 21:48 skrev Wido den Hollander <mailto:w...@42on.com>>:
>
> Right now I'm just trying to find a clever solution to this. It's a 2k
> OSD cluster and the likelihood of an host or OSD cra
On 7/25/19 7:49 AM, Sangwhan Moon wrote:
> Hello,
>
> Original Message:
>>
>>
>> On 7/25/19 6:49 AM, Sangwhan Moon wrote:
>>> Hello,
>>>
>>> I've inherited a Ceph cluster from someone who has left zero documentation
>>> or any handover. A couple days ago it decided to show the entire company
On 7/25/19 6:49 AM, Sangwhan Moon wrote:
> Hello,
>
> I've inherited a Ceph cluster from someone who has left zero documentation or
> any handover. A couple days ago it decided to show the entire company what it
> is capable of..
>
> The health report looks like this:
>
> [root@host mnt]# c
On 7/24/19 9:38 PM, dhils...@performair.com wrote:
> All;
>
> There's been a lot of discussion of various kernel versions on this list
> lately, so I thought I'd seek some clarification.
>
> I prefer to run CentOS, and I prefer to keep the number of "extra"
> repositories to a minimum. Ceph
er solution to this. It's a 2k
OSD cluster and the likelihood of an host or OSD crashing is reasonable
while you are performing maintenance on a different host.
All kinds of things have crossed my mind where using size=4 is one of them.
Wido
> Mark Schouten
>
>> Op 24 jul. 2019 om
Hi,
Is anybody using 4x (size=4, min_size=2) replication with Ceph?
The reason I'm asking is that a customer of mine asked me for a solution
to prevent a situation which occurred:
A cluster running with size=3 and replication over different racks was
being upgraded from 13.2.5 to 13.2.6.
During
t;
> - Sinan
>
>> Op 24 jul. 2019 om 17:48 heeft Wido den Hollander het
>> volgende geschreven:
>>
>>
>>
>>> On 7/24/19 4:06 PM, Fabian Niepelt wrote:
>>> Hi, thanks for the reply.
>>>
>>> Am Mittwoch, den 24.07.2019, 15:26 +
On 7/24/19 7:15 PM, Kevin Hrpcek wrote:
> I often add 50+ OSDs at a time and my cluster is all NLSAS. Here is what
> I do, you can obviously change the weight increase steps to what you are
> comfortable with. This has worked well for me and my workloads. I've
> sometimes seen peering take longe
On 7/24/19 4:06 PM, Fabian Niepelt wrote:
> Hi, thanks for the reply.
>
> Am Mittwoch, den 24.07.2019, 15:26 +0200 schrieb Wido den Hollander:
>>
>> On 7/24/19 1:37 PM, Fabian Niepelt wrote:
>>> Hello ceph-users,
>>>
>>> I am currently building
On 7/24/19 1:37 PM, Fabian Niepelt wrote:
> Hello ceph-users,
>
> I am currently building a Ceph cluster that will serve as a backend for
> Openstack and object storage using RGW. The cluster itself is finished and
> integrated with Openstack and virtual machines for testing are being deployed.
On 7/18/19 12:21 PM, Eugen Block wrote:
> Hi list,
>
> we're facing an unexpected recovery behavior of an upgraded cluster
> (Luminous -> Nautilus).
>
> We added new servers with Nautilus to the existing Luminous cluster, so
> we could first replace the MONs step by step. Then we moved the old
On 7/16/19 6:53 PM, M Ranga Swami Reddy wrote:
> Thanks for your reply..
> Here, new pool creations and pg auto scale may cause rebalance..which
> impact the ceph cluster performance..
>
> Please share name space detail like how to use etc
>
Would it be RBD, Rados, CephFS? What would you be us
On 7/20/19 6:06 PM, Wei Zhao wrote:
> Hi ceph users:
> I was doing write benchmark, and found some io will be blocked for a
> very long time. The following log is one op , it seems to wait for
> replica to finish. My ceph version is 12.2.4, and the pool is 3+2 EC .
> Does anyone give me some ad
Hi,
We will be having Ceph Day London October 24th!
https://ceph.com/cephdays/ceph-day-london-2019/
The CFP is now open for you to get your Ceph related content in front
of the Ceph community ranging from all levels of expertise:
https://forms.zohopublic.com/thingee/form/CephDayLondon2019/formp
On 7/11/19 11:42 AM, Marc Roos wrote:
>
>
> Can I temporary shutdown all my monitors? This only affects new
> connections not? Existing will still keep running?
>
You can, but it will completely shut down your whole Ceph cluster.
All I/O will pause until the MONs are back and have reached
On 7/10/19 9:59 AM, Lars Täuber wrote:
> Hi everbody!
>
> Is it possible to make snapshots in cephfs writable?
As far as I'm aware: No
You would need to remove the complete snapshot and create a new one.
> We need to remove files because of this General Data Protection Regulation
> also from
On 7/10/19 5:56 AM, Konstantin Shalygin wrote:
> On 5/28/19 5:16 PM, Marc Roos wrote:
>> I switched first of may, and did not notice to much difference in memory
>> usage. After the restart of the osd's on the node I see the memory
>> consumption gradually getting back to as before.
>> Can't say
On 6/11/19 9:48 PM, J. Eric Ivancich wrote:
> Hi Wido,
>
> Interleaving below
>
> On 6/11/19 3:10 AM, Wido den Hollander wrote:
>>
>> I thought it was resolved, but it isn't.
>>
>> I counted all the OMAP values for the GC objects and I got back:
On 6/4/19 8:00 PM, J. Eric Ivancich wrote:
> On 6/4/19 7:37 AM, Wido den Hollander wrote:
>> I've set up a temporary machine next to the 13.2.5 cluster with the
>> 13.2.6 packages from Shaman.
>>
>> On that machine I'm running:
>>
>> $ rad
On 6/5/19 8:44 AM, jes...@krogh.cc wrote:
> Hi.
>
> This is more an inquiry to figure out how our current setup compares
> to other setups. I have a 3 x replicated SSD pool with RBD images.
> When running fio on /tmp I'm interested in seeing how much IOPS a
> single thread can get - as Ceph sca
On 5/30/19 2:45 PM, Wido den Hollander wrote:
>
>
> On 5/29/19 11:22 PM, J. Eric Ivancich wrote:
>> Hi Wido,
>>
>> When you run `radosgw-admin gc list`, I assume you are *not* using the
>> "--include-all" flag, right? If you're not using that f
t is in place for the 13.2.6
> release of mimic.
>
Thanks! I'll might grab some packages from Shaman to give GC a try.
Wido
> Eric
>
>
> On 5/29/19 3:19 AM, Wido den Hollander wrote:
>> Hi,
>>
>> I've got a Ceph cluster with this status:
>>
On 5/29/19 11:41 AM, Johan Thomsen wrote:
> Hi,
>
> It doesn't look like SIGHUP causes the osd's to trigger conf reload from
> files? Is there any other way I can do that, without restarting?
>
No, there isn't. I suggest you look into the new config store which is
in Ceph since the Mimic rele
Hi,
I've got a Ceph cluster with this status:
health: HEALTH_WARN
3 large omap objects
After looking into it I see that the issue comes from objects in the
'.rgw.gc' pool.
Investigating it I found that the gc.* objects have a lot of OMAP keys:
for OBJ in $(rados -p .rgw.gc ls);
>
Yes, that is correct. Keep in mind though that you will need to
Stop/Start the VMs or (Live) Migrate them to a different hypervisor for
the new packages to be loaded.
Wido
> Thank you very much!
>
> Kevin
>
> Am Di., 28. Mai 2019 um 09:46 Uhr schrieb Wido den Hollander
>
On 5/28/19 7:52 AM, Kevin Olbrich wrote:
> Hi!
>
> How can I determine which client compatibility level (luminous, mimic,
> nautilus, etc.) is supported in Qemu/KVM?
> Does it depend on the version of ceph packages on the system? Or do I
> need a recent version Qemu/KVM?
This is mainly related
On 5/23/19 12:02 PM, Marc Roos wrote:
>
> Sorry for not waiting until it is published on the ceph website but,
> anyone attended this talk? Is it production ready?
>
Danny from Deutsche Telekom can answer this better, but no, it's not
production ready.
It seems it's more challenging to get
On 5/21/19 4:48 PM, Kevin Flöh wrote:
> Hi,
>
> we gave up on the incomplete pgs since we do not have enough complete
> shards to restore them. What is the procedure to get rid of these pgs?
>
You need to start with marking the OSDs as 'lost' and then you can
force_create_pg to get the PGs bac
On 5/18/19 1:45 AM, Thore Krüss wrote:
> On Thu, May 16, 2019 at 08:41:10AM +0200, Wido den Hollander wrote:
>>
>>
>> On 5/12/19 4:21 PM, Thore Krüss wrote:
>>> Good evening,
>>> after upgrading our cluster yesterday to Nautilus (14.2.1) and pg-merging an
On 5/12/19 4:21 PM, Thore Krüss wrote:
> Good evening,
> after upgrading our cluster yesterday to Nautilus (14.2.1) and pg-merging an
> imbalanced pool we noticed that the number of objects in the pool has dubled
> (rising synchronously with the merge progress).
>
> What happened there? Was this
On 5/2/19 4:08 PM, Daniel Gryniewicz wrote:
> Based on past experience with this issue in other projects, I would
> propose this:
>
> 1. By default (rgw frontends=beast), we should bind to both IPv4 and
> IPv6, if available.
>
> 2. Just specifying port (rgw frontends=beast port=8000) should app
On 4/16/19 2:27 PM, M Ranga Swami Reddy wrote:
> Its Smart Storage battery, which was disabled due to high ambient
> temperature.
> All OSD processes/daemon working as is...but those OSDs not responding
> to other OSD due to high CPU utilization..
> Don't observe the clock skew issue.
>
As the
to account
when setting:
[osd]
bluestore_allocator = bitmap
bluefs_allocator = bitmap
Writing this here for archival purposes so that users who have the same
question can find it easily.
Wido
>
> Thanks,
>
> Igor
>
>
> On 4/15/2019 3:39 PM, Wido den Hollander wrote:
&
Hi,
With the release of 12.2.12 the bitmap allocator for BlueStore is now
available under Mimic and Luminous.
[osd]
bluestore_allocator = bitmap
bluefs_allocator = bitmap
Before setting this in production: What might the implications be and
what should be thought of?
>From what I've read the bi
On 4/15/19 1:13 PM, Alfredo Daniel Rezinovsky wrote:
>
> On 15/4/19 06:54, Jasper Spaans wrote:
>> On 14/04/2019 17:05, Alfredo Daniel Rezinovsky wrote:
>>> autoscale-status reports some of my PG_NUMs are way too big
>>>
>>> I have 256 and need 32
>>>
>>> POOL SIZE TARGET SIZE RA
Hi,
I recently upgraded a cluster to 13.2.5 and right now the RestFul API
won't start due to this stacktrace:
2019-04-15 11:32:18.632 7f8797cb6700 0 mgr[restful] Traceback (most
recent call last):
File "/usr/lib64/ceph/mgr/restful/module.py", line 254, in serve
self._serve()
File "/usr/l
On 4/10/19 9:25 AM, jes...@krogh.cc wrote:
>> On 4/10/19 9:07 AM, Charles Alva wrote:
>>> Hi Ceph Users,
>>>
>>> Is there a way around to minimize rocksdb compacting event so that it
>>> won't use all the spinning disk IO utilization and avoid it being marked
>>> as down due to fail to send hear
On 4/10/19 9:07 AM, Charles Alva wrote:
> Hi Ceph Users,
>
> Is there a way around to minimize rocksdb compacting event so that it
> won't use all the spinning disk IO utilization and avoid it being marked
> as down due to fail to send heartbeat to others?
>
> Right now we have frequent high I
On 3/8/19 4:17 AM, Pardhiv Karri wrote:
> Hi,
>
> We have a ceph cluster with rack as failure domain but the racks are so
> imbalanced due to which we are not able to utilize the maximum of
> storage allocated as some odd's in small racks are filling up too fast
> and causing ceph to go into war
Hi,
I'm trying to do a 'rados cppool' of a RGW index pool and I keep hitting
this error:
.rgw.buckets.index:.dir.default.20674.1 =>
.rgw.buckets.index.new:.dir.default.20674.1
error copying object: (0) Success
error copying pool .rgw.buckets.index => .rgw.buckets.index.new: (5)
Input/output error
n(vol, image, &info) < 0)
> goto cleanup;
> } else {
> vol->target.allocation = info.obj_size * info.num_objs;
> }
> --
>
> Kind regards,
> Glen Baars
>
> -Original Message-
> From: Wido den Hollander
On 2/28/19 2:59 AM, Glen Baars wrote:
> Hello Ceph Users,
>
> Has anyone found a way to improve the speed of the rbd du command on large
> rbd images? I have object map and fast diff enabled - no invalid flags on the
> image or it's snapshots.
>
> We recently upgraded our Ubuntu 16.04 KVM se
On 2/21/19 9:19 PM, Paul Emmerich wrote:
> On Thu, Feb 21, 2019 at 4:05 PM Wido den Hollander wrote:
>> This isn't available in 13.2.4, but should be in 13.2.5, so on Mimic you
>> will need to wait. But this might bite you at some point.
>
> Unfortunately it hasn&
On 2/24/19 4:34 PM, David Turner wrote:
> One thing that's worked for me to get more out of nvmes with Ceph is to
> create multiple partitions on the nvme with an osd on each partition.
> That way you get more osd processes and CPU per nvme device. I've heard
> of people using up to 4 partitions
Hi,
For the last few months I've been getting question about people seeing
warnings about large OMAP objects after scrubs.
I've been digging for a few months (You'll also find multiple threads
about this) and it all seemed to trace back to RGW indexes.
Resharding didn't clean up old indexes prop
On 2/20/19 1:03 PM, Andrés Rojas Guerrero wrote:
> Hi all, sorry, we are newbies in Ceph and we have a newbie question
> about it. We have a Ceph cluster with three mon's and two public networks:
>
> public network = 10.100.100.0/23,10.100.101.0/21
>
> We have seen that ceph-mon are listen in o
On 2/19/19 6:28 PM, Marc Roos wrote:
>
> >>
> >
> >I'm not saying CephFS snapshots are 100% stable, but for certain
> >use-cases they can be.
> >
> >Try to avoid:
> >
> >- Multiple CephFS in same cluster
> >- Snapshot the root (/)
> >- Having a lot of snapshots
>
> How many is a lot?
On 2/19/19 6:00 PM, Balazs Soltesz wrote:
> Hi all,
>
>
>
> I’m experimenting with CephFS as storage to a bitbucket cluster.
>
>
>
> One problems to tackle is replicating the filesystem contents between
> ceph clusters in different sites around the globe.
>
> I’ve read about pool replica
Hi,
Has anybody ever tried or does know how safe it is to set
'rados_osd_op_timeout' in a RGW-only situation?
Right now, if one PG becomes inactive or OSDs are super slow the RGW
will start to block at some point since the RADOS operations will never
time out.
Using rados_osd_op_timeout you can
8GB per OSD
so it will max out on 80GB leaving 16GB as spare.
As these OSDs were all restarted earlier this week I can't tell how it
will hold up over a longer period. Monitoring (Zabbix) shows the latency
is fine at the moment.
Wido
>
>
> - Mail original -
> De: &qu
On 2/15/19 2:31 PM, Alexandre DERUMIER wrote:
> Thanks Igor.
>
> I'll try to create multiple osds by nvme disk (6TB) to see if behaviour is
> different.
>
> I have other clusters (same ceph.conf), but with 1,6TB drives, and I don't
> see this latency problem.
>
>
Just wanted to chime in, I'v
On 2/14/19 2:08 PM, Dan van der Ster wrote:
> On Thu, Feb 14, 2019 at 12:07 PM Wido den Hollander wrote:
>>
>>
>>
>> On 2/14/19 11:26 AM, Dan van der Ster wrote:
>>> On Thu, Feb 14, 2019 at 11:13 AM Wido den Hollander wrote:
>>>>
>>>&g
On 2/14/19 11:26 AM, Dan van der Ster wrote:
> On Thu, Feb 14, 2019 at 11:13 AM Wido den Hollander wrote:
>>
>> On 2/14/19 10:20 AM, Dan van der Ster wrote:
>>> On Thu., Feb. 14, 2019, 6:17 a.m. Wido den Hollander >>>
>>>> Hi,
>>>>
>
On 2/14/19 10:20 AM, Dan van der Ster wrote:
> On Thu., Feb. 14, 2019, 6:17 a.m. Wido den Hollander >
>> Hi,
>>
>> On a cluster running RGW only I'm running into BlueStore 12.2.11 OSDs
>> being 100% busy sometimes.
>>
>> This cluster has 85k sta
Hi,
On a cluster running RGW only I'm running into BlueStore 12.2.11 OSDs
being 100% busy sometimes.
This cluster has 85k stale indexes (stale-instances list) and I've been
slowly trying to remove them.
I noticed that regularly OSDs read their HDD heavily and that device
then becomes 100% busy.
On 2/14/19 4:40 AM, John Petrini wrote:
> Okay that makes more sense, I didn't realize the WAL functioned in a
> similar manner to filestore journals (though now that I've had another
> read of Sage's blog post, New in Luminous: BlueStore, I notice he does
> cover this). Is this to say that writ
Hi,
I've got a situation where I need to split a Ceph cluster into two.
This cluster is currently running a mix of RBD and RGW and in this case
I am splitting it into two different clusters.
A difficult thing to do, but it's possible.
One problem that stays though is that after the split both C
On 2/8/19 8:38 AM, Ashley Merrick wrote:
> So I was adding a new host using ceph-deploy, for the first OSD I
> accidentally run it against the hostname of the external IP and not the
> internal network.
>
> I stopped / deleted the OSD from the new host and then re-created the
> OSD using the in
On 2/8/19 8:13 AM, Ashley Merrick wrote:
> I have had issues on a mimic cluster (latest release) where the
> dashboard does not display any read or write ops under the pool's
> section on the main dashboard page.
>
> I have just noticed during restarting the mgr service the following
> shows un
1 - 100 of 1239 matches
Mail list logo