Re: [ceph-users] OSD log being spammed with BlueStore stupidallocator dump

2018-10-15 Thread Wido den Hollander
ot;, some chances that I missed something.. > > > Thanks, > > Igor > > > On 10/15/2018 10:12 PM, Wido den Hollander wrote: >> >> On 10/15/2018 08:23 PM, Gregory Farnum wrote: >>> I don't know anything about the BlueStore code, but given the snippets

Re: [ceph-users] OSD log being spammed with BlueStore stupidallocator dump

2018-10-15 Thread Wido den Hollander
On 10/16/2018 12:04 AM, Igor Fedotov wrote: > > On 10/15/2018 11:47 PM, Wido den Hollander wrote: >> Hi, >> >> On 10/15/2018 10:43 PM, Igor Fedotov wrote: >>> Hi Wido, >>> >>> once you apply the PR you'll probably see the initial er

Re: [ceph-users] OSD log being spammed with BlueStore stupidallocator dump

2018-10-16 Thread Wido den Hollander
On 10/16/18 11:32 AM, Igor Fedotov wrote: > > > On 10/16/2018 6:57 AM, Wido den Hollander wrote: >> >> On 10/16/2018 12:04 AM, Igor Fedotov wrote: >>> On 10/15/2018 11:47 PM, Wido den Hollander wrote: >>>> Hi, >>>> >>

Re: [ceph-users] why set pg_num do not update pgp_num

2018-10-19 Thread Wido den Hollander
On 10/19/18 7:51 AM, xiang@iluvatar.ai wrote: > Hi! > > I use ceph 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic > (stable), and find that: > > When expand whole cluster, i update pg_num, all succeed, but the status > is as below: >   cluster: >     id: 41ef913c-2351-4794-b9ac

Re: [ceph-users] Luminous 12.2.3 Changelog ?

2018-02-21 Thread Wido den Hollander
They aren't there yet: http://docs.ceph.com/docs/master/release-notes/ And no Git commit yet: https://github.com/ceph/ceph/commits/master/doc/release-notes.rst I think the Release Manager is doing its best to release them asap. 12.2.3 packages were released this morning :) Wido On 02/21/201

Re: [ceph-users] Luminous 12.2.3 Changelog ?

2018-02-21 Thread Wido den Hollander
On 02/21/2018 01:39 PM, Konstantin Shalygin wrote: Is there any changelog for this release ? https://github.com/ceph/ceph/pull/20503 And this one: https://github.com/ceph/ceph/pull/20500 Wido k ___ ceph-users mailing list ceph-users@lists.

[ceph-users] PG mapped to OSDs on same host although 'chooseleaf type host'

2018-02-22 Thread Wido den Hollander
Hi, I have a situation with a cluster which was recently upgraded to Luminous and has a PG mapped to OSDs on the same host. root@man:~# ceph pg map 1.41 osdmap e21543 pg 1.41 (1.41) -> up [15,7,4] acting [15,7,4] root@man:~# root@man:~# ceph osd find 15|jq -r '.crush_location.host' n02 root@m

Re: [ceph-users] PG mapped to OSDs on same host although 'chooseleaf type host'

2018-02-23 Thread Wido den Hollander
41 I also disabled the balancer for now (will report a issue) and removed all other upmap entries: $ ceph osd dump|grep pg_upmap_items|awk '{print $2}'|xargs -n 1 ceph osd rm-pg-upmap-items Thanks for the hint! Wido mike On Thu, Feb 22, 2018 at 10:28 AM, Wido den Hollander <ma

Re: [ceph-users] /var/lib/ceph/osd/ceph-xxx/current/meta shows "Structure needs cleaning"

2018-03-08 Thread Wido den Hollander
On 03/08/2018 08:01 AM, 赵贺东 wrote: Hi All, Every time after we activate osd, we got “Structure needs cleaning” in /var/lib/ceph/osd/ceph-xxx/current/meta. /var/lib/ceph/osd/ceph-xxx/current/meta # ls -l ls: reading directory .: Structure needs cleaning total 0 Could Anyone say something ab

[ceph-users] 19th April 2018: Ceph/Apache CloudStack day in London

2018-03-08 Thread Wido den Hollander
Hello Ceph (and CloudStack ;-) ) people! Together with the Apache CloudStack [0] project we are organizing a Ceph Day in London on April 19th this year. As there are many users using Apache CloudStack with Ceph as the storage behind their Virtual Machines or using Ceph as a object store in a

Re: [ceph-users] IO rate-limiting with Ceph RBD (and libvirt)

2018-03-21 Thread Wido den Hollander
On 03/21/2018 06:48 PM, Andre Goree wrote: > I'm trying to determine the best way to go about configuring IO > rate-limiting for individual images within an RBD pool. > > Here [1], I've found that OpenStack appears to use Libvirt's "iotune" > parameter, however I seem to recall reading about bei

Re: [ceph-users] What is in the mon leveldb?

2018-03-26 Thread Wido den Hollander
On 03/27/2018 06:40 AM, Tracy Reed wrote: > Hello all, > > It seems I have underprovisioned storage space for my mons and my > /var/lib/ceph/mon filesystem is getting full. When I first started using > ceph this only took up tens of megabytes and I assumed it would stay > that way and 5G for thi

Re: [ceph-users] Requests blocked as cluster is unaware of dead OSDs for quite a long time

2018-03-27 Thread Wido den Hollander
On 03/27/2018 12:58 AM, Jared H wrote: > I have three datacenters with three storage hosts in each, which house > one OSD/MON per host. There are three replicas, one in each datacenter. > I want the cluster to be able to survive a nuke dropped on 1/3 > datacenters, scaling up to 2/5 datacenters.

Re: [ceph-users] What is in the mon leveldb?

2018-03-28 Thread Wido den Hollander
On 03/28/2018 01:34 AM, Tracy Reed wrote: >> health: HEALTH_WARN >> recovery 1230/13361271 objects misplaced (0.009%) >> >> and no recovery is happening. I'm not sure why. This hasn't happened >> before. But the mon db had been growing since long before this >> circumstance. > > Hmm.

Re: [ceph-users] Getting a public file from radosgw

2018-03-28 Thread Wido den Hollander
On 03/28/2018 11:59 AM, Marc Roos wrote: > > > > > > Do you have maybe some pointers, or example? ;) > When you upload using s3cmd try using the -P flag, that will set the public-read ACL. Wido > This XML file does not appear to have any style information associated > with it. The doc

Re: [ceph-users] Getting a public file from radosgw

2018-03-28 Thread Wido den Hollander
3://test/ > s3://test/ (bucket): >Location: us-east-1 >Payer: BucketOwner >Expiration Rule: none >policy:none >cors: none >ACL: *anon*: READ >ACL: Test1 User: FULL_CONTROL > URL: http://192.168.1.111:7480/test/ >

Re: [ceph-users] Use trimfs on already mounted RBD image

2018-04-04 Thread Wido den Hollander
On 04/04/2018 07:30 PM, Damian Dabrowski wrote: > Hello, > > I wonder if it is any way to run `trimfs` on rbd image which is > currently used by the KVM process? (when I don't have access to VM) > > I know that I can do this by qemu-guest-agent but not all VMs have it > installed. > > I can't

Re: [ceph-users] ceph-deploy: recommended?

2018-04-04 Thread Wido den Hollander
On 04/04/2018 08:58 PM, Robert Stanford wrote: > >  I read a couple of versions ago that ceph-deploy was not recommended > for production clusters.  Why was that?  Is this still the case?  We > have a lot of problems automating deployment without ceph-deploy. > > In the end it is just a Pytho

Re: [ceph-users] Admin socket on a pure client: is it possible?

2018-04-09 Thread Wido den Hollander
On 04/09/2018 04:01 PM, Fulvio Galeazzi wrote: > Hallo, > >   I am wondering whether I could have the admin socket functionality > enabled on a server which is a pure Ceph client (no MDS/MON/OSD/whatever > running on such server). Is this at all possible? How should ceph.conf > be configured? Do

[ceph-users] ceph-fuse CPU and Memory usage vs CephFS kclient

2018-04-10 Thread Wido den Hollander
Hi, There have been numerous threads about this in the past, but I wanted to bring this up again in a new situation. Running with Luminous v12.2.4 I'm seeing some odd Memory and CPU usage when using the ceph-fuse client to mount a multi-MDS CephFS filesystem. health: HEALTH_OK services:

Re: [ceph-users] ceph-fuse CPU and Memory usage vs CephFS kclient

2018-04-10 Thread Wido den Hollander
On 04/10/2018 09:22 PM, Gregory Farnum wrote: > On Tue, Apr 10, 2018 at 6:32 AM Wido den Hollander <mailto:w...@42on.com>> wrote: > > Hi, > > There have been numerous threads about this in the past, but I wanted to > bring this up again in a new situa

Re: [ceph-users] ceph-fuse CPU and Memory usage vs CephFS kclient

2018-04-10 Thread Wido den Hollander
On 04/10/2018 09:45 PM, Gregory Farnum wrote: > On Tue, Apr 10, 2018 at 12:36 PM, Wido den Hollander wrote: >> >> >> On 04/10/2018 09:22 PM, Gregory Farnum wrote: >>> On Tue, Apr 10, 2018 at 6:32 AM Wido den Hollander >> <mailto:w...@42on.com>> wrote

Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-12 Thread Wido den Hollander
On 04/12/2018 04:36 AM, ? ?? wrote: > Hi,  > > For anybody who may be interested, here I share a process of locating the > reason for ceph cluster performance slow down in our environment. > > Internally, we have a cluster with capacity 1.1PB, used 800TB, and raw user > data is about 500TB. E

Re: [ceph-users] London Ceph day yesterday

2018-04-23 Thread Wido den Hollander
On 04/23/2018 12:09 PM, John Spray wrote: > On Fri, Apr 20, 2018 at 9:32 AM, Sean Purdy wrote: >> Just a quick note to say thanks for organising the London Ceph/OpenStack >> day. I got a lot out of it, and it was nice to see the community out in >> force. > > +1, thanks for Wido and the Shap

Re: [ceph-users] Questions regarding hardware design of an SSD only cluster

2018-04-24 Thread Wido den Hollander
On 04/24/2018 05:01 AM, Mohamad Gebai wrote: > > > On 04/23/2018 09:24 PM, Christian Balzer wrote: >> >>> If anyone has some ideas/thoughts/pointers, I would be glad to hear them. >>> >> RAM, you'll need a lot of it, even more with Bluestore given the current >> caching. >> I'd say 1GB per TB

[ceph-users] Collecting BlueStore per Object DB overhead

2018-04-26 Thread Wido den Hollander
Hi, I've been investigating the per object overhead for BlueStore as I've seen this has become a topic for a lot of people who want to store a lot of small objects in Ceph using BlueStore. I've writting a piece of Python code which can be run on a server running OSDs and will print the overhead.

Re: [ceph-users] trimming the MON level db

2018-04-28 Thread Wido den Hollander
On 04/27/2018 08:31 PM, David Turner wrote: > I'm assuming that the "very bad move" means that you have some PGs not > in active+clean.  Any non-active+clean PG will prevent your mons from > being able to compact their db store.  This is by design so that if > something were to happen where the d

Re: [ceph-users] Collecting BlueStore per Object DB overhead

2018-04-30 Thread Wido den Hollander
On 04/30/2018 10:25 PM, Gregory Farnum wrote: > > > On Thu, Apr 26, 2018 at 11:36 AM Wido den Hollander <mailto:w...@42on.com>> wrote: > > Hi, > > I've been investigating the per object overhead for BlueStore as I've > seen this has bec

[ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-01 Thread Wido den Hollander
Hi, I've been trying to get the lowest latency possible out of the new Xeon Scalable CPUs and so far I got down to 1.3ms with the help of Nick. However, I can't seem to pin the CPUs to always run at their maximum frequency. If I disable power saving in the BIOS they stay at 2.1Ghz (Silver 4110),

Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Wido den Hollander
On 05/07/2018 04:53 PM, Reed Dier wrote: > I’ll +1 on InfluxDB rather than Prometheus, though I think having a version > for each infrastructure path would be best. > I’m sure plenty here have existing InfluxDB infrastructure as their TSDB of > choice, and moving to Prometheus would be less adv

Re: [ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-14 Thread Wido den Hollander
ocking down to 800Mhz I've set scaling_min_freq=scaling_max_freq in /sys, but that doesn't change a thing. The CPUs keep scaling down. Still not close to the 1ms latency with these CPUs :( Wido > > -Original Message- > From: ceph-users On Behalf Of Blair > Bethwaite >

Re: [ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-15 Thread Wido den Hollander
having some issues with getting intel_pstate loaded, but with 4.16 it loaded without any problems, but still, CPUs keep clocking down. Wido > > > -Original Message- > From: ceph-users On Behalf Of Wido den > Hollander > Sent: 14 May 2018 14:14 > To: n...@fisk

Re: [ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-15 Thread Wido den Hollander
On 05/15/2018 02:51 PM, Blair Bethwaite wrote: > Sorry, bit late to get back to this... > > On Wed., 2 May 2018, 06:19 Nick Fisk, > wrote: > > 4.16 required? > > > Looks like it - thanks for pointing that out. > > Wido, I don't think you are doing anything wrong

Re: [ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-16 Thread Wido den Hollander
On 05/16/2018 01:22 PM, Blair Bethwaite wrote: > On 15 May 2018 at 08:45, Wido den Hollander <mailto:w...@42on.com>> wrote: > > > We've got some Skylake Ubuntu based hypervisors that we can look at to > > compare tomorrow... > > > &g

Re: [ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-17 Thread Wido den Hollander
On 05/16/2018 03:34 PM, Wido den Hollander wrote: > > > On 05/16/2018 01:22 PM, Blair Bethwaite wrote: >> On 15 May 2018 at 08:45, Wido den Hollander > <mailto:w...@42on.com>> wrote: >> >> > We've got some Skylake Ubuntu based hypervisors t

Re: [ceph-users] A question about HEALTH_WARN and monitors holding onto cluster maps

2018-05-17 Thread Wido den Hollander
On 05/17/2018 04:37 PM, Thomas Byrne - UKRI STFC wrote: > Hi all, > >   > > As far as I understand, the monitor stores will grow while not HEALTH_OK > as they hold onto all cluster maps. Is this true for all HEALTH_WARN > reasons? Our cluster recently went into HEALTH_WARN due to a few weeks >

Re: [ceph-users] Data recovery after loosing all monitors

2018-05-22 Thread Wido den Hollander
On 05/22/2018 03:38 PM, George Shuklin wrote: > Good news, it's not an emergency, just a curiosity. > > Suppose I lost all monitors in a ceph cluster in my laboratory. I have > all OSDs intact. Is it possible to recover something from Ceph? Yes, there is. Using ceph-objectstore-tool you are abl

Re: [ceph-users] Stop scrubbing

2018-06-06 Thread Wido den Hollander
On 06/06/2018 08:32 PM, Joe Comeau wrote: > When I am upgrading from filestore to bluestore > or any other server maintenance for a short time > (ie high I/O while rebuilding) >   > ceph osd set noout > ceph osd set noscrub > ceph osd set nodeep-scrub >   > when finished >   > ceph osd unset nos

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread Wido den Hollander
On 06/07/2018 09:46 AM, Kevin Olbrich wrote: > Hi! > > When we installed our new luminous cluster, we had issues with the > cluster network (setup of mon's failed). > We moved on with a single network setup. > > Now I would like to set the cluster network again but the cluster is in > use (4 no

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread Wido den Hollander
it. Keep it simple, one network to run the cluster on. Less components which can fail or complicate things. Wido > Thank you. > > Kevin > > 2018-06-07 10:44 GMT+02:00 Wido den Hollander <mailto:w...@42on.com>>: > > > > On 06/07/2018 09:46 AM, K

Re: [ceph-users] Adding cluster network to running cluster

2018-06-07 Thread Wido den Hollander
On 06/07/2018 01:39 PM, mj wrote: > Hi, > > Please allow me to ask one more question: > > We currently have a seperated network: cluster on 10.10.x.x and public > on 192.168.x.x. > > I would like to migrate all network to 192.168.x.x setup, which would > give us 2*10G. > > Is simply changing

Re: [ceph-users] Ceph bonding vs separate provate public network

2018-06-12 Thread Wido den Hollander
On 06/12/2018 02:00 PM, Steven Vacaroaia wrote: > Hi, > > I am designing a new ceph cluster and was wondering whether I should > bond the 10 GB adapters or use one for public one for private > > The advantage of bonding is simplicity and, maybe, performance  > The catch though is that I cannot

Re: [ceph-users] Journal flushed on osd clean shutdown?

2018-06-13 Thread Wido den Hollander
On 06/13/2018 11:39 AM, Chris Dunlop wrote: > Hi, > > Is the osd journal flushed completely on a clean shutdown? > > In this case, with Jewel, and FileStore osds, and a "clean shutdown" being: > It is, a Jewel OSD will flush it's journal on a clean shutdown. The flush-journal is no longer ne

Re: [ceph-users] PM1633a

2018-06-16 Thread Wido den Hollander
On 06/15/2018 09:02 PM, Brian : wrote: > Hello List - anyone using these drives and have any good / bad things > to say about them? > Not really experience with them. I was about to order them in a SuperMicro chassis which supports SAS3 but then I found that the PM963a NVMe disks have the sam

Re: [ceph-users] Planning all flash cluster

2018-06-20 Thread Wido den Hollander
On 06/20/2018 02:00 PM, Robert Sander wrote: > On 20.06.2018 13:58, Nick A wrote: > >> We'll probably add another 2 OSD drives per month per node until full >> (24 SSD's per node), at which point, more nodes. > > I would add more nodes earlier to achieve better overall performance. Exactly. No

Re: [ceph-users] "ceph pg scrub" does not start

2018-06-21 Thread Wido den Hollander
On 06/21/2018 11:11 AM, Jake Grimmett wrote: > Dear All, > > A bad disk controller appears to have damaged our cluster... > > # ceph health > HEALTH_ERR 10 scrub errors; Possible data damage: 10 pgs inconsistent > > probing to find bad pg... > > # ceph health detail > HEALTH_ERR 10 scrub err

Re: [ceph-users] Designating an OSD as a spare

2018-06-21 Thread Wido den Hollander
On 06/21/2018 03:35 PM, Drew Weaver wrote: > Yes, > >   > > Eventually however you would probably want to replace that physical disk > that has died and sometimes with remote deployments it is nice to not > have to do that instantly which is how enterprise arrays and support > contracts have wo

Re: [ceph-users] bluestore upgrade 11.0.2 to 11.1.1 failed

2017-01-11 Thread Wido den Hollander
> Op 11 januari 2017 om 12:24 schreef Jayaram R : > > > Hello, > > > > We from Nokia are validating bluestore on 3 node cluster with EC 2+1 > > > > While upgrading our cluster from Kraken 11.0.2 to 11.1.1 with bluesotre , > the cluster affected more than half of the OSDs went down. > Yes

Re: [ceph-users] Write back cache removal

2017-01-12 Thread Wido den Hollander
> Op 10 januari 2017 om 22:05 schreef Nick Fisk : > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Stuart Harland > Sent: 10 January 2017 11:58 > To: Wido den Hollander > Cc: ceph new ; n...@fisk.me.uk > Subject: Re: [ceph-user

Re: [ceph-users] PGs of EC pool stuck in peering state

2017-01-12 Thread Wido den Hollander
> Op 12 januari 2017 om 12:37 schreef george.vasilaka...@stfc.ac.uk: > > > Hi Ceph folks, > > I’ve just posted a bug report http://tracker.ceph.com/issues/18508 > So I debugged this a bit with George and after switching from async messenger back to simple messenger the problems are gone. S

Re: [ceph-users] HEALTH_OK when one server crashed?

2017-01-12 Thread Wido den Hollander
> Op 12 januari 2017 om 15:35 schreef Matthew Vernon : > > > Hi, > > One of our ceph servers froze this morning (no idea why, alas). Ceph > noticed, moved things around, and when I ran ceph -s, said: > > root@sto-1-1:~# ceph -s > cluster 049fc780-8998-45a8-be12-d3b8b6f30e69 > health H

Re: [ceph-users] All SSD cluster performance

2017-01-13 Thread Wido den Hollander
> Op 13 januari 2017 om 18:18 schreef Mohammed Naser : > > > Hi everyone, > > We have a deployment with 90 OSDs at the moment which is all SSD that’s not > hitting quite the performance that it should be in my opinion, a `rados > bench` run gives something along these numbers: > > Maintainin

Re: [ceph-users] All SSD cluster performance

2017-01-13 Thread Wido den Hollander
> Op 13 januari 2017 om 18:39 schreef Mohammed Naser : > > > > > On Jan 13, 2017, at 12:37 PM, Wido den Hollander wrote: > > > > > >> Op 13 januari 2017 om 18:18 schreef Mohammed Naser : > >> > >> > >> Hi everyone, >

Re: [ceph-users] All SSD cluster performance

2017-01-13 Thread Wido den Hollander
> Op 13 januari 2017 om 18:50 schreef Mohammed Naser : > > > > > On Jan 13, 2017, at 12:41 PM, Wido den Hollander wrote: > > > > > >> Op 13 januari 2017 om 18:39 schreef Mohammed Naser : > >> > >> > >> > >>> On

Re: [ceph-users] rgw leaking data, orphan search loop

2017-01-13 Thread Wido den Hollander
> Op 24 december 2016 om 13:47 schreef Wido den Hollander : > > > > > Op 23 december 2016 om 16:05 schreef Wido den Hollander : > > > > > > > > > Op 22 december 2016 om 19:00 schreef Orit Wasserman : > > > > > > > &g

Re: [ceph-users] All SSD cluster performance

2017-01-13 Thread Wido den Hollander
> Op 13 januari 2017 om 20:33 schreef Mohammed Naser : > > > > > On Jan 13, 2017, at 1:34 PM, Wido den Hollander wrote: > > > >> > >> Op 13 januari 2017 om 18:50 schreef Mohammed Naser : > >> > >> > >> > >>>

Re: [ceph-users] All SSD cluster performance

2017-01-14 Thread Wido den Hollander
e: > > > > > > Also, there are lot of discussion about SSDs not suitable for Ceph write > > > workload (with filestore) in community as those are not good for > > > odirect/odsync kind of writes. Hope your SSDs are tolerant of that. > > > > > &

Re: [ceph-users] Change Partition Schema on OSD Possible?

2017-01-14 Thread Wido den Hollander
> Op 14 januari 2017 om 11:05 schreef Hauke Homburg : > > > Hello, > > In our Ceph Cluster are our HDD in the OSD with 50% DATA in GPT > Partitions configured. Can we change this Schema to have more Data Storage? > How do you mean? > Our HDD are 5TB so i hope to have more Space when i change

Re: [ceph-users] rgw static website docs 404

2017-01-19 Thread Wido den Hollander
> Op 19 januari 2017 om 2:57 schreef Ben Hines : > > > Aha! Found some docs here in the RHCS site: > > https://access.redhat.com/documentation/en/red-hat-ceph-storage/2/paged/object-gateway-guide-for-red-hat-enterprise-linux/chapter-2-configuration > > Really, ceph.com should have all this too

Re: [ceph-users] rgw static website docs 404

2017-01-20 Thread Wido den Hollander
dev didn't want to write docs, he/she forgot or just didn't get to it yet. It would be very much appreciated if you would send a PR with the updated documentation :) Wido > -Ben > > On Thu, Jan 19, 2017 at 1:56 AM, Wido den Hollander wrote: > > > > > > Op 19

Re: [ceph-users] Ceph counters decrementing after changing pg_num

2017-01-20 Thread Wido den Hollander
> Op 20 januari 2017 om 17:17 schreef Kai Storbeck : > > > Hello ceph users, > > My graphs of several counters in our Ceph cluster are showing abnormal > behaviour after changing the pg_num and pgp_num respectively. What counters exactly? Like pg information? It could be that it needs a scrub

Re: [ceph-users] Replacing an mds server

2017-01-24 Thread Wido den Hollander
> Op 24 januari 2017 om 22:08 schreef Goncalo Borges > : > > > Hi Jorge > Indeed my advice is to configure your high memory mds as a standby mds. Once > you restart the service in the low memory mds, the standby one should take > over without downtime and the first one becomes the standby one

[ceph-users] systemd and ceph-mon autostart on Ubuntu 16.04

2017-01-25 Thread Wido den Hollander
Hi, I thought this issue was resolved a while ago, but while testing Kraken with BlueStore I ran into the problem again. My monitors are not being started on boot: Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-59-generic x86_64) * Documentation: https://help.ubuntu.com * Management: ht

Re: [ceph-users] systemd and ceph-mon autostart on Ubuntu 16.04

2017-01-25 Thread Wido den Hollander
> Op 25 januari 2017 om 20:25 schreef Patrick Donnelly : > > > On Wed, Jan 25, 2017 at 2:19 PM, Wido den Hollander wrote: > > Hi, > > > > I thought this issue was resolved a while ago, but while testing Kraken > > with BlueStore I ran into the problem agai

Re: [ceph-users] ceph rados gw, select objects by metadata

2017-01-30 Thread Wido den Hollander
> Op 30 januari 2017 om 10:29 schreef Johann Schwarzmeier > : > > > Hello, > I’m quite new to ceph and radosgw. With the python API, I found calls > for writing objects via boto API. It’s also possible to add metadata’s > to our objects. But now I have a question: is it possible to select or

Re: [ceph-users] mon.mon01 store is getting too big! 18119 MB >= 15360 MB -- 94% avail

2017-01-31 Thread Wido den Hollander
> Op 31 januari 2017 om 10:22 schreef Martin Palma : > > > Hi all, > > our cluster is currently performing a big expansion and is in recovery > mode (we doubled in size and osd# from 600 TB to 1,2 TB). > Yes, that is to be expected. When not all PGs are active+clean the MONs will not trim th

Re: [ceph-users] [Ceph-mirrors] rsync service download.ceph.com partially broken

2017-01-31 Thread Wido den Hollander
> Op 31 januari 2017 om 13:46 schreef Björn Lässig : > > > Hi cephers, > > since some time i get errors while rsyncing from the ceph download server: > > download.ceph.com: > > rsync: send_files failed to open "/debian-jewel/db/lockfile" (in ceph): > Permission denied (13) > "/debian-jewel/

Re: [ceph-users] CephFS read IO caching, where it is happining?

2017-02-02 Thread Wido den Hollander
> Op 2 februari 2017 om 15:35 schreef Ahmed Khuraidah : > > > Hi all, > > I am still confused about my CephFS sandbox. > > When I am performing simple FIO test into single file with size of 3G I > have too many IOps: > > cephnode:~ # fio payloadrandread64k3G > test: (g=0): rw=randread, bs=64K

Re: [ceph-users] Experience with 5k RPM/archive HDDs

2017-02-03 Thread Wido den Hollander
> Op 3 februari 2017 om 8:39 schreef Christian Balzer : > > > > Hello, > > On Fri, 3 Feb 2017 10:30:28 +0300 Irek Fasikhov wrote: > > > Hi, Maxime. > > > > Linux SMR is only starting with version 4.9 kernel. > > > What Irek said. > > Also, SMR in general is probably a bad match for Ceph. >

Re: [ceph-users] RGW authentication fail with AWS S3 v4

2017-02-03 Thread Wido den Hollander
> Op 3 februari 2017 om 9:52 schreef Khang Nguyễn Nhật > : > > > Hi all, > I'm using Ceph Object Gateway with S3 API (ceph-radosgw-10.2.5-0.el7.x86_64 > on CentOS Linux release 7.3.1611) and I use generate_presigned_url method > of boto3 to create rgw url. This url working fine in period of 15

Re: [ceph-users] Experience with 5k RPM/archive HDDs

2017-02-03 Thread Wido den Hollander
tel DC series. All pools by default should go to those OSDs. Only the RGW buckets data pool should go to the big SMR drives. However, again, expect very, very low performance of those disks. Wido > Cheers, > Maxime > > On 03/02/17 09:40, "ceph-users on behalf of Wido den

Re: [ceph-users] CephFS read IO caching, where it is happining?

2017-02-03 Thread Wido den Hollander
, Feb 2, 2017 at 9:30 PM, Shinobu Kinjo wrote: > > > You may want to add this in your FIO recipe. > > > > * exec_prerun=echo 3 > /proc/sys/vm/drop_caches > > > > Regards, > > > > On Fri, Feb 3, 2017 at 12:36 AM, Wido den Hollander wrote: > >

Re: [ceph-users] ceph df : negative numbers

2017-02-06 Thread Wido den Hollander
> Op 6 februari 2017 om 11:10 schreef Florent B : > > > # ceph -v > ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367) > > (officiel Ceph packages for Jessie) > > > Yes I recently adjusted pg_num, but all objects were correctly rebalanced. > > Then a manually deleted some objects

Re: [ceph-users] would people mind a slow osd restart during luminous upgrade?

2017-02-08 Thread Wido den Hollander
> Op 9 februari 2017 om 4:09 schreef Sage Weil : > > > Hello, ceph operators... > > Several times in the past we've had to do some ondisk format conversion > during upgrade which mean that the first time the ceph-osd daemon started > after upgrade it had to spend a few minutes fixing up it's

Re: [ceph-users] Speeding Up "rbd ls -l " output

2017-02-09 Thread Wido den Hollander
> Op 9 februari 2017 om 9:13 schreef Özhan Rüzgar Karaman > : > > > Hi; > I am using Hammer 0.49.9 release on my Ceph Storage, today i noticed that > listing an rbd pool takes to much time then the old days. If i have more > rbd images on pool it takes much more time. > It is the -l flag that

Re: [ceph-users] Speeding Up "rbd ls -l " output

2017-02-09 Thread Wido den Hollander
33c1-9389-8bf226c887e8 > Pool 01b375db-d3f5-33c1-9389-8bf226c887e8 refreshed > > > real 0m22.504s > user 0m0.012s > sys 0m0.004s > > Thanks > Özhan > > > On Thu, Feb 9, 2017 at 11:30 AM, Wido den Hollander wrote: > > > > > > Op 9 februari 2017

Re: [ceph-users] Radosgw scaling recommendation?

2017-02-09 Thread Wido den Hollander
> Op 9 februari 2017 om 19:34 schreef Mark Nelson : > > > I'm not really an RGW expert, but I'd suggest increasing the > "rgw_thread_pool_size" option to something much higher than the default > 100 threads if you haven't already. RGW requires at least 1 thread per > client connection, so wi

Re: [ceph-users] CephFS root squash?

2017-02-10 Thread Wido den Hollander
> Op 10 februari 2017 om 9:02 schreef Robert Sander > : > > > On 09.02.2017 20:11, Jim Kilborn wrote: > > > I am trying to figure out how to allow my users to have sudo on their > > workstation, but not have that root access to the ceph kernel mounted > > volume. > > I do not think that Cep

Re: [ceph-users] - permission denied on journal after reboot

2017-02-13 Thread Wido den Hollander
> Op 13 februari 2017 om 12:06 schreef Piotr Dzionek : > > > Hi, > > I am running ceph Jewel 10.2.5 with separate journals - ssd disks. It > runs pretty smooth, however I stumble upon an issue after system reboot. > Journal disks become owned by root and ceph failed to start. > > /starting o

Re: [ceph-users] OSDs cannot match up with fast OSD map changes (epochs) during recovery

2017-02-13 Thread Wido den Hollander
> Op 13 februari 2017 om 12:57 schreef Muthusamy Muthiah > : > > > Hi All, > > We also have same issue on one of our platforms which was upgraded from > 11.0.2 to 11.2.0 . The issue occurs on one node alone where CPU hits 100% > and OSDs of that node marked down. Issue not seen on cluster whic

[ceph-users] SMR disks go 100% busy after ~15 minutes

2017-02-13 Thread Wido den Hollander
Hi, I have a odd case with SMR disks in a Ceph cluster. Before I continue, yes, I am fully aware of SMR and Ceph not playing along well, but there is something happening which I'm not able to fully explain. On a 2x replica cluster with 8TB Seagate SMR disks I can write with about 30MB/sec to e

Re: [ceph-users] SMR disks go 100% busy after ~15 minutes

2017-02-13 Thread Wido den Hollander
27;t aware that SMR disks have that. SMR shouldn't be used in Ceph without proper support in Bluestore or XFS aware SMR. Wido > > On 02/13/17 15:49, Wido den Hollander wrote: > > Hi, > > > > I have a odd case with SMR disks in a Ceph cluster. Before I continue, yes,

Re: [ceph-users] 1 PG stuck unclean (active+remapped) after OSD replacement

2017-02-13 Thread Wido den Hollander
> Op 13 februari 2017 om 16:03 schreef Eugen Block : > > > Hi experts, > > I have a strange situation right now. We are re-organizing our 4 node > Hammer cluster from LVM-based OSDs to HDDs. When we did this on the > first node last week, everything went smoothly, I removed the OSDs > fro

Re: [ceph-users] - permission denied on journal after reboot

2017-02-13 Thread Wido den Hollander
GROUP="ceph", MODE="660" Wido > > > Udo > > Am 2017-02-13 16:13, schrieb Piotr Dzionek: > > I run it on CentOS Linux release 7.3.1611. After running "udevadm test > > /sys/block/sda/sda1" I don't see that this rule apply to this disk.

Re: [ceph-users] SMR disks go 100% busy after ~15 minutes

2017-02-13 Thread Wido den Hollander
any > problems as the XFS journal is rewritten often and SMR disks don't like > rewrites. > I think that is one reason why btrfs works smoother with those disks. > > Hope this helps > > Bernhard > > Wido den Hollander schrieb am Mo., 13. Feb. 2017 um > 16:

Re: [ceph-users] bcache vs flashcache vs cache tiering

2017-02-14 Thread Wido den Hollander
> Op 14 februari 2017 om 11:14 schreef Nick Fisk : > > > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > > Dongsheng Yang > > Sent: 14 February 2017 09:01 > > To: Sage Weil > > Cc: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com

Re: [ceph-users] kraken-bluestore 11.2.0 memory leak issue

2017-02-16 Thread Wido den Hollander
> Op 16 februari 2017 om 7:19 schreef Muthusamy Muthiah > : > > > Thanks IIya Letkowski for the information we will change this value > accordingly. > What I understand from yesterday's performance meeting is that this seems like a bug. Lowering this buffer reduces memory, but the root-cause

Re: [ceph-users] KVM/QEMU rbd read latency

2017-02-16 Thread Wido den Hollander
> Op 16 februari 2017 om 21:38 schreef Steve Taylor > : > > > You might try running fio directly on the host using the rbd ioengine (direct > librbd) and see how that compares. The major difference between that and the > krbd test will be the page cache readahead, which will be present in the

Re: [ceph-users] PG stuck peering after host reboot

2017-02-16 Thread Wido den Hollander
> Op 16 februari 2017 om 14:55 schreef george.vasilaka...@stfc.ac.uk: > > > Hi folks, > > I have just made a tracker for this issue: > http://tracker.ceph.com/issues/18960 > I used ceph-post-file to upload some logs from the primary OSD for the > troubled PG. > > Any help would be appreciate

Re: [ceph-users] PG stuck peering after host reboot

2017-02-17 Thread Wido den Hollander
hat osd.307 in on the same host as osd.595. > > We’ll have a look on osd.595 like you suggested. > If the PG still doesn't recover do the same on osd.307 as I think that 'ceph pg X query' still hangs? The info from ceph-objectstore-tool might shed some more light on

Re: [ceph-users] Disable debug logging: best practice or not?

2017-02-17 Thread Wido den Hollander
> Op 17 februari 2017 om 17:44 schreef Kostis Fardelas : > > > Hi, > I keep reading recommendations about disabling debug logging in Ceph > in order to improve performance. There are two things that are unclear > to me though: > > a. what do we lose if we decrease default debug logging and wher

Re: [ceph-users] Experience with 5k RPM/archive HDDs

2017-02-19 Thread Wido den Hollander
> > now, I would strongly recommend against SMR. > > > > Go for normal SATA drives with only slightly higher price/capacity ratios. > > > > - mike > > > >> On 2/3/17 2:46 PM, Stillwell, Bryan J wrote: > >> On 2/3/17, 3:23 AM, "c

Re: [ceph-users] PG stuck peering after host reboot

2017-02-21 Thread Wido den Hollander
> Op 20 februari 2017 om 17:52 schreef george.vasilaka...@stfc.ac.uk: > > > Hi Wido, > > Just to make sure I have everything straight, > > > If the PG still doesn't recover do the same on osd.307 as I think that > > 'ceph pg X query' still hangs? > > > The info from ceph-objectstore-tool mig

Re: [ceph-users] PG stuck peering after host reboot

2017-02-22 Thread Wido den Hollander
> Op 21 februari 2017 om 15:35 schreef george.vasilaka...@stfc.ac.uk: > > > I have noticed something odd with the ceph-objectstore-tool command: > > It always reports PG X not found even on healthly OSDs/PGs. The 'list' op > works on both and unhealthy PGs. > Are you sure you are supplying t

Re: [ceph-users] PG stuck peering after host reboot

2017-02-22 Thread Wido den Hollander
empty is 1. The other OSDs are reporting last_epoch_started 16806 and empty 0. My EC PG knowledge is not sufficient here to exactly tell you what is going on, but that's the only thing I noticed so far. If you stop osd.307 and maybe mark it as out, does that help? Wido >

Re: [ceph-users] mgr active s01 reboot

2017-02-22 Thread Wido den Hollander
le by ceph-deploy. Check that first if the local user is allow to write to that file. > Must we first umount the filesystem? No, not required. Wido > > Regards, Arnoud. > > From: Wido den Hollander [w...@42on.com] > Sent: Wednesday, February 22, 2017 2:10 PM >

Re: [ceph-users] PG stuck peering after host reboot

2017-02-24 Thread Wido den Hollander
empty is 1. The other OSDs are reporting > last_epoch_started 16806 and empty 0. > > I noticed that too and was wondering why it never completed recovery and > joined > > > If you stop osd.307 and maybe mark it as out, does that help? > > No, I see the same

Re: [ceph-users] ceph-disk and mkfs.xfs are hanging on SAS SSD

2017-02-24 Thread Wido den Hollander
> Op 24 februari 2017 om 9:12 schreef Rajesh Kumar : > > > Hi, > > I am using Ceph Jewel on Ubuntu 16.04 Xenial, with SAS SSD and > driver=megaraid_sas > > > "/usr/bin/python /usr/sbin/ceph-disk prepare --osd-uuid --fs-type xfs > /dev/sda3" is hanging. This command is start "mkfs.xfs -f -i

Re: [ceph-users] Can Cloudstack really be HA when using CEPH?

2017-02-25 Thread Wido den Hollander
> Op 24 februari 2017 om 19:48 schreef Adam Carheden : > > > From the docs for each project: > > "When a primary storage outage occurs the hypervisor immediately stops > all VMs stored on that storage > device"http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/4.8/reliabili

Re: [ceph-users] Can Cloudstack really be HA when using CEPH?

2017-02-25 Thread Wido den Hollander
g to update CloudStack's configuration. Wido > > On Feb 25, 2017 6:56 AM, "Wido den Hollander" wrote: > > > > Op 24 februari 2017 om 19:48 schreef Adam Carheden < > adam.carhe...@gmail.com>: > > > > > > From the docs for each project

<    5   6   7   8   9   10   11   12   13   >