Re: [ceph-users] ceph-deploy: recommended?

2018-04-06 Thread David Turner
I looked through the backlog for ceph-deploy. It has some pretty intense stuff including bugs for random environments that aren't ubuntu or redhat/centos. Not really something I could manage in my off time. On Thu, Apr 5, 2018 at 2:15 PM wrote: > ... we use

[ceph-users] "unable to connect to cluster" after monitor IP change

2018-04-06 Thread Nathan Dehnel
gentooserver ~ # ceph-mon -i mon0 --extract-monmap /tmp/monmap 2018-04-06 15:38:10.863444 7f8aa2b72f80 -1 wrote monmap to /tmp/monmap gentooserver ~ # monmaptool --print /tmp/monmap monmaptool: monmap file /tmp/monmap epoch 3 fsid a736559a-92d1-483e-9289-d2c7feed510f last_changed 2018-04-06

Re: [ceph-users] bluestore OSD did not start at system-boot

2018-04-06 Thread David Turner
`systemctl list-dependencies ceph.target` Do you have ceph-osd.target listed underneath it with all of your OSDs under that? My guess is that you just need to enable them in systemctl to manage them. `systemctl enable ceph-osd@${osd}.service` where $osd is the osd number to be enabled. For

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-06 Thread David Turner
First and foremost, have you checked your disk controller. Of most import would be your cache battery. Any time I have a single node acting up, the controller is Suspect #1. On Thu, Apr 5, 2018 at 11:23 AM Steven Vacaroaia wrote: > Hi, > > I have a strange issue - OSDs from

Re: [ceph-users] Does jewel 10.2.10 support filestore_split_rand_factor?

2018-04-06 Thread David Turner
You could randomize your ceph.conf settings for filestore_merge_threshold and filestore_split_multiple. It's not pretty, but it would spread things out. You could even do this as granularly as you'd like down to the individual OSDs while only having a single ceph.conf file to maintain. I would

[ceph-users] jewel ceph has PG mapped always to the same OSD's

2018-04-06 Thread Konstantin Danilov
Hi all, we have a strange issue on one cluster. One PG is mapped to the particular set of OSD, say X,Y and Z doesn't matter what how we change crush map. The whole picture is next: * This is 10.2.7 ceph version, all monitors and osd's have the same version * One PG eventually get into

Re: [ceph-users] EC related osd crashes (luminous 12.2.4)

2018-04-06 Thread Adam Tygart
I set this about 15 minutes ago, with the following: ceph tell osd.* injectargs '--osd-recovery-max-single-start 1 --osd-recovery-max-active 1' ceph osd unset noout ceph osd unset norecover I also set those settings in ceph.conf just in case the "not observed" response was true. Things have been

Re: [ceph-users] jewel ceph has PG mapped always to the same OSD's

2018-04-06 Thread David Turner
What happens when you deep-scrub this PG? What do the OSD logs show for any lines involving the problem PGs? Was anything happening on your cluster just before this started happening at first? On Fri, Apr 6, 2018 at 2:29 PM Konstantin Danilov wrote: > Hi all, we have a

Re: [ceph-users] ceph-deploy: recommended?

2018-04-06 Thread Anthony D'Atri
> ?I read a couple of versions ago that ceph-deploy was not recommended > for production clusters.? InkTank had sort of discouraged the use of ceph-deploy; in 2014 we used it only to deploy OSDs. Some time later the message changed. ___ ceph-users

Re: [ceph-users] bluestore OSD did not start at system-boot

2018-04-06 Thread Oliver Freyermuth
Hi together, this sounds a lot like my issue and quick solution here: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-February/024858.html It seems http://tracker.ceph.com/issues/23067 is already under review, so maybe that will be in a future release, shortening the bash-script and

Re: [ceph-users] jewel ceph has PG mapped always to the same OSD's

2018-04-06 Thread Konstantin Danilov
David, > What happens when you deep-scrub this PG? we haven't try to deep-scrub it, will try. > What do the OSD logs show for any lines involving the problem PGs? Nothing special were logged about this particular osd, except that it's degraded. Yet osd consume quite a lot portion of its CPU time

Re: [ceph-users] how the files in /var/lib/ceph/osd/ceph-0 are generated

2018-04-06 Thread Jeffrey Zhang
​​ Yes, I am using ceph-volume. And i found where the keyring comes from. bluestore will save all the information at the starting of disk (BDEV_LABEL_BLOCK_SIZE=4096) this area is used for saving labels, including keyring, whoami etc. these can be read through ceph-bluestore-tool show-lable $

Re: [ceph-users] RGW multisite sync issues

2018-04-06 Thread Casey Bodley
On 04/06/2018 10:57 AM, Josef Zelenka wrote: Hi everyone, i'm currently setting up RGW multisite(one cluster is jewel(primary), the other is luminous - this is only for testing, on prod we will have the same version - jewel on both), but i can't get bucket synchronization to work. Data gets

Re: [ceph-users] Have an inconsistent PG, repair not working

2018-04-06 Thread Michael Sudnick
I've tried a few more things to get a deep-scrub going on my PG. I tried instructing the involved osds to scrub all their PGs and it looks like that didn't do it. Do you have any documentation on the object-store-tool? What I've found online talks about filestore and not bluestore. On 6 April

Re: [ceph-users] Have an inconsistent PG, repair not working

2018-04-06 Thread David Turner
I'm running into this exact same situation. I'm running 12.2.2 and I have an EC PG with a scrub error. It has the same output for [1] rados list-inconsistent-obj as mentioned before. This is the [2] full health detail. This is the [3] excerpt from the log from the deep-scrub that marked the PG

Re: [ceph-users] how the files in /var/lib/ceph/osd/ceph-0 are generated

2018-04-06 Thread David Turner
Likely the differences you're seeing of /dev/sdb1 and tmpfs have to do with how ceph-disk vs ceph-volume manage the OSDs and what their defaults are. ceph-disk will create partitions on devices while ceph-volume configures LVM on the block device. Also with bluestore you do not have a standard

[ceph-users] RGW multisite sync issues

2018-04-06 Thread Josef Zelenka
Hi everyone, i'm currently setting up RGW multisite(one cluster is jewel(primary), the other is luminous - this is only for testing, on prod we will have the same version - jewel on both), but i can't get bucket synchronization to work. Data gets synchronized fine when i upload it, but when

Re: [ceph-users] Have an inconsistent PG, repair not working

2018-04-06 Thread David Turner
I'm using filestore. I think the root cause is something getting stuck in the code. As such I went ahead and created a [1] bug tracker for this. Hopefully it gets some traction as I'm not particularly looking forward to messing with deleting PGs with the ceph-objectstore-tool in production. [1]

Re: [ceph-users] EC related osd crashes (luminous 12.2.4)

2018-04-06 Thread Josh Durgin
You should be able to avoid the crash by setting: osd recovery max single start = 1 osd recovery max active = 1 With that, you can unset norecover to let recovery start again. A fix so you don't need those settings is here: https://github.com/ceph/ceph/pull/21273 If you see any other