Re: [ceph-users] Ceph-fuse getting stuck with "currently failed to authpin local pins"

2018-05-31 Thread Oliver Freyermuth
Am 01.06.2018 um 02:59 schrieb Yan, Zheng: > On Wed, May 30, 2018 at 5:17 PM, Oliver Freyermuth > wrote: >> Am 30.05.2018 um 10:37 schrieb Yan, Zheng: >>> On Wed, May 30, 2018 at 3:04 PM, Oliver Freyermuth >>> wrote: Hi, ij our case, there's only a single active MDS (+1

Re: [ceph-users] Ceph-fuse getting stuck with "currently failed to authpin local pins"

2018-05-31 Thread Yan, Zheng
On Wed, May 30, 2018 at 5:17 PM, Oliver Freyermuth wrote: > Am 30.05.2018 um 10:37 schrieb Yan, Zheng: >> On Wed, May 30, 2018 at 3:04 PM, Oliver Freyermuth >> wrote: >>> Hi, >>> >>> ij our case, there's only a single active MDS >>> (+1 standby-replay + 1 standby). >>> We also get the health

Re: [ceph-users] Luminous 12.2.4: CephFS kernel client (4.15/4.16) shows up as jewel

2018-05-31 Thread Linh Vu
I see, thanks a lot Ilya :) Will test that out. From: Ilya Dryomov Sent: Thursday, 31 May 2018 10:50:48 PM To: Heðin Ejdesgaard Møller Cc: Linh Vu; ceph-users Subject: Re: [ceph-users] Luminous 12.2.4: CephFS kernel client (4.15/4.16) shows up as jewel On Thu,

Re: [ceph-users] ceph-osd@ service keeps restarting after removing osd

2018-05-31 Thread Gregory Farnum
On Thu, May 24, 2018 at 9:15 AM Michael Burk wrote: > Hello, > > I'm trying to replace my OSDs with higher capacity drives. I went through > the steps to remove the OSD on the OSD node: > # ceph osd out osd.2 > # ceph osd down osd.2 > # ceph osd rm osd.2 > Error EBUSY: osd.2 is still up; must be

Re: [ceph-users] Why the change from ceph-disk to ceph-volume and lvm? (and just not stick with direct disk access)

2018-05-31 Thread David Turner
You are also making this entire conversation INCREDIBLY difficult to follow by creating so many new email threads instead of sticking with one. On Thu, May 31, 2018 at 5:48 PM David Turner wrote: > Your question assumes that ceph-disk was a good piece of software. It had > a bug list a mile

Re: [ceph-users] Why the change from ceph-disk to ceph-volume and lvm? (and just not stick with direct disk access)

2018-05-31 Thread David Turner
Your question assumes that ceph-disk was a good piece of software. It had a bug list a mile long and nobody working on it. A common example was how simple it was to mess up any part of the dozens of components that allowed an OSD to autostart on boot. One of the biggest problems was when

[ceph-users] Why the change from ceph-disk to ceph-volume and lvm? (and just not stick with direct disk access)

2018-05-31 Thread Marc Roos
What is the reasoning behind switching to lvm? Does it make sense to go through (yet) another layer to access the disk? Why creating this dependency and added complexity? It is fine as it is, or not? ___ ceph-users mailing list

Re: [ceph-users] [URGENT] Rebuilding cluster data from remaining OSDs

2018-05-31 Thread Gregory Farnum
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds On Thu, May 31, 2018 at 1:49 PM Leônidas Villeneuve wrote: > I had a small Ceph cluster and had to take down one node. The data from > its OSDs was reallocated on the other OSDs and went fine. > >

Re: [ceph-users] Sudden increase in "objects misplaced"

2018-05-31 Thread Gregory Farnum
l good, so I thought, but this morning (31st May): > > nodown flag(s) set; > 41264874/5190843723 objects misplaced (0.795%) > Degraded data redundancy: 11795/5190843723 objects degraded (0.000%), > 226 pgs degraded > > Of course I'm perplexed as to what might have caused this..

[ceph-users] [URGENT] Rebuilding cluster data from remaining OSDs

2018-05-31 Thread Leônidas Villeneuve
I had a small Ceph cluster and had to take down one node. The data from its OSDs was reallocated on the other OSDs and went fine. After the reallocation, I removed its mon.service as described by the official documentation. Then, everything went wrong. The other mons just collapsed and stopped

Re: [ceph-users] Ceph-disk --dmcrypt or manual

2018-05-31 Thread David Turner
Why are you digging into ceph-disk so deeply when it is being EOL'd and doesn't even exist in Mimic? On Thu, May 31, 2018 at 4:41 PM Marc Roos wrote: > > What is the advantage of using this ceph-disk dmcrypt option and lockbox > and boostrap-osd vs just creating a luks partition and do

[ceph-users] Ceph-disk --dmcrypt or manual

2018-05-31 Thread Marc Roos
What is the advantage of using this ceph-disk dmcrypt option and lockbox and boostrap-osd vs just creating a luks partition and do something like ceph-disk prepare --bluestore --zap-disk /dev/mapper/crypt1 Are there advantages for moving osd easily to a different host?

[ceph-users] What is osd-lockbox ceph-disk dmcrypt wipefs not working (of course)

2018-05-31 Thread Marc Roos
Why is ceph-disk trying to wipe a mounted filesystem? [@]# ceph-disk prepare --bluestore --zap-disk --dmcrypt /dev/sdg The operation has completed successfully. mke2fs 1.42.9 (28-Dec-2013) Filesystem label= OS type: Linux Block size=1024 (log=0) Fragment size=1024 (log=0) Stride=4 blocks,

[ceph-users] What is osd-lockbox? How does it work?

2018-05-31 Thread Marc Roos
[@-]# ceph-disk prepare --bluestore --zap-disk --dmcrypt /dev/sdg Creates these 3 why? drwxr-x--- 14 ceph ceph 200 May 31 19:14 .. drwxr-xr-x 3 root root 1024 May 31 22:24 93bce5fa-3443-4bb2-bb93-bf65ad40ebcd lrwxrwxrwx 1 root root 62 May 31 22:24 d29c5890-9b02-46da-b7b1-24fcb04793df

Re: [ceph-users] Testing with ceph-disk and dmcrypt

2018-05-31 Thread Marc Roos
I don’t use ansible or deploy. I had to create it, because it was also not in the auth list. ceph auth add client.bootstrap-osd mon 'allow profile bootstrap-osd' ceph auth get client.bootstrap-osd 2>/dev/null | head -2 | sudo -u ceph tee -a /var/lib/ceph/bootstrap-osd/ceph.keyring But since

Re: [ceph-users] Testing with ceph-disk and dmcrypt

2018-05-31 Thread Gregory Farnum
That key should have been created when the cluster was. The orchestration software (ceph-deploy, ceph-ansible, or whatever you're using) is responsible for putting it in place on the local machine if needed, but in your case you should just need to fetch it. ("get-or-create" rather than "new",

Re: [ceph-users] SSD recommendation

2018-05-31 Thread Sean Redmond
Hi, I know the s4600 thread well as I had over 10 of those drives fail before I took them all out of production. Intel did say a firmware fix was on the way but I could not wait and opted for SM863A and never looked back... I will be sticking with SM863A for now on futher orders. Thanks On

Re: [ceph-users] RGW unable to start gateway for 2nd realm

2018-05-31 Thread Brett Chancellor
On a whim, I looked at the period and it seems to be pointing to a zonegroup and zone that don't exist. Could that be the issue? And if so, how to fix? I'm paranoid about period updates because I don't want to interrupt operations to the working realm. ## List realms, all is as expected $ sudo

[ceph-users] Testing with ceph-disk and dmcrypt

2018-05-31 Thread Marc Roos
I don’t have the /var/lib/ceph/bootstrap-osd/ceph.keyring, how should I create this file? [@ ~]# ceph-disk prepare --bluestore --zap-disk --dmcrypt /dev/sdg Creating new GPT entries. The operation has completed successfully. mke2fs 1.42.9 (28-Dec-2013) Filesystem label= OS type: Linux Block

[ceph-users] Slack bot for Ceph

2018-05-31 Thread David Turner
https://github.com/drakonstein/cephbot This is something that I've been using and working on for a while. My Python abilities are subpar at best, but this has been very useful for me in my environments. I use it for my home cluster and for multiple clusters at work. The biggest gain from this

[ceph-users] issue with OSD class path in RDMA mode

2018-05-31 Thread Raju Rangoju
Hello, I'm trying to run iscsi tgtd on ceph cluster. When do 'rbd list' I see below errors. [root@ceph1 ceph]# rbd list 2018-05-30 18:19:02.227 2ae7260a8140 -1 librbd::api::Image: list_images: error listing image in directory: (5) Input/output error 2018-05-30 18:19:02.227 2ae7260a8140 -1

Re: [ceph-users] RGW unable to start gateway for 2nd realm

2018-05-31 Thread David Turner
This is the documentation that has always worked for me to set up multiple realms. http://docs.ceph.com/docs/luminous/radosgw/multisite/ It's for the multisite configuration, but if you aren't using multi-site, just stop after setting up the first site. I didn't read through all of your steps,

Re: [ceph-users] Cephfs no space on device error

2018-05-31 Thread David Turner
Run the command `ceph versions` to find which version of the software each daemon is actively running. If you start a daemon on 12.2.4 and upgrade the packages on the server to 12.2.5, then the daemon will continue to be running 12.2.4 until you restart it. By restarting the mds daemons you

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-05-31 Thread Lionel Bouton
On 31/05/2018 14:41, Simon Ironside wrote: > On 24/05/18 19:21, Lionel Bouton wrote: > >> Unfortunately I just learned that Supermicro found an incompatibility >> between this motherboard and SM863a SSDs (I don't have more information >> yet) and they proposed S4600 as an alternative. I

Re: [ceph-users] SSD recommendation

2018-05-31 Thread Fulvio Galeazzi
Hallo Simon, I am also about to buy some new hardware and for SATA ~400GB I was considering Micron 5200 MAX, rated at 5 DWPD, for journaling/FSmetadata. Is anyone using such drives, and to what degree of satisfaction? Thanks Fulvio Original Message

Re: [ceph-users] Cephfs no space on device error

2018-05-31 Thread Doug Bell
According to dpkg, all of the cluster is running 12.2.5, so it is strange that one would report as 12.2.4. I am certain that everything is on Luminous as its a new environment. The problem seems to have been resolved by changing the trimming settings and rolling restarting the mds daemons. —

Re: [ceph-users] Luminous 12.2.4: CephFS kernel client (4.15/4.16) shows up as jewel

2018-05-31 Thread Ilya Dryomov
On Thu, May 31, 2018 at 2:39 PM, Heðin Ejdesgaard Møller wrote: > I have encountered the same issue and wrote to the mailing list about it, > with the subject: [ceph-users] krbd upmap support on kernel-4.16 ? > > The odd thing is that I can krbd map an image after setting min compat to >

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2018-05-31 Thread Simon Ironside
On 24/05/18 19:21, Lionel Bouton wrote: Unfortunately I just learned that Supermicro found an incompatibility between this motherboard and SM863a SSDs (I don't have more information yet) and they proposed S4600 as an alternative. I immediately remembered that there were problems and asked for a

Re: [ceph-users] Luminous 12.2.4: CephFS kernel client (4.15/4.16) shows up as jewel

2018-05-31 Thread Heðin Ejdesgaard Møller
I have encountered the same issue and wrote to the mailing list about it, with the subject: [ceph-users] krbd upmap support on kernel-4.16 ? The odd thing is that I can krbd map an image after setting min compat to luminous, without specifying --yes-i-really-mean-it . It's only nessecary at

[ceph-users] Sudden increase in "objects misplaced"

2018-05-31 Thread Jake Grimmett
flag(s) set; 41264874/5190843723 objects misplaced (0.795%) Degraded data redundancy: 11795/5190843723 objects degraded (0.000%), 226 pgs degraded Of course I'm perplexed as to what might have caused this... Looking at /var/log/ceph.log-20180531.gz there is a sudden jump in objects misplaced at 2

Re: [ceph-users] Recovery priority

2018-05-31 Thread Stefan Kooman
Quoting Dennis Benndorf (dennis.bennd...@googlemail.com): > Hi, > > lets assume we have size=3 min_size=2 and lost some osds and now have some > placement groups with only one copy left. > > Is there a setting to tell ceph to start recovering those pgs first in order > to reach min_size and so

Re: [ceph-users] Luminous 12.2.4: CephFS kernel client (4.15/4.16) shows up as jewel

2018-05-31 Thread Ilya Dryomov
On Thu, May 31, 2018 at 4:16 AM, Linh Vu wrote: > Hi all, > > > On my test Luminous 12.2.4 cluster, with this set (initially so I could use > upmap in the mgr balancer module): > > > # ceph osd set-require-min-compat-client luminous > > # ceph osd dump | grep client > require_min_compat_client

[ceph-users] Fix incomplete PG

2018-05-31 Thread Monis Monther
Hi, We have an EC pool that is 3+1 (luminous 12.2.0 using filestore), 2 OSDs crashed and some PGs are now incomplete because we are left only with 2 out of 4. The failed OSDs have the two other chunks of the PGs intact but they are missing the omap directory and other files. We replaced the

[ceph-users] Recovery priority

2018-05-31 Thread Dennis Benndorf
Hi, lets assume we have size=3 min_size=2 and lost some osds and now have some placement groups with only one copy left. Is there a setting to tell ceph to start recovering those pgs first in order to reach min_size and so get the cluster online faster? Regards, Dennis