[ceph-users] Changing osd crush chooseleaf type at runtime

2018-02-02 Thread Flemming Frandsen
Hi, I'm just starting to play around with Ceph, so please excuse my complete lack of a clue if this question is covered somewhere, but I have been unable to find an answer. I have a single machine running Ceph which was set up with osd crush chooseleaf type = 0 in /etc/ceph/ceph.conf, now

Re: [ceph-users] RFC Bluestore-Cluster of SAMSUNG PM863a

2018-02-02 Thread Serkan Çoban
May I ask why are you using EL repo with centos? AFAIK, Redhat is backporting all ceph features to 3.10 kernels. Am I wrong? On Fri, Feb 2, 2018 at 2:44 PM, Richard Hesketh wrote: > On 02/02/18 08:33, Kevin Olbrich wrote: >> Hi! >> >> I am planning a new Flash-based

[ceph-users] ceph luminous performance - disks at 100% , low network utilization

2018-02-02 Thread Steven Vacaroaia
Hi, I have been struggling to get my test cluster to behave ( from a performance perspective) Dell R620, 64 GB RAM, 1 CPU, numa=off , PERC H710, Raid0, Enterprise 10K disks No SSD - just plain HDD Local tests ( dd, hdparm ) confirm my disks are capable of delivering 200 MBs Fio with 15 jobs

[ceph-users] restrict user access to certain rbd image

2018-02-02 Thread knawnd
Hello! I wonder if it's possible in ceph Luminous to manage user access to rbd images on per image (but not the whole rbd pool) basis? I need to provide rbd images for my users but would like to disable their ability to list all images in a pool as well as to somehow access/use ones if a ceph

Re: [ceph-users] restrict user access to certain rbd image

2018-02-02 Thread Gregory Farnum
I don't think it's well-integrated with the tooling, but check out the cephx docs for the "prefix" level of access. It lets you grant access only to objects whose name matches a prefix, which for rbd would be the rbd volume ID (or name? Something easy to identify). -Greg On Fri, Feb 2, 2018 at

Re: [ceph-users] ceph luminous performance - disks at 100% , low network utilization

2018-02-02 Thread Steven Vacaroaia
Hi Mark, Thanks My pools are using replication =2 I'll re enable numa and report back Steven On 2 February 2018 at 10:48, Marc Roos wrote: > > Not sure if this info is of any help, please beware I am also just in a > testing phase with ceph. > > I don’t know how

Re: [ceph-users] restrict user access to certain rbd image

2018-02-02 Thread Frédéric Nass
Hi, We use this on our side: $ rbd create rbd-image --size 1048576 --pool rbd --image-feature layering $ rbd create rbd-other-image --size 1048576 --pool rbd --image-feature layering $ rbd info rbd/rbd-image rbd image 'rbd-image':     size 1024 GB in 262144 objects     order 22 (4096 kB

Re: [ceph-users] restrict user access to certain rbd image

2018-02-02 Thread Jason Dillaman
Concur that it's technically feasible by restricting access to "rbd_id.", "rbd_header..", "rbd_object_map..", and "rbd_data.." objects using the prefix restriction in the OSD caps. However, this really won't scale beyond a small number of images per user since every IO will need to traverse the

Re: [ceph-users] ceph luminous performance - disks at 100% , low network utilization

2018-02-02 Thread Marc Roos
Not sure if this info is of any help, please beware I am also just in a testing phase with ceph. I don’t know how numa=off is interpreted by the os. If it just hides the numa, you still could run into the 'known issues'. That is why I have numad running. Furthermore I have put an osd 'out'

Re: [ceph-users] Can't enable backfill because of "recover_replicas: object added to missing set for backfill, but is not in recovering, error!"

2018-02-02 Thread Gregory Farnum
On Wed, Jan 31, 2018 at 9:01 PM Philip Poten wrote: > 2018-01-31 19:20 GMT+01:00 Gregory Farnum : > >> On Wed, Jan 31, 2018 at 1:40 AM Philip Poten >> wrote: >> > Hello, >>> >>> i have this error message: >>> >>> 2018-01-25

Re: [ceph-users] Changing osd crush chooseleaf type at runtime

2018-02-02 Thread Gregory Farnum
Once you've created a crush map you need to edit it directly (either by dumping it from the cluster, editing with the crush tool, and importing; or via the ceph cli commands), rather than by updating config settings. I believe doing so is explained in the ceph docs. On Fri, Feb 2, 2018 at 4:47 AM

[ceph-users] Erasure code ruleset for small cluster

2018-02-02 Thread Caspar Smit
Hi all, I'd like to setup a small cluster (5 nodes) using erasure coding. I would like to use k=5 and m=3. Normally you would need a minimum of 8 nodes (preferably 9 or more) for this. Then i found this blog: https://ceph.com/planet/erasure-code-on-small-clusters/ This sounded ideal to me so i

[ceph-users] Help ! how to recover from total monitor failure in lumnious

2018-02-02 Thread Frank Li
Hi, I ran the ceph osd force-create-pg command in luminious 12.2.2 to recover a failed pg, and it Instantly caused all of the monitor to crash, is there anyway to revert back to an earlier state of the cluster ? Right now, the monitors refuse to come up, the error message is as follows: I’ve

Re: [ceph-users] Help ! how to recover from total monitor failure in lumnious

2018-02-02 Thread Frank Li
Yes, I was dealing with an issue where OSD are not peerings, and I was trying to see if force-create-pg can help recover the peering. Data lose is an accepted possibility. I hope this is what you are looking for ? -3> 2018-01-31 22:47:22.942394 7fc641d0b700 5 mon.dl1-kaf101@0(electing)

Re: [ceph-users] Help ! how to recover from total monitor failure in lumnious

2018-02-02 Thread Sage Weil
On Fri, 2 Feb 2018, Frank Li wrote: > Yes, I was dealing with an issue where OSD are not peerings, and I was trying > to see if force-create-pg can help recover the peering. > Data lose is an accepted possibility. > > I hope this is what you are looking for ? > > -3> 2018-01-31

Re: [ceph-users] Help ! how to recover from total monitor failure in lumnious

2018-02-02 Thread Sage Weil
On Fri, 2 Feb 2018, Frank Li wrote: > Hi, I ran the ceph osd force-create-pg command in luminious 12.2.2 to recover > a failed pg, and it > Instantly caused all of the monitor to crash, is there anyway to revert back > to an earlier state of the cluster ? > Right now, the monitors refuse to come

Re: [ceph-users] Erasure code ruleset for small cluster

2018-02-02 Thread Gregory Farnum
On Fri, Feb 2, 2018 at 8:13 AM, Caspar Smit wrote: > Hi all, > > I'd like to setup a small cluster (5 nodes) using erasure coding. I would > like to use k=5 and m=3. > Normally you would need a minimum of 8 nodes (preferably 9 or more) for > this. > > Then i found this

Re: [ceph-users] RFC Bluestore-Cluster of SAMSUNG PM863a

2018-02-02 Thread Kevin Olbrich
2018-02-02 12:44 GMT+01:00 Richard Hesketh : > On 02/02/18 08:33, Kevin Olbrich wrote: > > Hi! > > > > I am planning a new Flash-based cluster. In the past we used SAMSUNG > PM863a 480G as journal drives in our HDD cluster. > > After a lot of tests with luminous and

Re: [ceph-users] ceph luminous performance - disks at 100% , low network utilization

2018-02-02 Thread Steven Vacaroaia
Unfortunately, even after removing all my kernel configuration , the performance did not improve Currently GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet net.ifnames=0 biosdevname=0 ipv6.disable=1 " Before GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet net.ifnames=0 biosdevname=0

Re: [ceph-users] Help ! how to recover from total monitor failure in lumnious

2018-02-02 Thread ceph . novice
https://shaman.ceph.com/repos/ceph/wip-22847-luminous/f04a4a36f01fdd5d9276fa5cfa1940f5cc11fb81/   Gesendet: Freitag, 02. Februar 2018 um 21:27 Uhr Von: "Frank Li" An: "Sage Weil" Cc: "ceph-users@lists.ceph.com" Betreff: 

Re: [ceph-users] Help ! how to recover from total monitor failure in lumnious

2018-02-02 Thread Frank Li
Thanks, I’m downloading it right now -- Efficiency is Intelligent Laziness From: "ceph.nov...@habmalnefrage.de" Date: Friday, February 2, 2018 at 12:37 PM To: "ceph.nov...@habmalnefrage.de" Cc: Frank Li ,

Re: [ceph-users] Help ! how to recover from total monitor failure in lumnious

2018-02-02 Thread Frank Li
Sure, please let me know where to get and run the binaries. Thanks for the fast response ! -- Efficiency is Intelligent Laziness On 2/2/18, 10:31 AM, "Sage Weil" wrote: On Fri, 2 Feb 2018, Frank Li wrote: > Yes, I was dealing with an issue where OSD are not

Re: [ceph-users] Help ! how to recover from total monitor failure in lumnious

2018-02-02 Thread ceph . novice
there pick your "DISTRO", klick on the "ID", klick "Repo URL"...   Gesendet: Freitag, 02. Februar 2018 um 21:34 Uhr Von: ceph.nov...@habmalnefrage.de An: "Frank Li" Cc: "ceph-users@lists.ceph.com" Betreff: Re: [ceph-users] Help ! how to

[ceph-users] OSD stuck in booting state while monitor show it as been up

2018-02-02 Thread Frank Li
Running ceph 12.2.2 in Centos 7.4. The cluster was in healthy condition until a command caused all the monitors to crash. Applied a private build for fixing the issue (thanks !) https://tracker.ceph.com/issues/22847 the monitors are all started, and all the OSDs are reported as been up in

Re: [ceph-users] RFC Bluestore-Cluster of SAMSUNG PM863a

2018-02-02 Thread Richard Hesketh
On 02/02/18 08:33, Kevin Olbrich wrote: > Hi! > > I am planning a new Flash-based cluster. In the past we used SAMSUNG PM863a > 480G as journal drives in our HDD cluster. > After a lot of tests with luminous and bluestore on HDD clusters, we plan to > re-deploy our whole RBD pool (OpenNebula

[ceph-users] RFC Bluestore-Cluster of SAMSUNG PM863a

2018-02-02 Thread Kevin Olbrich
Hi! I am planning a new Flash-based cluster. In the past we used SAMSUNG PM863a 480G as journal drives in our HDD cluster. After a lot of tests with luminous and bluestore on HDD clusters, we plan to re-deploy our whole RBD pool (OpenNebula cloud) using these disks. As far as I understand, it

Re: [ceph-users] High apply latency

2018-02-02 Thread Jakub Jaszewski
Hi, So I have changed merge & split settings to filestore_merge_threshold = 40 filestore_split_multiple = 8 and restart all OSDs , host by host. Let me ask a question, although the pool default.rgw.buckets.data that was affected prior to the above change has higher write bandwidth it is very

Re: [ceph-users] High apply latency

2018-02-02 Thread Piotr Dałek
On 18-02-02 09:55 AM, Jakub Jaszewski wrote: Hi, So I have changed merge & split settings to filestore_merge_threshold = 40 filestore_split_multiple = 8 and restart all OSDs , host by host. Let me ask a question, although the pool default.rgw.buckets.data that was affected prior to the above