Re: [ceph-users] Monitor as local VM on top of the server pool cluster?

2017-07-10 Thread Z Will
For large cluster , there will be a lot of change at any time, this means the pressure of mon will be big at some time, because all change will go through leader , so for this , the local storage for mon should be good enough, I think this maybe a conderation . On Tue, Jul 11, 2017 at 11:29

Re: [ceph-users] Monitor as local VM on top of the server pool cluster?

2017-07-10 Thread Brad Hubbard
On Tue, Jul 11, 2017 at 3:44 AM, David Turner wrote: > Mons are a paxos quorum and as such want to be in odd numbers. 5 is > generally what people go with. I think I've heard of a few people use 7 > mons, but you do not want to have an even number of mons or an ever

Re: [ceph-users] ceph-mon leader election problem, should it be improved ?

2017-07-10 Thread Z Will
Hi Joao: > Basically, this would be something similar to heartbeats. If a monitor can't > reach all monitors in an existing quorum, then just don't do anything. Based on your solution, I make a little change : - send a probe to all monitors - if get a quorum ,

Re: [ceph-users] OSD Full Ratio Luminous - Unset

2017-07-10 Thread Brad Hubbard
On Tue, Jul 11, 2017 at 12:46 PM, Ashley Merrick wrote: > Hello, > > Perfect thanks that fixed my issue! > > Still seems to be a bug on the ceph pg dump unless it has been moved out of > the PG and directly into the OSD? I am looking into this issue and have been trying

Re: [ceph-users] OSD Full Ratio Luminous - Unset

2017-07-10 Thread Ashley Merrick
Hello, Perfect thanks that fixed my issue! Still seems to be a bug on the ceph pg dump unless it has been moved out of the PG and directly into the OSD? ,Ashley -Original Message- From: Edward R Huyer [mailto:erh...@rit.edu] Sent: Tuesday, 11 July 2017 7:53 AM To: Ashley Merrick

[ceph-users] OSD Full Ratio Luminous - Unset

2017-07-10 Thread Edward R Huyer
I just now ran into the same problem you did, though I managed to get it straightened out. It looks to me like the "ceph osd set-{full,nearfull,backfillfull}-ratio" commands *do* work, with two caveats. Caveat 1: "ceph pg dump" doesn't reflect the change for some reason. Caveat 2: There

[ceph-users] admin_socket error

2017-07-10 Thread Oscar Segarra
Hi, My lab environment has just one node for testing purposes. As user ceph (with sudo privileges granted) I have executed the following commands in my environment: ceph-deploy install vdicnode01 ceph-deploy --cluster vdiccephmgmtcluster new vdicnode01 --cluster-network 192.168.100.0/24

Re: [ceph-users] Problems with statistics after upgrade to luminous

2017-07-10 Thread Gregory Farnum
On Mon, Jul 10, 2017 at 1:00 PM Sage Weil wrote: > On Mon, 10 Jul 2017, Ruben Kerkhof wrote: > > On Mon, Jul 10, 2017 at 7:44 PM, Sage Weil wrote: > > > On Mon, 10 Jul 2017, Gregory Farnum wrote: > > >> On Mon, Jul 10, 2017 at 12:57 AM Marc Roos

Re: [ceph-users] Problems with statistics after upgrade to luminous

2017-07-10 Thread Sage Weil
On Mon, 10 Jul 2017, Ruben Kerkhof wrote: > On Mon, Jul 10, 2017 at 7:44 PM, Sage Weil wrote: > > On Mon, 10 Jul 2017, Gregory Farnum wrote: > >> On Mon, Jul 10, 2017 at 12:57 AM Marc Roos > >> wrote: > >> > >> I need a little help with fixing

Re: [ceph-users] Problems with statistics after upgrade to luminous

2017-07-10 Thread Ruben Kerkhof
On Mon, Jul 10, 2017 at 7:44 PM, Sage Weil wrote: > On Mon, 10 Jul 2017, Gregory Farnum wrote: >> On Mon, Jul 10, 2017 at 12:57 AM Marc Roos wrote: >> >> I need a little help with fixing some errors I am having. >> >> After upgrading from

Re: [ceph-users] RBD journaling benchmarks

2017-07-10 Thread Maged Mokhtar
On 2017-07-10 20:06, Mohamad Gebai wrote: > On 07/10/2017 01:51 PM, Jason Dillaman wrote: On Mon, Jul 10, 2017 at 1:39 > PM, Maged Mokhtar wrote: These are significant > differences, to the point where it may not make sense > to use rbd journaling / mirroring unless there

Re: [ceph-users] RBD journaling benchmarks

2017-07-10 Thread Mohamad Gebai
On 07/10/2017 01:51 PM, Jason Dillaman wrote: > On Mon, Jul 10, 2017 at 1:39 PM, Maged Mokhtar wrote: >> These are significant differences, to the point where it may not make sense >> to use rbd journaling / mirroring unless there is only 1 active client. > I interpreted

Re: [ceph-users] RBD journaling benchmarks

2017-07-10 Thread Jason Dillaman
On Mon, Jul 10, 2017 at 1:39 PM, Maged Mokhtar wrote: > These are significant differences, to the point where it may not make sense > to use rbd journaling / mirroring unless there is only 1 active client. I interpreted the results as the same RBD image was being

Re: [ceph-users] Problems with statistics after upgrade to luminous

2017-07-10 Thread Sage Weil
On Mon, 10 Jul 2017, Gregory Farnum wrote: > On Mon, Jul 10, 2017 at 12:57 AM Marc Roos wrote: > > I need a little help with fixing some errors I am having. > > After upgrading from Kraken im getting incorrect values reported > on > placement

Re: [ceph-users] Monitor as local VM on top of the server pool cluster?

2017-07-10 Thread David Turner
Mons are a paxos quorum and as such want to be in odd numbers. 5 is generally what people go with. I think I've heard of a few people use 7 mons, but you do not want to have an even number of mons or an ever growing number of mons. The reason you do not want mons running on the same hardware as

Re: [ceph-users] RBD journaling benchmarks

2017-07-10 Thread Maged Mokhtar
On 2017-07-10 18:14, Mohamad Gebai wrote: > Resending as my first try seems to have disappeared. > > Hi, > > We ran some benchmarks to assess the overhead caused by enabling > client-side RBD journaling in Luminous. The tests consists of: > - Create an image with journaling enabled

[ceph-users] Monitor as local VM on top of the server pool cluster?

2017-07-10 Thread Massimiliano Cuttini
Hi everybody, i would like to separate MON from OSD as reccomended. In order to do so without new hardware I'm planning to create all the monitor as a Virtual Machine on top of my hypervisors (Xen). I'm testing a pool of 8 nodes of Xen. I'm thinking about create 8 monitor and pin one monitor

Re: [ceph-users] Problems with statistics after upgrade to luminous

2017-07-10 Thread Gregory Farnum
On Mon, Jul 10, 2017 at 12:57 AM Marc Roos wrote: > > I need a little help with fixing some errors I am having. > > After upgrading from Kraken im getting incorrect values reported on > placement groups etc. At first I thought it is because I was changing > the public

[ceph-users] RBD journaling benchmarks

2017-07-10 Thread Mohamad Gebai
Resending as my first try seems to have disappeared. Hi, We ran some benchmarks to assess the overhead caused by enabling client-side RBD journaling in Luminous. The tests consists of: - Create an image with journaling enabled (--image-feature journaling) - Run randread, randwrite and randrw

Re: [ceph-users] hammer -> jewel 10.2.8 upgrade and setting sortbitwise

2017-07-10 Thread Sage Weil
On Mon, 10 Jul 2017, Luis Periquito wrote: > Hi Dan, > > I've enabled it in a couple of big-ish clusters and had the same > experience - a few seconds disruption caused by a peering process > being triggered, like any other crushmap update does. Can't remember > if it triggered data movement, but

Re: [ceph-users] Adding storage to exiting clusters with minimal impact

2017-07-10 Thread bruno.canning
Hi All, Thanks for your ideas and recommendations. I've been experimenting with: https://github.com/cernceph/ceph-scripts/blob/master/tools/ceph-gentle-reweight by Dan van der Ster and it is producing good results. It does indeed seem that adjusting the crush weight up from zero is the way to

Re: [ceph-users] hammer -> jewel 10.2.8 upgrade and setting sortbitwise

2017-07-10 Thread Luis Periquito
Hi Dan, I've enabled it in a couple of big-ish clusters and had the same experience - a few seconds disruption caused by a peering process being triggered, like any other crushmap update does. Can't remember if it triggered data movement, but I have a feeling it did... On Mon, Jul 10, 2017 at

[ceph-users] hammer -> jewel 10.2.8 upgrade and setting sortbitwise

2017-07-10 Thread Dan van der Ster
Hi all, With 10.2.8, ceph will now warn if you didn't yet set sortbitwise. I just updated a test cluster, saw that warning, then did the necessary ceph osd set sortbitwise I noticed a short re-peering which took around 10s on this small cluster with very little data. Has anyone done this

Re: [ceph-users] Access rights of /var/lib/ceph with Jewel

2017-07-10 Thread Brady Deetz
>From a least privilege standpoint, o=rx seems bad. Instead, if you need a user to gave rx, why not set a default acl on each osd to allow Nagios to have rx? I think it's designed to best practice. If a user wishes to accept additional risk, that's their risk. On Jul 10, 2017 8:10 AM, "Jens

Re: [ceph-users] Access rights of /var/lib/ceph with Jewel

2017-07-10 Thread Jens Rosenboom
2017-07-10 10:40 GMT+00:00 Christian Balzer : > On Mon, 10 Jul 2017 11:27:26 +0200 Marc Roos wrote: > >> Looks to me by design (from rpm install), and the settings of the >> directorys below are probably the result of a user umask setting. > > I know it's deliberate, I'm asking why.

[ceph-users] subscribe

2017-07-10 Thread hui chen
subscribe ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Degraded objects while OSD is being added/filled

2017-07-10 Thread Eino Tuominen
[replying to my post] In fact, I did just this: 1. On a HEALTH_OK​ cluster, command ceph osd in 245 2. wait cluster to stabilise 3. witness this: cluster 0a9f2d69-5905-4369-81ae-e36e4a791831 health HEALTH_WARN 385 pgs backfill_wait 1 pgs backfilling

Re: [ceph-users] Degraded objects while OSD is being added/filled

2017-07-10 Thread Eino Tuominen
Hi Greg, I was not clear enough. First I set the weight to 0 (ceph osd out), I waited until the cluster was stable and healthy (all pgs active+clean). Then I went and removed the now empty osds. That was when I saw degraded objects. I'm soon about to add some new disks to the cluster. I can

Re: [ceph-users] Access rights of /var/lib/ceph with Jewel

2017-07-10 Thread Christian Balzer
On Mon, 10 Jul 2017 11:27:26 +0200 Marc Roos wrote: > Looks to me by design (from rpm install), and the settings of the > directorys below are probably the result of a user umask setting. I know it's deliberate, I'm asking why. > Anyway > I can imagine that it is not nagios business to

Re: [ceph-users] MDSs have different mdsmap epoch

2017-07-10 Thread John Spray
On Mon, Jul 10, 2017 at 7:44 AM, TYLin wrote: > Hi all, > > We have a cluster whose fsmap and mdsmap have different value. Also, each mds > has different mdsmap epoch. Active mds has epoch 52, and other two standby > mds have 53 and 55, respectively. Why are the mdsmap epoch

[ceph-users] Problems with statistics after upgrade to luminous

2017-07-10 Thread Marc Roos
I need a little help with fixing some errors I am having. After upgrading from Kraken im getting incorrect values reported on placement groups etc. At first I thought it is because I was changing the public cluster ip address range and modifying the monmap directly. But after deleting and

[ceph-users] Ceph MeetUp Berlin on July 17

2017-07-10 Thread Robert Sander
Hi, https://www.meetup.com/de-DE/Ceph-Berlin/events/240812906/ Come join us for an introduction into Ceph and DESY including a tour of their data center and photo injector test facility. Regards -- Robert Sander Heinlein Support GmbH Schwedter Str. 8/9b, 10119 Berlin

Re: [ceph-users] MON daemons fail after creating bluestore osd with block.db partition (luminous 12.1.0-1~bpo90+1 )

2017-07-10 Thread Thomas Gebhardt
Hello, Thomas Gebhardt schrieb am 07.07.2017 um 17:21: > ( e.g., > ceph-deploy osd create --bluestore --block-db=/dev/nvme0bnp1 node1:/dev/sdi > ) just noticed that there was typo in the block-db device name (/dev/nvme0bnp1 -> /dev/nvme0n1p1). After fixing that misspelling my cookbook worked

[ceph-users] MDSs have different mdsmap epoch

2017-07-10 Thread TYLin
Hi all, We have a cluster whose fsmap and mdsmap have different value. Also, each mds has different mdsmap epoch. Active mds has epoch 52, and other two standby mds have 53 and 55, respectively. Why are the mdsmap epoch of each mds different? Our cluster: ceph 11.2.0 3 nodes. Each node has a

[ceph-users] Access rights of /var/lib/ceph with Jewel

2017-07-10 Thread Christian Balzer
Hello, With Jewel /var/lib/ceph has these permissions: "drwxr-x---", while every directory below it still has the world aXessible bit set. This makes it impossible (by default) for nagios and other non-root bits to determine the disk usage for example. Any rhyme or reason for this decision?