Re: [ceph-users] Is it possible to suggest the active MDS to move to a datacenter ?

2018-03-29 Thread Nicolas Huillard
Thanks for your answer. Le jeudi 29 mars 2018 à 13:51 -0700, Patrick Donnelly a écrit : > On Thu, Mar 29, 2018 at 1:02 PM, Nicolas Huillard .fr> wrote: > > I manage my 2 datacenters with Pacemaker and Booth. One of them is > > the > > publicly-known one, thanks to Booth. > >

Re: [ceph-users] Bluestore and scrubbing/deep scrubbing

2018-03-29 Thread Alex Gorbachev
On Thu, Mar 29, 2018 at 5:09 PM, Ronny Aasen wrote: > On 29.03.2018 20:02, Alex Gorbachev wrote: >> >> w Luminous 12.2.4 cluster with Bluestore, I see a good deal >> of scrub and deep scrub operations. Tried to find a reference, but >> nothing obvious out there - was

[ceph-users] Bluestore caching, flawed by design?

2018-03-29 Thread Christian Balzer
Hello, my crappy test cluster was rendered inoperational by an IP renumbering that wasn't planned and forced on me during a DC move, so I decided to start from scratch and explore the fascinating world of Luminous/bluestore and all the assorted bugs. ^_- (yes I could have recovered the cluster

Re: [ceph-users] Can't get MDS running after a power outage

2018-03-29 Thread Yan, Zheng
On Thu, Mar 29, 2018 at 3:16 PM, Zhang Qiang wrote: > Hi, > > Ceph version 10.2.3. After a power outage, I tried to start the MDS > deamons, but they stuck forever replaying journals, I had no idea why > they were taking that long, because this is just a small cluster for >

Re: [ceph-users] 1 mon unable to join the quorum

2018-03-29 Thread Brad Hubbard
2018-03-19 11:03:50.819493 7f842ed47640 0 mon.controller02 does not exist in monmap, will attempt to join an existing cluster 2018-03-19 11:03:50.820323 7f842ed47640 0 starting mon.controller02 rank -1 at 172.18.8.6:6789/0 mon_data /var/lib/ceph/mon/ceph-controller02 fsid

Re: [ceph-users] Ceph recovery kill VM's even with the smallest priority

2018-03-29 Thread Damian Dabrowski
Greg, thanks for Your reply! I think Your idea makes sense, I've did tests and its quite hard to understand for me. I'll try to explain my situation in few steps below. I think that ceph showing progress in recovery but it can only solve objects which doesn't really changed. It won't try to

Re: [ceph-users] Bluestore and scrubbing/deep scrubbing

2018-03-29 Thread Ronny Aasen
On 29.03.2018 20:02, Alex Gorbachev wrote: w Luminous 12.2.4 cluster with Bluestore, I see a good deal of scrub and deep scrub operations. Tried to find a reference, but nothing obvious out there - was it not supposed to not need scrubbing any more due to CRC checks? crc gives you checks as

Re: [ceph-users] Is it possible to suggest the active MDS to move to a datacenter ?

2018-03-29 Thread Patrick Donnelly
On Thu, Mar 29, 2018 at 1:02 PM, Nicolas Huillard wrote: > Hi, > > I manage my 2 datacenters with Pacemaker and Booth. One of them is the > publicly-known one, thanks to Booth. > Whatever the "public datacenter", Ceph is a single storage cluster. > Since most of the cephfs

Re: [ceph-users] Ceph luminous 12.4 - ceph-volume device not found

2018-03-29 Thread Alfredo Deza
On Thu, Mar 29, 2018 at 4:12 PM, Steven Vacaroaia wrote: > Thanks for pointing me to the root cause of the issue > > I did this > dd if=/dev/zero of=/dev/sdc bs=4096k count=100;sgdisk --zap-all --clear > --mbrtogpt -g -- /dev/sdc > but was not enough > > fdisk -l /dev/sdc >

Re: [ceph-users] All pools full after one OSD got OSD_FULL state

2018-03-29 Thread Mike Lovell
On Thu, Mar 29, 2018 at 1:17 AM, Jakub Jaszewski wrote: > Many thanks Mike, that justifies stopped IOs. I've just finished adding > new disks to cluster and now try to evenly reweight OSD by PG. > > May I ask you two more questions? > 1. As I was in a hurry I did not

Re: [ceph-users] PGs stuck activating after adding new OSDs

2018-03-29 Thread Jon Light
I let the 2 working OSDs backfill over the last couple days and today I was able to add 7 more OSDs before getting PGs stuck activating. Below is the OSD and health outputs after adding an 8th OSD and getting activating PGs. ceph osd df tree ID CLASS WEIGHTREWEIGHT SIZE USEAVAIL %USE

Re: [ceph-users] Ceph luminous 12.4 - ceph-volume device not found

2018-03-29 Thread Steven Vacaroaia
Thanks for pointing me to the root cause of the issue I did this dd if=/dev/zero of=/dev/sdc bs=4096k count=100;sgdisk --zap-all --clear --mbrtogpt -g -- /dev/sdc but was not enough fdisk -l /dev/sdc WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at

[ceph-users] Is it possible to suggest the active MDS to move to a datacenter ?

2018-03-29 Thread Nicolas Huillard
Hi, I manage my 2 datacenters with Pacemaker and Booth. One of them is the publicly-known one, thanks to Booth. Whatever the "public datacenter", Ceph is a single storage cluster. Since most of the cephfs traffic come from this "public datacenter", I'd like to suggest or force the active MDS to

Re: [ceph-users] Ceph luminous 12.4 - ceph-volume device not found

2018-03-29 Thread Alfredo Deza
Seems like the partition table is still around even after calling sgdisk --zap-all On Thu, Mar 29, 2018 at 2:18 PM, Steven Vacaroaia wrote: > Thanks for your willingness to help > > No multiphating > > Here are the commands and their output > > [root@osd02 ~]# sgdisk --zap-all

Re: [ceph-users] split brain case

2018-03-29 Thread Ronny Aasen
On 29.03.2018 11:13, ST Wong (ITSC) wrote: Hi, Thanks. > ofcourse the 4 osd's left working now want to selfheal by recreating all objects stored on the 4 split off osd's and have a huge recovery job. and you may risk that the osd's goes into too_full error, unless you have free space in

Re: [ceph-users] One object degraded cause all ceph requests hang - Jewel 10.2.6 (rbd + radosgw)

2018-03-29 Thread Rudenko Aleksandr
Thank you Vincent, it’s very helpfull for me! On 11 Jan 2018, at 14:24, Vincent Godin > wrote: As no response were given, i will explain what i found : maybe it could help other people .dirXXX object is an index marker with a 0 data

Re: [ceph-users] Ceph luminous 12.4 - ceph-volume device not found

2018-03-29 Thread Steven Vacaroaia
Thanks for your willingness to help No multiphating Here are the commands and their output [root@osd02 ~]# sgdisk --zap-all --clear --mbrtogpt -g -- /dev/sdc GPT data structures destroyed! You may now partition the disk using fdisk or other utilities. The operation has completed successfully.

[ceph-users] Bluestore and scrubbing/deep scrubbing

2018-03-29 Thread Alex Gorbachev
On the new Luminous 12.2.4 cluster with Bluestore, I see a good deal of scrub and deep scrub operations. Tried to find a reference, but nothing obvious out there - was it not supposed to not need scrubbing any more due to CRC checks? Thanks for any clarification. -- Alex Gorbachev Storcium

Re: [ceph-users] session lost, hunting for new mon / session established : every 30s until unmount/remount

2018-03-29 Thread Nicolas Huillard
Le mercredi 28 mars 2018 à 15:57 -0700, Jean-Charles Lopez a écrit : > if I read you crrectly you have 3 MONs on each data center. This > means that when the link goes down you will loose quorum making the > cluster unavailable. > > If my perception is correct, you’d have to start a 7th MON

Re: [ceph-users] Ceph luminous 12.4 - ceph-volume device not found

2018-03-29 Thread Alfredo Deza
On Thu, Mar 29, 2018 at 10:25 AM, Steven Vacaroaia wrote: > Hi, > > I am unable to create OSD because " Device /dev/sdc not found (or ignored by > filtering)." Is that device part of a multipath setup? Check if it isn't blacklisted in /etc/multipath.conf or in /etc/lvm/lvm.conf

Re: [ceph-users] Ceph recovery kill VM's even with the smallest priority

2018-03-29 Thread Gregory Farnum
On Thu, Mar 29, 2018 at 7:27 AM Damian Dabrowski wrote: > Hello, > > Few days ago I had very strange situation. > > I had to turn off few OSDs for a while. So I've set flags:noout, > nobackfill, norecover and then turned off selected OSDs. > All was ok, but when I started

[ceph-users] Ceph recovery kill VM's even with the smallest priority

2018-03-29 Thread Damian Dabrowski
Hello, Few days ago I had very strange situation. I had to turn off few OSDs for a while. So I've set flags:noout, nobackfill, norecover and then turned off selected OSDs. All was ok, but when I started these OSDs again all VMs went down due to recovery process(even when recovery priority was

[ceph-users] Ceph luminous 12.4 - ceph-volume device not found

2018-03-29 Thread Steven Vacaroaia
Hi, I am unable to create OSD because " Device /dev/sdc not found (or ignored by filtering)." I tried using the ceph-volume ( on the host) as well as ceph-deploy ( on the admin node) The device is definitely there Any suggestions will be greatly appreciated Note I created the block-db and

Re: [ceph-users] ceph mgr balancer bad distribution

2018-03-29 Thread shadow_lin
Hi Stefan, > Am 28.02.2018 um 13:47 schrieb Stefan Priebe - Profihost AG: >> Hello, >> >> with jewel we always used the python crush optimizer which gave us a >> pretty good distribution fo the used space. >> You mentioned a python crush opimizer for jewel.Could you tell me where I can find it?

Re: [ceph-users] [SOLVED] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread Dan van der Ster
Guys, Ceph does not have a concept of "osd quorum" or "electing a primary PG". The mons are in a PAXOS quorum, and the mon leader decides which OSD is primary for each PG. No need to worry about a split OSD brain. -- dan On Thu, Mar 29, 2018 at 2:51 PM, Peter Linder

Re: [ceph-users] [SOLVED] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread Peter Linder
Den 2018-03-29 kl. 14:26, skrev David Rabel: On 29.03.2018 13:50, Peter Linder wrote: Den 2018-03-29 kl. 12:29, skrev David Rabel: On 29.03.2018 12:25, Janne Johansson wrote: 2018-03-29 11:50 GMT+02:00 David Rabel : You are right. But with my above example: If I have

Re: [ceph-users] [SOLVED] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread Peter Linder
Den 2018-03-29 kl. 12:29, skrev David Rabel: On 29.03.2018 12:25, Janne Johansson wrote: 2018-03-29 11:50 GMT+02:00 David Rabel : You are right. But with my above example: If I have min_size 2 and size 4, and because of a network issue the 4 OSDs are split into 2 and 2,

Re: [ceph-users] where is it possible download CentOS 7.5

2018-03-29 Thread Marc Roos
My idea also -Original Message- From: Jason Dillaman [mailto:jdill...@redhat.com] Sent: donderdag 29 maart 2018 13:51 To: Max Cuttins Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] where is it possible download CentOS 7.5 On Wed, Mar 28, 2018 at 11:44 AM, Max Cuttins

Re: [ceph-users] [SOLVED] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread David Rabel
On 29.03.2018 13:50, Peter Linder wrote: > Den 2018-03-29 kl. 12:29, skrev David Rabel: >> On 29.03.2018 12:25, Janne Johansson wrote: >>> 2018-03-29 11:50 GMT+02:00 David Rabel : You are right. But with my above example: If I have min_size 2 and size 4, and because

Re: [ceph-users] Can't get MDS running after a power outage

2018-03-29 Thread Webert de Souza Lima
I'd also try to boot up only one mds until it's fully up and running. Not both of them. Sometimes they go switching states between each other. Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* On Thu, Mar 29, 2018 at 7:32 AM, John Spray

Re: [ceph-users] where is it possible download CentOS 7.5

2018-03-29 Thread Jason Dillaman
On Wed, Mar 28, 2018 at 11:44 AM, Max Cuttins wrote: > Hi Jason, > > i really don't want to stress this much than I already did. > But I need to have a clear answer. At this point I'm not sure if you are just trolling the list since I believe all your questions and more have

Re: [ceph-users] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread Jakub Jaszewski
On Thu, Mar 29, 2018 at 12:25 PM, Janne Johansson wrote: > > > 2018-03-29 11:50 GMT+02:00 David Rabel : > >> On 29.03.2018 11:43, Janne Johansson wrote: >> > 2018-03-29 11:39 GMT+02:00 David Rabel : >> > >> >> For example a

Re: [ceph-users] cephfs performance issue

2018-03-29 Thread Ouyang Xu
Hi David: That's works, thank you very much! Best regards, Steven On 2018年03月29日 18:30, David C wrote: Pretty sure you're getting stung by: http://tracker.ceph.com/issues/17563 Consider using an elrepo kernel, 4.14 works well for me. On Thu, 29 Mar 2018, 09:46 Dan van der Ster,

Re: [ceph-users] [SOLVED] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread David Rabel
On 29.03.2018 12:25, Janne Johansson wrote: > 2018-03-29 11:50 GMT+02:00 David Rabel : >> You are right. But with my above example: If I have min_size 2 and size >> 4, and because of a network issue the 4 OSDs are split into 2 and 2, is >> it possible that I have write

Re: [ceph-users] Can't get MDS running after a power outage

2018-03-29 Thread John Spray
On Thu, Mar 29, 2018 at 8:16 AM, Zhang Qiang wrote: > Hi, > > Ceph version 10.2.3. After a power outage, I tried to start the MDS > deamons, but they stuck forever replaying journals, I had no idea why > they were taking that long, because this is just a small cluster for >

Re: [ceph-users] cephfs performance issue

2018-03-29 Thread David C
Pretty sure you're getting stung by: http://tracker.ceph.com/issues/17563 Consider using an elrepo kernel, 4.14 works well for me. On Thu, 29 Mar 2018, 09:46 Dan van der Ster, wrote: > On Thu, Mar 29, 2018 at 10:31 AM, Robert Sander >

Re: [ceph-users] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread Janne Johansson
2018-03-29 11:50 GMT+02:00 David Rabel : > On 29.03.2018 11:43, Janne Johansson wrote: > > 2018-03-29 11:39 GMT+02:00 David Rabel : > > > >> For example a replicated pool with size 4: Do i always have to set the > >> min_size to 3? Or is there a way to

Re: [ceph-users] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread David Rabel
On 29.03.2018 11:43, Janne Johansson wrote: > 2018-03-29 11:39 GMT+02:00 David Rabel : > >> For example a replicated pool with size 4: Do i always have to set the >> min_size to 3? Or is there a way to use min_size 2 and use some other >> node as a decision maker in case of

Re: [ceph-users] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread Janne Johansson
2018-03-29 11:39 GMT+02:00 David Rabel : > Hi there. > > Are there possibilities to prevent osd-split-brain in a replicated pool > with an even size? Or do you always have to make min_size big enough to > cover this? > > For example a replicated pool with size 4: Do i always

[ceph-users] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread David Rabel
Hi there. Are there possibilities to prevent osd-split-brain in a replicated pool with an even size? Or do you always have to make min_size big enough to cover this? For example a replicated pool with size 4: Do i always have to set the min_size to 3? Or is there a way to use min_size 2 and use

Re: [ceph-users] 1 mon unable to join the quorum

2018-03-29 Thread Julien Lavesque
Hi Brad, The results have been uploaded on the tracker (https://tracker.ceph.com/issues/23403) Julien On 29/03/2018 07:54, Brad Hubbard wrote: Can you update with the result of the following commands from all of the MONs? # ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok

Re: [ceph-users] split brain case

2018-03-29 Thread ST Wong (ITSC)
Hi, Thanks. > ofcourse the 4 osd's left working now want to selfheal by recreating all > objects stored on the 4 split off osd's and have a huge recovery job. and you > may risk that the osd's goes into too_full error, unless you have free space > in your osd's to recreate all the data in the

Re: [ceph-users] split brain case

2018-03-29 Thread Richard Hesketh
On 29/03/18 09:25, ST Wong (ITSC) wrote: > Hi all, > > We put 8 (4+4) OSD and 5 (2+3) MON servers in server rooms in 2 buildings for > redundancy.  The buildings are connected through direct connection. > > While servers in each building have alternate uplinks.   What will happen in > case the

Re: [ceph-users] split brain case

2018-03-29 Thread Ronny Aasen
On 29.03.2018 10:25, ST Wong (ITSC) wrote: Hi all, We put 8 (4+4) OSD and 5 (2+3) MON servers in server rooms in 2 buildings for redundancy.  The buildings are connected through direct connection. While servers in each building have alternate uplinks.   What will happen in case the link

Re: [ceph-users] cephfs performance issue

2018-03-29 Thread Dan van der Ster
On Thu, Mar 29, 2018 at 10:31 AM, Robert Sander wrote: > On 29.03.2018 09:50, ouyangxu wrote: > >> I'm using Ceph 12.2.4 with CentOS 7.4, and tring to use cephfs for >> MariaDB deployment, > > Don't do this. > As the old saying goes: If it hurts, stop doing it. Why

Re: [ceph-users] split brain case

2018-03-29 Thread Robert Sander
On 29.03.2018 10:25, ST Wong (ITSC) wrote: > While servers in each building have alternate uplinks.   What will > happen in case the link between the buildings is broken (application > servers in each server room will continue to write to OSDs in the same > room) ? The side with the lesser

Re: [ceph-users] cephfs performance issue

2018-03-29 Thread Robert Sander
On 29.03.2018 09:50, ouyangxu wrote: > I'm using Ceph 12.2.4 with CentOS 7.4, and tring to use cephfs for > MariaDB deployment, Don't do this. As the old saying goes: If it hurts, stop doing it. Regards -- Robert Sander Heinlein Support GmbH Schwedter Str. 8/9b, 10119 Berlin

Re: [ceph-users] Upgrading ceph and mapped rbds

2018-03-29 Thread Robert Sander
On 28.03.2018 11:36, Götz Reinicke wrote: > My question is: How to proceed with the serves which map the rbds? Do you intend to upgrade the kernels on these RBD clients acting as NFS servers? If so you have to plan a reboot anyway. If not, nothing changes. Or are you using qemu+rbd in

[ceph-users] split brain case

2018-03-29 Thread ST Wong (ITSC)
Hi all, We put 8 (4+4) OSD and 5 (2+3) MON servers in server rooms in 2 buildings for redundancy. The buildings are connected through direct connection. While servers in each building have alternate uplinks. What will happen in case the link between the buildings is broken (application

Re: [ceph-users] [rgw] civetweb behind haproxy doesn't work with absolute URI

2018-03-29 Thread Sean Purdy
We had something similar recently. We had to disable "rgw dns name" in the end. Sean On Thu, 29 Mar 2018, Rudenko Aleksandr said: > > Hi friends. > > > I'm sorry, maybe it isn't bug, but i don't know how to solve this problem. > > I know that absolute URIs are supported in civetweb and it

[ceph-users] cephfs performance issue

2018-03-29 Thread ouyangxu
Hi Ceph users: I'm using Ceph 12.2.4 with CentOS 7.4, and tring to use cephfs for MariaDB deployment, the configuration is default, but got very pool performance during creating tables, if I use the local file system, not this issue. Here is the sql scripts I used: [root@cmv01cn01]$ cat

[ceph-users] [rgw] civetweb behind haproxy doesn't work with absolute URI

2018-03-29 Thread Rudenko Aleksandr
Hi friends. I'm sorry, maybe it isn't bug, but i don't know how to solve this problem. I know that absolute URIs are supported in civetweb and it works fine for me without haproxy in the middle. But if client send absolute URIs through reverse proxy(haproxy) to civetweb, civetweb breaks

Re: [ceph-users] session lost, hunting for new mon / session established : every 30s until unmount/remount

2018-03-29 Thread Nicolas Huillard
I found this message in the archive : Message transféré De: Ilya Dryomov À: Дмитрий Глушенок Cc: ceph-users@lists.ceph.com Objet: Re: [ceph-users] How's cephfs going? Date: Fri, 21 Jul 2017 16:25:40 +0200 On

[ceph-users] Can't get MDS running after a power outage

2018-03-29 Thread Zhang Qiang
Hi, Ceph version 10.2.3. After a power outage, I tried to start the MDS deamons, but they stuck forever replaying journals, I had no idea why they were taking that long, because this is just a small cluster for testing purpose with only hundreds MB data. I restarted them, and the error below was

Re: [ceph-users] session lost, hunting for new mon / session established : every 30s until unmount/remount

2018-03-29 Thread Nicolas Huillard
Le mercredi 28 mars 2018 à 15:57 -0700, Jean-Charles Lopez a écrit : > if I read you crrectly you have 3 MONs on each data center. This > means that when the link goes down you will loose quorum making the > cluster unavailable. Oh yes, sure. I'm planning to add this 7th MON. I'm not sure the

Re: [ceph-users] PGs stuck activating after adding new OSDs

2018-03-29 Thread Jakub Jaszewski
Hi Jon, can you reweight one OSD to default value and share outcome of "ceph osd df tree; ceph -s; ceph health detail" ? Recently I was adding new node, 12x 4TB, one disk at a time and faced activating+remapped state for few hours. Not sure but maybe that was caused by "osd_max_backfills"