Re: [ceph-users] Recovery from 12.2.5 (corruption) -> 12.2.6 (hair on fire) -> 13.2.0 (some objects inaccessible and CephFS damaged)

2018-07-18 Thread Brad Hubbard
On Thu, Jul 19, 2018 at 12:47 PM, Troy Ablan wrote: > > > On 07/18/2018 06:37 PM, Brad Hubbard wrote: >> On Thu, Jul 19, 2018 at 2:48 AM, Troy Ablan wrote: >>> >>> >>> On 07/17/2018 11:14 PM, Brad Hubbard wrote: On Wed, Jul 18, 2018 at 2:57 AM, Troy Ablan wrote: > > I was on

Re: [ceph-users] Crush Rules with multiple Device Classes

2018-07-18 Thread Konstantin Shalygin
Now my first question is: 1) Is there a way to specify "take default class (ssd or nvme)"? Then we could just do this for the migration period, and at some point remove "ssd". If multi-device-class in a crush rule is not supported yet, the only workaround which comes to my mind right now

Re: [ceph-users] Migrating EC pool to device-class crush rules

2018-07-18 Thread Konstantin Shalygin
So mostly I want to confirm that is is safe to change the crush rule for the EC pool. Changing crush rules for replicated or ec pool is safe. One thing is, when I was migrated from multiroot to device-classes I was recreate ec pools and clone images with qemu-img for ec_overwrites feature,

[ceph-users] Crush Rules with multiple Device Classes

2018-07-18 Thread Oliver Freyermuth
Dear Cephalopodians, we use an SSD-only pool to store the metadata of our CephFS. In the future, we will add a few NVMEs, and in the long-term view, replace the existing SSDs by NVMEs, too. Thinking this through, I came up with three questions which I do not find answered in the docs (yet).

Re: [ceph-users] RAID question for Ceph

2018-07-18 Thread Troy Ablan
On 07/18/2018 07:44 PM, Satish Patel wrote: > If i have 8 OSD drives in server on P410i RAID controller (HP), If i > want to make this server has OSD node in that case show should i > configure RAID? > > 1. Put all drives in RAID-0? > 2. Put individual HDD in RAID-0 and create 8 individual

Re: [ceph-users] Recovery from 12.2.5 (corruption) -> 12.2.6 (hair on fire) -> 13.2.0 (some objects inaccessible and CephFS damaged)

2018-07-18 Thread Troy Ablan
On 07/18/2018 06:37 PM, Brad Hubbard wrote: > On Thu, Jul 19, 2018 at 2:48 AM, Troy Ablan wrote: >> >> >> On 07/17/2018 11:14 PM, Brad Hubbard wrote: >>> >>> On Wed, Jul 18, 2018 at 2:57 AM, Troy Ablan wrote: I was on 12.2.5 for a couple weeks and started randomly seeing

[ceph-users] RAID question for Ceph

2018-07-18 Thread Satish Patel
If i have 8 OSD drives in server on P410i RAID controller (HP), If i want to make this server has OSD node in that case show should i configure RAID? 1. Put all drives in RAID-0? 2. Put individual HDD in RAID-0 and create 8 individual RAID-0 so OS can see 8 separate HDD drives What most people

[ceph-users] ceph rdma + IB network error

2018-07-18 Thread Will Zhao
Hi all: By following the instructions: (https://community.mellanox.com/docs/DOC-2721) (https://community.mellanox.com/docs/DOC-2693) (http://hwchiu.com/2017-05-03-ceph-with-rdma.html) I'm trying to configure CEPH with RDMA feature on environments as follows: CentOS Linux release

Re: [ceph-users] Recovery from 12.2.5 (corruption) -> 12.2.6 (hair on fire) -> 13.2.0 (some objects inaccessible and CephFS damaged)

2018-07-18 Thread Brad Hubbard
On Thu, Jul 19, 2018 at 2:48 AM, Troy Ablan wrote: > > > On 07/17/2018 11:14 PM, Brad Hubbard wrote: >> >> On Wed, Jul 18, 2018 at 2:57 AM, Troy Ablan wrote: >>> >>> I was on 12.2.5 for a couple weeks and started randomly seeing >>> corruption, moved to 12.2.6 via yum update on Sunday, and all

Re: [ceph-users] [Ceph-maintainers] v12.2.7 Luminous released

2018-07-18 Thread Linh Vu
Awesome, thank you Sage! With that explanation, it's actually a lot easier and less impacting than I thought. :) Cheers, Linh From: Sage Weil Sent: Thursday, 19 July 2018 9:35:33 AM To: Linh Vu Cc: Stefan Kooman; ceph-de...@vger.kernel.org;

Re: [ceph-users] [Ceph-maintainers] v12.2.7 Luminous released

2018-07-18 Thread Sage Weil
On Wed, 18 Jul 2018, Linh Vu wrote: > Thanks for all your hard work in putting out the fixes so quickly! :) > > We have a cluster on 12.2.5 with Bluestore and EC pool but for CephFS, > not RGW. In the release notes, it says RGW is a risk especially the > garbage collection, and the

Re: [ceph-users] Fwd: MDS memory usage is very high

2018-07-18 Thread Daniel Carrasco
Thanks again, I was trying to use fuse client instead Ubuntu 16.04 kernel module to see if maybe is a client side problem, but CPU usage on fuse client is very high (a 100% and even more in a two cores machine), so I'd to rever to kernel client that uses much less CPU. Is a web server, so maybe

Re: [ceph-users] Fwd: MDS memory usage is very high

2018-07-18 Thread Gregory Farnum
Wow, yep, apparently the MDS has another 9GB of allocated RAM outside of the cache! Hopefully one of the current FS users or devs has some idea. All I can suggest is looking to see if there are a bunch of stuck requests or something that are taking up memory which isn’t properly counted. On Wed,

Re: [ceph-users] Fwd: MDS memory usage is very high

2018-07-18 Thread Daniel Carrasco
Hello, thanks for your response. This is what I get: # ceph tell mds.kavehome-mgto-pro-fs01 heap stats 2018-07-19 00:43:46.142560 7f5a7a7fc700 0 client.1318388 ms_handle_reset on 10.22.0.168:6800/1129848128 2018-07-19 00:43:46.181133 7f5a7b7fe700 0 client.1318391 ms_handle_reset on

Re: [ceph-users] Fwd: MDS memory usage is very high

2018-07-18 Thread Gregory Farnum
The MDS think it's using 486MB of cache right now, and while that's not a complete accounting (I believe you should generally multiply by 1.5 the configured cache limit to get a realistic memory consumption model) it's obviously a long way from 12.5GB. You might try going in with the "ceph daemon"

[ceph-users] Fwd: MDS memory usage is very high

2018-07-18 Thread Daniel Carrasco
Hello, I've created a 3 nodes cluster with MON, MGR, OSD and MDS on all (2 MDS actives), and I've noticed that MDS is using a lot of memory (just now is using 12.5GB of RAM): # ceph daemon mds.kavehome-mgto-pro-fs01 dump_mempools | jq -c '.mds_co'; ceph daemon mds.kavehome-mgto-pro-fs01 perf dump

[ceph-users] Migrating EC pool to device-class crush rules

2018-07-18 Thread Graham Allan
Like many, we have a typical double root crush map, for hdd vs ssd-based pools. We've been running lumious for some time, so in preparation for a migration to new storage hardware, I wanted to migrate our pools to use the new device-class based rules; this way I shouldn't need to perpetuate

Re: [ceph-users] Need advice on Ceph design

2018-07-18 Thread Satish Patel
Thanks Sebastien, Let me answer all of your question which i missed out, Let me tell you this is first cluster so i have no idea what would be best or worst here, also you said we don't need SSD Journal for BlueStore but i heard people saying WAL/RockDB required SSD, can you explain? If i have

[ceph-users] Ceph Community Manager

2018-07-18 Thread Sage Weil
Hi everyone, Leo Vaz has moved on from his community manager role. I'd like to take this opportunity to thank him for his efforts over the past year, and to wish him the best in his future ventures. We've accomplished a lot during his tenure (including our first Cephalocon!) and Leo's

Re: [ceph-users] Need advice on Ceph design

2018-07-18 Thread Sébastien VIGNERON
Hello, What is your expected workload? VMs, primary storage, backup, objects storage, ...? How many disks do you plan to put in each OSD node? How many CPU cores? How many RAM per nodes? Ceph access protocol(s): CephFS, RBD or objects? How do you plan to give access to the storage to you

Re: [ceph-users] krbd vs librbd performance with qemu

2018-07-18 Thread Alexandre DERUMIER
Hi, qemu use only 1 thread for disk, generally the performance limitation come from cpu. (you can have 1 thread for each disk using iothread). I'm not sure how it's work with krbd, but with librbd and qemu rbd driver, it's only use 1core by disk. So, you need to have fast cpu frequency,

Re: [ceph-users] krbd vs librbd performance with qemu

2018-07-18 Thread Jason Dillaman
On Wed, Jul 18, 2018 at 1:08 PM Nikola Ciprich wrote: > > Care to share your "bench-rbd" script (on pastebin or similar)? > sure, no problem.. it's so short I hope nobody will get offended if I > paste it right > here :) > > #!/bin/bash > > #export LD_PRELOAD="/usr/lib64/libtcmalloc.so.4" >

Re: [ceph-users] Exact scope of OSD heartbeating?

2018-07-18 Thread Anthony D'Atri
Thanks, Dan. I thought so but wanted to verify. I'll see if I can work up a doc PR to clarify this. >> The documentation here: >> >> http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/ >> >> says >> >> "Each Ceph OSD Daemon checks the heartbeat of other Ceph OSD

[ceph-users] Need advice on Ceph design

2018-07-18 Thread Satish Patel
I have decided to setup 5 node Ceph storage and following is my inventory, just tell me is it good to start first cluster for average load. 0. Ceph Bluestore 1. Journal SSD (Intel DC 3700) 2. OSD disk Samsung 850 Pro 500GB 3. OSD disk SATA 500GB (7.5k RPMS) 4. 2x10G NIC (separate public/cluster

Re: [ceph-users] krbd vs librbd performance with qemu

2018-07-18 Thread Nikola Ciprich
> Care to share your "bench-rbd" script (on pastebin or similar)? sure, no problem.. it's so short I hope nobody will get offended if I paste it right here :) #!/bin/bash #export LD_PRELOAD="/usr/lib64/libtcmalloc.so.4" numjobs=8 pool=nvme vol=xxx time=30 opts="--randrepeat=1 --ioengine=rbd

Re: [ceph-users] krbd vs librbd performance with qemu

2018-07-18 Thread Jason Dillaman
On Wed, Jul 18, 2018 at 12:58 PM Nikola Ciprich wrote: > > What's the output from "rbd info nvme/centos7"? > that was it! the parent had some of unsupported features > enabled, therefore the child could not be mapped.. > > so the error message is a bit confusing, but now after disabling > the

Re: [ceph-users] krbd vs librbd performance with qemu

2018-07-18 Thread Nikola Ciprich
> What's the output from "rbd info nvme/centos7"? that was it! the parent had some of unsupported features enabled, therefore the child could not be mapped.. so the error message is a bit confusing, but now after disabling the features on the parent it works for me, thanks! > Odd. The

Re: [ceph-users] Recovery from 12.2.5 (corruption) -> 12.2.6 (hair on fire) -> 13.2.0 (some objects inaccessible and CephFS damaged)

2018-07-18 Thread Troy Ablan
On 07/17/2018 11:14 PM, Brad Hubbard wrote: On Wed, Jul 18, 2018 at 2:57 AM, Troy Ablan wrote: I was on 12.2.5 for a couple weeks and started randomly seeing corruption, moved to 12.2.6 via yum update on Sunday, and all hell broke loose. I panicked and moved to Mimic, and when that didn't

Re: [ceph-users] krbd vs librbd performance with qemu

2018-07-18 Thread Jason Dillaman
On Wed, Jul 18, 2018 at 12:36 PM Nikola Ciprich wrote: > ;6QHi Janon, > > > Just to clarify: modern / rebased krbd block drivers definitely support > > layering. The only missing features right now are object-map/fast-diff, > > deep-flatten, and journaling (for RBD mirroring). > > I thought it

Re: [ceph-users] krbd vs librbd performance with qemu

2018-07-18 Thread Nikola Ciprich
;6QHi Janon, > Just to clarify: modern / rebased krbd block drivers definitely support > layering. The only missing features right now are object-map/fast-diff, > deep-flatten, and journaling (for RBD mirroring). I thought it as well, but at least mapping clone does not work for me even under

[ceph-users] is upgrade from 12.2.5 to 12.2.7 an emergency for EC users

2018-07-18 Thread Brady Deetz
I'm trying to determine if I need to perform an emergency update on my 2PB CephFS environment running on EC. What triggers the corruption bug? Is it only at the time of an OSD restart before data is quiesced? When do you know if corruption has occurred? deep-scrub?

Re: [ceph-users] krbd vs librbd performance with qemu

2018-07-18 Thread Jason Dillaman
On Wed, Jul 18, 2018 at 10:55 AM Nikola Ciprich wrote: > Hi, > > historically I've found many discussions about this topic in > last few years, but it seems to me to be still a bit unresolved > so I'd like to open the question again.. > > In all flash deployments, under 12.2.5 luminous and qemu

Re: [ceph-users] 10.2.6 upgrade

2018-07-18 Thread Glen Baars
Hello Sage, Thanks for the response. I new fairly new to ceph. Is there any commands that would help confirm the issue? Kind regards, Glen Baars T 1300 733 328 NZ +64 9280 3561 MOB +61 447 991 234 This e-mail may contain confidential and/or privileged information.If you are not the

[ceph-users] krbd vs librbd performance with qemu

2018-07-18 Thread Nikola Ciprich
Hi, historically I've found many discussions about this topic in last few years, but it seems to me to be still a bit unresolved so I'd like to open the question again.. In all flash deployments, under 12.2.5 luminous and qemu 12.2.0 using lbirbd, I'm getting much worse results regarding IOPS

Re: [ceph-users] v12.2.7 Luminous released

2018-07-18 Thread Nicolas Huillard
Hi all, This is just to report that I just upgraded smoothly from 12.2.6 to 12.2.7 (bluestore only, bitten by the "damaged mds" consequence of the bad checksum on mds journal 0x200). This was a really bad problem for CephFS. Hopefully, that cluster was not in production yet (that's why I didn't

Re: [ceph-users] 10.2.6 upgrade

2018-07-18 Thread Sage Weil
On Wed, 18 Jul 2018, Glen Baars wrote: > Hello Ceph Users, > > We installed 12.2.6 on a single node in the cluster ( new node added, 80TB > moved ) > Disabled scrub/deepscrub once the issues with 12.2.6 were discovered. > > > Today we upgrade the one affected node to 12.2.7 today, set osd skip

[ceph-users] 10.2.6 upgrade

2018-07-18 Thread Glen Baars
Hello Ceph Users, We installed 12.2.6 on a single node in the cluster ( new node added, 80TB moved ) Disabled scrub/deepscrub once the issues with 12.2.6 were discovered. Today we upgrade the one affected node to 12.2.7 today, set osd skip data digest = true and re enabled the scrubs. It's a

Re: [ceph-users] v12.2.7 Luminous released

2018-07-18 Thread Oliver Freyermuth
Am 18.07.2018 um 16:20 schrieb Sage Weil: > On Wed, 18 Jul 2018, Oliver Freyermuth wrote: >> Am 18.07.2018 um 14:20 schrieb Sage Weil: >>> On Wed, 18 Jul 2018, Linh Vu wrote: Thanks for all your hard work in putting out the fixes so quickly! :) We have a cluster on 12.2.5 with

Re: [ceph-users] v12.2.7 Luminous released

2018-07-18 Thread Sage Weil
On Wed, 18 Jul 2018, Oliver Freyermuth wrote: > Am 18.07.2018 um 14:20 schrieb Sage Weil: > > On Wed, 18 Jul 2018, Linh Vu wrote: > >> Thanks for all your hard work in putting out the fixes so quickly! :) > >> > >> We have a cluster on 12.2.5 with Bluestore and EC pool but for CephFS, > >> not

Re: [ceph-users] v12.2.7 Luminous released

2018-07-18 Thread Oliver Freyermuth
Am 18.07.2018 um 14:20 schrieb Sage Weil: > On Wed, 18 Jul 2018, Linh Vu wrote: >> Thanks for all your hard work in putting out the fixes so quickly! :) >> >> We have a cluster on 12.2.5 with Bluestore and EC pool but for CephFS, >> not RGW. In the release notes, it says RGW is a risk especially

Re: [ceph-users] multisite and link speed

2018-07-18 Thread Casey Bodley
On Tue, Jul 17, 2018 at 10:16 AM, Robert Stanford wrote: > > I have ceph clusters in a zone configured as active/passive, or > primary/backup. If the network link between the two clusters is slower than > the speed of data coming in to the active cluster, what will eventually > happen? Will

Re: [ceph-users] v12.2.7 Luminous released

2018-07-18 Thread Sage Weil
On Wed, 18 Jul 2018, Linh Vu wrote: > Thanks for all your hard work in putting out the fixes so quickly! :) > > We have a cluster on 12.2.5 with Bluestore and EC pool but for CephFS, > not RGW. In the release notes, it says RGW is a risk especially the > garbage collection, and the

Re: [ceph-users] v12.2.7 Luminous released

2018-07-18 Thread Oliver Freyermuth
Also many thanks from my side! Am 18.07.2018 um 03:04 schrieb Linh Vu: > Thanks for all your hard work in putting out the fixes so quickly! :) > > We have a cluster on 12.2.5 with Bluestore and EC pool but for CephFS, not > RGW. In the release notes, it says RGW is a risk especially the

Re: [ceph-users] Read/write statistics per RBD image

2018-07-18 Thread Jason Dillaman
Yup, on the host running librbd, you just need to enable the "admin socket" in your ceph.conf and then use "ceph --admin-daemon /path/to/image/admin/socket.asok perf dump" (i.e. not "ceph perf dump"). See the example in this tip window [1] for how to configure for a "libvirt" CephX user. [1]

Re: [ceph-users] Jewel PG stuck inconsistent with 3 0-size objects

2018-07-18 Thread Matthew Vernon
Hi, On 17/07/18 01:29, Brad Hubbard wrote: > Your issue is different since not only do the omap digests of all > replicas not match the omap digest from the auth object info but they > are all different to each other. > > What is min_size of pool 67 and what can you tell us about the events >

Re: [ceph-users] Balancer: change from crush-compat to upmap

2018-07-18 Thread Caspar Smit
Hi Xavier, Not yet, i got a little anxious in changing anything major in the cluster after reading about the 12.2.5 regressions since i'm also using bluestore and erasure coding. So after this cluster is upgraded to 12.2.7 i'm proceeding forward with this. Kind regards, Caspar 2018-07-16 8:34

Re: [ceph-users] Exact scope of OSD heartbeating?

2018-07-18 Thread Dan van der Ster
On Wed, Jul 18, 2018 at 3:20 AM Anthony D'Atri wrote: > > The documentation here: > > http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/ > > says > > "Each Ceph OSD Daemon checks the heartbeat of other Ceph OSD Daemons every 6 > seconds" > > and > > " If a neighboring Ceph

Re: [ceph-users] Read/write statistics per RBD image

2018-07-18 Thread Mateusz Skala (UST, POL)
Thanks for response. In ‘ceph perf dump’ there is no statistics for read/write operations on specific RBD image, only for osd and total client operations. I need to get statistics on one specific RBD image, to get top used images. It is possible? Regards Mateusz From: Jason Dillaman

Re: [ceph-users] v12.2.7 Luminous released

2018-07-18 Thread Caspar Smit
2018-07-18 3:04 GMT+02:00 Linh Vu : > Thanks for all your hard work in putting out the fixes so quickly! :) > > We have a cluster on 12.2.5 with Bluestore and EC pool but for CephFS, not > RGW. In the release notes, it says RGW is a risk especially the garbage > collection, and the recommendation

[ceph-users] config ceph with rdma error

2018-07-18 Thread Will Zhao
Hi all: By following the instructions: (https://community.mellanox.com/docs/DOC-2721) (https://community.mellanox.com/docs/DOC-2693) (http://hwchiu.com/2017-05-03-ceph-with-rdma.html) I'm trying to configure CEPH with RDMA feature on environments as follows: CentOS Linux release

Re: [ceph-users] Recovery from 12.2.5 (corruption) -> 12.2.6 (hair on fire) -> 13.2.0 (some objects inaccessible and CephFS damaged)

2018-07-18 Thread Brad Hubbard
On Wed, Jul 18, 2018 at 2:57 AM, Troy Ablan wrote: > I was on 12.2.5 for a couple weeks and started randomly seeing > corruption, moved to 12.2.6 via yum update on Sunday, and all hell broke > loose. I panicked and moved to Mimic, and when that didn't solve the > problem, only then did I start