Re: [ceph-users] Migrating from block to lvm

2019-11-15 Thread Mike Cave
Losing a node is not a big deal for us (dual bonded 10G connection to each node). I’m thinking: 1. Drain node 2. Redeploy with Ceph Ansible It would require much less hands-on time for our group. I know the churn on the cluster would be high, which was my only concern. Mike Senior

Re: [ceph-users] Migrating from block to lvm

2019-11-15 Thread Mike Cave
Good points, thank you for the insight. Given that I’m hosting the journals (wal/block.dbs) on ssds, would I need to do all the OSDs hosts on each journal ssd at the same time? I’m fairly sure this would be the case. Senior Systems Administrator Research Computing Services Team University of

Re: [ceph-users] Migrating from block to lvm

2019-11-15 Thread Martin Verges
I would consider doing it host-by-host wise, as you should always be able to handle the complete loss of a node. This would be much faster in the end as you save a lot of time not migrating data back and forth. However this can lead to problems if your cluster is not configured according to the

Re: [ceph-users] Migrating from block to lvm

2019-11-15 Thread Janne Johansson
Den fre 15 nov. 2019 kl 19:40 skrev Mike Cave : > So would you recommend doing an entire node at the same time or per-osd? > You should be able to do it per-OSD (or per-disk in case you run more than one OSD per disk), to minimize data movement over the network, letting other OSDs on the same

Re: [ceph-users] Migrating from block to lvm

2019-11-15 Thread Mike Cave
So would you recommend doing an entire node at the same time or per-osd? Senior Systems Administrator Research Computing Services Team University of Victoria O: 250.472.4997 On 2019-11-15, 10:28 AM, "Paul Emmerich" wrote: You'll have to tell LVM about multi-path, otherwise LVM gets

Re: [ceph-users] Migrating from block to lvm

2019-11-15 Thread Paul Emmerich
You'll have to tell LVM about multi-path, otherwise LVM gets confused. But that should be the only thing Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Fri, Nov

Re: [ceph-users] Full FLash NVME Cluster recommendation

2019-11-15 Thread Nathan Fish
Bluestore will use about 4 cores, but in my experience, the maximum utilization I've seen has been something like: 100%, 100%, 50%, 50% So those first 2 cores are the bottleneck for pure OSD IOPS. This sort of pattern isn't uncommon in multithreaded programs. This was on HDD OSDs with DB/WAL on

Re: [ceph-users] NVMe disk - size

2019-11-15 Thread vitalif
Use 30 GB for all OSDs. Other values are pointless, because https://yourcmc.ru/wiki/Ceph_performance#About_block.db_sizing You can use the rest of free NVMe space for bcache - it's much better than just allocating it for block.db. ___ ceph-users

[ceph-users] Migrating from block to lvm

2019-11-15 Thread Mike Cave
Greetings all! I am looking at upgrading to Nautilus in the near future (currently on Mimic). We have a cluster built on 480 OSDs all using multipath and simple block devices. I see that the ceph-disk tool is now deprecated and the ceph-volume tool doesn’t do everything that ceph-disk did for

Re: [ceph-users] Large OMAP Object

2019-11-15 Thread DHilsbos
Wido; Ok, yes, I have tracked it down to the index for one of our buckets. I missed the ID in the ceph df output previously. Next time I'll wait to read replies until I've finished my morning coffee. How would I go about correcting this? The content for this bucket is basically just junk,

Re: [ceph-users] Large OMAP Object

2019-11-15 Thread DHilsbos
Paul; I upgraded the cluster in question from 14.2.2 to 14.2.4 just before this came up, so that makes sense. Thank you, Dominic L. Hilsbos, MBA Director – Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com -Original Message- From:

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread Paul Emmerich
To clear up a few misconceptions here: * RBD keyrings should use the "profile rbd" permissions, everything else is *wrong* and should be fixed asap * Manually adding the blacklist permission might work but isn't future-proof, fix the keyring instead * The suggestion to mount them elsewhere to fix

[ceph-users] Mimic - cephfs scrub errors

2019-11-15 Thread Andras Pataki
Dear cephers, We've had a few (dozen or so) rather odd scrub errors in our Mimic (13.2.6) cephfs: 2019-11-15 07:52:52.614 7fffcc41f700  0 log_channel(cluster) log [DBG] : 2.b5b scrub starts 2019-11-15 07:52:55.190 7fffcc41f700 -1 log_channel(cluster) log [ERR] : 2.b5b shard 599 soid

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread Joshua M. Boniface
Thanks Simon! I've implemented it, I guess I'll test it out next time my homelab's power dies :-) On 2019-11-15 10:54 a.m., Simon Ironside wrote: On 15/11/2019 15:44, Joshua M. Boniface wrote: Hey All: I've also quite frequently experienced this sort of issue with my Ceph RBD-backed

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread Simon Ironside
On 15/11/2019 15:44, Joshua M. Boniface wrote: Hey All: I've also quite frequently experienced this sort of issue with my Ceph RBD-backed QEMU/KVM cluster (not OpenStack specifically). Should this workaround of allowing the 'osd blacklist' command in the caps help in that scenario as well, or

Re: [ceph-users] Large OMAP Object

2019-11-15 Thread Paul Emmerich
Note that the size limit changed from 2M keys to 200k keys recently (14.2.3 or 14.2.2 or something), so that object is probably older and that's just the first deep scrub with the reduced limit that triggered the warning. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact

Re: [ceph-users] NVMe disk - size

2019-11-15 Thread Paul Emmerich
On Fri, Nov 15, 2019 at 4:39 PM Wido den Hollander wrote: > > > > On 11/15/19 4:25 PM, Paul Emmerich wrote: > > On Fri, Nov 15, 2019 at 4:02 PM Wido den Hollander wrote: > >> > >> I normally use LVM on top > >> of each device and create 2 LVs per OSD: > >> > >> - WAL: 1GB > >> - DB: xx GB > > >

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread Joshua M. Boniface
Hey All: I've also quite frequently experienced this sort of issue with my Ceph RBD-backed QEMU/KVM cluster (not OpenStack specifically). Should this workaround of allowing the 'osd blacklist' command in the caps help in that scenario as well, or is this an OpenStack-specific functionality?

Re: [ceph-users] Large OMAP Object

2019-11-15 Thread Wido den Hollander
On 11/15/19 4:35 PM, dhils...@performair.com wrote: > All; > > Thank you for your help so far. I have found the log entries from when the > object was found, but don't see a reference to the pool. > > Here the logs: > 2019-11-14 03:10:16.508601 osd.1 (osd.1) 21 : cluster [DBG] 56.7

Re: [ceph-users] NVMe disk - size

2019-11-15 Thread Wido den Hollander
On 11/15/19 4:25 PM, Paul Emmerich wrote: > On Fri, Nov 15, 2019 at 4:02 PM Wido den Hollander wrote: >> >> I normally use LVM on top >> of each device and create 2 LVs per OSD: >> >> - WAL: 1GB >> - DB: xx GB > > Why? I've seen this a few times and I can't figure out what the > advantage of

Re: [ceph-users] Large OMAP Object

2019-11-15 Thread DHilsbos
All; Thank you for your help so far. I have found the log entries from when the object was found, but don't see a reference to the pool. Here the logs: 2019-11-14 03:10:16.508601 osd.1 (osd.1) 21 : cluster [DBG] 56.7 deep-scrub starts 2019-11-14 03:10:18.325881 osd.1 (osd.1) 22 : cluster

Re: [ceph-users] NVMe disk - size

2019-11-15 Thread Paul Emmerich
On Fri, Nov 15, 2019 at 4:02 PM Wido den Hollander wrote: > > I normally use LVM on top > of each device and create 2 LVs per OSD: > > - WAL: 1GB > - DB: xx GB Why? I've seen this a few times and I can't figure out what the advantage of doing this explicitly on the LVM level instead of relying

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread EDH - Manuel Rios Fernandez
Hi, For solve the issue, mount with: rbd map pool/disk_id , and mount the / volume in a linux machine "A ceph node will be ok", this will flush the journal and close and discard the pending changes in openstack nodes cache, then unmount and rbd unmap. Boot the instance from openstack again, and

Re: [ceph-users] NVMe disk - size

2019-11-15 Thread Paul Emmerich
On Fri, Nov 15, 2019 at 4:04 PM Kristof Coucke wrote: > > Hi Paul, > > Thank you for the answer. > I didn't thought of that approach... (Using the NVMe for the meta data pool > of RGW). > > From where do you get the limitation of 1.3TB? 13 OSDs/Server * 10 Servers * 30 GB/OSD usable DB space /

Re: [ceph-users] NVMe disk - size

2019-11-15 Thread Kristof Coucke
Hi Paul, Thank you for the answer. I didn't thought of that approach... (Using the NVMe for the meta data pool of RGW). >From where do you get the limitation of 1.3TB? I don't get that one... Br, Kristof Op vr 15 nov. 2019 om 15:26 schreef Paul Emmerich : > On Fri, Nov 15, 2019 at 3:16 PM

Re: [ceph-users] NVMe disk - size

2019-11-15 Thread Wido den Hollander
On 11/15/19 3:19 PM, Kristof Coucke wrote: > Hi all, > >   > > We’ve configured a Ceph cluster with 10 nodes, each having 13 large > disks (14TB) and 2 NVMe disks (1,6TB). > > The idea was to use the NVMe as “fast device”… > > The recommendations I’ve read in the online documentation, state

Re: [ceph-users] NVMe disk - size

2019-11-15 Thread Paul Emmerich
On Fri, Nov 15, 2019 at 3:16 PM Kristof Coucke wrote: > We’ve configured a Ceph cluster with 10 nodes, each having 13 large disks > (14TB) and 2 NVMe disks (1,6TB). > The recommendations I’ve read in the online documentation, state that the db > block device should be around 4%~5% of the slow

[ceph-users] NVMe disk - size

2019-11-15 Thread Kristof Coucke
Hi all, We’ve configured a Ceph cluster with 10 nodes, each having 13 large disks (14TB) and 2 NVMe disks (1,6TB). The idea was to use the NVMe as “fast device”… The recommendations I’ve read in the online documentation, state that the db block device should be around 4%~5% of the slow

[ceph-users] NVMe disk - size

2019-11-15 Thread Kristof Coucke
Hi all, We’ve configured a Ceph cluster with 10 nodes, each having 13 large disks (14TB) and 2 NVMe disks (1,6TB).The idea was to use the NVMe as “fast device”…The recommendations I’ve read in the online documentation, state that the db block device should be around 4%~5% of the slow device. So,

[ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread Florian Haas
On 15/11/2019 14:27, Simon Ironside wrote: > Hi Florian, > > On 15/11/2019 12:32, Florian Haas wrote: > >> I received this off-list but then subsequently saw this message pop up >> in the list archive, so I hope it's OK to reply on-list? > > Of course, I just clicked the wrong reply button the

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread Simon Ironside
Hi Florian, On 15/11/2019 12:32, Florian Haas wrote: I received this off-list but then subsequently saw this message pop up in the list archive, so I hope it's OK to reply on-list? Of course, I just clicked the wrong reply button the first time. So that cap was indeed missing, thanks for

Re: [ceph-users] Beginner question netwokr configuration best practice

2019-11-15 Thread Willi Schiegel
Thank you, you answer helps a lot! On 15.11.19 13:21, Wido den Hollander wrote: On 11/15/19 12:57 PM, Willi Schiegel wrote: Hello All, I'm starting to setup a Ceph cluster and am confused about the recommendations for the network setup. In the Mimic manual I can read "We recommend running

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread Florian Haas
On 15/11/2019 11:23, Simon Ironside wrote: > Hi Florian, > > Any chance the key your compute nodes are using for the RBD pool is > missing 'allow command "osd blacklist"' from its mon caps? > > Simon Hi Simon, I received this off-list but then subsequently saw this message pop up in the list

Re: [ceph-users] Node failure -- corrupt memory

2019-11-15 Thread Wido den Hollander
On 11/11/19 2:00 PM, Shawn Iverson wrote: > Hello Cephers! > > I had a node over the weekend go nuts from what appears to have been > failed/bad memory modules and/or motherboard. > > This resulted in several OSDs blocking IO for > 128s (indefinitely). > > I was not watching my alerts too

Re: [ceph-users] Beginner question netwokr configuration best practice

2019-11-15 Thread Wido den Hollander
On 11/15/19 12:57 PM, Willi Schiegel wrote: > Hello All, > > I'm starting to setup a Ceph cluster and am confused about the > recommendations for the network setup. > > In the Mimic manual I can read > > "We recommend running a Ceph Storage Cluster with two networks: a public > (front-side)

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread Wido den Hollander
On 11/15/19 11:24 AM, Simon Ironside wrote: > Hi Florian, > > Any chance the key your compute nodes are using for the RBD pool is > missing 'allow command "osd blacklist"' from its mon caps? > Added to this I recommend to use the 'profile rbd' for the mon caps. As also stated in the

[ceph-users] Beginner question netwokr configuration best practice

2019-11-15 Thread Willi Schiegel
Hello All, I'm starting to setup a Ceph cluster and am confused about the recommendations for the network setup. In the Mimic manual I can read "We recommend running a Ceph Storage Cluster with two networks: a public (front-side) network and a cluster (back-side) network." In the Nautilus

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread Simon Ironside
Hi Florian, Any chance the key your compute nodes are using for the RBD pool is missing 'allow command "osd blacklist"' from its mon caps? Simon On 15/11/2019 08:19, Florian Haas wrote: Hi everyone, I'm trying to wrap my head around an issue we recently saw, as it relates to RBD locks,

Re: [ceph-users] Strange CEPH_ARGS problems

2019-11-15 Thread Rainer Krienke
This is not my day :-) Yes this flip beween client.rz and client.user was not intended. Its another typo. When trying to run rbd I used everywhere the same client.rz user and the same keyring /etc/ceph/ceph.client.user.keyring. Sorry Rainer Am 15.11.19 um 11:02 schrieb Janne Johansson: > Is

Re: [ceph-users] Strange CEPH_ARGS problems

2019-11-15 Thread Konstantin Shalygin
I found a typo in my post: Of course I tried export CEPH_ARGS="-n client.rz --keyring=" and not export CEPH_ARGS=="-n client.rz --keyring=" try `export CEPH_ARGS="--id rz --keyring=..."` k ___ ceph-users mailing list

Re: [ceph-users] Strange CEPH_ARGS problems

2019-11-15 Thread Janne Johansson
Is the flip between the client name "rz" and "user" also a mistype? It's hard to divinate if it is intentional or not since you are mixing it about. Den fre 15 nov. 2019 kl 10:57 skrev Rainer Krienke : > I found a typo in my post: > > Of course I tried > > export CEPH_ARGS="-n client.rz

Re: [ceph-users] Strange CEPH_ARGS problems

2019-11-15 Thread Rainer Krienke
I found a typo in my post: Of course I tried export CEPH_ARGS="-n client.rz --keyring=" and not export CEPH_ARGS=="-n client.rz --keyring=" Thanks Rainer Am 15.11.19 um 07:46 schrieb Rainer Krienke: > Hello, > > I try to use CEPH_ARGS in order to use eg rbd with a non client.admin >

Re: [ceph-users] Large OMAP Object

2019-11-15 Thread Wido den Hollander
Did you check /var/log/ceph/ceph.log on one of the Monitors to see which pool and Object the large Object is in? Wido On 11/15/19 12:23 AM, dhils...@performair.com wrote: > All; > > We had a warning about a large OMAP object pop up in one of our clusters > overnight. The cluster is configured

[ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread Florian Haas
Hi everyone, I'm trying to wrap my head around an issue we recently saw, as it relates to RBD locks, Qemu/KVM, and libvirt. Our data center graced us with a sudden and complete dual-feed power failure that affected both a Ceph cluster (Luminous, 12.2.12), and OpenStack compute nodes that used