Re: [ceph-users] Bluestore osd_max_backfills

2017-11-07 Thread Wido den Hollander
> Op 7 november 2017 om 22:54 schreef Scottix : > > > Hey, > I recently updated to luminous and started deploying bluestore osd nodes. I > normally set osd_max_backfills = 1 and then ramp up as time progresses. > > Although with bluestore it seems like I wasn't able to do

Re: [ceph-users] ceph-mgr bug - zabbix module division by zero

2017-11-07 Thread Wido den Hollander
> Op 7 november 2017 om 18:27 schreef Brady Deetz : > > > I'm guessing this is not expected behavior > It's not indeed. Two PRs for this are out there: - https://github.com/ceph/ceph/pull/18734 - https://github.com/ceph/ceph/pull/18515 Where the last one is the one you

[ceph-users] bluestore assertion happening mostly on cachetier SSDs with external WAL/DB nvme

2017-11-07 Thread Eric Nelson
Hi all, This list has been so great a resource for myself in the past few years using ceph that first off I want to say thanks. This is the first I've needed to post but have gained a ton of insight/experience reading responses from you helpful people. We've been running Luminous for about a

[ceph-users] Bluestore osd_max_backfills

2017-11-07 Thread Scottix
Hey, I recently updated to luminous and started deploying bluestore osd nodes. I normally set osd_max_backfills = 1 and then ramp up as time progresses. Although with bluestore it seems like I wasn't able to do this on the fly like I used to with XFS. ceph tell osd.* injectargs

Re: [ceph-users] removing cluster name support

2017-11-07 Thread Erik McCormick
On Nov 8, 2017 7:33 AM, "Vasu Kulkarni" wrote: On Tue, Nov 7, 2017 at 11:38 AM, Sage Weil wrote: > On Tue, 7 Nov 2017, Alfredo Deza wrote: >> On Tue, Nov 7, 2017 at 7:09 AM, kefu chai wrote: >> > On Fri, Jun 9, 2017 at 3:37 AM, Sage

Re: [ceph-users] removing cluster name support

2017-11-07 Thread Vasu Kulkarni
On Tue, Nov 7, 2017 at 11:38 AM, Sage Weil wrote: > On Tue, 7 Nov 2017, Alfredo Deza wrote: >> On Tue, Nov 7, 2017 at 7:09 AM, kefu chai wrote: >> > On Fri, Jun 9, 2017 at 3:37 AM, Sage Weil wrote: >> >> At CDM yesterday we talked about

Re: [ceph-users] removing cluster name support

2017-11-07 Thread Sage Weil
On Tue, 7 Nov 2017, Alfredo Deza wrote: > On Tue, Nov 7, 2017 at 7:09 AM, kefu chai wrote: > > On Fri, Jun 9, 2017 at 3:37 AM, Sage Weil wrote: > >> At CDM yesterday we talked about removing the ability to name your ceph > >> clusters. There are a number of

[ceph-users] RGW Multisite replication

2017-11-07 Thread David Turner
Jewel 10.2.7. I have a realm that is not replicating data unless I restart the RGW daemons. It will catch up when I restart the daemon, but then not replicate new information until it's restarted again. This is the only realm with this problem, but all of the realms are configured identically.

[ceph-users] ceph-mgr bug - zabbix module division by zero

2017-11-07 Thread Brady Deetz
I'm guessing this is not expected behavior $ ceph zabbix send Error EINVAL: Traceback (most recent call last): File "/usr/lib64/ceph/mgr/zabbix/module.py", line 234, in handle_command self.send() File "/usr/lib64/ceph/mgr/zabbix/module.py", line 206, in send data = self.get_data()

[ceph-users] Why degraded objects count keeps increasing as more data is wrote into cluster?

2017-11-07 Thread shadow_lin
Hi all, I have a pool of 2 replicate(failure domain host) and I was testing it with fio writing to rbd image(about 450MB/s) when one of my host crashed.I rebooted the crashed host and mon said all osd and host were online, but there were some pg in degraded status. I thought it would recover

Re: [ceph-users] s3 bucket policys

2017-11-07 Thread Adam C. Emerson
On 07/11/2017, Simon Leinen wrote: > Simon Leinen writes: > > Adam C Emerson writes: > >> On 03/11/2017, Simon Leinen wrote: > >> [snip] > >>> Is this supported by the Luminous version of RadosGW? > > >> Yes! There's a few bugfixes in master that are making their way into > >> Luminous, but

Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread John Spray
On Tue, Nov 7, 2017 at 3:28 PM, Dan van der Ster wrote: > On Tue, Nov 7, 2017 at 4:15 PM, John Spray wrote: >> On Tue, Nov 7, 2017 at 3:01 PM, Dan van der Ster wrote: >>> On Tue, Nov 7, 2017 at 12:57 PM, John Spray

Re: [ceph-users] Libvirt hosts freeze after ceph osd+mon problem

2017-11-07 Thread Jan Pekař - Imatic
I am using librbd. rbd map was only my test to see, if it is not librbd related. Both - librbd and rbd map were the same frozen result. Node running virtuals has 4.9.0-3-amd64 kernel Two tested virtuals have 4.9.0-3-amd64 kernel, second with 4.10.17-2-pve kernel JP On 7.11.2017 10:42, Wido

Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread Dan van der Ster
On Tue, Nov 7, 2017 at 4:15 PM, John Spray wrote: > On Tue, Nov 7, 2017 at 3:01 PM, Dan van der Ster wrote: >> On Tue, Nov 7, 2017 at 12:57 PM, John Spray wrote: >>> On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz wrote:

Re: [ceph-users] Libvirt hosts freeze after ceph osd+mon problem

2017-11-07 Thread Jan Pekař - Imatic
I migrated virtual to my second node which is running qemu-kvm version 1:2.1+dfsg-12+deb8u6 (from debian oldstable) the same situation - frozen after approx 30-40 seconds when "libceph: osd6 down" appeared in syslog (not before). Also my other virtual on first node was frozen in the same time.

Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread John Spray
On Tue, Nov 7, 2017 at 3:01 PM, Dan van der Ster wrote: > On Tue, Nov 7, 2017 at 12:57 PM, John Spray wrote: >> On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz wrote: >>> My organization has a production cluster primarily used for cephfs

Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread Dan van der Ster
On Tue, Nov 7, 2017 at 12:57 PM, John Spray wrote: > On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz wrote: >> My organization has a production cluster primarily used for cephfs upgraded >> from jewel to luminous. We would very much like to have snapshots on

Re: [ceph-users] ceph-backed VM drive became corrupted after unexpected VM termination

2017-11-07 Thread Дробышевский , Владимир
2017-11-07 19:06 GMT+05:00 Jason Dillaman : > On Tue, Nov 7, 2017 at 8:55 AM, Дробышевский, Владимир > wrote: > > > > Oh, sorry, I forgot to mention that all OSDs are with bluestore, so xfs > mount options don't have any influence. > > > > VMs have

Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread John Spray
On Tue, Nov 7, 2017 at 2:40 PM, Brady Deetz wrote: > Are there any existing fuzzing tools you'd recommend? I know about ceph osd > thrash, which could be tested against, but what about on the client side? I > could just use something pre-built for posix, but that wouldn't

Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread Brady Deetz
Are there any existing fuzzing tools you'd recommend? I know about ceph osd thrash, which could be tested against, but what about on the client side? I could just use something pre-built for posix, but that wouldn't coordinate simulated failures on the storage side with actions against the fs. If

[ceph-users] "Metadata damage detected" on Jewel

2017-11-07 Thread Olivier Migeot
Greetings, ceph-lings, our disaster recovery goes well. All our PGs are clean now. But we still have a few "metadata damage" detected on CephFS, of the "dirfrag" flavour. I listed the related inodes and ran `find` on my FS : I know what folders would be lost if I were to "delete" these

Re: [ceph-users] ceph-backed VM drive became corrupted after unexpected VM termination

2017-11-07 Thread Jason Dillaman
On Tue, Nov 7, 2017 at 8:55 AM, Дробышевский, Владимир wrote: > > Oh, sorry, I forgot to mention that all OSDs are with bluestore, so xfs mount > options don't have any influence. > > VMs have cache="none" by default, then I've tried "writethrough". No > difference. > > And

Re: [ceph-users] Small cluster for VMs hosting

2017-11-07 Thread Gandalf Corvotempesta
2017-11-07 14:49 GMT+01:00 Richard Hesketh : > Read up on > http://docs.ceph.com/docs/master/rados/operations/monitoring-osd-pg/ and > http://docs.ceph.com/docs/master/rados/operations/pg-states/ - understanding > what the different states that PGs and OSDs can be

Re: [ceph-users] ceph-backed VM drive became corrupted after unexpected VM termination

2017-11-07 Thread Дробышевский , Владимир
Oh, sorry, I forgot to mention that all OSDs are with bluestore, so xfs mount options don't have any influence. VMs have cache="none" by default, then I've tried "writethrough". No difference. And aren't these rbd cache options enabled by default? 2017-11-07 18:45 GMT+05:00 Peter Maloney

Re: [ceph-users] Small cluster for VMs hosting

2017-11-07 Thread Richard Hesketh
On 07/11/17 13:16, Gandalf Corvotempesta wrote: > Hi to all > I've been far from ceph from a couple of years (CephFS was still unstable) > > I would like to test it again, some questions for a production cluster for > VMs hosting: > > 1. Is CephFS stable? Yes, CephFS is stable and safe (though

Re: [ceph-users] Small cluster for VMs hosting

2017-11-07 Thread Eneko Lacunza
Hi Gandalf, El 07/11/17 a las 14:16, Gandalf Corvotempesta escribió: Hi to all I've been far from ceph from a couple of years (CephFS was still unstable) I would like to test it again, some questions for a production cluster for VMs hosting: 1. Is CephFS stable? Yes. 2. Can I spin up a 3

Re: [ceph-users] ceph-backed VM drive became corrupted after unexpected VM termination

2017-11-07 Thread Peter Maloney
I see nobarrier in there... Try without that. (unless that's just the bluestore xfs...then it probably won't change anything). And are the osds using bluestore? And what cache options did you set in the VM config? It's dangerous to set writeback without also this in the client side ceph.conf:

Re: [ceph-users] s3 bucket policys

2017-11-07 Thread David Turner
Your examples aren't quite right, meaning you didn't quite understand what I was saying. Let's do 3 buckets, created by 3 different users, and the employees Tom and Bill want access to them... bucket_a: user_a owner, user_a:tom R, user_a:bill RW bucket_b: user_b owner, user_b:tom RW, user_b:bill

Re: [ceph-users] a question about ceph raw space usage

2017-11-07 Thread Alwin Antreich
Hi Nitin, On Tue, Nov 07, 2017 at 12:03:15AM +, Kamble, Nitin A wrote: > Dear Cephers, > > As seen below, I notice that 12.7% of raw storage is consumed with zero pools > in the system. These are bluestore OSDs.  > Is this expected or an anomaly? DB + WAL are consuming space already, if you

[ceph-users] ceph-backed VM drive became corrupted after unexpected VM termination

2017-11-07 Thread Дробышевский , Владимир
Hello! I've got a weird situation with rdb drive image reliability. I found that after hard-reset VM with ceph rbd drive from my new cluster become corrupted. I accidentally found it during HA tests of my new cloud cluster: after host reset VM was not able to boot again because of the virtual

[ceph-users] Small cluster for VMs hosting

2017-11-07 Thread Gandalf Corvotempesta
Hi to all I've been far from ceph from a couple of years (CephFS was still unstable) I would like to test it again, some questions for a production cluster for VMs hosting: 1. Is CephFS stable? 2. Can I spin up a 3 nodes cluster with mons, MDS and osds on the same machine? 3. Hardware

Re: [ceph-users] Libvirt hosts freeze after ceph osd+mon problem

2017-11-07 Thread Jason Dillaman
If you are seeing this w/ librbd and krbd, I would suggest trying a different version of QEMU and/or different host OS since loss of a disk shouldn't hang it -- only potentially the guest OS. On Tue, Nov 7, 2017 at 5:17 AM, Jan Pekař - Imatic wrote: > I'm calling kill -STOP

[ceph-users] features required for live migration

2017-11-07 Thread Oscar Segarra
Hi, In my environment I'm working with a 3 node ceph cluster based on Centos 7 and KVM. My VM is a clone of a protected snapshot as is suggested in the following document: http://docs.ceph.com/docs/luminous/rbd/rbd-snapshot/#getting-started-with-layering I'd like to use the live migration

Re: [ceph-users] unable to remove rbd image

2017-11-07 Thread Jason Dillaman
Disable the journaling feature to delete the corrupt journal and you will be able to access the image again. On Tue, Nov 7, 2017 at 7:35 AM, Stefan Kooman wrote: > Dear list, > > Somehow, might have to do with live migrating the virtual machine, an > rbd image ends up being

[ceph-users] unable to remove rbd image

2017-11-07 Thread Stefan Kooman
Dear list, Somehow, might have to do with live migrating the virtual machine, an rbd image ends up being undeletable. Trying to remove the image results in *loads* of the same messages over and over again: 2017-11-07 11:30:58.431913 7f9ae2ffd700 -1 JournalPlayer: 0x7f9ae400a130 missing prior

Re: [ceph-users] removing cluster name support

2017-11-07 Thread kefu chai
On Fri, Jun 9, 2017 at 3:37 AM, Sage Weil wrote: > At CDM yesterday we talked about removing the ability to name your ceph > clusters. There are a number of hurtles that make it difficult to fully > get rid of this functionality, not the least of which is that some > (many?)

Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread John Spray
On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz wrote: > My organization has a production cluster primarily used for cephfs upgraded > from jewel to luminous. We would very much like to have snapshots on that > filesystem, but understand that there are risks. > > What kind of work

Re: [ceph-users] s3 bucket policys

2017-11-07 Thread Abhishek Lekshmanan
Simon Leinen writes: > Simon Leinen writes: >> Adam C Emerson writes: >>> On 03/11/2017, Simon Leinen wrote: >>> [snip] Is this supported by the Luminous version of RadosGW? > >>> Yes! There's a few bugfixes in master that are making their way into >>> Luminous, but

Re: [ceph-users] s3 bucket policys

2017-11-07 Thread Simon Leinen
Simon Leinen writes: > Adam C Emerson writes: >> On 03/11/2017, Simon Leinen wrote: >> [snip] >>> Is this supported by the Luminous version of RadosGW? >> Yes! There's a few bugfixes in master that are making their way into >> Luminous, but Luminous has all the features at present. > Does that

Re: [ceph-users] s3 bucket policys

2017-11-07 Thread nigel davies
Thanks David and All I am trying out what you said now. When talking to my manager about permissions, is it possible to set the sub users with a bucket by bucket permissions, as form your example i would be granting user_a read only permissions on all buckets and user_b would have read write

Re: [ceph-users] Libvirt hosts freeze after ceph osd+mon problem

2017-11-07 Thread Jan Pekař - Imatic
I'm calling kill -STOP to simulate behavior, that occurred, when on one ceph node i was out of memory. Processes was not killed, but were somehow suspended/unresponsible (they couldn't create new threads etc), and that caused all virtuals (on other nodes) to hung. I decided to simulate it with

Re: [ceph-users] s3 bucket policys

2017-11-07 Thread Simon Leinen
Adam C Emerson writes: > On 03/11/2017, Simon Leinen wrote: > [snip] >> Is this supported by the Luminous version of RadosGW? > Yes! There's a few bugfixes in master that are making their way into > Luminous, but Luminous has all the features at present. Does that mean it should basically work

[ceph-users] Problem syncing monitor

2017-11-07 Thread Stuart Harland
Hi, We had a monitor drop out of quorum a few weeks back and we have been unable to bring it back into sync. when starting, it synchronises the OSD maps, and then it just restarts from fresh every time. When turning the logging up to log level 20, we see this: 2017-11-07 09:31:57.230333

Re: [ceph-users] Libvirt hosts freeze after ceph osd+mon problem

2017-11-07 Thread Piotr Dałek
On 17-11-07 12:02 AM, Jan Pekař - Imatic wrote: Hi, I'm using debian stretch with ceph 12.2.1-1~bpo80+1 and qemu 1:2.8+dfsg-6+deb9u3 I'm running 3 nodes with 3 monitors and 8 osds on my nodes, all on IPV6. When I tested the cluster, I detected strange and severe problem. On first node I'm

Re: [ceph-users] Libvirt hosts freeze after ceph osd+mon problem

2017-11-07 Thread Wido den Hollander
> Op 7 november 2017 om 10:14 schreef Jan Pekař - Imatic : > > > Additional info - it is not librbd related, I mapped disk through > rbd map and it was the same - virtuals were stuck/frozen. > I happened exactly when in my log appeared > Why aren't you using librbd? Is

Re: [ceph-users] RGW: ERROR: failed to distribute cache

2017-11-07 Thread Wido den Hollander
> Op 6 november 2017 om 20:17 schreef Yehuda Sadeh-Weinraub : > > > On Mon, Nov 6, 2017 at 7:29 AM, Wido den Hollander wrote: > > Hi, > > > > On a Ceph Luminous (12.2.1) environment I'm seeing RGWs stall and about the > > same time I see these errors in the

Re: [ceph-users] Libvirt hosts freeze after ceph osd+mon problem

2017-11-07 Thread Jan Pekař - Imatic
Additional info - it is not librbd related, I mapped disk through rbd map and it was the same - virtuals were stuck/frozen. I happened exactly when in my log appeared Nov 7 10:01:27 imatic-hydra01 kernel: [2266883.493688] libceph: osd6 down I can attach with strace to qemu process and I can