Re: [ceph-users] How ceph client abort IO

2015-10-21 Thread Jason Dillaman
> On Tue, 20 Oct 2015, Jason Dillaman wrote: > > There is no such interface currently on the librados / OSD side to abort > > IO operations. Can you provide some background on your use-case for > > aborting in-flight IOs? > > The internal Objecter has a cancel interface, but it can't yank back >

[ceph-users] Increasing pg and pgs

2015-10-21 Thread Paras pradhan
Hi, When I check ceph health I see "HEALTH_WARN too few pgs per osd (11 < min 20)" I have 40osds and tried to increase the pg to 2000 with the following command. It says creating 1936 but not sure if it is working or not. Is there a way to check the progress? It has passed more than 48hrs and I

Re: [ceph-users] Increasing pg and pgs

2015-10-21 Thread Michael Hackett
Hello Paras, This is a limit that was added pre-firefly to prevent users from knocking IO off clusters for several seconds when PG's are being split in existing pools. This limit is not called into effect when creating new pools though. If you try and limit the number to # ceph osd pool set

Re: [ceph-users] Increasing pg and pgs

2015-10-21 Thread Michael Hackett
Hello Paras, You pgp-num should mirror your pg-num on a pool. pgp-num is what the cluster will use for actual object placement purposes. - Original Message - From: "Paras pradhan" To: "Michael Hackett" Cc: ceph-users@lists.ceph.com Sent:

Re: [ceph-users] Increasing pg and pgs

2015-10-21 Thread Michael Hackett
One thing I forgot to note Paras, If you are increasing the PG count on a pool by a large number you will want to increase the PGP value slowly and allow the cluster to rebalance the data instead of just setting the pgp-num to immediately reflect the pg-num. This will give you greater control

[ceph-users] librbd regression with Hammer v0.94.4 -- use caution!

2015-10-21 Thread Sage Weil
There is a regression in librbd in v0.94.4 that can cause VMs to crash. For now, please refrain from upgrading hypervisor nodes or other librbd users to v0.94.4. http://tracker.ceph.com/issues/13559 The problem does not affect server-side daemons (ceph-mon, ceph-osd, etc.). Jason's

Re: [ceph-users] Increasing pg and pgs

2015-10-21 Thread Paras pradhan
Thanks Michael for the clarification. I should set the pg and pgp_num to all the pools . Am i right? . I am asking beacuse setting the pg to just only one pool already set the status to HEALTH OK. -Paras. On Wed, Oct 21, 2015 at 12:21 PM, Michael Hackett wrote: > Hello

Re: [ceph-users] [urgent] KVM issues after upgrade to 0.94.4

2015-10-21 Thread Jan Schermer
If I'm reading it correctly his cmdline says cache=none for the rbd device, so there should be no writeback caching:

Re: [ceph-users] pg incomplete state

2015-10-21 Thread Gregory Farnum
On Tue, Oct 20, 2015 at 7:22 AM, John-Paul Robinson wrote: > Hi folks > > I've been rebuilding drives in my cluster to add space. This has gone > well so far. > > After the last batch of rebuilds, I'm left with one placement group in > an incomplete state. > > [sudo] password for

Re: [ceph-users] ceph and upgrading OS version

2015-10-21 Thread Warren Wang - ISD
Depending on how busy your cluster is, I’d nuke and pave node by node. You can slow the data movement off the old box, and also slow it on the way back in with weighting. My own personal preference, if you have performance overhead to spare. Warren From: Andrei Mikhailovsky

Re: [ceph-users] ceph-hammer and debian jessie - missing files on repository

2015-10-21 Thread Alfredo Deza
We did had some issues a few days ago where the Jessie packages didn't make it. This shouldn't be a problem, would you mind trying again? I just managed to install on Debian Jessie without problems:: Debian GNU/Linux 8.2 (jessie)

Re: [ceph-users] [urgent] KVM issues after upgrade to 0.94.4

2015-10-21 Thread Jason Dillaman
There is an edge case with cloned image writeback caching that occurs after an attempt to read a non-existent clone RADOS object, followed by a write to said object, followed by another read. This second read will cause the cached write to be flushed to the OSD while the appropriate locks are

Re: [ceph-users] [urgent] KVM issues after upgrade to 0.94.4

2015-10-21 Thread Jason Dillaman
> If I'm reading it correctly his cmdline says cache=none for the rbd device, > so there should be no writeback caching: > >

Re: [ceph-users] Increasing pg and pgs

2015-10-21 Thread Paras pradhan
Michael, Yes i did wait for the rebalance to complete. Thanks Paras. On Wed, Oct 21, 2015 at 1:02 PM, Michael Hackett wrote: > One thing I forgot to note Paras, If you are increasing the PG count on a > pool by a large number you will want to increase the PGP value slowly

Re: [ceph-users] pg incomplete state

2015-10-21 Thread John-Paul Robinson
Greg, Thanks for the insight. I suspect things are somewhat sane given that I did erase the primary (osd.30) and the secondary (osd.11) still contains pg data. If I may, could you clarify the process of backfill a little? I understand the min_size allows I/O on the object to resume while there

Re: [ceph-users] pg incomplete state

2015-10-21 Thread Gregory Farnum
I don't remember the exact timeline, but min_size is designed to prevent data loss from under-replicated objects (ie, if you only have 1 copy out of 3 and you lose that copy, you're in trouble, so maybe you don't want it to go active). Unfortunately it could also prevent the OSDs from

Re: [ceph-users] pg incomplete state

2015-10-21 Thread John-Paul Robinson
Yes. That's the intention. I was fixing the osd size to ensure the cluster was in health ok for the upgrades (instead of multiple osds in near full). Thanks again for all the insight. Very helpful. ~jpr On 10/21/2015 03:01 PM, Gregory Farnum wrote: > (which it sounds like you're on —

Re: [ceph-users] ceph-fuse and its memory usage

2015-10-21 Thread Gregory Farnum
On Tue, Oct 13, 2015 at 10:09 PM, Goncalo Borges wrote: > Hi all... > > Thank you for the feedback, and I am sorry for my delay in replying. > > 1./ Just to recall the problem, I was testing cephfs using fio in two > ceph-fuse clients: > > - Client A is in the same

[ceph-users] Preparing Ceph for CBT, disk labels by-id

2015-10-21 Thread Artie Ziff
My inquiry may be a fundamental Linux thing and/or requiring basic Ceph guidance. According to the CBT ReadMe -- https://github.com/ceph/cbt Currently CBT looks for specific partition labels in /dev/disk/by-partlabel for the Ceph OSD data and journal partitions. ...each OSD host

Re: [ceph-users] CephFS file to rados object mapping

2015-10-21 Thread Gregory Farnum
On Wed, Oct 14, 2015 at 7:20 PM, Francois Lafont wrote: > Hi, > > On 14/10/2015 06:45, Gregory Farnum wrote: > >>> Ok, however during my tests I had been careful to replace the correct >>> file by a bad file with *exactly* the same size (the content of the >>> file was just a

Re: [ceph-users] CephFS file to rados object mapping

2015-10-21 Thread David Zafman
See below On 10/21/15 2:44 PM, Gregory Farnum wrote: On Wed, Oct 14, 2015 at 7:20 PM, Francois Lafont wrote: Hi, On 14/10/2015 06:45, Gregory Farnum wrote: Ok, however during my tests I had been careful to replace the correct file by a bad file with *exactly* the same

Re: [ceph-users] cephfs best practice

2015-10-21 Thread Gregory Farnum
On Wed, Oct 21, 2015 at 3:12 PM, Erming Pei wrote: > Hi, > > I am just wondering which use case is better: (within one single file > system) set up one data pool for each project, or let project to share a big > pool? I don't think anybody has that kind of operational

[ceph-users] Fwd: Preparing Ceph for CBT, disk labels by-id

2015-10-21 Thread David Burley
My response got held up in moderation to the CBT list, so posting to ceph-users and sending a copy to you as well to ensure you get it. Artie, I'd just use ceph-disk unless you need a config that it doesn't support. Its a lot fewer commands, pre-tested, and works. That said, I had to create

Re: [ceph-users] cephfs best practice

2015-10-21 Thread John Spray
On Wed, Oct 21, 2015 at 11:12 PM, Erming Pei wrote: > Hi, > > I am just wondering which use case is better: (within one single file > system) set up one data pool for each project, or let project to share a big > pool? In general you want to use a single data pool. Using

Re: [ceph-users] ceph-fuse crush

2015-10-21 Thread Gregory Farnum
On Thu, Oct 15, 2015 at 10:41 PM, 黑铁柱 wrote: > > cluster info: >cluster b23b48bf-373a-489c-821a-31b60b5b5af0 > health HEALTH_OK > monmap e1: 3 mons at > {node1=192.168.0.207:6789/0,node2=192.168.0.208:6789/0,node3=192.168.0.209:6789/0}, > election epoch 24,

[ceph-users] cephfs best practice

2015-10-21 Thread Erming Pei
Hi, I am just wondering which use case is better: (within one single file system) set up one data pool for each project, or let project to share a big pool? Thanks, Erming ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Increasing pg and pgs

2015-10-21 Thread Paras pradhan
Thanks! On Wed, Oct 21, 2015 at 12:52 PM, Michael Hackett wrote: > Hello Paras, > > You pgp-num should mirror your pg-num on a pool. pgp-num is what the > cluster will use for actual object placement purposes. > > - Original Message - > From: "Paras pradhan"

Re: [ceph-users] CephFS and page cache

2015-10-21 Thread Gregory Farnum
On Sun, Oct 18, 2015 at 8:27 PM, Yan, Zheng wrote: > On Sat, Oct 17, 2015 at 1:42 AM, Burkhard Linke > wrote: >> Hi, >> >> I've noticed that CephFS (both ceph-fuse and kernel client in version 4.2.3) >> remove files from page

[ceph-users] Core dump when running OSD service

2015-10-21 Thread James O'Neill
I have an OSD that didn't come up after a reboot. I was getting the error show below. it was running 0.94.3 so I reinstalled all packages. I then upgraded everything to 0.94.4 hoping that would fix it but it hasn't. There are three OSDs, this is the only one having problems (it also contains

Re: [ceph-users] Ceph OSDs with bcache experience

2015-10-21 Thread Wido den Hollander
On 10/20/2015 07:44 PM, Mark Nelson wrote: > On 10/20/2015 09:00 AM, Wido den Hollander wrote: >> Hi, >> >> In the "newstore direction" thread on ceph-devel I wrote that I'm using >> bcache in production and Mark Nelson asked me to share some details. >> >> Bcache is running in two clusters now

[ceph-users] Help with Bug #12738: scrub bogus results when missing a clone

2015-10-21 Thread Chris Taylor
Is there some way to manually correct this error while this bug is still needing review? I have one PG that is stuck inconsistent with the same error. I already created a new RBD image and migrated the data to it. The original RBD image was "rb.0.ac3386.238e1f29". The new image is

Re: [ceph-users] Ceph OSDs with bcache experience

2015-10-21 Thread Wido den Hollander
On 10/20/2015 09:45 PM, Martin Millnert wrote: > The thing that worries me with your next-gen design (actually your current > design aswell) is SSD wear. If you use Intel SSD at 10 DWPD, that's 12TB/day > per 64TB total. I guess use case dependant, and perhaps 1:4 write read > ratio is quite

Re: [ceph-users] planet.ceph.com

2015-10-21 Thread Patrick McGarry
Hey Luis, The planet was broken as a result of the new site (although will be rejuvenated in the ceph.com rebuild). The redirect to a dental site was a DNS problem that has since been fixed. Thanks! On Tue, Oct 20, 2015 at 4:21 AM, Luis Periquito wrote: > Hi, > > I was

Re: [ceph-users] [performance] rbd kernel module versus qemu librbd

2015-10-21 Thread Alexandre DERUMIER
can you send me also your ceph.conf ? do you have a ceph.conf on the vms hosts too ? - Mail original - De: hzwuli...@gmail.com À: "aderumier" Cc: "ceph-users" Envoyé: Mercredi 21 Octobre 2015 10:31:56 Objet: Re: [ceph-users] [performance] rbd

Re: [ceph-users] [performance] rbd kernel module versus qemu librbd

2015-10-21 Thread hzwuli...@gmail.com
Hi, Yeah, i have the ceph.conf on the real machine which VM located on. A simple configuration, -:) [global] fsid = *** mon_initial_members = *, *, * mon_host = *, *, * auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true I

Re: [ceph-users] v0.94.4 Hammer released

2015-10-21 Thread Christoph Adomeit
Hi there, I was hoping for the following changes in 0.94.4 release: -Stable Object Maps for faster Image Handling (Backups, Diffs, du etc). -Link against better Malloc implementation like jemalloc Does 0.94.4 bring any improvement in these areas ? Thanks Christoph On Mon, Oct 19, 2015 at

Re: [ceph-users] [performance] rbd kernel module versus qemu librbd

2015-10-21 Thread Alexandre DERUMIER
>>But, anyway, from my test, the configuration impact less for the performance. A fast speed win, disable cephx and debug: [global] auth_cluster_required = none auth_service_required = none auth_client_required = none debug_lockdep = 0/0 debug_context = 0/0 debug_crush = 0/0 debug_buffer = 0/0

Re: [ceph-users] [performance] rbd kernel module versus qemu librbd

2015-10-21 Thread Alexandre DERUMIER
here a libvirt sample to enable iothreads: 2 With this, you can scale with multiple disks. (but it should help a little bit with 1 disk too) - Mail original - De: hzwuli...@gmail.com À: "aderumier"

Re: [ceph-users] Ceph OSDs with bcache experience

2015-10-21 Thread Jan Schermer
> On 21 Oct 2015, at 09:11, Wido den Hollander wrote: > > On 10/20/2015 09:45 PM, Martin Millnert wrote: >> The thing that worries me with your next-gen design (actually your current >> design aswell) is SSD wear. If you use Intel SSD at 10 DWPD, that's 12TB/day >> per 64TB

[ceph-users] disable cephx signing

2015-10-21 Thread Corin Langosch
Hi, we have cephx authentication and signing enabled. For performance reasons we'd like to keep auth but disabled signing. Is this possible without service interruption and without having to restart the qemu rbd clients? Just adapt the ceph.conf, restart mons and then osds? Thanks Corin

[ceph-users] [urgent] KVM issues after upgrade to 0.94.4

2015-10-21 Thread Andrei Mikhailovsky
Hello guys, I've upgraded to the latest Hammer release and I've just noticed a massive issue after the upgrade ((( I am using ceph for virtual machine rbd storage over cloudstack. I am having issues with starting virtual routers. The libvirt error message is: cat r-1407-VM.log 2015-10-21

Re: [ceph-users] [performance] rbd kernel module versus qemu librbd

2015-10-21 Thread Alexandre DERUMIER
Damn, that's a huge difference. What is your host os, guest os , qemu version and vm config ? As an extra boost, you could enable iothread on virtio disk. (It's available on libvirt but not on openstack yet). If it's a test server, maybe could you test it with proxmox 4.0 hypervisor

Re: [ceph-users] [performance] rbd kernel module versus qemu librbd

2015-10-21 Thread Lindsay Mathieson
On 21 October 2015 at 16:01, Alexandre DERUMIER wrote: > If it's a test server, maybe could you test it with proxmox 4.0 hypervisor > https://www.proxmox.com > > I have made a lot of patch inside it to optimize rbd (qemu+jemalloc, > iothreads,...) > Really gotta find time

Re: [ceph-users] Network performance

2015-10-21 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Jonas Björklund > Sent: 21 October 2015 09:23 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] Network performance > > Hello, > > In the configuration I have read about "cluster

Re: [ceph-users] Help with Bug #12738: scrub bogus results when missing a clone

2015-10-21 Thread Jan Schermer
We just had to look into a similiar problem (missing clone objects, extraneous clone objects, wrong sizes on few objects...) You should do something like this: 1) find all OSDs hosting the PG ceph pg map 8.e82 2) find the directory with the object on the OSDs should be something like

Re: [ceph-users] v0.94.4 Hammer released

2015-10-21 Thread Iban Cabrillo
Hi, The same for us. Everything is working fine after upgrade to 0.94.4 (first the MONs and then the OSDs). Iban 2015-10-21 0:21 GMT+02:00 Lindsay Mathieson : > > On 21 October 2015 at 08:09, Andrei Mikhailovsky > wrote: > >> Same here, the

[ceph-users] Network performance

2015-10-21 Thread Jonas Björklund
Hello, In the configuration I have read about "cluster network" and "cluster addr". Is it possible to make the OSDs to listens to multiple IP addresses? I want to use several network interfaces to increase performance. I hav tried [global] cluster network = 172.16.3.0/24,172.16.4.0/24 [osd.0]

Re: [ceph-users] [performance] rbd kernel module versus qemu librbd

2015-10-21 Thread hzwuli...@gmail.com
Hi, let me post the version and configuration here first. host os: debian 7.8 kernel: 3.10.45 guest os: debian 7.8 kernel: 3.2.0-4 qemu version: ii ipxe-qemu 1.0.0+git-2013.c3d1e78-2.1~bpo70+1 all PXE boot firmware - ROM images for qemu ii

Re: [ceph-users] Network performance

2015-10-21 Thread Jonas Björklund
On Wed, 21 Oct 2015, Nick Fisk wrote: [global] cluster network = 172.16.3.0/24,172.16.4.0/24 [osd.0] public addr = 0.0.0.0 #public addr = 172.16.3.1 #public addr = 172.16.4.1 But I cant get them to listen to both 172.16.3.1 and 172.16.4.1 at the same time. Any ideas? I don't think this