[ceph-users] krbd exclusive-lock

2017-03-21 Thread Mikaël Cluseau
Hi, There's something I don't understand about the exclusive-lock feature. I created an image: $ ssh host-3 Container Linux by CoreOS stable (1298.6.0) Update Strategy: No Reboots host-3 ~ # uname -a Linux host-3 4.9.9-coreos-r1 #1 SMP Tue Mar 14 21:09:42 UTC 2017 x86_64 Intel(R) Xeon(R) CPU

Re: [ceph-users] More than 50% osds down, CPUs still busy; will the cluster recover without help?

2015-03-27 Thread Mikaël Cluseau
Hi, On 03/18/2015 03:01 PM, Gregory Farnum wrote: I think it tended to crash rather than hang like this so I'm a bit surprised, but if this op is touching a broken file or something that could explain it. FWIW, the last time I had the issue (on a 3.10.9 kernel), btrfs was freezing, waiting

Re: [ceph-users] Help with SSDs

2014-12-17 Thread Mikaël Cluseau
On 12/17/2014 02:58 AM, Bryson McCutcheon wrote: Is there a good work around if our SSDs are not handling D_SYNC very well? We invested a ton of money into Samsung 840 EVOS and they are not playing well with D_SYNC. Would really appreciate the help! Just in case it's linked with the recent

Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-23 Thread Mikaël Cluseau
On 09/22/2014 05:17 AM, Robin H. Johnson wrote: Can somebody else make comments about migrating S3 buckets with preserved mtime data (and all of the ACLs CORS) then? I don't know how radosgw objects are stored, but have you considered a lower level rados export/import ? IMPORT AND EXPORT

[ceph-users] RBD over cache tier over EC pool: rbd rm doesn't remove objects

2014-09-19 Thread Mikaël Cluseau
Hi all, I have weird behaviour on my firefly test + convenience storage cluster. It consists of 2 nodes with a light imbalance in available space: # idweighttype nameup/downreweight -114.58root default -28.19host store-1 12.73osd.1up

Re: [ceph-users] Introductions

2014-08-13 Thread Mikaël Cluseau
On 08/11/2014 01:14 PM, Zach Hill wrote: Thanks for the info! Great data points. We will still recommend a separated solution, but it's good to know that some have tried to unify compute and storage and have had some success. Yes, and using drives on compute node for backup is a seducing idea

Re: [ceph-users] Introductions

2014-08-09 Thread Mikaël Cluseau
Hi Zach, On 08/09/2014 11:33 AM, Zach Hill wrote: Generally, we recommend strongly against such a deployment in order to ensure performance and failure isolation between the compute and storage sides of the system. But, I'm curious if anyone is doing this in practice and if they've found

Re: [ceph-users] Is it still unsafe to map a RBD device on an OSD server?

2014-06-11 Thread Mikaël Cluseau
On 06/11/2014 08:20 AM, Sebastien Han wrote: Thanks for your answers u I have that for an apt-cache since more than 1 year now, never had an issue. Of course, your question is not about having a krbd device backing an OSD of the same cluster ;-) attachment:

Re: [ceph-users] Kernel Panic / RBD Instability

2013-11-06 Thread Mikaël Cluseau
Hello, if you use kernel RBD, maybe your issue is linked to this one : http://tracker.ceph.com/issues/5760 Best regards, Mikael. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Production locked: OSDs down

2013-10-14 Thread Mikaël Cluseau
Hi, I have a pretty big problem here... my OSDs are marked down (except one?!) I have ceph ceph version 0.61.8 (a6fdcca3bddbc9f177e4e2bf0d9cdd85006b028b). I recently had a full monitors so I had to remove them but it seemed to work. # idweighttype nameup/downreweight -1

[ceph-users] ceph 0.67, 0.67.1: ceph_init bug

2013-08-17 Thread Mikaël Cluseau
Hi, troubles with ceph_init (after a test reboot) # ceph_init restart osd # ceph_init restart osd.0 /usr/lib/ceph/ceph_init.sh: osd.0 not found (/etc/ceph/ceph.conf defines mon.xxx , /var/lib/ceph defines mon.xxx) 1 # ceph-disk list [...] /dev/sdc : /dev/sdc1 ceph data, prepared, cluster

Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug

2013-08-17 Thread Mikaël Cluseau
On 08/18/2013 08:39 AM, Mikaël Cluseau wrote: # ceph-disk -v activate-all DEBUG:ceph-disk-python2.7:Scanning /dev/disk/by-parttypeuuid Maybe /dev/disk/by-parttypeuuid is specific? # ls -l /dev/disk total 0 drwxr-xr-x 2 root root 1220 Aug 18 07:01 by-id drwxr-xr-x 2 root root 60 Aug 18 07

Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug

2013-08-17 Thread Mikaël Cluseau
On 08/18/2013 08:53 AM, Sage Weil wrote: Yep! What distro is this? I'm working on Gentoo packaging to get a full stack of ceph and openstack. Overlay here: git clone https://git.isi.nc/cloud/cloud-overlay.git And a small fork of ceph-deploy to add gentoo support: git clone

Re: [ceph-users] v0.67 Dumpling released

2013-08-14 Thread Mikaël Cluseau
Hi lists, in this release I see that the ceph command is not compatible with python 3. The changes were not all trivial so I gave up, but for those using gentoo, I made my ceph git repository available here with an ebuild that forces the python version to 2.6 ou 2.7 : git clone

Re: [ceph-users] 1 x raid0 or 2 x disk

2013-07-21 Thread Mikaël Cluseau
On 07/21/13 20:37, Wido den Hollander wrote: I'd saw two disks and not raid0. Since when you are doing parallel I/O both disks can be doing something completely different. Completely agree, Ceph is already doing the stripping :) ___ ceph-users

Re: [ceph-users] Hadoop/Ceph and DFS IO tests

2013-07-20 Thread Mikaël Cluseau
Hi, On 07/11/13 12:23, ker can wrote: Unfortunately I currently do not have access to SSDs, so I had a separate disk for the journal for each data disk for now. you can try the RAM as a journal (well... not in production of course), if you want an idea of the performance on SSDs. I tried

Re: [ceph-users] optimizing recovery throughput

2013-07-20 Thread Mikaël Cluseau
Hi, On 07/21/13 09:05, Dan van der Ster wrote: This is with a 10Gb network -- and we can readily get 2-3GBytes/s in normal rados bench tests across many hosts in the cluster. I wasn't too concerned with the overall MBps throughput in my question, but rather the objects/s recovery rate -- they

Re: [ceph-users] optimizing recovery throughput

2013-07-19 Thread Mikaël Cluseau
HI, On 07/19/13 07:16, Dan van der Ster wrote: and that gives me something like this: 2013-07-18 21:22:56.546094 mon.0 128.142.142.156:6789/0 27984 : [INF] pgmap v112308: 9464 pgs: 8129 active+clean, 398 active+remapped+wait_backfill, 3 active+recovery_wait, 933 active+remapped+backfilling, 1

[ceph-users] weird: -23/116426 degraded (-0.020%)

2013-07-17 Thread Mikaël Cluseau
Hi list, not a real problem but weird thing under cuttlefish : 2013-07-18 10:51:01.597390 mon.0 [INF] pgmap v266324: 216 pgs: 215 active+clean, 1 active+remapped+backfilling; 144 GB data, 305 GB used, 453 GB / 766 GB avail; 3921KB/s rd, 2048KB/s wr, 288op/s; 1/116426 degraded (0.001%);

Re: [ceph-users] osd client op priority vs osd recovery op priority

2013-07-08 Thread Mikaël Cluseau
Hi Greg, thank you for your (fast) answer. Since we're going more in-depth, in must say : * we're running 2 Gentoo GNU/Linux servers doing both storage and virtualization (I know this is not recommended but we mostly have a low load and virtually no writes outside of ceph) *

Re: [ceph-users] osd client op priority vs osd recovery op priority

2013-07-08 Thread Mikaël Cluseau
On 09/07/2013 14:41, Gregory Farnum wrote: On Mon, Jul 8, 2013 at 8:08 PM, Mikaël Cluseaumclus...@isi.nc wrote: Hi Greg, thank you for your (fast) answer. snip Please keep all messages on the list. :) oops, reply-to isn't set by default here ^^ I just realized you were talking about

Re: [ceph-users] osd client op priority vs osd recovery op priority

2013-07-08 Thread Mikaël Cluseau
On 09/07/2013 14:57, Mikaël Cluseau wrote: I think I'll go for the second option because the problematic load spikes seem to have a period of 24h + epsilon... Seems good : the load drop behind the 1.0 line, ceph starts to scrub, the scrub is fast and load goes higher the 1.0, there's a pause