Re: [ceph-users] tgt and krbd

2015-03-06 Thread Mike Christie
On 03/06/2015 06:51 AM, Jake Young wrote: On Thursday, March 5, 2015, Nick Fisk n...@fisk.me.uk mailto:n...@fisk.me.uk wrote: Hi All, __ __ Just a heads up after a day’s experimentation. __ __ I believe tgt with its default settings has a small

[ceph-users] Cascading Failure of OSDs

2015-03-06 Thread Quentin Hartman
So I'm in the middle of trying to triage a problem with my ceph cluster running 0.80.5. I have 24 OSDs spread across 8 machines. The cluster has been running happily for about a year. This last weekend, something caused the box running the MDS to sieze hard, and when we came in on monday, several

Re: [ceph-users] Cascading Failure of OSDs

2015-03-06 Thread Sage Weil
It looks like you may be able to work around the issue for the moment with ceph osd set nodeep-scrub as it looks like it is scrub that is getting stuck? sage On Fri, 6 Mar 2015, Quentin Hartman wrote: Ceph health detail - http://pastebin.com/5URX9SsQpg dump summary (with active+clean pgs

Re: [ceph-users] Cascading Failure of OSDs

2015-03-06 Thread Quentin Hartman
Alright, tried a few suggestions for repairing this state, but I don't seem to have any PG replicas that have good copies of the missing / zero length shards. What do I do now? telling the pg's to repair doesn't seem to help anything? I can deal with data loss if I can figure out which images

Re: [ceph-users] Cascading Failure of OSDs

2015-03-06 Thread Quentin Hartman
Thanks for the response. Is this the post you are referring to? http://ceph.com/community/incomplete-pgs-oh-my/ For what it's worth, this cluster was running happily for the better part of a year until the event from this weekend that I described in my first post, so I doubt it's configuration

[ceph-users] Fwd: Prioritize Heartbeat packets

2015-03-06 Thread Robert LeBlanc
Hidden HTML ... trying agin... -- Forwarded message -- From: Robert LeBlanc rob...@leblancnet.us Date: Fri, Mar 6, 2015 at 5:20 PM Subject: Re: [ceph-users] Prioritize Heartbeat packets To: ceph-users@lists.ceph.com ceph-users@lists.ceph.com, ceph-devel ceph-de...@vger.kernel.org

Re: [ceph-users] Cascading Failure of OSDs

2015-03-06 Thread Quentin Hartman
Ceph health detail - http://pastebin.com/5URX9SsQ pg dump summary (with active+clean pgs removed) - http://pastebin.com/Y5ATvWDZ an osd crash log (in github gist because it was too big for pastebin) - https://gist.github.com/qhartman/cb0e290df373d284cfb5 And now I've got four OSDs that are

Re: [ceph-users] Cascading Failure of OSDs

2015-03-06 Thread Quentin Hartman
Thanks for the suggestion, but that doesn't seem to have made a difference. I've shut the entire cluster down and brought it back up, and my config management system seems to have upgraded ceph to 0.80.8 during the reboot. Everything seems to have come back up, but I am still seeing the crash

Re: [ceph-users] Prioritize Heartbeat packets

2015-03-06 Thread Robert LeBlanc
I see that Jian Wen has done work on this for 0.94. I tried looking through the code to see if I can figure out how to configure this new option, but it all went over my head pretty quick. Can I get a brief summary on how to set the priority of heartbeat packets or where to look in the code to

Re: [ceph-users] Cascading Failure of OSDs

2015-03-06 Thread Quentin Hartman
Finally found an error that seems to provide some direction: -1 2015-03-07 02:52:19.378808 7f175b1cf700 0 log [ERR] : scrub 3.18e e08a418e/rbd_data.3f7a2ae8944a.16c8/7//3 on disk size (0) does not match object info size (4120576) ajusted for ondisk to (4120576) I'm diving into

Re: [ceph-users] Cascading Failure of OSDs

2015-03-06 Thread Gregory Farnum
This might be related to the backtrace assert, but that's the problem you need to focus on. In particular, both of these errors are caused by the scrub code, which Sage suggested temporarily disabling — if you're still getting these messages, you clearly haven't done so successfully. That said,

Re: [ceph-users] Cascading Failure of OSDs

2015-03-06 Thread Quentin Hartman
Here's more information I have been able to glean: pg 3.5d3 is stuck inactive for 917.471444, current state incomplete, last acting [24] pg 3.690 is stuck inactive for 11991.281739, current state incomplete, last acting [24] pg 4.ca is stuck inactive for 15905.499058, current state incomplete,

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Mark Nelson
Interesting. We've seen things like this on the librbd side in the past, but I don't think I've seen this kind of behavior in the kernel client. what does the latency historgram look like when going from 1-2? Mark On 03/06/2015 08:10 AM, Nick Fisk wrote: Just tried cfq, deadline and noop

Re: [ceph-users] rgw admin api - users

2015-03-06 Thread Joshua Weaver
Thanks!! On Mar 5, 2015, at 4:09 PM, Yehuda Sadeh-Weinraub yeh...@redhat.com wrote: The metadata api can do it: GET /admin/metadata/user Yehuda - Original Message - From: Joshua Weaver joshua.wea...@ctl.io To: ceph-us...@ceph.com Sent: Thursday, March 5, 2015 1:43:33 PM

Re: [ceph-users] tgt and krbd

2015-03-06 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jake Young Sent: 06 March 2015 12:52 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] tgt and krbd On Thursday, March 5, 2015, Nick Fisk n...@fisk.me.uk wrote: Hi All, Just a heads up after a

Re: [ceph-users] tgt and krbd

2015-03-06 Thread Nick Fisk
Hi Jake, Good to see it’s not just me. I’m guessing that the fact you are doing 1MB writes means that the latency difference is having a less noticeable impact on the overall write bandwidth. What I have been discovering with Ceph + iSCSi is that due to all the extra hops (client-iscsi

Re: [ceph-users] S3 RadosGW - Create bucket OP

2015-03-06 Thread Steffen W Sørensen
On 06/03/2015, at 12.24, Steffen W Sørensen ste...@me.com wrote: 3. What are BCP for maintaining GW pools, need I run something like GC / cleanup OPs / log object pruning etc. any pointers to doc here for? Is this all manitaince one should consider on pools for a GW instance?

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Nick Fisk
Just tried cfq, deadline and noop which more or less all show identical results -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alexandre DERUMIER Sent: 06 March 2015 11:59 To: Nick Fisk Cc: ceph-users Subject: Re: [ceph-users] Strange krbd

Re: [ceph-users] Clustering a few NAS into a Ceph cluster

2015-03-06 Thread Loic Dachary
On 05/03/2015 18:18, Thomas Lemarchand wrote: Hello Loïc, It does exists ... but maybe not at the scale you are looking for : http://www.fujitsu.com/global/products/computing/storage/eternus-cd/ It's slightly above my price range ;-) I read a paper about their hardware, it seems like

Re: [ceph-users] Clustering a few NAS into a Ceph cluster

2015-03-06 Thread Loic Dachary
Hi John, On 06/03/2015 11:38, John Spray wrote: On 04/03/2015 00:10, Loic Dachary wrote: Last week-end I discussed with a friend about a use case many of us thought about already: it would be cool to have a simple way to assemble Ceph aware NAS fresh from the store. I summarized the use case

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Ilya Dryomov
On Thu, Mar 5, 2015 at 8:17 PM, Nick Fisk n...@fisk.me.uk wrote: I’m seeing a strange queue depth behaviour with a kernel mapped RBD, librbd does not show this problem. Cluster is comprised of 4 nodes, 10GB networking, not including OSDs as test sample is small so fits in page cache. What

Re: [ceph-users] tgt and krbd

2015-03-06 Thread Jake Young
My initator is also VMware software iscsi. I had my tgt iscsi targets' write-cache setting off. I turned write and read cache on in the middle of creating a large eager zeroed disk (tgt has no VAAI support, so this is all regular synchronous IO) and it did give me a clear performance boost. Not

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Mark Nelson
Histogram is probably the wrong word. In the normal fio output, there should be a distribution of latencies shown for the test, so you can get a rough estimate of the skew. It might be interesting to know when you jump from iodepth=1 to iodepth=2 how that skew changes. Here's an example:

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Nick Fisk
Hi Ilya, I meant that the OSD numbers and configuration is probably irrelevant as the sample size of 1G fits in the page cache. This is Kernel 3.16 (from Ubuntu 14.04.2, Ceph v87.1) Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Ilya

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Nick Fisk
Hi Mark, Sorry if I am showing my ignorance here, is there some sort of flag or tool that generates this from fio? Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: 06 March 2015 15:06 To: ceph-users@lists.ceph.com

Re: [ceph-users] Multiple OSD's in a Each node with replica 2

2015-03-06 Thread Alexandre DERUMIER
Is it possible all replicas of an object to be saved in the same node? No. (until you don't wrongly modify the crushmap manually) Is it possible to lose any? with replicat x2, if you loose 2osd on 2differents nodes, with the same object inside, you'll lost the object Is there a mechanism

Re: [ceph-users] tgt and krbd

2015-03-06 Thread Jake Young
On Thursday, March 5, 2015, Nick Fisk n...@fisk.me.uk wrote: Hi All, Just a heads up after a day’s experimentation. I believe tgt with its default settings has a small write cache when exporting a kernel mapped RBD. Doing some write tests I saw 4 times the write throughput when using

Re: [ceph-users] client-ceph [can not connect from client][connect protocol feature mismatch]

2015-03-06 Thread Stéphane DUGRAVOT
Hi Sonal, You can refer to this doc to identify your problem. Your error code is 4204, so * 4000 upgrade to kernel 3.9 * 200 CEPH_FEATURE_CRUSH_TUNABLES2 * 4 CEPH_FEATURE_CRUSH_TUNABLES *

Re: [ceph-users] tgt and krbd

2015-03-06 Thread Jake Young
On Fri, Mar 6, 2015 at 10:18 AM, Nick Fisk n...@fisk.me.uk wrote: On Fri, Mar 6, 2015 at 9:04 AM, Nick Fisk n...@fisk.me.uk wrote: From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jake Young Sent: 06 March 2015 12:52 To: Nick Fisk Cc: ceph-users@lists.ceph.com

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Somnath Roy
Nick, I think this is because of the krbd you are using is using Naggle's algorithm i.e TCP_NODELAY = false by default. The latest krbd module should have the TCP_NODELAY = true by default. You may want to try that. But, I think it is available in the latest kernel only. Librbd is running with

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Ilya Dryomov
On Fri, Mar 6, 2015 at 7:27 PM, Nick Fisk n...@fisk.me.uk wrote: Hi Somnath, I think you hit the nail on the head, setting librbd to not use TCP_NODELAY shows the same behaviour as with krbd. That's why I asked about the kernel version. TCP_NODELAY is enabled by default since 4.0-rc1, so if

Re: [ceph-users] tgt and krbd

2015-03-06 Thread Nick Fisk
Hi Jake, Good to see it’s not just me. I’m guessing that the fact you are doing 1MB writes means that the latency difference is having a less noticeable impact on the overall write bandwidth. What I have been discovering with Ceph + iSCSi is that due to all the extra hops (client-iscsi

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Mark Nelson
On 03/06/2015 10:27 AM, Nick Fisk wrote: Hi Somnath, I think you hit the nail on the head, setting librbd to not use TCP_NODELAY shows the same behaviour as with krbd. Score (another) 1 for Somnath! :) Mark if you are still interested here are the two latency reports Queue Depth=1

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Nick Fisk
Hi Somnath, I think you hit the nail on the head, setting librbd to not use TCP_NODELAY shows the same behaviour as with krbd. Mark if you are still interested here are the two latency reports Queue Depth=1 slat (usec): min=24, max=210, avg=39.40, stdev=11.54 clat (usec): min=310,

[ceph-users] S3 RadosGW - Create bucket OP

2015-03-06 Thread Steffen W Sørensen
Hi Check the S3 Bucket OPS at : http://ceph.com/docs/master/radosgw/s3/bucketops/ I've read that as well, but I'm having other issues getting an App to run against our Ceph S3 GW, maybe you have a few hints on this as well... Got the cluster working for rbd+cephFS and have initial verified the

Re: [ceph-users] RadosGW - Create bucket via admin API

2015-03-06 Thread Georgios Dimitrakakis
Hi Italo, Check the S3 Bucket OPS at : http://ceph.com/docs/master/radosgw/s3/bucketops/ or use any of the examples provided in Python (http://ceph.com/docs/master/radosgw/s3/python/) or PHP (http://ceph.com/docs/master/radosgw/s3/php/) or JAVA

Re: [ceph-users] Clustering a few NAS into a Ceph cluster

2015-03-06 Thread John Spray
On 04/03/2015 00:10, Loic Dachary wrote: Last week-end I discussed with a friend about a use case many of us thought about already: it would be cool to have a simple way to assemble Ceph aware NAS fresh from the store. I summarized the use case and interface we discussed here :

[ceph-users] Multiple OSD's in a Each node with replica 2

2015-03-06 Thread Azad Aliyar
I have a doubt . In a scenario (3nodes x 4osd each x 2replica) I tested with a node down and as long as you have space available all objects were there. Is it possible all replicas of an object to be saved in the same node? Is it possible to lose any? Is there a mechanism that prevents

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Alexandre DERUMIER
Hi, does somebody known if redhat will backport new krbd features (discard, blk-mq, tcp_nodelay,...) to the redhat 3.10 kernel ? Alexandre - Mail original - De: Mark Nelson mnel...@redhat.com À: Nick Fisk n...@fisk.me.uk, Somnath Roy somnath@sandisk.com, aderumier

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Ilya Dryomov
On Fri, Mar 6, 2015 at 9:52 PM, Alexandre DERUMIER aderum...@odiso.com wrote: Hi, does somebody known if redhat will backport new krbd features (discard, blk-mq, tcp_nodelay,...) to the redhat 3.10 kernel ? Yes, all of those will be backported. discard is already there in rhel7.1 kernel.

Re: [ceph-users] tgt and krbd

2015-03-06 Thread Steffen W Sørensen
On 06/03/2015, at 16.50, Jake Young jak3...@gmail.com wrote: After seeing your results, I've been considering experimenting with that. Currently, my iSCSI proxy nodes are VMs. I would like to build a few dedicated servers with fast SSDs or fusion-io devices. It depends on my budget,

[ceph-users] RadosGW - Bucket link and ACLs

2015-03-06 Thread Italo Santos
Hello, I’m building a object storage environment and I’m in trouble with some administration ops, to manage the entire environment I decided create an admin user and use that to manage the client users which I’ll create further. Using the admin (called “italux) I created a new user (called

Re: [ceph-users] tgt and krbd

2015-03-06 Thread Jake Young
On Friday, March 6, 2015, Steffen W Sørensen ste...@me.com wrote: On 06/03/2015, at 16.50, Jake Young jak3...@gmail.com javascript:; wrote: After seeing your results, I've been considering experimenting with that. Currently, my iSCSI proxy nodes are VMs. I would like to build a few

[ceph-users] Tr : RadosGW - Bucket link and ACLs

2015-03-06 Thread ghislain.chevalier
Message d'origine De : CHEVALIER Ghislain IMT/OLPS ghislain.cheval...@orange.com Date :06/03/2015 21:56 (GMT+01:00) À : Italo Santos okd...@gmail.com Cc : Objet : RE : [ceph-users] RadosGW - Bucket link and ACLs Hi We encountered this behavior when developing the rgw admin

Re: [ceph-users] Permanente Mount RBD blocs device RHEL7

2015-03-06 Thread Jesus Chavez (jeschave)
Still not working does anybody know show to automap and Mount rbd image on redhat? Regards Jesus Chavez SYSTEMS ENGINEER-C.SALES jesch...@cisco.commailto:jesch...@cisco.com Phone: +52 55 5267 3146tel:+52%2055%205267%203146 Mobile: +51 1 5538883255tel:+51%201%205538883255 CCIE - 44433 On Mar