Re: [ceph-users] Ceph, LIO, VMWARE anyone?

2015-01-14 Thread Jake Young
Nick, Where did you read that having more than 1 LUN per target causes stability problems? I am running 4 LUNs per target. For HA I'm running two linux iscsi target servers that map the same 4 rbd images. The two targets have the same serial numbers, T10 address, etc. I copy the primary's

Re: [ceph-users] How to tell a VM to write more local ceph nodes than to the network.

2015-01-14 Thread Lionel Bouton
On 01/13/15 22:03, Roland Giesler wrote: I have a 4 node ceph cluster, but the disks are not equally distributed across all machines (they are substantially different from each other) One machine has 12 x 1TB SAS drives (h1), another has 8 x 300GB SAS (s3) and two machines have only two 1 TB

Re: [ceph-users] CRUSH question - failing to rebalance after failure test

2015-01-14 Thread Sage Weil
On Tue, 13 Jan 2015, Christopher Kunz wrote: Hi, Okay, it sounds like something is not quite right then. Can you attach the OSDMap once it is in the not-quite-repaired state? And/or try setting 'ceph osd crush tunables optimal' and see if that has any effect? Indeed it did - I

Re: [ceph-users] Spark/Mesos on top of Ceph/Btrfs

2015-01-14 Thread John Spray
On Tue, Jan 13, 2015 at 1:25 PM, James wirel...@tampabay.rr.com wrote: I was wondering if anyone has Mesos running on top of Ceph? I want to test/use Ceph if lieu of HDFS. You might be interested in http://ceph.com/docs/master/cephfs/hadoop/ It allows you to expose CephFS to applications that

Re: [ceph-users] Spark/Mesos on top of Ceph/Btrfs

2015-01-14 Thread James
Sebastien Han sebastien.han@... writes: What do you want to use from Ceph? RBD? CephFS? (I hope this post is not redundant I still seem to be having some troubles posting to this group and I'm using gmane) Hello one and all, I am suppose to be able to post to this group via gmane, but

Re: [ceph-users] Ceph, LIO, VMWARE anyone?

2015-01-14 Thread Nick Fisk
Hi Jake, I can’t remember the exact details, but it was something to do with a potential problem when using the pacemaker resource agents. I think it was to do with a potential hanging issue when one LUN on a shared target failed and then it tried to kill all the other LUNS to fail the

[ceph-users] two mount points, two diffrent data

2015-01-14 Thread Rafał Michalak
Hello I have trouble with this situation #node1 mount /dev/rbd/rbd/test /mnt cd /mnt touch test1 ls (i see test1, OK) #node2 mount /dev/rbd/rbd/test /mnt cd /mnt (i see test1, OK) touch test2 ls (i see test2, OK) #node1 ls (i see test1, BAD) touch test3 ls (i see test1, test3 BAD) #node2 ls (i

[ceph-users] Better way to use osd's of different size

2015-01-14 Thread Межов Игорь Александрович
Hi! We have a small production ceph cluster, based on firefly release. It was built using hardware we already have in our site so it is not new shiny, but works quite good. It was started in 2014.09 as a proof of concept from 4 hosts with 3 x 1tb osd's each: 1U dual socket Intel 54XX

[ceph-users] Placementgroups stuck peering

2015-01-14 Thread Christian Eichelmann
Hi all, after our cluster problems with incomplete placementgroups, we've decided to remove our pools and create new ones. This was going fine in the beginning. After adding an additional OSD server, we now have 2 PGs that are stuck in the peering state: HEALTH_WARN 2 pgs peering; 2 pgs stuck

Re: [ceph-users] Ceph, LIO, VMWARE anyone?

2015-01-14 Thread Stephan Seitz
Guiseppe, despite the fact I like SCST, I did a comparable setup with LIO (and the respective RBD LIO Backend) in userspace. It spans over at least three bridge nodes without any problems. In contrast to usual (two controller, one backplane) iSCSI portals, I have to discover every single portal

Re: [ceph-users] How to tell a VM to write more local ceph nodes than to the network.

2015-01-14 Thread Gregory Farnum
On Tue, Jan 13, 2015 at 1:03 PM, Roland Giesler rol...@giesler.za.net wrote: I have a 4 node ceph cluster, but the disks are not equally distributed across all machines (they are substantially different from each other) One machine has 12 x 1TB SAS drives (h1), another has 8 x 300GB SAS (s3)

Re: [ceph-users] Spark/Mesos on top of Ceph/Btrfs

2015-01-14 Thread Gurvinder Singh
We are definitely interested in such a setup too. Not necessarily with btrfs but ceph combined with Spark Mesos. Earlier last year the ceph hadoop plugin was not stable enough after that I haven't looked in to it. As hadoop plugin relies of CephFS and it is still under development. So we are

Re: [ceph-users] problem deploying ceph on a 3 node test lab : active+degraded

2015-01-14 Thread Nicolas Zin
I found the problem :-) no weight on any osd (in ceph osd tree) I just had to reweight the osd! - Mail original - De: Nicolas Zin nicolas@savoirfairelinux.com À: ceph-users@lists.ceph.com Envoyé: Mercredi 14 Janvier 2015 10:41:00 Objet: problem deploying ceph on a 3 node test lab :

[ceph-users] problem deploying ceph on a 3 node test lab : active+degraded

2015-01-14 Thread Nicolas Zin
Hi, I have 3 VM running Ubuntu 14.04. I followed mainly the http://ceph.com/docs/master/rados/deployment/ceph-deploy-install document to deploy on a 3 VM nodes. First I thought that he didn't like to have to NIC card (one for public, one for storage). Now I switch to a only one NIC platform. I

Re: [ceph-users] rbd directory listing performance issues

2015-01-14 Thread Shain Miley
Christian, Thanks again for providing this level of insight in trying to help us solve our issues. I am going to move ahead with the new hardware purchasejust to make sure we eliminate hardware (or under-powered hardware) as the bottleneck. At this point the directory listings seem fast

Re: [ceph-users] ceph on peta scale

2015-01-14 Thread Zeeshan Ali Shah
So is there any other alternative for over the WAN deployment .. I hava use case to connect two Swedish unversities (few hundreds km apart) . Target is that user from univ A can write to cluster to univ B and can read the data from other users . /Zee On Tue, Jan 13, 2015 at 7:41 AM, Robert

Re: [ceph-users] Problem with Rados gateway

2015-01-14 Thread Yehuda Sadeh
Try setting 'rgw print continue = false' in your ceph.conf. Yehuda On Thu, Jan 8, 2015 at 1:34 AM, Walter Valenti waltervale...@yahoo.it wrote: Scenario: Openstack Juno RDO on Centos7. Ceph version: Giant. On Centos7 there isn't more the old fastcgi, but there's mod_fcgid The apache VH

[ceph-users] Spark/Mesos on top of Ceph/Btrfs

2015-01-14 Thread James
Hello, I was wondering if anyone has Mesos running on top of Ceph? I want to test/use Ceph if lieu of HDFS. I'm working on Gentoo, but any experiences with Mesos on Ceph are of keen interest to me as related to performance, stability and any difficulties experienced. James

Re: [ceph-users] ceph on peta scale

2015-01-14 Thread James
Gregory Farnum greg@... writes: Ceph isn't really suited for WAN-style distribution. Some users have high-enough and consistent-enough bandwidth (with low enough latency) to do it, but otherwise you probably want to use Ceph within the data centers and layer something else on top of it.

Re: [ceph-users] cephfs modification time

2015-01-14 Thread Gregory Farnum
Awesome, thanks for the bug report and the fix, guys. :) -Greg On Mon, Jan 12, 2015 at 11:18 PM, 严正 z...@redhat.com wrote: I tracked down the bug. Please try the attached patch Regards Yan, Zheng 在 2015年1月13日,07:40,Gregory Farnum g...@gregs42.com 写道: Zheng, this looks like a kernel

Re: [ceph-users] error adding OSD to crushmap

2015-01-14 Thread Martin B Nielsen
Hi Luis, I might remember wrong, but don't you need to actually create the osd first? (ceph osd create) Then you can use assign it a position using cli crushrules. Like Jason said, can you send the ceph osd tree output? Cheers, Martin On Mon, Jan 12, 2015 at 1:45 PM, Luis Periquito

Re: [ceph-users] got XmlParseFailure when libs3 client accessing radosgw object gateway

2015-01-14 Thread Ken Dreyer
On 01/06/2015 02:21 AM, Liu, Xuezhao wrote: But when I using libs3 (clone from http://github.com/wido/libs3.git ), the s3 commander does not work as expected: Hi Xuezhao, Wido's fork of libs3 is pretty old and not up to date [1]. It's best to use Bryan's repository instead:

Re: [ceph-users] Radosgw with SSL enabled

2015-01-14 Thread lakshmi k s
Hello All - Happy 2015.   I have been successful in establishing communication using --insecure option. I have two problems here. 1. swift calls without --insecure option continues to fail. Not sure why?  2. ceph gateway logs has the following error logs. Any thoughts on why I am seeing this

Re: [ceph-users] NUMA zone_reclaim_mode

2015-01-14 Thread Gregory Farnum
On Mon, Jan 12, 2015 at 8:25 AM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: On 12 Jan 2015, at 17:08, Sage Weil s...@newdream.net wrote: On Mon, 12 Jan 2015, Dan Van Der Ster wrote: Moving forward, I think it would be good for Ceph to a least document this behaviour, but better would

Re: [ceph-users] NUMA zone_reclaim_mode

2015-01-14 Thread Sage Weil
On Mon, 12 Jan 2015, Dan Van Der Ster wrote: Sure, I?ll try to prepare a patch which warns but isn?t too annoying. MongoDB already solved the heuristic: https://github.com/mongodb/mongo/blob/master/src/mongo/db/startup_warnings_mongod.cpp It?s licensed as AGPLv3 -- do you already know if

[ceph-users] Recovering some data with 2 of 2240 pg in remapped+peering

2015-01-14 Thread Chris Murray
Hi all, I think I know the answer to this already after reading similar queries, but I'll ask in case times have changed. After an error on my part, I have a very small number of pgs in remapped+peering. They don't look like they'll get out of that state. Some IO is blocked too, as you

Re: [ceph-users] rbd directory listing performance issues

2015-01-14 Thread Christian Balzer
On Mon, 12 Jan 2015 13:49:28 + Shain Miley wrote: Hi, I am just wondering if anyone has any thoughts on the questions below...I would like to order some additional hardware ASAP...and the order that I place may change depending on the feedback that I receive. Thanks again, Shain

[ceph-users] Part 2: ssd osd fails often with FAILED assert(soid scrubber.start || soid = scrubber.end)

2015-01-14 Thread Udo Lembke
Hi again, sorry for not threaded, but my last email don't came back on the mailing list (often miss some posts!). Just after sending the last mail, the first time another SSD fails - in this case an cheap one, but with the same error: root@ceph-04:/var/log/ceph# more ceph-osd.62.log 2015-01-13

[ceph-users] Cache pool latency impact

2015-01-14 Thread Pavan Rallabhandi
This is regarding cache pools and the impact of the flush/evict on the client IO latencies. Am seeing a direct impact on the client IO latencies (making them worse) when flush/evict is triggered on the cache pool. In a constant ingress of IOs on the cache pool, the write performance is no

[ceph-users] rgw single bucket performance question

2015-01-14 Thread baijia...@126.com
I know single bucket has performance question from http://tracker.ceph.com/issues/8473 I attempt to modify crush map that put bucket.index pool to ssd. but performance is not good, and ssd performance never utilize. this is op description,can you give me some suggests to improve

[ceph-users] Cache pool latency impact

2015-01-14 Thread Pavan Rallabhandi
This is regarding cache pools and the impact of the flush/evict on the client IO latencies. Am seeing a direct impact on the client IO latencies (making them worse) when flush/evict is triggered on the cache pool. In a constant ingress of IOs on the cache pool, the write performance is no

Re: [ceph-users] CRUSH question - failing to rebalance after failure test

2015-01-14 Thread Christopher Kunz
Hi, Okay, it sounds like something is not quite right then. Can you attach the OSDMap once it is in the not-quite-repaired state? And/or try setting 'ceph osd crush tunables optimal' and see if that has any effect? Indeed it did - I set ceph osd crush tunables optimal (80% degradation)

[ceph-users] How to tell a VM to write more local ceph nodes than to the network.

2015-01-14 Thread Roland Giesler
I have a 4 node ceph cluster, but the disks are not equally distributed across all machines (they are substantially different from each other) One machine has 12 x 1TB SAS drives (h1), another has 8 x 300GB SAS (s3) and two machines have only two 1 TB drives each (s2 s1). Now machine s3 has by

Re: [ceph-users] reset osd perf counters

2015-01-14 Thread Sebastien Han
It was added in 0.90 On 13 Jan 2015, at 00:11, Gregory Farnum g...@gregs42.com wrote: perf reset on the admin socket. I'm not sure what version it went in to; you can check the release logs if it doesn't work on whatever you have installed. :) -Greg On Mon, Jan 12, 2015 at 2:26 PM,

[ceph-users] Cache pool latency impact

2015-01-14 Thread Pavan Rallabhandi
This is regarding cache pools and the impact of the flush/evict on the client IO latencies. Am seeing a direct impact on the client IO latencies (making them worse) when flush/evict is triggered on the cache pool. In a constant ingress of IOs on the cache pool, the write performance is no

[ceph-users] Ceph, LIO, VMWARE anyone?

2015-01-14 Thread Giuseppe Civitella
Hi all, I'm working on a lab setup regarding Ceph serving rbd images as ISCSI datastores to VMWARE via a LIO box. Is there someone that already did something similar wanting to share some knowledge? Any production deployments? What about LIO's HA and luns' performances? Thanks Giuseppe

Re: [ceph-users] any workaround for FAILED assert(p != snapset.clones.end())

2015-01-14 Thread Luke Kao
Hi Sam and Greg, No, not using cache tier. Just for your information, backend filestore is btrfs with zlib compression Need I provide any more information? Thanks. BR, Luke From: Samuel Just [sam.j...@inktank.com] Sent: Wednesday, January 14, 2015 1:22

Re: [ceph-users] reset osd perf counters

2015-01-14 Thread Shain Miley
Ok we are on the latest firefly release...so I guess we will have to live with it until we upgrade to Hammer. Thanks, Shain Sent from my iPhone On Jan 13, 2015, at 5:30 AM, Sebastien Han sebastien@enovance.com wrote: It was added in 0.90 On 13 Jan 2015, at 00:11, Gregory Farnum

Re: [ceph-users] Recovering some data with 2 of 2240 pg in remapped+peering

2015-01-14 Thread Wido den Hollander
On 01/13/2015 07:33 PM, Chris Murray wrote: Hi all, I think I know the answer to this already after reading similar queries, but I'll ask in case times have changed. After an error on my part, I have a very small number of pgs in remapped+peering. They don't look like they'll

[ceph-users] Multiple OSDs crashing constantly

2015-01-14 Thread Scott Laird
I'm having a problem with 0.87 on Ubuntu. I created a cephfs filesystem on top of a 2,2 EC pool with a cache tier and copied a bunch of data (non-critical) onto it, and now 4 of my OSDs (on 3 physical servers) are crash-looping on startup. If I stop one of the crashing OSDs, then a different OSD

Re: [ceph-users] any workaround for FAILED assert(p != snapset.clones.end())

2015-01-14 Thread Samuel Just
Are you using a cache tier? -Sam On Mon, Jan 12, 2015 at 11:37 PM, Luke Kao luke@mycom-osi.com wrote: Hello community, We have a cluster using v0.80.5, and recently several OSDs goes down with error when removing a rbd snapshot: osd/ReplicatedPG.cc: 2352: FAILED assert(p !=

Re: [ceph-users] error adding OSD to crushmap

2015-01-14 Thread Luis Periquito
Hi Martin, you are correct, I have to call ceph osd create before calling ceph crush add, and after looking into the puppet code I now realise I didn't (my bug). thanks for your help. On Tue, Jan 13, 2015 at 4:05 PM, Martin B Nielsen mar...@unity3d.com wrote: Hi Luis, I might remember

Re: [ceph-users] Part 2: ssd osd fails often with FAILED assert(soid scrubber.start || soid = scrubber.end)

2015-01-14 Thread Loic Dachary
Hi, This is http://tracker.ceph.com/issues/8011 which is being backported. Cheers On 13/01/2015 22:00, Udo Lembke wrote: Hi again, sorry for not threaded, but my last email don't came back on the mailing list (often miss some posts!). Just after sending the last mail, the first time

[ceph-users] Object gateway install questions

2015-01-14 Thread Hoc Phan
1. Where do I install Apache/FastCGI based on  http://docs.ceph.com/docs/master/install/install-ceph-gateway/? Is it in the admin node? 2. When will I need 100-continue? I am just learning this so is it OK to do without 100-continue to make things simple? Sorry for basic

[ceph-users] Caching

2015-01-14 Thread Samuel Terburg - Panther-IT BV
I have a couple of questions about caching: I have 5 VM-Hosts serving 20 VMs. I have 1 Ceph pool where the VM-Disks of those 20 VMs reside as RBD Images. 1) Can i use multiple caching-tiers on the same data pool? I would like to use a local SSD OSD on each VM-Host that can serve as

Re: [ceph-users] NUMA zone_reclaim_mode

2015-01-14 Thread Loic Dachary
On 13/01/2015 01:10, Gregory Farnum wrote: On Mon, Jan 12, 2015 at 8:25 AM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: On 12 Jan 2015, at 17:08, Sage Weil s...@newdream.net wrote: On Mon, 12 Jan 2015, Dan Van Der Ster wrote: Moving forward, I think it would be good for Ceph to a

Re: [ceph-users] Recovering some data with 2 of 2240 pg inremapped+peering

2015-01-14 Thread Chris Murray
No, I/O will block for those PGs as long as you don't mark them as lost. Isn't there any way to get those OSDs back? If you can you can restore the PGs. Interesting, 'lost' is a term I'm not yet familiar with, regarding ceph. I'll read up on it. One of the OSDs was re-used straight away, and

Re: [ceph-users] ceph on peta scale

2015-01-14 Thread Zeeshan Ali Shah
Thanks James, I will look into it Zeeshan On Tue, Jan 13, 2015 at 2:00 PM, James wirel...@tampabay.rr.com wrote: Gregory Farnum greg@... writes: Ceph isn't really suited for WAN-style distribution. Some users have high-enough and consistent-enough bandwidth (with low enough latency)

Re: [ceph-users] NUMA zone_reclaim_mode

2015-01-14 Thread Loic Dachary
Hi Dan, On 12/01/2015 17:25, Dan Van Der Ster wrote: On 12 Jan 2015, at 17:08, Sage Weil s...@newdream.net mailto:s...@newdream.net wrote: On Mon, 12 Jan 2015, Dan Van Der Ster wrote: Moving forward, I think it would be good for Ceph to a least document this behaviour, but better would

Re: [ceph-users] Ceph, LIO, VMWARE anyone?

2015-01-14 Thread Nick Fisk
Hi Giuseppe, I am working on something very similar at the moment. I currently have it working on some test hardware but seems to be working reasonably well. I say reasonably as I have had a few instability’s but these are on the HA side, the LIO and RBD side of things have been rock

Re: [ceph-users] ceph on peta scale

2015-01-14 Thread Robert van Leeuwen
So is there any other alternative for over the WAN deployment .. I hava use case to connect two Swedish unversities (few hundreds km apart) . Target is that user from univ A can write to cluster to univ B and can read the data from other users . You could have a look at OpenStack Swift: it

Re: [ceph-users] Spark/Mesos on top of Ceph/Btrfs

2015-01-14 Thread Sebastien Han
Hey What do you want to use from Ceph? RBD? CephFS? It is not really clear, you mentioned ceph/btfrs which makes me either think of using btrfs for OSD store or btrfs on top of a RBD device. Later you mentioned HDFS, does that mean you want to use CephFS? I don’t know much about Mesos, but