[ceph-users] RGW buckets sync to AWS?

2015-03-31 Thread Henrik Korkuc
Hello, can anyone recommend script/program to periodically synchronize RGW buckets with Amazon's S3? -- Sincerely Henrik ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] One host failure bring down the whole cluster

2015-03-31 Thread Kai KH Huang
1) But Ceph says ...You can run a cluster with 1 monitor. (http://ceph.com/docs/master/rados/operations/add-or-rm-mons/), I assume it should work. And brain split is not my current concern 2) I've written object to Ceph, now I just want to get it back Anyway. I tried to reduce the mon

Re: [ceph-users] Radosgw authorization failed

2015-03-31 Thread Neville
Date: Mon, 30 Mar 2015 12:17:48 -0400 From: yeh...@redhat.com To: neville.tay...@hotmail.co.uk CC: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Radosgw authorization failed - Original Message - From: Neville neville.tay...@hotmail.co.uk To: Yehuda Sadeh-Weinraub

[ceph-users] Radosgw multi-region user creation question

2015-03-31 Thread Abhishek L
Hi I'm trying to set up a POC multi-region radosgw configuration (with different ceph clusters). Following the official docs[1], here the part about creation of zone system users was not very clear. Going by an example configuration of 2 regions US (master zone us-dc1), EU (master zone eu-dc1)

Re: [ceph-users] One host failure bring down the whole cluster

2015-03-31 Thread Henrik Korkuc
On 3/31/15 11:27, Kai KH Huang wrote: 1) But Ceph says ...You can run a cluster with 1 monitor. (http://ceph.com/docs/master/rados/operations/add-or-rm-mons/), I assume it should work. And brain split is not my current concern Point is that you must have majority of monitors up. * In one

Re: [ceph-users] Cannot add OSD node into crushmap or all writes fail

2015-03-31 Thread Henrik Korkuc
check firewall rules, network connectivity. Can all nodes and clients reach each other? Can you telnet to OSD ports (note that multiple OSDs may listen on differenct ports)? On 3/31/15 8:44, Tyler Bishop wrote: I have this ceph node that will correctly recover into my ceph pool and

[ceph-users] One of three monitors can not be started

2015-03-31 Thread 张皓宇
Who can help me? One monitor in my ceph cluster can not be started. Before that, I added '[mon] mon_compact_on_start = true' to /etc/ceph/ceph.conf on three monitor hosts. Then I did 'ceph tell mon.computer05 compact ' on computer05, which has a monitor on it. When store.db of computer05

Re: [ceph-users] SSD Hardware recommendation

2015-03-31 Thread f...@univ-lr.fr
Hi, in our quest to get the right SSD for OSD journals, I managed to benchmark two kind of 10 DWPD SSDs : - Toshiba M2 PX02SMF020 - Samsung 845DC PRO I wan't to determine if a disk is appropriate considering its absolute performances, and the optimal number of ceph-osd processes using the

Re: [ceph-users] Creating and deploying OSDs in parallel

2015-03-31 Thread Dan van der Ster
Hi Somnath, We have deployed many machines in parallel and it generally works. Keep in mind that if you deploy many many (1000) then this will create so many osdmap incrementals, so quickly, that the memory usage on the OSDs will increase substantially (until you reboot). Best Regards, Dan On

Re: [ceph-users] One of three monitors can not be started

2015-03-31 Thread 张皓宇
There is asok on computer06. I tried to start the mon.computer06, maybe two hours later, the mon.computer06 still not start, but there are some different processes on computer06, I don't know how to handle it: root 7812 1 0 11:39 pts/400:00:00 python /usr/sbin/ceph-create-keys

Re: [ceph-users] SSD Journaling

2015-03-31 Thread Garg, Pankaj
Hi Mark, Yes my reads are consistently slower. I have testes both Random and Sequential and various block sizes. Thanks Pankaj -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: Monday, March 30, 2015 1:07 PM To:

Re: [ceph-users] SSD Hardware recommendation

2015-03-31 Thread Adam Tygart
Speaking of SSD IOPs. Running the same tests on my SSDs (LiteOn ECT-480N9S 480GB SSDs): The lines at the bottom are a single 6TB spinning disk for comparison's sake. http://imgur.com/a/fD0Mh Based on these numbers, there is a minimum latency per operation, but multiple operations can be

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Robert LeBlanc
Turns out jumbo frames was not set on all the switch ports. Once that was resolved the cluster quickly became healthy. On Mon, Mar 30, 2015 at 8:15 PM, Robert LeBlanc rob...@leblancnet.us wrote: I've been working at this peering problem all day. I've done a lot of testing at the network layer

Re: [ceph-users] Weird cluster restart behavior

2015-03-31 Thread Gregory Farnum
On Tue, Mar 31, 2015 at 7:50 AM, Quentin Hartman qhart...@direwolfdigital.com wrote: I'm working on redeploying a 14-node cluster. I'm running giant 0.87.1. Last friday I got everything deployed and all was working well, and I set noout and shut all the OSD nodes down over the weekend.

Re: [ceph-users] One of three monitors can not be started

2015-03-31 Thread Gregory Farnum
On Tue, Mar 31, 2015 at 2:50 AM, 张皓宇 zhanghaoyu1...@hotmail.com wrote: Who can help me? One monitor in my ceph cluster can not be started. Before that, I added '[mon] mon_compact_on_start = true' to /etc/ceph/ceph.conf on three monitor hosts. Then I did 'ceph tell mon.computer05 compact ' on

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread koukou73gr
On 03/31/2015 09:23 PM, Sage Weil wrote: It's nothing specific to peering (or ceph). The symptom we've seen is just that byte stop passing across a TCP connection, usually when there is some largish messages being sent. The ping/heartbeat messages get through because they are small and we

Re: [ceph-users] Weird cluster restart behavior

2015-03-31 Thread Quentin Hartman
Thanks for the extra info Gregory. I did not also set nodown. I expect that I will be very rarely shutting everything down in the normal course of things, but it has come up a couple times when having to do some physical re-organizing of racks. Little irritants like this aren't a big deal if

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Robert LeBlanc
At the L2 level, if the hosts and switches don't accept jumbo frames, they just drop them because they are too big. They are not fragmented because they don't go through a router. My problem is that OSDs were able to peer with other OSDs on the host, but my guess is that they never sent/received

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Sage Weil
On Tue, 31 Mar 2015, Robert LeBlanc wrote: Turns out jumbo frames was not set on all the switch ports. Once that was resolved the cluster quickly became healthy. I always hesitate to point the finger at the jumbo frames configuration but almost every time that is the culprit! Thanks for the

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Somnath Roy
But, do we know why Jumbo frames may have an impact on peering ? In our setup so far, we haven't enabled jumbo frames other than performance reason (if at all). Thanks Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Robert

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Robert LeBlanc
I was desperate for anything after exhausting every other possibility I could think of. Maybe I should put a checklist in the Ceph docs of things to look for. Thanks, On Tue, Mar 31, 2015 at 11:36 AM, Sage Weil s...@newdream.net wrote: On Tue, 31 Mar 2015, Robert LeBlanc wrote: Turns out jumbo

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Sage Weil
On Tue, 31 Mar 2015, Somnath Roy wrote: But, do we know why Jumbo frames may have an impact on peering ? In our setup so far, we haven't enabled jumbo frames other than performance reason (if at all). It's nothing specific to peering (or ceph). The symptom we've seen is just that byte stop

Re: [ceph-users] Weird cluster restart behavior

2015-03-31 Thread Quentin Hartman
On Tue, Mar 31, 2015 at 2:05 PM, Gregory Farnum g...@gregs42.com wrote: Github pull requests. :) Ah, well that's easy: https://github.com/ceph/ceph/pull/4237 QH ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Weird cluster restart behavior

2015-03-31 Thread Jeffrey Ollie
On Tue, Mar 31, 2015 at 3:05 PM, Gregory Farnum g...@gregs42.com wrote: On Tue, Mar 31, 2015 at 12:56 PM, Quentin Hartman My understanding is that the right method to take an entire cluster offline is to set noout and then shutting everything down. Is there a better way? That's

[ceph-users] Weird cluster restart behavior

2015-03-31 Thread Quentin Hartman
I'm working on redeploying a 14-node cluster. I'm running giant 0.87.1. Last friday I got everything deployed and all was working well, and I set noout and shut all the OSD nodes down over the weekend. Yesterday when I spun it back up, the OSDs were behaving very strangely, incorrectly marking

Re: [ceph-users] Weird cluster restart behavior

2015-03-31 Thread Gregory Farnum
On Tue, Mar 31, 2015 at 12:56 PM, Quentin Hartman qhart...@direwolfdigital.com wrote: Thanks for the extra info Gregory. I did not also set nodown. I expect that I will be very rarely shutting everything down in the normal course of things, but it has come up a couple times when having to do

Re: [ceph-users] Cascading Failure of OSDs

2015-03-31 Thread Francois Lafont
Hi, Quentin Hartman wrote: Since I have been in ceph-land today, it reminded me that I needed to close the loop on this. I was finally able to isolate this problem down to a faulty NIC on the ceph cluster network. It worked, but it was accumulating a huge number of Rx errors. My best guess