Re: [ceph-users] Cannot create bucket via the S3 (s3cmd)

2016-02-17 Thread Arvydas Opulskis
Hi, Are you using rgw_dns_name parameter in config? Sometimes it’s needed (when s3 client sends bucket name as subdomain). Arvydas From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alexandr Porunov Sent: Wednesday, February 17, 2016 10:37 PM To: Василий Ангапов

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread Christian Balzer
Hello, On Wed, 17 Feb 2016 09:19:39 - Nick Fisk wrote: > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > > Of Christian Balzer > > Sent: 17 February 2016 02:41 > > To: ceph-users > > Subject: Re:

Re: [ceph-users] Performance Testing of CEPH on ARM MicroServer

2016-02-17 Thread Christian Balzer
Hello, On Wed, 17 Feb 2016 21:47:31 +0530 Swapnil Jain wrote: > Thanks Christian, > > > > > On 17-Feb-2016, at 7:25 AM, Christian Balzer wrote: > > > > > > Hello, > > > > On Mon, 15 Feb 2016 21:10:33 +0530 Swapnil Jain wrote: > > > >> For most of you CEPH on ARMv7 might

Re: [ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Gregory Farnum
You probably don't want to try and replace the dead OSD with a new one until stuff is otherwise recovered. Just import the PG into any osd in the cluster and it should serve the data up for proper recovery (and then delete it when done). I've never done this or worked on the tooling though so

Re: [ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Kostis Fardelas
Right now the PG is served by two other OSDs and fresh data is written to them. Is it safe to export the stale pg contents from the crashed OSD and try to just import them again back to the cluster (the PG is not entirely lost, only some objects didn't make it). What could be the right sequence

Re: [ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Gregory Farnum
On Wed, Feb 17, 2016 at 4:44 PM, Kostis Fardelas wrote: > Thanks Greg, > I gather from reading about ceph_objectstore_tool that it acts at the > level of the PG. The fact is that I do not want to wipe the whole PG, > only export certain objects (the unfound ones) and import

Re: [ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Kostis Fardelas
Thanks Greg, I gather from reading about ceph_objectstore_tool that it acts at the level of the PG. The fact is that I do not want to wipe the whole PG, only export certain objects (the unfound ones) and import them again into the cluster. To be precise the pg with the unfound objects is mapped

Re: [ceph-users] How to properly deal with NEAR FULL OSD

2016-02-17 Thread Stillwell, Bryan
Vlad, First off your cluster is rather full (80.31%). Hopefully you have hardware ordered for an expansion in the near future. Based on your 'ceph osd tree' output, it doesn't look like the reweight-by-utilization did anything for you. That last number for each OSD is set to 1, which means it

Re: [ceph-users] How to properly deal with NEAR FULL OSD

2016-02-17 Thread Jan Schermer
It would be helpful to see your crush map (there are some tunables that help with this issue as well available if you're not running ancient versions). However, distribution uniformity isn't that great really. It helps to increase the number of PGs, but beware that there's no turning back. Other

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Jan Schermer
Hmm, it's possible there aren't any safeguards against filling the whole drive when increasing PGs, actually I think ceph only cares about free space when backilling which is not what happened (at least directly) in your case. However, having a completely full OSD filesystem is not going to end

[ceph-users] How to properly deal with NEAR FULL OSD

2016-02-17 Thread Vlad Blando
Hi This been bugging me for some time now, the distribution of data on the OSD is not balanced so some OSD are near full, i did ceph osd reweight-by-utilization but it not helping much. [root@controller-node ~]# ceph osd tree # idweight type name up/down reweight -1 98.28 root

Re: [ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Gregory Farnum
On Wed, Feb 17, 2016 at 3:05 PM, Kostis Fardelas wrote: > Hello cephers, > due to an unfortunate sequence of events (disk crashes, network > problems), we are currently in a situation with one PG that reports > unfound objects. There is also an OSD which cannot start-up and >

[ceph-users] Recover unfound objects from crashed OSD's underlying filesystem

2016-02-17 Thread Kostis Fardelas
Hello cephers, due to an unfortunate sequence of events (disk crashes, network problems), we are currently in a situation with one PG that reports unfound objects. There is also an OSD which cannot start-up and crashes with the following: 2016-02-17 18:40:01.919546 7fecb0692700 -1

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Lukáš Kubín
You're right, the "full" osd was still up and in until I increased the pg values of one of the pools. The redistribution has not completed yet and perhaps that's what is still filling the drive. With this info - do you think I'm still safe to follow the steps suggested in previous post? Thanks!

Re: [ceph-users] pg repair behavior? (Was: Re: getting rid of misplaced objects)

2016-02-17 Thread George Mihaiescu
We have three replicas, so we just performed md5sum on all of them in order to find the correct ones, then we deleted the bad file and ran pg repair. On 15 Feb 2016 10:42 a.m., "Zoltan Arnold Nagy" wrote: > Hi Bryan, > > You were right: we’ve modified our PG weights a

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Jan Schermer
Something must be on those 2 OSDs that ate all that space - ceph by default doesn't allow OSD to get completely full (filesystem-wise) and from what you've shown those filesystems are really really full. OSDs don't usually go down when "full" (95%) .. or do they? I don't think so... so the

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Lukáš Kubín
Ahoj Jan, thanks for the quick hint! Those 2 OSDs are currently full and down. How should I handle that? Is it ok that I delete some pg directories again and start the OSD daemons, on both drives in parallel. Then set the weights as recommended ? What effect should I expect then - will the

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Somnath Roy
If you are not sure about what weight to put , ‘ceph osd reweight-by-utilization’ should also do the job for you automatically.. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: Wednesday, February 17, 2016 12:48 PM To: Lukáš

[ceph-users] Idea for speedup RadosGW for buckets with many objects.

2016-02-17 Thread Krzysztof Księżyk
Hi, I'm experiencing problem with poor performance of RadosGW while operating on bucket with many object. That's known issue with LevelDB and can be partially resolved using shrading but I have one more idea. As I see in ceph osd logs all slow requests are while making call to rgw.bucket_list:

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Jan Schermer
Ahoj ;-) You can reweight them temporarily, that shifts the data from the full drives. ceph osd reweight osd.XX YY (XX = the number of full OSD, YY is "weight" which default to 1) This is different from "crush reweight" which defaults to drive size in TB. Beware that reweighting will (afaik)

[ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Lukáš Kubín
Hi, I'm running a very small setup of 2 nodes with 6 OSDs each. There are 2 pools, each of size=2. Today, one of our OSDs got full, another 2 near full. Cluster turned into ERR state. I have noticed uneven space distribution among OSD drives between 70 and 100 perce. I have realized there's a low

Re: [ceph-users] Cannot create bucket via the S3 (s3cmd)

2016-02-17 Thread Alexandr Porunov
Because I have created them manually and then I have installed Rados Gateway. After that I realised that Rados Gateway didn't work. I thought that it was because I have created pools manually so I removed those buckets which I had created and reinstall Rados Gateway. But without success of course

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread Nick Fisk
Ah typo, I meant to say 10Mhz per IO. So a 7.2k disk does around 80IOPs = ~ 800mhz which is close to the 1Ghz figure. From: John Hogenmiller [mailto:j...@hogenmiller.net] Sent: 17 February 2016 13:15 To: Nick Fisk Cc: Василий Ангапов ;

Re: [ceph-users] Cannot create bucket via the S3 (s3cmd)

2016-02-17 Thread Василий Ангапов
First, seems to me you should not delete pools .rgw.buckets and .rgw.buckets.index because that's the pools where RGW stores buckets actually. But why did you do that? 2016-02-18 3:08 GMT+08:00 Alexandr Porunov : > When I try to create bucket: > s3cmd mb

[ceph-users] Cannot create bucket via the S3 (s3cmd)

2016-02-17 Thread Alexandr Porunov
When I try to create bucket: s3cmd mb s3://first-bucket I always get this error: ERROR: S3 error: 405 (MethodNotAllowed) /var/log/ceph/ceph-client.rgw.gateway.log : 2016-02-17 20:22:49.282715 7f86c50f3700 1 handle_sigterm 2016-02-17 20:22:49.282750 7f86c50f3700 1 handle_sigterm set alarm for

Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2016-02-17 Thread Scottix
Looks like the bug with the kernel using ceph and XFS was fixed, I haven't tested it yet but just wanted to give an update. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1527062 On Tue, Dec 8, 2015 at 8:05 AM Scottix wrote: > I can confirm it seems to be kernels

Re: [ceph-users] Performance issues related to scrubbing

2016-02-17 Thread Cullen King
On Wed, Feb 17, 2016 at 12:13 AM, Christian Balzer wrote: > > Hello, > > On Tue, 16 Feb 2016 10:46:32 -0800 Cullen King wrote: > > > Thanks for the helpful commentary Christian. Cluster is performing much > > better with 50% more spindles (12 to 18 drives), along with setting

Re: [ceph-users] Performance Testing of CEPH on ARM MicroServer

2016-02-17 Thread Swapnil Jain
Thanks Christian, > On 17-Feb-2016, at 7:25 AM, Christian Balzer wrote: > > > Hello, > > On Mon, 15 Feb 2016 21:10:33 +0530 Swapnil Jain wrote: > >> For most of you CEPH on ARMv7 might not sound good. This is our setup >> and our FIO testing report. I am not able to

Re: [ceph-users] Adding multiple OSDs to existing cluster

2016-02-17 Thread Ed Rowley
On 17 February 2016 at 14:59, Christian Balzer wrote: > > Hello, > > On Wed, 17 Feb 2016 13:44:17 + Ed Rowley wrote: > >> On 17 February 2016 at 12:04, Christian Balzer wrote: >> > >> > Hello, >> > >> > On Wed, 17 Feb 2016 11:18:40 + Ed Rowley wrote: >> > >>

Re: [ceph-users] Adding multiple OSDs to existing cluster

2016-02-17 Thread Christian Balzer
Hello, On Wed, 17 Feb 2016 13:44:17 + Ed Rowley wrote: > On 17 February 2016 at 12:04, Christian Balzer wrote: > > > > Hello, > > > > On Wed, 17 Feb 2016 11:18:40 + Ed Rowley wrote: > > > >> Hi, > >> > >> We have been running Ceph in production for a few months and

Re: [ceph-users] Cannot change the gateway port (civetweb)

2016-02-17 Thread Jaroslaw Owsiewski
Probably this is the reason: https://www.w3.org/Daemon/User/Installation/PrivilegedPorts.html Regards, -- Jarosław Owsiewski 2016-02-17 15:28 GMT+01:00 Alexandr Porunov : > Hello, > > I have problem with port changes of rados gateway node. > I don't know why but I

Re: [ceph-users] Cannot change the gateway port (civetweb)

2016-02-17 Thread Karol Mroz
On Wed, Feb 17, 2016 at 04:28:38PM +0200, Alexandr Porunov wrote: [...] > set_ports_option: cannot bind to 80: 13 (Permission denied) Hi, The problem is that civetweb can't bind to privileged port 80 because it currently drops permissions _before_ the bind.

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread Tyler Bishop
I'm using 2x replica on that pool for storing rbd volumes. Our workload is pretty heavy, id imagine objects an ec would be light in comparison. Tyler Bishop Chief Technical Officer 513-299-7108 x10 tyler.bis...@beyondhosting.net If you are not the intended recipient of

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread John Hogenmiller
I hadn't come across this ratio prior, but now that I've read that PDF you linked and I've narrowed my search in the mailing list, I think that the 0.5 - 1ghz per OSD ratio is pretty spot on. The 100Mhz per IOP is also pretty interesting, and we do indeed use 7200 RPM drives. I'll look up a few

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-02-17 Thread Mark Nelson
On 02/17/2016 06:36 AM, Christian Balzer wrote: Hello, On Wed, 17 Feb 2016 09:23:11 - Nick Fisk wrote: -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Christian Balzer Sent: 17 February 2016 04:22 To: ceph-users@lists.ceph.com Cc: Piotr

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread John Hogenmiller
Tyler, E5-2660 V2 is a 10-core, 2.2Ghz, giving you roughly 44Ghz or 0.78Ghz per OSD. That seems to fall in line with Nick's "golden rule" or 0.5Ghz - 1Ghz per OSD. Are you doing EC or Replication? If EC, what profile? Could you also provide an average of CPU utilization? I'm still

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-02-17 Thread Christian Balzer
Hello, On Wed, 17 Feb 2016 09:23:11 - Nick Fisk wrote: > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > > Of Christian Balzer > > Sent: 17 February 2016 04:22 > > To: ceph-users@lists.ceph.com > > Cc: Piotr Wachowicz

Re: [ceph-users] Adding multiple OSDs to existing cluster

2016-02-17 Thread Christian Balzer
Hello, On Wed, 17 Feb 2016 11:18:40 + Ed Rowley wrote: > Hi, > > We have been running Ceph in production for a few months and looking > at our first big expansion. We are going to be adding 8 new OSDs > across 3 hosts to our current cluster of 13 OSD across 5 hosts. We > obviously want to

[ceph-users] Adding multiple OSDs to existing cluster

2016-02-17 Thread Ed Rowley
Hi, We have been running Ceph in production for a few months and looking at our first big expansion. We are going to be adding 8 new OSDs across 3 hosts to our current cluster of 13 OSD across 5 hosts. We obviously want to minimize the amount of disruption this is going to cause but we are unsure

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-02-17 Thread Christian Balzer
Hello, On Wed, 17 Feb 2016 10:04:11 +0100 Piotr Wachowicz wrote: > Thanks for your reply. > > > > > Let's consider both cases: > > > Journals on SSDs - for writes, the write operation returns right > > > after data lands on the Journal's SSDs, but before it's written to > > > the backing HDD.

Re: [ceph-users] ceph 9.2.0 mds cluster went down and now constantly crashes with Floating point exception

2016-02-17 Thread Kenneth Waegeman
On 05/02/16 11:43, John Spray wrote: On Fri, Feb 5, 2016 at 9:36 AM, Kenneth Waegeman wrote: On 04/02/16 16:17, Gregory Farnum wrote: On Thu, Feb 4, 2016 at 1:42 AM, Kenneth Waegeman wrote: Hi, Hi, we are running ceph 9.2.0.

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Christian Balzer > Sent: 17 February 2016 02:41 > To: ceph-users > Subject: Re: [ceph-users] Recomendations for building 1PB RadosGW with > Erasure Code > > >

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-02-17 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Christian Balzer > Sent: 17 February 2016 04:22 > To: ceph-users@lists.ceph.com > Cc: Piotr Wachowicz > Subject: Re: [ceph-users] SSDs for journals vs

Re: [ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-17 Thread Nick Fisk
Thanks for posting your experiences John, very interesting read. I think the golden rule of around 1Ghz is still a realistic goal to aim for. It looks like you probably have around 16ghz for 60OSD's, or 0.26Ghz per OSD. Do you have any idea on how much CPU you think you would need to just be

Re: [ceph-users] Hammer OSD crash during deep scrub

2016-02-17 Thread Maksym Krasilnikov
Hello! On Wed, Feb 17, 2016 at 07:38:15AM +, ceph.user wrote: > ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) > 1: /usr/bin/ceph-osd() [0xbf03dc] > 2: (()+0xf0a0) [0x7f29e4c4d0a0] > 3: (gsignal()+0x35) [0x7f29e35b7165] > 4: (abort()+0x180) [0x7f29e35ba3e0] > 5:

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-02-17 Thread Piotr Wachowicz
Thanks for your reply. > > Let's consider both cases: > > Journals on SSDs - for writes, the write operation returns right after > > data lands on the Journal's SSDs, but before it's written to the backing > > HDD. So, for writes, SSD journal approach should be comparable to having > > a SSD

[ceph-users] Infernalis sortbitwise flag

2016-02-17 Thread Markus Blank-Burian
Hi, I recently saw, that a new osdmap is created with the sortbitwise flag. Can this safely be enabled on an existing cluster and would there be any advantages in doing so? Markus ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Performance issues related to scrubbing

2016-02-17 Thread Christian Balzer
Hello, On Tue, 16 Feb 2016 10:46:32 -0800 Cullen King wrote: > Thanks for the helpful commentary Christian. Cluster is performing much > better with 50% more spindles (12 to 18 drives), along with setting scrub > sleep to 0.1. Didn't see really any gain from moving from the Samsung 850 > Pro