Re: [ceph-users] dense storage nodes

2016-05-18 Thread Christian Balzer
Hello Kris, On Wed, 18 May 2016 19:31:49 -0700 Kris Jurka wrote: > > > On 5/18/2016 7:15 PM, Christian Balzer wrote: > > >> We have hit the following issues: > >> > >> - Filestore merge splits occur at ~40 MObjects with default > >> settings. This is a really, really bad couple of days

Re: [ceph-users] mark out vs crush weight 0

2016-05-18 Thread Christian Balzer
Hello Sage, On Wed, 18 May 2016 17:23:00 -0400 (EDT) Sage Weil wrote: > Currently, after an OSD has been down for 5 minutes, we mark the OSD > "out", whic redistributes the data to other OSDs in the cluster. If the > OSD comes back up, it marks the OSD back in (with the same reweight >

[ceph-users] dd testing from within the VM

2016-05-18 Thread Ken Peng
Hi, Our VM has been using ceph as block storage for both system and data patition. This is what dd shows, # dd if=/dev/zero of=test.file bs=4k count=1024k 1048576+0 records in 1048576+0 records out 4294967296 bytes (4.3 GB) copied, 16.7969 s, 256 MB/s When dd again with fdatasync

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Kris Jurka
On 5/18/2016 7:15 PM, Christian Balzer wrote: We have hit the following issues: - Filestore merge splits occur at ~40 MObjects with default settings. This is a really, really bad couple of days while things settle. Could you elaborate on that? As in which settings affect this and what

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Christian Balzer
Hello, On Wed, 18 May 2016 08:14:51 -0500 Brian Felton wrote: > At my current gig, we are running five (soon to be six) pure object > storage clusters in production with the following specs: > > - 9 nodes > - 32 cores, 256 GB RAM per node > - 72 6 TB SAS spinners per node (648 total per

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Christian Balzer
Hello, On Wed, 18 May 2016 12:32:25 -0400 Benjeman Meekhof wrote: > Hi Lionel, > > These are all very good points we should consider, thanks for the > analysis. Just a couple clarifications: > > - NVMe in this system are actually slotted in hot-plug front bays so a > failure can be swapped

Re: [ceph-users] OSD node memory sizing

2016-05-18 Thread Christian Balzer
Hello again, On Wed, 18 May 2016 15:32:50 +0200 Dietmar Rieder wrote: > Hello Christian, > > > Hello, > > > > On Wed, 18 May 2016 13:57:59 +0200 Dietmar Rieder wrote: > > > >> Dear Ceph users, > >> > >> I've a question regarding the memory recommendations for an OSD node. > >> > >> The

[ceph-users] ceph hang on pg list_unfound

2016-05-18 Thread Don Waterloo
I am running 10.2.0-0ubuntu0.16.04.1. I've run into a problem w/ cephfs metadata pool. Specifically I have a pg w/ an 'unfound' object. But i can't figure out which since when i run: ceph pg 12.94 list_unfound it hangs (as does ceph pg 12.94 query). I know its in the cephfs metadata pool since

[ceph-users] CEPH/CEPHFS upgrade questions (9.2.0 ---> 10.2.1)

2016-05-18 Thread Goncalo Borges
Dear All... Our infrastructure is the following: - We use CEPH/CEPHFS (9.2.0) - We have 3 mons and 8 storage servers supporting 8 OSDs each. - We use SSDs for journals (2 SSDs per storage server, each serving 4 OSDs). - We have one main mds and one standby-replay mds. - We are

Re: [ceph-users] mark out vs crush weight 0

2016-05-18 Thread David Turner
>>On 16-05-18 14:23, Sage Weil wrote: >> Currently, after an OSD has been down for 5 minutes, we mark the OSD >> "out", whic redistributes the data to other OSDs in the cluster. If the >> OSD comes back up, it marks the OSD back in (with the same reweight value, >> usually 1.0). >> >> The good

Re: [ceph-users] mark out vs crush weight 0

2016-05-18 Thread Henrik Korkuc
On 16-05-18 14:23, Sage Weil wrote: Currently, after an OSD has been down for 5 minutes, we mark the OSD "out", whic redistributes the data to other OSDs in the cluster. If the OSD comes back up, it marks the OSD back in (with the same reweight value, usually 1.0). The good thing about marking

[ceph-users] mark out vs crush weight 0

2016-05-18 Thread Sage Weil
Currently, after an OSD has been down for 5 minutes, we mark the OSD "out", whic redistributes the data to other OSDs in the cluster. If the OSD comes back up, it marks the OSD back in (with the same reweight value, usually 1.0). The good thing about marking OSDs out is that exactly the

Re: [ceph-users] dense storage nodes

2016-05-18 Thread George Mihaiescu
Hi Blair, We use 36 OSDs nodes with journals on HDD running in a 90% object storage cluster. The servers have 128 GB RAM and 40 cores (HT) for the storage nodes with 4 TB SAS drives, and 256 GB and 48 cores for the storage nodes with 6 TB SAS drives. We use 2x10 Gb bonded for the client network,

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Benjeman Meekhof
Hi Lionel, These are all very good points we should consider, thanks for the analysis. Just a couple clarifications: - NVMe in this system are actually slotted in hot-plug front bays so a failure can be swapped online. However I do see your point about this otherwise being a non-optimal

Re: [ceph-users] OSD node memory sizing

2016-05-18 Thread Dietmar Rieder
Hello Christian, > Hello, > > On Wed, 18 May 2016 13:57:59 +0200 Dietmar Rieder wrote: > >> Dear Ceph users, >> >> I've a question regarding the memory recommendations for an OSD node. >> >> The official Ceph hardware recommendations say that an OSD node should >> have 1GB Ram / TB OSD [1] >>

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Brian Felton
At my current gig, we are running five (soon to be six) pure object storage clusters in production with the following specs: - 9 nodes - 32 cores, 256 GB RAM per node - 72 6 TB SAS spinners per node (648 total per cluster) - 7,2 erasure coded pool for RGW buckets - ZFS as the filesystem on

[ceph-users] Ceph OSD performance issue

2016-05-18 Thread Davie De Smet
Hi Ceph-users, I am having some trouble in finding the bottleneck in my CephFS Infernalis setup. I am running 5 OSD servers which all have 6 OSD's each (so I have 30 OSD's in total). Each OSD is a physical disk (non SSD) and each OSD has it's journal stored on the first partition of it's own

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Christian Balzer
On Wed, 18 May 2016 08:56:51 + Van Leeuwen, Robert wrote: > >We've hit issues (twice now) that seem (have not > >figured out exactly how to confirm this yet) to be related to kernel > >dentry slab cache exhaustion - symptoms were a major slow down in > >performance and slow requests all over

[ceph-users] OSD node memory sizing

2016-05-18 Thread Dietmar Rieder
Dear Ceph users, I've a question regarding the memory recommendations for an OSD node. The official Ceph hardware recommendations say that an OSD node should have 1GB Ram / TB OSD [1] The "Reference Architecture" whitpaper from Red Hat & Supermicro says that "typically" 2GB of memory per OSD on

Re: [ceph-users] The RGW create new bucket instance then delete it at every create bucket OP

2016-05-18 Thread fangchen sun
Hello, the following code snippet from rgw_rados.cc shows the problem. RGWRados::create_bucket(...) { ... ... ret = put_linked_bucket_info(info, exclusive, ceph::real_time(), pep_objv, , true); if (ret == -EEXIST) { ... //* if the bucket has exist, the new bucket instance

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Van Leeuwen, Robert
>We've hit issues (twice now) that seem (have not >figured out exactly how to confirm this yet) to be related to kernel >dentry slab cache exhaustion - symptoms were a major slow down in >performance and slow requests all over the place on writes, watching >OSD iostat would show a single drive

Re: [ceph-users] The RGW create new bucket instance then delete it at every create bucket OP

2016-05-18 Thread Saverio Proto
Hello, I am not sure I understood the problem. Can you post the example steps to reproduce the problem ? Also what version of Ceph RGW are you running ? Saverio 2016-05-18 10:24 GMT+02:00 fangchen sun : > Dear ALL, > > I found a problem that the RGW create a new bucket

[ceph-users] The RGW create new bucket instance then delete it at every create bucket OP

2016-05-18 Thread fangchen sun
Dear ALL, I found a problem that the RGW create a new bucket instance and delete the bucket instance at every create bucket OP with same name http://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html According to the error code "BucketAlreadyOwnedByYou" from the above link, shouldn't

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Christian Balzer
Hello, On Wed, 18 May 2016 15:54:59 +1000 Blair Bethwaite wrote: > Hi all, > > What are the densest node configs out there, and what are your > experiences with them and tuning required to make them work? If we can > gather enough info here then I'll volunteer to propose some upstream > docs

Re: [ceph-users] dense storage nodes

2016-05-18 Thread Wido den Hollander
> Op 18 mei 2016 om 7:54 schreef Blair Bethwaite : > > > Hi all, > > What are the densest node configs out there, and what are your > experiences with them and tuning required to make them work? If we can > gather enough info here then I'll volunteer to propose some

Re: [ceph-users] Ceph Recovery

2016-05-18 Thread Gaurav Bafna
Is it a known issue and is it expected ? When as osd is marked out, the reweight becomes 0 and the PGs should get remapped , right ? I do see recovery after removing from crush map. Thanks Gaurav On Wed, May 18, 2016 at 12:08 PM, Lazuardi Nasution wrote: > Hi Gaurav, >

Re: [ceph-users] Ceph Recovery

2016-05-18 Thread Lazuardi Nasution
Hi Gaurav, Not onnly marked out, you need to remove it from crush map to make sure cluster do auto recovery. It seem taht the marked out OSD still appear on crush map calculation so it must be removed manually. You will see that there will be recovery process after you remove OSD from crush map.