Re: [ceph-users] Living with huge bucket sizes

2017-06-13 Thread Eric Choi
Hello all,

I work in the same team as Tyler here, and I can provide more info here..

The cluster is indeed an RGW cluster, with many small (100 KB) objects
similar to your use case Bryan.  But we have the blind bucket set up with
 "index_type": 1 for this particular bucket, as we wanted to avoid this
bottleneck to begin with (we didn't need listing feature)  Would the bucket
sharding still be a problem for blind buckets?

Mark, would setting logging to 20 give any insights to what threads are
doing?


Eric
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Living with huge bucket sizes

2017-06-10 Thread Cullen King
Bryan, I just went through this myself, also on Hammer, as offline bucket
index resharding was backported. I had three buckets with > 10 million
objects each, one of them with 30 million. I was experiencing the typical
blocked request issue during scrubs, when the placement group containing
the bucket index got hit.

I solved it in two steps. First, I added an SSD only pool, and moved the
bucket index to this new SSD pool. This is an on-line operation.

After that was complete I scheduled some downtime (we run a highly
available consumer facing website), and made a plan to reshard the bucket
indexes. I did some tests with buckets containing 100,000 test objects and
found the performance to be satisfactory. Once my maintenance window hit
and I stopped all access to RGW, I was able to reshard all my bucket
indexes in 20 minutes.

I can't remember exact numbers, but I believe I did a 20+ million bucket in
about 5 minutes. It was extremely fast, but again I had moved my bucket
indexes to an SSD backed pool of fast enterprise SSDs (three hosts, one SSD
per host, Samsung 3.84tb PM863a for what it's worth).

Once I finished this, all my ceph performance issues disappeared. I'll
slowly upgrade my cluster with the end goal of moving to the more efficient
bluestore, but I no longer feel the rush.

Last detail: I used 100 shards per bucket which seems to be a good
compromise.


Cullen


> Date: Fri, 9 Jun 2017 14:58:41 -0700
> From: Yehuda Sadeh-Weinraub <yeh...@redhat.com>
> To: Dan van der Ster <d...@vanderster.com>
> Cc: "ceph-users@lists.ceph.com" <ceph-users@lists.ceph.com>
> Subject: Re: [ceph-users] Living with huge bucket sizes
> Message-ID:
> <CADRKj5SMdrA2FZW8W5VY0tzNi_21dAo7r9gWLevV9CMygkiuTw@mail.
> gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> On Fri, Jun 9, 2017 at 2:21 AM, Dan van der Ster <d...@vanderster.com>
> wrote:
> > Hi Bryan,
> >
> > On Fri, Jun 9, 2017 at 1:55 AM, Bryan Stillwell <bstillw...@godaddy.com>
> wrote:
> >> This has come up quite a few times before, but since I was only working
> with
> >> RBD before I didn't pay too close attention to the conversation.  I'm
> >> looking
> >> for the best way to handle existing clusters that have buckets with a
> large
> >> number of objects (>20 million) in them.  The cluster I'm doing test on
> is
> >> currently running hammer (0.94.10), so if things got better in jewel I
> would
> >> love to hear about it!
> >> ...
> >> Has anyone found a good solution for this for existing large buckets?  I
> >> know sharding is the solution going forward, but afaik it can't be done
> >> on existing buckets yet (although the dynamic resharding work mentioned
> >> on today's performance call sounds promising).
> >
> > I haven't tried it myself, but 0.94.10 should have the (offline)
> > resharding feature. From the release notes:
> >
>
> Right. We did add automatic dynamic resharding to Luminous, but
> offline resharding should be enough.
>
>
> >> * In RADOS Gateway, it is now possible to reshard an existing bucket's
> index
> >> using an off-line tool.
> >>
> >> Usage:
> >>
> >> $ radosgw-admin bucket reshard --bucket=
> --num_shards=
> >>
> >> This will create a new linked bucket instance that points to the newly
> created
> >> index objects. The old bucket instance still exists and currently it's
> up to
> >> the user to manually remove the old bucket index objects. (Note that
> bucket
> >> resharding currently requires that all IO (especially writes) to the
> specific
> >> bucket is quiesced.)
>
> Once resharding is done, use the radosgw-admin bi purge command to
> remove the old bucket indexes.
>
> Yehuda
>
> >
> > -- Dan
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
>
> Subject: Digest Footer
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
>
> End of ceph-users Digest, Vol 53, Issue 9
> *
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Living with huge bucket sizes

2017-06-09 Thread Yehuda Sadeh-Weinraub
On Fri, Jun 9, 2017 at 2:21 AM, Dan van der Ster  wrote:
> Hi Bryan,
>
> On Fri, Jun 9, 2017 at 1:55 AM, Bryan Stillwell  
> wrote:
>> This has come up quite a few times before, but since I was only working with
>> RBD before I didn't pay too close attention to the conversation.  I'm
>> looking
>> for the best way to handle existing clusters that have buckets with a large
>> number of objects (>20 million) in them.  The cluster I'm doing test on is
>> currently running hammer (0.94.10), so if things got better in jewel I would
>> love to hear about it!
>> ...
>> Has anyone found a good solution for this for existing large buckets?  I
>> know sharding is the solution going forward, but afaik it can't be done
>> on existing buckets yet (although the dynamic resharding work mentioned
>> on today's performance call sounds promising).
>
> I haven't tried it myself, but 0.94.10 should have the (offline)
> resharding feature. From the release notes:
>

Right. We did add automatic dynamic resharding to Luminous, but
offline resharding should be enough.


>> * In RADOS Gateway, it is now possible to reshard an existing bucket's index
>> using an off-line tool.
>>
>> Usage:
>>
>> $ radosgw-admin bucket reshard --bucket= 
>> --num_shards=
>>
>> This will create a new linked bucket instance that points to the newly 
>> created
>> index objects. The old bucket instance still exists and currently it's up to
>> the user to manually remove the old bucket index objects. (Note that bucket
>> resharding currently requires that all IO (especially writes) to the specific
>> bucket is quiesced.)

Once resharding is done, use the radosgw-admin bi purge command to
remove the old bucket indexes.

Yehuda

>
> -- Dan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Living with huge bucket sizes

2017-06-09 Thread Dan van der Ster
Hi Bryan,

On Fri, Jun 9, 2017 at 1:55 AM, Bryan Stillwell  wrote:
> This has come up quite a few times before, but since I was only working with
> RBD before I didn't pay too close attention to the conversation.  I'm
> looking
> for the best way to handle existing clusters that have buckets with a large
> number of objects (>20 million) in them.  The cluster I'm doing test on is
> currently running hammer (0.94.10), so if things got better in jewel I would
> love to hear about it!
> ...
> Has anyone found a good solution for this for existing large buckets?  I
> know sharding is the solution going forward, but afaik it can't be done
> on existing buckets yet (although the dynamic resharding work mentioned
> on today's performance call sounds promising).

I haven't tried it myself, but 0.94.10 should have the (offline)
resharding feature. From the release notes:

> * In RADOS Gateway, it is now possible to reshard an existing bucket's index
> using an off-line tool.
>
> Usage:
>
> $ radosgw-admin bucket reshard --bucket= 
> --num_shards=
>
> This will create a new linked bucket instance that points to the newly created
> index objects. The old bucket instance still exists and currently it's up to
> the user to manually remove the old bucket index objects. (Note that bucket
> resharding currently requires that all IO (especially writes) to the specific
> bucket is quiesced.)

-- Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com