Re: [ceph-users] rgw S3 lifecycle cannot keep up

2019-10-03 Thread Christian Pedersen
Thank you Robin.

Looking at the video it doesn't seem like a fix is anywhere near ready.
Am I correct in concluding that Ceph is not the right tool for my use-case?

Cheers,
Christian

On Oct 3 2019, at 6:07 am, Robin H. Johnson  wrote:
> On Wed, Oct 02, 2019 at 01:48:40PM +0200, Christian Pedersen wrote:
> > Hi Martin,
> >
> > Even before adding cold storage on HDD, I had the cluster with SSD only. 
> > That also could not keep up with deleting the files.
> > I am no where near I/O exhaustion on the SSDs or even the HDDs.
>
> Please see my presentation from Cephalic on 2019 about RGW S3 where I
> touch on slowness in Lifecycle processing and deletion.
>
> The efficiency of the code is very low: it requires a full scan of
> the bucket index every single day. Depending on the traversal order
> (unordered listing helps), this might mean it takes a very long time to
> find the items that can be deleted, and even when it gets to them, it's
> bound by the deletion time, which is also slow (that the head of the
> objects is a synchronous deletion in many cases, while the tails are
> async garbage-collected).
>
> Fixing this isn't trivial: either you have to scan the entire bucket, or
> you have to maintain a secondary index in insertion-order for EACH
> prefix in a lifecycle policy.
>
> --
> Robin Hugh Johnson
> Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
> E-Mail : robb...@gentoo.org
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw S3 lifecycle cannot keep up

2019-10-02 Thread Robin H. Johnson
On Wed, Oct 02, 2019 at 01:48:40PM +0200, Christian Pedersen wrote:
> Hi Martin,
> 
> Even before adding cold storage on HDD, I had the cluster with SSD only. That 
> also could not keep up with deleting the files.
> I am no where near I/O exhaustion on the SSDs or even the HDDs.
Please see my presentation from Cephalic on 2019 about RGW S3 where I
touch on slowness in Lifecycle processing and deletion. 

The efficiency of the code is very low: it requires a full scan of
the bucket index every single day. Depending on the traversal order
(unordered listing helps), this might mean it takes a very long time to
find the items that can be deleted, and even when it gets to them, it's
bound by the deletion time, which is also slow (that the head of the
objects is a synchronous deletion in many cases, while the tails are
async garbage-collected).

Fixing this isn't trivial: either you have to scan the entire bucket, or
you have to maintain a secondary index in insertion-order for EACH
prefix in a lifecycle policy.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw S3 lifecycle cannot keep up

2019-10-02 Thread Christian Pedersen
Hi Martin,

Even before adding cold storage on HDD, I had the cluster with SSD only. That 
also could not keep up with deleting the files.
I am no where near I/O exhaustion on the SSDs or even the HDDs.

Cheers,
Christian

On Oct 2 2019, at 1:23 pm, Martin Verges  wrote:
> Hello Christian,
>
> the problem is, that HDD is not capable of providing lots of IOs required for 
> "~4 million small files".
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io (mailto:martin.ver...@croit.io)
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
>
>
> Am Mi., 2. Okt. 2019 um 11:56 Uhr schrieb Christian Pedersen 
> mailto:chrip...@gmail.com)>:
> > Hi,
> >
> > Using the S3 gateway I store ~4 million small files in my cluster every 
> > day. I have a lifecycle setup to move these files to cold storage after a 
> > day and delete them after two days.
> > The default storage is SSD based and the cold storage is HDD.
> > However the rgw lifecycle process cannot keep up with this. In a 24 hour 
> > period. A little less than a million files are moved per day ( 
> > https://imgur.com/a/H52hD2h ). I have tried only enabling the delete part 
> > of the lifecycle, but even though it deleted from SSD storage, the result 
> > is the same. The screenshots are taken while there is no incoming files to 
> > the cluster.
> > I'm running 5 rgw servers, but that doesn't really change anything from 
> > when I was running less. I've tried adjusting rgw lc max objs, but again no 
> > change in performance.
> > Any suggestions on how I can tune the lifecycle process?
> > Cheers,
> > Christian
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com)
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw S3 lifecycle cannot keep up

2019-10-02 Thread Martin Verges
Hello Christian,

the problem is, that HDD is not capable of providing lots of IOs required
for "~4 million small files".

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 2. Okt. 2019 um 11:56 Uhr schrieb Christian Pedersen <
chrip...@gmail.com>:

> Hi,
>
> Using the S3 gateway I store ~4 million small files in my cluster every
> day. I have a lifecycle setup to move these files to cold storage after a
> day and delete them after two days.
>
> The default storage is SSD based and the cold storage is HDD.
>
> However the rgw lifecycle process cannot keep up with this. In a 24 hour
> period. A little less than a million files are moved per day (
> https://imgur.com/a/H52hD2h ). I have tried only enabling the delete part
> of the lifecycle, but even though it deleted from SSD storage, the result
> is the same. The screenshots are taken while there is no incoming files to
> the cluster.
>
> I'm running 5 rgw servers, but that doesn't really change anything from
> when I was running less. I've tried adjusting rgw lc max objs, but again no
> change in performance.
>
> Any suggestions on how I can tune the lifecycle process?
>
> Cheers,
> Christian
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rgw S3 lifecycle cannot keep up

2019-10-02 Thread Christian Pedersen
Hi,

Using the S3 gateway I store ~4 million small files in my cluster every
day. I have a lifecycle setup to move these files to cold storage after a
day and delete them after two days.

The default storage is SSD based and the cold storage is HDD.

However the rgw lifecycle process cannot keep up with this. In a 24 hour
period. A little less than a million files are moved per day (
https://imgur.com/a/H52hD2h ). I have tried only enabling the delete part
of the lifecycle, but even though it deleted from SSD storage, the result
is the same. The screenshots are taken while there is no incoming files to
the cluster.

I'm running 5 rgw servers, but that doesn't really change anything from
when I was running less. I've tried adjusting rgw lc max objs, but again no
change in performance.

Any suggestions on how I can tune the lifecycle process?

Cheers,
Christian
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com