Re: Adding Data-At-Rest compression support to Ceph

2015-09-28 Thread Igor Fedotov
On 25.09.2015 17:14, Sage Weil wrote: On Fri, 25 Sep 2015, Igor Fedotov wrote: Another thing to note is that we don't have the whole object ready for compression. We just have some new data block written(appended) to the object. And we should either compress that block and save mentioned

Re: Adding Data-At-Rest compression support to Ceph

2015-09-25 Thread Igor Fedotov
Another thing to note is that we don't have the whole object ready for compression. We just have some new data block written(appended) to the object. And we should either compress that block and save mentioned mapping data or decompress the existing object data and do full compression again.

Re: Adding Data-At-Rest compression support to Ceph

2015-09-25 Thread Igor Fedotov
On 24.09.2015 21:10, Gregory Farnum wrote: On Thu, Sep 24, 2015 at 8:13 AM, Igor Fedotov wrote: On 23.09.2015 21:03, Gregory Farnum wrote: Okay, that's acceptable, but that metadata then gets pretty large. You would need to store an offset, for each chunk in the PG,

Re: Adding Data-At-Rest compression support to Ceph

2015-09-25 Thread Sage Weil
On Fri, 25 Sep 2015, Igor Fedotov wrote: > Another thing to note is that we don't have the whole object ready for > compression. We just have some new data block written(appended) to the object. > And we should either compress that block and save mentioned mapping data or > decompress the existing

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread Igor Fedotov
Samuel, I completely agree about the need to have a blueprint before the implementation. But I think we should fix what approach to use ( when and how to perform the compression) first. I'll summarize existing suggestions and their Pros and Cons shortly. Thus we'll be able to discuss them more

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread Sage Weil
On Thu, 24 Sep 2015, Igor Fedotov wrote: > On 23.09.2015 21:03, Gregory Farnum wrote: > > On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil wrote: > > > > > > > > > > The idea of making the primary responsible for object compression > > > > > really concerns me. It means for instance

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread Igor Fedotov
As for me that's the first time I hear about it. But if we introduce pluggable compression back-ends that would be pretty easy to try. Thanks, Igor. On 24.09.2015 18:41, HEWLETT, Paul (Paul) wrote: Out of curiosity have you considered the Google compression algos:

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread Igor Fedotov
On 23.09.2015 21:03, Gregory Farnum wrote: On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil wrote: The idea of making the primary responsible for object compression really concerns me. It means for instance that a single random access will likely require access to multiple

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread HEWLETT, Paul (Paul)
Out of curiosity have you considered the Google compression algos: http://google-opensource.blogspot.co.uk/2015/09/introducing-brotli-new-comp ression.html Paul On 24/09/2015 16:34, "ceph-devel-ow...@vger.kernel.org on behalf of Sage Weil"

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread Igor Fedotov
On 24.09.2015 18:34, Sage Weil wrote: I was also assuming each stripe unit would be independently compressed, but I didn't think about the efficiency. This approach implies that you'd want a relatively large stripe size (100s of KB or more). Hmm, a quick google search suggests the zlib

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread Igor Fedotov
On 24.09.2015 19:03, Sage Weil wrote: On Thu, 24 Sep 2015, Igor Fedotov wrote: There is probably no need in strict alignment with the stripe size. We can use block sizes that client provides on write dynamically. If some client writes in stripes - then we compress that block. If others use

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread Igor Fedotov
On 24.09.2015 19:03, Sage Weil wrote: On Thu, 24 Sep 2015, Igor Fedotov wrote: Dynamic stripe sizes are possible but it's a significant change from the way the EC pool currently works. I would make that a separate project (as its useful in its own right) and not complicate the compression

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread Gregory Farnum
On Thu, Sep 24, 2015 at 8:13 AM, Igor Fedotov wrote: > On 23.09.2015 21:03, Gregory Farnum wrote: >> >> On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil wrote: > > > The idea of making the primary responsible for object compression > really

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I'm probably missing something, but since we are talking about data at rest, can't we just have the OSD compress the object as it goes to disk? Instead of rbd\udata.1ba49c10d9b00c.6859__head_2AD1002B__11 it would be

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread Samuel Just
The catch is that currently accessing 4k in the middle of a 4MB object does not require reading the whole object, so you'd need some kind of logical offset -> compressed offset mapping. -Sam On Thu, Sep 24, 2015 at 10:36 AM, Robert LeBlanc wrote: > -BEGIN PGP SIGNED

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Sage Weil
On Wed, 23 Sep 2015, Igor Fedotov wrote: > Hi Sage, > thanks a lot for your feedback. > > Regarding issues with offset mapping and stripe size exposure. > What's about the idea to apply compression in two-tier (cache+backing storage) > model only ? I'm not sure we win anything by making it a

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Igor Fedotov
Hi Sage, thanks a lot for your feedback. Regarding issues with offset mapping and stripe size exposure. What's about the idea to apply compression in two-tier (cache+backing storage) model only ? I doubt single-tier one is widely used for EC pools since there is no random write support in such

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Igor Fedotov
On 23.09.2015 17:05, Gregory Farnum wrote: On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil wrote: On Wed, 23 Sep 2015, Igor Fedotov wrote: Hi Sage, thanks a lot for your feedback. Regarding issues with offset mapping and stripe size exposure. What's about the idea to apply

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil wrote: > On Wed, 23 Sep 2015, Igor Fedotov wrote: >> Hi Sage, >> thanks a lot for your feedback. >> >> Regarding issues with offset mapping and stripe size exposure. >> What's about the idea to apply compression in two-tier

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Igor Fedotov
Sage, so you are saying that radosgw tend to use EC pools directly without caching, right? I agree that we need offset mapping anyway. And the difference between cache writes and direct writes is mainly in block size granularity: 8 Mb vs. 4 Kb. In the latter case we have higher overhead

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Sage Weil
On Wed, 23 Sep 2015, Igor Fedotov wrote: > Sage, > > so you are saying that radosgw tend to use EC pools directly without caching, > right? > > I agree that we need offset mapping anyway. > > And the difference between cache writes and direct writes is mainly in block > size granularity: 8 Mb

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Samuel Just
I think before moving forward with any sort of implementation, the design would need to be pretty much completely mapped out -- particularly how the offset mapping will be handled and stored. The right thing to do would be to produce a blueprint and submit it to the list. I also would vastly

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 8:26 AM, Igor Fedotov wrote: > > > On 23.09.2015 17:05, Gregory Farnum wrote: >> >> On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil wrote: >>> >>> On Wed, 23 Sep 2015, Igor Fedotov wrote: Hi Sage, thanks a lot for your

Re: Adding Data-At-Rest compression support to Ceph

2015-09-22 Thread Sage Weil
On Tue, 22 Sep 2015, Igor Fedotov wrote: > Hi guys, > > I can find some talks about adding compression support to Ceph. Let me share > some thoughts and proposals on that too. > > First of all I?d like to consider several major implementation options > separately. IMHO this makes sense since

Adding Data-At-Rest compression support to Ceph

2015-09-22 Thread Igor Fedotov
Hi guys, I can find some talks about adding compression support to Ceph. Let me share some thoughts and proposals on that too. First of all I’d like to consider several major implementation options separately. IMHO this makes sense since they have different applicability, value and