Re: Compaction Filter in Cassandra

2016-03-19 Thread Dikang Gu
Fyi, this is the jira, https://issues.apache.org/jira/browse/CASSANDRA-11348
.

We can move the discussion to the jira if want.

On Thu, Mar 17, 2016 at 11:46 AM, Dikang Gu  wrote:

> Hi Eric,
>
> Thanks for sharing the information!
>
> We also mainly want to use it for trimming data, either by the time or the
> number of columns in a row. We haven't started the work yet, do you mind to
> share some patches? We'd love to try it and test it in our environment.
>
> Thanks.
>
> On Tue, Mar 15, 2016 at 9:36 PM, Eric Stevens  wrote:
>
>> We have been working on filtering compaction for a month or so (though we
>> call it deleting compaction, its implementation is as a filtering
>> compaction strategy).  The feature is nearing completion, and we have used
>> it successfully in a limited production capacity against DSE 4.8 series.
>>
>> Our use case is that our records are written anywhere between a month, up
>> to several years before they are scheduled for deletion.  Tombstones are
>> too expensive, as we have tables with hundreds of billions of rows.  In
>> addition, traditional TTLs don't work for us because our customers are
>> permitted to change their retention policy such that already-written
>> records should not be deleted if they increase their retention after the
>> record was written (or vice versa).
>>
>> We can clean up data more cheaply and more quickly with filtered
>> compaction than with tombstones and traditional compaction.  Our
>> implementation is a wrapper compaction strategy for another underlying
>> strategy, so that you can have the characteristics of whichever strategy
>> makes sense in terms of managing your SSTables, while interceding and
>> removing records during compaction (including cleaning up secondary
>> indexes) that otherwise would have survived into the new SSTable.
>>
>> We are hoping to contribute it back to the community, so if you'd be
>> interested in helping test it out, I'd love to hear from you.
>>
>> On Sat, Mar 12, 2016 at 5:12 AM Marcus Eriksson 
>> wrote:
>>
>>> We don't have anything like that, do you have a specific use case in
>>> mind?
>>>
>>> Could you create a JIRA ticket and we can discuss there?
>>>
>>> /Marcus
>>>
>>> On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu  wrote:
>>>
 Hello there,

 RocksDB has the feature called "Compaction Filter" to allow application
 to modify/delete a key-value during the background compaction.
 https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226

 I'm wondering is there a plan/value to add this into C* as well? Or is
 there already a similar thing in C*?

 Thanks

 --
 Dikang


>>>
>
>
> --
> Dikang
>
>


-- 
Dikang


Re: Compaction Filter in Cassandra

2016-03-19 Thread Clint Martin
I would definitely be interested in this.

Clint
On Mar 15, 2016 9:36 PM, "Eric Stevens"  wrote:

> We have been working on filtering compaction for a month or so (though we
> call it deleting compaction, its implementation is as a filtering
> compaction strategy).  The feature is nearing completion, and we have used
> it successfully in a limited production capacity against DSE 4.8 series.
>
> Our use case is that our records are written anywhere between a month, up
> to several years before they are scheduled for deletion.  Tombstones are
> too expensive, as we have tables with hundreds of billions of rows.  In
> addition, traditional TTLs don't work for us because our customers are
> permitted to change their retention policy such that already-written
> records should not be deleted if they increase their retention after the
> record was written (or vice versa).
>
> We can clean up data more cheaply and more quickly with filtered
> compaction than with tombstones and traditional compaction.  Our
> implementation is a wrapper compaction strategy for another underlying
> strategy, so that you can have the characteristics of whichever strategy
> makes sense in terms of managing your SSTables, while interceding and
> removing records during compaction (including cleaning up secondary
> indexes) that otherwise would have survived into the new SSTable.
>
> We are hoping to contribute it back to the community, so if you'd be
> interested in helping test it out, I'd love to hear from you.
>
> On Sat, Mar 12, 2016 at 5:12 AM Marcus Eriksson  wrote:
>
>> We don't have anything like that, do you have a specific use case in mind?
>>
>> Could you create a JIRA ticket and we can discuss there?
>>
>> /Marcus
>>
>> On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu  wrote:
>>
>>> Hello there,
>>>
>>> RocksDB has the feature called "Compaction Filter" to allow application
>>> to modify/delete a key-value during the background compaction.
>>> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226
>>>
>>> I'm wondering is there a plan/value to add this into C* as well? Or is
>>> there already a similar thing in C*?
>>>
>>> Thanks
>>>
>>> --
>>> Dikang
>>>
>>>
>>


Re: Compaction Filter in Cassandra

2016-03-19 Thread Dikang Gu
Hi Eric,

Thanks for sharing the information!

We also mainly want to use it for trimming data, either by the time or the
number of columns in a row. We haven't started the work yet, do you mind to
share some patches? We'd love to try it and test it in our environment.

Thanks.

On Tue, Mar 15, 2016 at 9:36 PM, Eric Stevens  wrote:

> We have been working on filtering compaction for a month or so (though we
> call it deleting compaction, its implementation is as a filtering
> compaction strategy).  The feature is nearing completion, and we have used
> it successfully in a limited production capacity against DSE 4.8 series.
>
> Our use case is that our records are written anywhere between a month, up
> to several years before they are scheduled for deletion.  Tombstones are
> too expensive, as we have tables with hundreds of billions of rows.  In
> addition, traditional TTLs don't work for us because our customers are
> permitted to change their retention policy such that already-written
> records should not be deleted if they increase their retention after the
> record was written (or vice versa).
>
> We can clean up data more cheaply and more quickly with filtered
> compaction than with tombstones and traditional compaction.  Our
> implementation is a wrapper compaction strategy for another underlying
> strategy, so that you can have the characteristics of whichever strategy
> makes sense in terms of managing your SSTables, while interceding and
> removing records during compaction (including cleaning up secondary
> indexes) that otherwise would have survived into the new SSTable.
>
> We are hoping to contribute it back to the community, so if you'd be
> interested in helping test it out, I'd love to hear from you.
>
> On Sat, Mar 12, 2016 at 5:12 AM Marcus Eriksson  wrote:
>
>> We don't have anything like that, do you have a specific use case in mind?
>>
>> Could you create a JIRA ticket and we can discuss there?
>>
>> /Marcus
>>
>> On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu  wrote:
>>
>>> Hello there,
>>>
>>> RocksDB has the feature called "Compaction Filter" to allow application
>>> to modify/delete a key-value during the background compaction.
>>> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226
>>>
>>> I'm wondering is there a plan/value to add this into C* as well? Or is
>>> there already a similar thing in C*?
>>>
>>> Thanks
>>>
>>> --
>>> Dikang
>>>
>>>
>>


-- 
Dikang


Re: Compaction Filter in Cassandra

2016-03-15 Thread Eric Stevens
We have been working on filtering compaction for a month or so (though we
call it deleting compaction, its implementation is as a filtering
compaction strategy).  The feature is nearing completion, and we have used
it successfully in a limited production capacity against DSE 4.8 series.

Our use case is that our records are written anywhere between a month, up
to several years before they are scheduled for deletion.  Tombstones are
too expensive, as we have tables with hundreds of billions of rows.  In
addition, traditional TTLs don't work for us because our customers are
permitted to change their retention policy such that already-written
records should not be deleted if they increase their retention after the
record was written (or vice versa).

We can clean up data more cheaply and more quickly with filtered compaction
than with tombstones and traditional compaction.  Our implementation is a
wrapper compaction strategy for another underlying strategy, so that you
can have the characteristics of whichever strategy makes sense in terms of
managing your SSTables, while interceding and removing records during
compaction (including cleaning up secondary indexes) that otherwise would
have survived into the new SSTable.

We are hoping to contribute it back to the community, so if you'd be
interested in helping test it out, I'd love to hear from you.

On Sat, Mar 12, 2016 at 5:12 AM Marcus Eriksson  wrote:

> We don't have anything like that, do you have a specific use case in mind?
>
> Could you create a JIRA ticket and we can discuss there?
>
> /Marcus
>
> On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu  wrote:
>
>> Hello there,
>>
>> RocksDB has the feature called "Compaction Filter" to allow application
>> to modify/delete a key-value during the background compaction.
>> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226
>>
>> I'm wondering is there a plan/value to add this into C* as well? Or is
>> there already a similar thing in C*?
>>
>> Thanks
>>
>> --
>> Dikang
>>
>>
>


Re: Compaction Filter in Cassandra

2016-03-12 Thread Marcus Eriksson
We don't have anything like that, do you have a specific use case in mind?

Could you create a JIRA ticket and we can discuss there?

/Marcus

On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu  wrote:

> Hello there,
>
> RocksDB has the feature called "Compaction Filter" to allow application to
> modify/delete a key-value during the background compaction.
> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226
>
> I'm wondering is there a plan/value to add this into C* as well? Or is
> there already a similar thing in C*?
>
> Thanks
>
> --
> Dikang
>
>