Re: scheduled work compaction strategy

2018-02-17 Thread Carl Mueller
I'm probably going to take a shot at doing it basing it off of TWCS. But I
don't know the fundamentals of compaction strategies and coding that well.
Fundamentally you have memtable sets being flushed out to sstables, and
then those sstables being reprocessed with background threads. And then
forming the bloom filters from the sstables, but that might be just a
cassandra service/method call.

For example, what is "aggressive tombstone subproperties"? Is that metadata
attached to sstables about tombstones within the sstables?

On Fri, Feb 16, 2018 at 8:17 PM, Jeff Jirsa  wrote:

> There’s a company using TWCS in this config - I’m not going to out them,
> but I think they do it (or used to) with aggressive tombstone sub
> properties. They may have since extended/enhanced it somewhat.
>
> --
> Jeff Jirsa
>
>
> > On Feb 16, 2018, at 2:24 PM, Carl Mueller 
> wrote:
> >
> > Oh and as a further refinement outside of our use case.
> >
> > If we could group/organize the sstables by the rowkey time value or
> > inherent TTL value, the naive version would be evenly distributed buckets
> > into the future.
> >
> > But many/most data patterns like this have "busy" data in the near term.
> > Far out scheduled stuff would be more sparse. In our case, 50% of the
> data
> > is in the first 12 hours, 50% of the remaining in the next day or two,
> 50%
> > of the remaining in the next week, etc etc.
> >
> > So we could have a "long term" general bucket to take data far in the
> > future. But here's the thing, if we could actively process the "long
> term"
> > sstable on a regular basis into two sstables: the stuff that is still
> "long
> > term" and sstables for the "near term", that could solve many general
> > cases. The "long term" bucket could even be STCS by default, and as the
> > near term comes into play, that is considered a different "level".
> >
> > Of course all this relies on the ability to look at the data in the
> rowkey
> > or the TTL associated with the row.
> >
> > On Fri, Feb 16, 2018 at 4:17 PM, Carl Mueller <
> carl.muel...@smartthings.com>
> > wrote:
> >
> >> We have a scheduler app here at smartthings, where we track per-second
> >> tasks to be executed.
> >>
> >> These are all TTL'd to be destroyed after the second the event was
> >> registered with has passed.
> >>
> >> If the scheduling window was sufficiently small, say, 1 day, we could
> >> probably use a time window compaction strategy with this. But the
> window is
> >> one-two years worth of adhoc event registration per the contract.
> >>
> >> Thus, the intermingling of all this data TTL'ing at the different times
> >> since they are registered at different times means the sstables are not
> >> written with data TTLing in the same rough time period. If they were,
> then
> >> compaction would be a relatively easy process since the entire sstable
> >> would tombstone.
> >>
> >> We could kind of do this by doing sharded tables for the time periods
> and
> >> rotating the shards for duty, and truncating them as they are recycled.
> >>
> >> But an elegant way would be a custom compaction strategy that would
> >> "window" the data into clustered sstables that could be compacted with
> >> other similarly time bucketed sstables.
> >>
> >> This would require visibility into the rowkey when it came time to
> convert
> >> the memtable data to sstables. Is that even possible with compaction
> >> schemes? We would provide a requirement that the time-based data would
> be
> >> in the row key if it is a composite row key, making it required.
> >>
> >>
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: scheduled work compaction strategy

2018-02-16 Thread Jeff Jirsa
There’s a company using TWCS in this config - I’m not going to out them, but I 
think they do it (or used to) with aggressive tombstone sub properties. They 
may have since extended/enhanced it somewhat.

-- 
Jeff Jirsa


> On Feb 16, 2018, at 2:24 PM, Carl Mueller  
> wrote:
> 
> Oh and as a further refinement outside of our use case.
> 
> If we could group/organize the sstables by the rowkey time value or
> inherent TTL value, the naive version would be evenly distributed buckets
> into the future.
> 
> But many/most data patterns like this have "busy" data in the near term.
> Far out scheduled stuff would be more sparse. In our case, 50% of the data
> is in the first 12 hours, 50% of the remaining in the next day or two, 50%
> of the remaining in the next week, etc etc.
> 
> So we could have a "long term" general bucket to take data far in the
> future. But here's the thing, if we could actively process the "long term"
> sstable on a regular basis into two sstables: the stuff that is still "long
> term" and sstables for the "near term", that could solve many general
> cases. The "long term" bucket could even be STCS by default, and as the
> near term comes into play, that is considered a different "level".
> 
> Of course all this relies on the ability to look at the data in the rowkey
> or the TTL associated with the row.
> 
> On Fri, Feb 16, 2018 at 4:17 PM, Carl Mueller 
> wrote:
> 
>> We have a scheduler app here at smartthings, where we track per-second
>> tasks to be executed.
>> 
>> These are all TTL'd to be destroyed after the second the event was
>> registered with has passed.
>> 
>> If the scheduling window was sufficiently small, say, 1 day, we could
>> probably use a time window compaction strategy with this. But the window is
>> one-two years worth of adhoc event registration per the contract.
>> 
>> Thus, the intermingling of all this data TTL'ing at the different times
>> since they are registered at different times means the sstables are not
>> written with data TTLing in the same rough time period. If they were, then
>> compaction would be a relatively easy process since the entire sstable
>> would tombstone.
>> 
>> We could kind of do this by doing sharded tables for the time periods and
>> rotating the shards for duty, and truncating them as they are recycled.
>> 
>> But an elegant way would be a custom compaction strategy that would
>> "window" the data into clustered sstables that could be compacted with
>> other similarly time bucketed sstables.
>> 
>> This would require visibility into the rowkey when it came time to convert
>> the memtable data to sstables. Is that even possible with compaction
>> schemes? We would provide a requirement that the time-based data would be
>> in the row key if it is a composite row key, making it required.
>> 
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: scheduled work compaction strategy

2018-02-16 Thread Carl Mueller
An even MORE complicated version could address the case where the TTLs are
at the column key rather than the row key. That would divide the row across
sstables by the rowkey, in essence the opposite of what most compaction
strategies try to do: eventually centralize the data for a rowkey in one
sstable. This strategy assumes TTLs would be cleaning up these row
fragments, so that the distribution of the data across many many sstables
wouldn't pollute the bloom filters too much.

On Fri, Feb 16, 2018 at 4:24 PM, Carl Mueller 
wrote:

> Oh and as a further refinement outside of our use case.
>
> If we could group/organize the sstables by the rowkey time value or
> inherent TTL value, the naive version would be evenly distributed buckets
> into the future.
>
> But many/most data patterns like this have "busy" data in the near term.
> Far out scheduled stuff would be more sparse. In our case, 50% of the data
> is in the first 12 hours, 50% of the remaining in the next day or two, 50%
> of the remaining in the next week, etc etc.
>
> So we could have a "long term" general bucket to take data far in the
> future. But here's the thing, if we could actively process the "long term"
> sstable on a regular basis into two sstables: the stuff that is still "long
> term" and sstables for the "near term", that could solve many general
> cases. The "long term" bucket could even be STCS by default, and as the
> near term comes into play, that is considered a different "level".
>
> Of course all this relies on the ability to look at the data in the rowkey
> or the TTL associated with the row.
>
> On Fri, Feb 16, 2018 at 4:17 PM, Carl Mueller <
> carl.muel...@smartthings.com> wrote:
>
>> We have a scheduler app here at smartthings, where we track per-second
>> tasks to be executed.
>>
>> These are all TTL'd to be destroyed after the second the event was
>> registered with has passed.
>>
>> If the scheduling window was sufficiently small, say, 1 day, we could
>> probably use a time window compaction strategy with this. But the window is
>> one-two years worth of adhoc event registration per the contract.
>>
>> Thus, the intermingling of all this data TTL'ing at the different times
>> since they are registered at different times means the sstables are not
>> written with data TTLing in the same rough time period. If they were, then
>> compaction would be a relatively easy process since the entire sstable
>> would tombstone.
>>
>> We could kind of do this by doing sharded tables for the time periods and
>> rotating the shards for duty, and truncating them as they are recycled.
>>
>> But an elegant way would be a custom compaction strategy that would
>> "window" the data into clustered sstables that could be compacted with
>> other similarly time bucketed sstables.
>>
>> This would require visibility into the rowkey when it came time to
>> convert the memtable data to sstables. Is that even possible with
>> compaction schemes? We would provide a requirement that the time-based data
>> would be in the row key if it is a composite row key, making it required.
>>
>>
>>
>


Re: scheduled work compaction strategy

2018-02-16 Thread Carl Mueller
Oh and as a further refinement outside of our use case.

If we could group/organize the sstables by the rowkey time value or
inherent TTL value, the naive version would be evenly distributed buckets
into the future.

But many/most data patterns like this have "busy" data in the near term.
Far out scheduled stuff would be more sparse. In our case, 50% of the data
is in the first 12 hours, 50% of the remaining in the next day or two, 50%
of the remaining in the next week, etc etc.

So we could have a "long term" general bucket to take data far in the
future. But here's the thing, if we could actively process the "long term"
sstable on a regular basis into two sstables: the stuff that is still "long
term" and sstables for the "near term", that could solve many general
cases. The "long term" bucket could even be STCS by default, and as the
near term comes into play, that is considered a different "level".

Of course all this relies on the ability to look at the data in the rowkey
or the TTL associated with the row.

On Fri, Feb 16, 2018 at 4:17 PM, Carl Mueller 
wrote:

> We have a scheduler app here at smartthings, where we track per-second
> tasks to be executed.
>
> These are all TTL'd to be destroyed after the second the event was
> registered with has passed.
>
> If the scheduling window was sufficiently small, say, 1 day, we could
> probably use a time window compaction strategy with this. But the window is
> one-two years worth of adhoc event registration per the contract.
>
> Thus, the intermingling of all this data TTL'ing at the different times
> since they are registered at different times means the sstables are not
> written with data TTLing in the same rough time period. If they were, then
> compaction would be a relatively easy process since the entire sstable
> would tombstone.
>
> We could kind of do this by doing sharded tables for the time periods and
> rotating the shards for duty, and truncating them as they are recycled.
>
> But an elegant way would be a custom compaction strategy that would
> "window" the data into clustered sstables that could be compacted with
> other similarly time bucketed sstables.
>
> This would require visibility into the rowkey when it came time to convert
> the memtable data to sstables. Is that even possible with compaction
> schemes? We would provide a requirement that the time-based data would be
> in the row key if it is a composite row key, making it required.
>
>
>


scheduled work compaction strategy

2018-02-16 Thread Carl Mueller
We have a scheduler app here at smartthings, where we track per-second
tasks to be executed.

These are all TTL'd to be destroyed after the second the event was
registered with has passed.

If the scheduling window was sufficiently small, say, 1 day, we could
probably use a time window compaction strategy with this. But the window is
one-two years worth of adhoc event registration per the contract.

Thus, the intermingling of all this data TTL'ing at the different times
since they are registered at different times means the sstables are not
written with data TTLing in the same rough time period. If they were, then
compaction would be a relatively easy process since the entire sstable
would tombstone.

We could kind of do this by doing sharded tables for the time periods and
rotating the shards for duty, and truncating them as they are recycled.

But an elegant way would be a custom compaction strategy that would
"window" the data into clustered sstables that could be compacted with
other similarly time bucketed sstables.

This would require visibility into the rowkey when it came time to convert
the memtable data to sstables. Is that even possible with compaction
schemes? We would provide a requirement that the time-based data would be
in the row key if it is a composite row key, making it required.