[
https://issues.apache.org/jira/browse/CASSANDRA-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13594635#comment-13594635
]
Sylvain Lebresne commented on CASSANDRA-3929:
---------------------------------------------
I note that if we're going to go the route of "we don't know how to do this
correctly, so we'll make it easy for people to implement their own incorrect,
but good enough for them, solution", then there is a maybe simpler solution
than improving the compaction strategy API.
Typically, we could take inspiration of Dave's idea of changing the indexer.
That is, we could provide a "SSTableWriteFilter" interface for which user could
provide custom implementation. That interface could look something like:
{noformat}
public interface SSTableWriteFilter {
public Column filter(ByteBuffer rowKey, Column column);
}
{noformat}
and the way it would work is that in SSTableWriter, each column would first go
through this filter (and then it would be indexed/written). Then a simple
filter filter to do the row size capping would just count columns for each
rowKey and start returning tombstones once the limit is reached (we may even
allow return null from filter() to just mean "skip that column").
I'm suggesting that because:
# It's not clearly obvious to me how to generalize the compaction strategy API
to make the row capping easy without leaking to much implementation detail.
# I suspect there could be other use for such filter. You could have (custom)
filter that just collect statistic (in fact, we may even be able to rewrite our
current statistic collector to use this interface). Or say you want to remove
all the TTL (or extend them) from all your data for some reason (maybe your
client code messed up and inserted data with a TTL too short). Then you could
write a trivial filter, and call upgradesstables and _voilĂ _.
Just a suggestion.
> Support row size limits
> -----------------------
>
> Key: CASSANDRA-3929
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3929
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Jonathan Ellis
> Priority: Minor
> Labels: ponies
> Fix For: 2.0
>
> Attachments: 3929_b.txt, 3929_c.txt, 3929_d.txt, 3929_e.txt,
> 3929_f.txt, 3929_g_tests.txt, 3929_g.txt, 3929.txt
>
>
> We currently support expiring columns by time-to-live; we've also had
> requests for keeping the most recent N columns in a row.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira