Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-21 Thread kurt greaves
> > Also, I was wondering if the key cache maintains a count of how many local > accesses a key undergoes. Such information might be very useful for > compactions of sstables by splitting data by frequency of use so that those > can be preferentially compacted. No we don't currently have metrics

Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-21 Thread Carl Mueller
Also, I was wondering if the key cache maintains a count of how many local accesses a key undergoes. Such information might be very useful for compactions of sstables by splitting data by frequency of use so that those can be preferentially compacted. On Wed, Feb 21, 2018 at 5:08 PM, Carl Mueller

Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-21 Thread Carl Mueller
Looking through the 2.1.X code I see this: org.apache.cassandra.io.sstable.Component.java In the enum for component types there is a CUSTOM enum value which seems to indicate a catchall for providing metadata for sstables. Has this been exploited... ever? I noticed in some of the patches for

Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-21 Thread Carl Mueller
jon: I am planning on writing a custom compaction strategy. That's why the question is here, I figured the specifics of memtable -> sstable and cassandra internals are not a user question. If that still isn't deep enough for the dev thread, I will move all those questions to user. On Wed, Feb 21,

Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-21 Thread Carl Mueller
Thank you all! On Tue, Feb 20, 2018 at 7:35 PM, kurt greaves wrote: > Probably a lot of work but it would be incredibly useful for vnodes if > flushing was range aware (to be used with RangeAwareCompactionStrategy). > The writers are already range aware for JBOD, but

Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-20 Thread kurt greaves
Probably a lot of work but it would be incredibly useful for vnodes if flushing was range aware (to be used with RangeAwareCompactionStrategy). The writers are already range aware for JBOD, but that's not terribly valuable ATM. On 20 February 2018 at 21:57, Jeff Jirsa wrote: >

Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-20 Thread Jeff Jirsa
There are some arguments to be made that the flush should consider compaction strategy - would allow a bug flush to respect LCS filesizes or break into smaller pieces to try to minimize range overlaps going from l0 into l1, for example. I have no idea how much work would be involved, but may

Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-20 Thread Jon Haddad
The file format is independent from compaction. A compaction strategy only selects sstables to be compacted, that’s it’s only job. It could have side effects, like generating other files, but any decent compaction strategy will account for the fact that those other files don’t exist. I