[
https://issues.apache.org/jira/browse/CASSANDRA-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Kapilevich updated CASSANDRA-5409:
---------------------------------------
Description:
I'm questioning the design decision to use sstables for storing Hinted Handoffs.
sstables are optimized for reads. Things like bloom-filters, indexes, and the
like - none of these are necessary for Hinted Handoffs.
After turning off Hinted Handoffs, I'm still seeing Hinted Handoffs be
compacted a week later. The fact that they are compacted in the first place
doesn't seem right. The whole purpose of compaction is to optimize sstables for
reads, which doesn't apply here.
In our case, this is exacerbated by using Leveled Compaction. The overhead of
compactions is significantly larger with LCS. When compactions begin to backup
under heavy write load, Hinted Handoffs contribute to that.
Another thing that makes it worse (I think) is that Hinted Handoffs are stored
on Coordinator nodes after 1.0. That means that these are being compacted
across all key-ranges.
It seems that Hinted Handoffs should be persisted in a simple queue-like
data-structure, that's not sorted by keys. The only thing the data-structure
needs to support is the ability to replay them in order.
A simpler improvement would be to introduce max_hint_window_size_in_mb
property, in addition to max_hint_window_in_ms. That would at least allow you
to control how much these build up.
was:
I'm questioning the design decision to use sstables for storing Hinted Handoffs.
sstables are optimized for reads. Things like bloom-filters, indexes, and the
like - none of these are necessary for Hinted Handoffs.
After turning off Hinted Handoffs, I'm still seeing Hinted Handoffs be
compacted a week later. The fact that they are compacted in the first place
doesn't seem right. The whole purpose of compaction is to optimize these tables
for reads, which doesn't apply here.
In our case, this is exacerbated by using Leveled Compaction. The overhead of
compactions is significantly larger with LCS. When compactions begin to backup
under heavy write load, Hinted Handoffs contribute to that.
Another thing that makes it worse (I think) is that Hinted Handoffs are stored
on Coordinator nodes after 1.0. That means that these are being compacted
across all key-ranges.
It seems that Hinted Handoffs should be persisted in a simple queue-like
data-structure, that's not sorted by keys. The only thing the data-structure
needs to support is the ability to replay them in order.
A simpler improvement would be to introduce max_hint_window_size_in_mb
property, in addition to max_hint_window_in_ms. That would at least allow you
to control how much these build up.
> Hinted Handoffs shouldn't use sstables for persistance
> ------------------------------------------------------
>
> Key: CASSANDRA-5409
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5409
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Matt Kapilevich
>
> I'm questioning the design decision to use sstables for storing Hinted
> Handoffs.
> sstables are optimized for reads. Things like bloom-filters, indexes, and the
> like - none of these are necessary for Hinted Handoffs.
> After turning off Hinted Handoffs, I'm still seeing Hinted Handoffs be
> compacted a week later. The fact that they are compacted in the first place
> doesn't seem right. The whole purpose of compaction is to optimize sstables
> for reads, which doesn't apply here.
> In our case, this is exacerbated by using Leveled Compaction. The overhead of
> compactions is significantly larger with LCS. When compactions begin to
> backup under heavy write load, Hinted Handoffs contribute to that.
> Another thing that makes it worse (I think) is that Hinted Handoffs are
> stored on Coordinator nodes after 1.0. That means that these are being
> compacted across all key-ranges.
> It seems that Hinted Handoffs should be persisted in a simple queue-like
> data-structure, that's not sorted by keys. The only thing the data-structure
> needs to support is the ability to replay them in order.
> A simpler improvement would be to introduce max_hint_window_size_in_mb
> property, in addition to max_hint_window_in_ms. That would at least allow you
> to control how much these build up.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira