[ 
https://issues.apache.org/jira/browse/CASSANDRA-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kapilevich updated CASSANDRA-5409:
---------------------------------------

    Description: 
I'm questioning the design decision to use sstables for storing Hinted Handoffs.

sstables are optimized for reads. Things like bloom-filters, indexes, and the 
like - none of these are necessary for Hinted Handoffs.

After turning off Hinted Handoffs, I'm still seeing Hinted Handoffs be 
compacted a week later. The fact that they are compacted in the first place 
doesn't seem right. The whole purpose of compaction is to optimize sstables for 
reads, which doesn't apply here.

In our case, this is exacerbated by using Leveled Compaction. The overhead of 
compactions is significantly larger with LCS. When compactions begin to backup 
under heavy write load, Hinted Handoffs contribute to that.

Another thing that makes it worse (I think) is that Hinted Handoffs are stored 
on Coordinator nodes after 1.0. That means that these are being compacted 
across all key-ranges.

It seems that Hinted Handoffs should be persisted in a simple queue-like 
data-structure, that's not sorted by keys. The only thing the data-structure 
needs to support is the ability to replay them in order.

A simpler improvement would be to introduce max_hint_window_size_in_mb 
property, in addition to max_hint_window_in_ms. That would at least allow you 
to control how much these build up.

  was:
I'm questioning the design decision to use sstables for storing Hinted Handoffs.

sstables are optimized for reads. Things like bloom-filters, indexes, and the 
like - none of these are necessary for Hinted Handoffs.

After turning off Hinted Handoffs, I'm still seeing Hinted Handoffs be 
compacted a week later. The fact that they are compacted in the first place 
doesn't seem right. The whole purpose of compaction is to optimize these tables 
for reads, which doesn't apply here.

In our case, this is exacerbated by using Leveled Compaction. The overhead of 
compactions is significantly larger with LCS. When compactions begin to backup 
under heavy write load, Hinted Handoffs contribute to that.

Another thing that makes it worse (I think) is that Hinted Handoffs are stored 
on Coordinator nodes after 1.0. That means that these are being compacted 
across all key-ranges.

It seems that Hinted Handoffs should be persisted in a simple queue-like 
data-structure, that's not sorted by keys. The only thing the data-structure 
needs to support is the ability to replay them in order.

A simpler improvement would be to introduce max_hint_window_size_in_mb 
property, in addition to max_hint_window_in_ms. That would at least allow you 
to control how much these build up.

    
> Hinted Handoffs shouldn't use sstables for persistance
> ------------------------------------------------------
>
>                 Key: CASSANDRA-5409
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5409
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Matt Kapilevich
>
> I'm questioning the design decision to use sstables for storing Hinted 
> Handoffs.
> sstables are optimized for reads. Things like bloom-filters, indexes, and the 
> like - none of these are necessary for Hinted Handoffs.
> After turning off Hinted Handoffs, I'm still seeing Hinted Handoffs be 
> compacted a week later. The fact that they are compacted in the first place 
> doesn't seem right. The whole purpose of compaction is to optimize sstables 
> for reads, which doesn't apply here.
> In our case, this is exacerbated by using Leveled Compaction. The overhead of 
> compactions is significantly larger with LCS. When compactions begin to 
> backup under heavy write load, Hinted Handoffs contribute to that.
> Another thing that makes it worse (I think) is that Hinted Handoffs are 
> stored on Coordinator nodes after 1.0. That means that these are being 
> compacted across all key-ranges.
> It seems that Hinted Handoffs should be persisted in a simple queue-like 
> data-structure, that's not sorted by keys. The only thing the data-structure 
> needs to support is the ability to replay them in order.
> A simpler improvement would be to introduce max_hint_window_size_in_mb 
> property, in addition to max_hint_window_in_ms. That would at least allow you 
> to control how much these build up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to