[
https://issues.apache.org/jira/browse/CASSANDRA-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuki Morishita updated CASSANDRA-4316:
--------------------------------------
Attachment: 4316-1.2.txt
For more accurate throttling for both skinny and wide rows, I replaced
cassandra's Throttle with guava's RateLimiter.
RateLimiter is held by CompactionManager to be used globally. RateLimiter is
set to have bytes per second permits, and when SSTableScanner tries to read the
data of one row, it acquires RateLimiter's permission based on row size.
This way we don't have to wait 1000 rows to be read in order to throttle read
in compaction.
> Compaction Throttle too bursty with large rows
> ----------------------------------------------
>
> Key: CASSANDRA-4316
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4316
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.8.0
> Reporter: Wayne Lewis
> Assignee: Yuki Morishita
> Fix For: 1.2.1
>
> Attachments: 4316-1.2.txt
>
>
> In org.apache.cassandra.db.compaction.CompactionIterable the check for
> compaction throttling occurs once every 1000 rows. In our workload this is
> much too large as we have many large rows (16 - 100 MB).
> With a 100 MB row, about 100 GB is read (and possibly written) before the
> compaction throttle sleeps. This causes bursts of essentially unthrottled
> compaction IO followed by a long sleep which yields inconsistence performance
> and high error rates during the bursts.
> We applied a workaround to check throttle every row which solved our
> performance and error issues:
> line 116 in org.apache.cassandra.db.compaction.CompactionIterable:
> if ((row++ % 1000) == 0)
> replaced with
> if ((row++ % 1) == 0)
> I think the better solution is to calculate how often throttle should be
> checked based on the throttle rate to apply sleeps more consistently. E.g. if
> 16MB/sec is the limit then check for sleep after every 16MB is read so sleeps
> are spaced out about every second.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira