[
https://issues.apache.org/jira/browse/CASSANDRA-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189291#comment-13189291
]
Dominic Williams commented on CASSANDRA-3748:
---------------------------------------------
Hey, anyone got any ideas on this bug yet?
> Range ghosts don't disappear as expected and accumulate
> -------------------------------------------------------
>
> Key: CASSANDRA-3748
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3748
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 1.0.3
> Environment: Cassandra on Debian
> Reporter: Dominic Williams
> Labels: compaction, ghost-row, range, remove
> Fix For: 1.0.8
>
> Original Estimate: 6h
> Remaining Estimate: 6h
>
> I have a problem where range ghosts are accumulating and cannot be removed by
> reducing GCSeconds and compacting.
> In our system, we have some cfs that represent "markets" where each row
> represents an item. Once an item is sold, it is removed from the market by
> passing its key to remove().
> The problem, which was hidden for some time by caching, is appearing on read.
> Every few seconds our system collates a random sample from each cf/market by
> choosing a random starting point:
> String startKey = RNG.nextUUID())
> and then loading a page range of rows, specifying the key range as:
> KeyRange keyRange = new KeyRange(pageSize);
> keyRange.setStart_key(startKey);
> keyRange.setEnd_key(maxKey);
> The returned rows are iterated over, and ghosts ignored. If insufficient rows
> are obtained, the process is repeated using the key of the last row as the
> starting key (or wrapping if necessary etc).
> When performance was lagging, we did a test and found that constructing a
> random sample of 40 items (rows) involved iterating over hundreds of
> thousands of ghost rows.
> Our first attempt to deal with this was to halve our GCGraceSeconds and then
> perform major compactions. However, this had no effect on the number of ghost
> rows being returned. Furthermore, on examination it seems clear that the
> number of ghost rows being created within GCSeconds window must be smaller
> than the number being returned. Thus looks like a bug.
> We are using Cassandra 1.0.3 with Sylain's patch from CASSANDRA-3510
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira