[
https://issues.apache.org/jira/browse/CASSANDRA-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuki Morishita updated CASSANDRA-5099:
--------------------------------------
Attachment: 5099-1.1.txt
Fix for CASSANDRA-4833 assumes that only last page from get_slice has columns
less than requested, and if that happens, returns the column count collected.
But in case of ttl, it is possible to have less columns at the middle of the
pagenation.
Attached patch changes loop termination check so that it does not depend on
column count, which is what we had in 1.0.
> Since 1.1, get_count sometimes returns value smaller than actual column count
> -----------------------------------------------------------------------------
>
> Key: CASSANDRA-5099
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5099
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.1.7
> Reporter: Jason Harvey
> Assignee: Yuki Morishita
> Fix For: 1.1.9
>
> Attachments: 5099-1.1.txt
>
>
> We have a CF where rows have thousands of TTLd columns. The columns are
> continually added at a regular rate, and TTL out after 15 minutes. We
> continually run a 'get_count' on these keys to get a count of the number of
> live columns.
> Since we upgrade from 1.0 to 1.1.7, "get_count" regularly returns much
> smaller values than are possible. For example, with roughly 15,000 columns
> that have well-distributed TTLs, running a get_count 10 times will result in
> 1 or 2 results that are up to half the actual column count. Using a normal
> 'get' to count those columns always results in proper values.
> For example:
> (all of these counts were ran within a second or less of eachother)
> {code}
> [default@reddit] count AccountsActiveBySR['2qh0u'];
> 13665 columns
> [default@reddit] count AccountsActiveBySR['2qh0u'];
> 13665 columns
> [default@reddit] count AccountsActiveBySR['2qh0u'];
> 13666 columns
> [default@reddit] count AccountsActiveBySR['2qh0u'];
> 3069 columns
> [default@reddit] count AccountsActiveBySR['2qh0u'];
> 13660 columns
> [default@reddit] count AccountsActiveBySR['2qh0u'];
> 13661 columns
> {code}
> I should note that this issue happens much more frequently with larger (>10k
> columns) rows than smaller rows. It never seems to happen with rows having
> fewer than 1k columns.
> There are no supercolumns in use. The key names and column names are very
> short, and there are no column values. The CF is LCS, and due to the TTL only
> hovers around a few MB in size. GC grace is normally at zero, but the problem
> is consistent with non-zero gc grace times.
> It appears that there was an issue (CASSANDRA-4833) fixed in 1.1.7 regarding
> get_count. Some logic was added to prevent an infinite loop case. Could that
> change have resulted in this problem somehow? I can't find any other relevant
> 1.1 changes that might explain this issue.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira