[
https://issues.apache.org/jira/browse/CASSANDRA-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943586#comment-13943586
]
Bill Mitchell commented on CASSANDRA-6825:
------------------------------------------
As it happens, I have that info handy as my JUnit testcase includes it in the
log4j output:
CREATE TABLE testdb_1395374703023.sr (
siteid text,
listid bigint,
partition int,
createdate timestamp,
emailcrypt text,
emailaddr text,
properties text,
removedate timestamp,
PRIMARY KEY ((siteid, listid, partition), createdate, emailcrypt)
) WITH CLUSTERING ORDER BY (createdate DESC, emailcrypt ASC)
AND read_repair_chance = 0.1
AND dclocal_read_repair_chance = 0.0
AND replicate_on_write = true
AND gc_grace_seconds = 864000
AND bloom_filter_fp_chance = 0.01
AND caching = 'KEYS_ONLY'
AND comment = ''
AND compaction = { 'class' :
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
AND compression = { 'sstable_compression' :
'org.apache.cassandra.io.compress.SnappyCompressor' };
(siteID was a BIGINT until recently when the schema was changed to TEXT to
match the use of siteID elsewhere in the product. I had not thought to
represent our Java String as a Cassandra UUID.)
> COUNT(*) with WHERE not finding all the matching rows
> -----------------------------------------------------
>
> Key: CASSANDRA-6825
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6825
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: quad core Windows7 x64, single node cluster
> Cassandra 2.0.5
> Reporter: Bill Mitchell
> Assignee: Tyler Hobbs
> Attachments: cassandra.log, selectpartitions.zip,
> selectrowcounts.txt, testdb_1395372407904.zip, testdb_1395372407904.zip
>
>
> Investigating another problem, I needed to do COUNT(*) on the several
> partitions of a table immediately after a test case ran, and I discovered
> that count(*) on the full table and on each of the partitions returned
> different counts.
> In particular case, SELECT COUNT(*) FROM sr LIMIT 1000000; returned the
> expected count from the test 99999 rows. The composite primary key splits
> the logical row into six distinct partitions, and when I issue a query asking
> for the total across all six partitions, the returned result is only 83999.
> Drilling down, I find that SELECT * from sr WHERE s = 5 AND l = 11 AND
> partition = 0; returns 30,000 rows, but a SELECT COUNT(*) with the identical
> WHERE predicate reports only 14,000.
> This is failing immediately after running a single small test, such that
> there are only two SSTables, sr-jb-1 and sr-jb-2. Compaction never needed to
> run.
> In selectrowcounts.txt is a copy of the cqlsh output showing the incorrect
> count(*) results.
--
This message was sent by Atlassian JIRA
(v6.2#6252)