[
https://issues.apache.org/jira/browse/CASSANDRA-8940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516949#comment-14516949
]
Frens Jan Rumph commented on CASSANDRA-8940:
--------------------------------------------
Thanks for the update. I guess you are on to something. Again, if there's
anything I can help with. I'm happy to pitch in.
(a bit of topic): I wasn't aware that Cassandra performs the count on the
coordinator. I wonder why one couldn't push the count operator to the replicas
involved. I see that aggregate functions in Cassandra trunk are implemented in
a similar fashion. A pity if you ask me.
As I understand it, select count queries operate on top of normal select all
queries. Does this mean that this 'skipping' of rows might also be a problem in
other cases? Or is it only a problem because the result set is processed/paged
on a Cassandra node and not in a driver?
> Inconsistent select count and select distinct
> ---------------------------------------------
>
> Key: CASSANDRA-8940
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8940
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: 2.1.2
> Reporter: Frens Jan Rumph
> Assignee: Benjamin Lerer
> Attachments: 7b74fb00-e935-11e4-b10c-317579db7eb4.csv,
> 8d5899d0-e935-11e4-847b-2d06da75a6cd.csv, Vagrantfile, install_cassandra.sh,
> setup_hosts.sh
>
>
> When performing {{select count( * ) from ...}} I expect the results to be
> consistent over multiple query executions if the table at hand is not written
> to / deleted from in the mean time. However, in my set-up it is not. The
> counts returned vary considerable (several percent). The same holds for
> {{select distinct partition-key-columns from ...}}.
> I have a table in a keyspace with replication_factor = 1 which is something
> like:
> {code}
> CREATE TABLE tbl (
> id frozen<id_type>,
> bucket bigint,
> offset int,
> value double,
> PRIMARY KEY ((id, bucket), offset)
> )
> {code}
> The frozen udt is:
> {code}
> CREATE TYPE id_type (
> tags map<text, text>
> );
> {code}
> The table contains around 35k rows (I'm not trying to be funny here ...). The
> consistency level for the queries was ONE.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)