[jira] [Commented] (CASSANDRA-8940) Inconsistent select count and select distinct

Frens Jan Rumph (JIRA) Wed, 22 Apr 2015 14:48:13 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507981#comment-14507981
 ]


Frens Jan Rumph commented on CASSANDRA-8940:
--------------------------------------------

Hi [~blerer],

Wow, thanks a lot for all the trouble you've been through!

Weirdest thing that you aren't able to reproduce the issue. It might give a 
clue though. I assume you are running the scripts from your host machine? If so 
it might be more of a client then a server related issue. Could you by any 
chance run the script from one of the nodes if you haven't done so already?

If you place the script in {{test.py}} next to the {{Vagrantfile}} you should 
be able to do something like (as root / with sudo):
{code}
curl https://bootstrap.pypa.io/get-pip.py | python
pip install cassandra-driver
cd /vagrant
python test.py cas-1 cas-2 cas-3
{code}

I have attached to csv dumps from {{system_traces.events}}:
7b74fb00-e935-11e4-b10c-317579db7eb4.csv which counted to 494453
8d5899d0-e935-11e4-847b-2d06da75a6cd.csv which counted to 494833

I wasn't able to count to the 500000 rows which were in the table with tracing 
enabled ... perhaps looking at differences between the traces reveals something?

The traces were generated from the script running from one of the Vagrant nodes 
by the way.

Cheers,
Frens Jan

> Inconsistent select count and select distinct
> ---------------------------------------------
>
>                 Key: CASSANDRA-8940
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8940
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 2.1.2
>            Reporter: Frens Jan Rumph
>            Assignee: Benjamin Lerer
>         Attachments: Vagrantfile, install_cassandra.sh, setup_hosts.sh
>
>
> When performing {{select count( * ) from ...}} I expect the results to be 
> consistent over multiple query executions if the table at hand is not written 
> to / deleted from in the mean time. However, in my set-up it is not. The 
> counts returned vary considerable (several percent). The same holds for 
> {{select distinct partition-key-columns from ...}}.
> I have a table in a keyspace with replication_factor = 1 which is something 
> like:
> {code}
> CREATE TABLE tbl (
>     id frozen<id_type>,
>     bucket bigint,
>     offset int,
>     value double,
>     PRIMARY KEY ((id, bucket), offset)
> )
> {code}
> The frozen udt is:
> {code}
> CREATE TYPE id_type (
>     tags map<text, text>
> );
> {code}
> The table contains around 35k rows (I'm not trying to be funny here ...). The 
> consistency level for the queries was ONE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8940) Inconsistent select count and select distinct

Reply via email to