You’re probably hitting https://issues.apache.org/jira/browse/CASSANDRA-8940: 
Inconsistent select count and select distinct
It’s resolved (as I understand, a non-thread-safe object was shared between 
threads) and the patch will be included in 2.1.6 and 2.0.16

It’s a showstopper for me too: while developing I sometimes need to rebuild 
stuff based on the complete dataset (should become *very* rare in production, 
but still).
However, as long as this bug is around, I can never be sure all records are 
included.

Unfortunately, I don’t see any schedule for releasing either version…

Luc


From: Josef Lindman Hörnlund [mailto:jo...@appdata.biz]
Sent: woensdag 3 juni 2015 12:16
To: user@cassandra.apache.org
Subject: Re: Different number of records from COPY command


I ran into that issue a while ago and it was because I hit the tombstone limit 
on one of the nodes. Try running `nodetool compact adlog 
'adclicklog20150528.csv` and see if that helps.

Josef Lindman Hörnlund

On 02 Jun 2015, at 17:48, Saurabh Chandolia 
<s.chando...@gmail.com<mailto:s.chando...@gmail.com>> wrote:

Still getting inconsistent number of records on consistency ALL and QUORUM. 
Following is the output of consistency ALL and QUORUM.

cqlsh:adlog> CONSISTENCY ALL;
Consistency level set to ALL.
cqlsh:adlog> copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv';
Processed 58000 rows; Write: 3065.60 rows/s
58463 rows exported in 21.353 seconds.
cqlsh:adlog> copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv';
Processed 63000 rows; Write: 3517.03 rows/s
63972 rows exported in 22.885 seconds.

cqlsh:adlog> CONSISTENCY QUORUM ;
Consistency level set to QUORUM.
cqlsh:adlog> copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv';
Processed 63000 rows; Write: 3443.37 rows/s
63440 rows exported in 21.987 seconds.
cqlsh:adlog> copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv';
Processed 65000 rows; Write: 3405.90 rows/s
65524 rows exported in 24.053 seconds.


- Saurabh

On Tue, Jun 2, 2015 at 9:09 PM, Anuj Wadehra 
<anujw_2...@yahoo.co.in<mailto:anujw_2...@yahoo.co.in>> wrote:
I have never exported data myself but can u just try setting 'consistency ALL' 
on cqlsh before executing command?

Thanks
Anuj Wadehra
Sent from Yahoo Mail on 
Android<https://overview.mail.yahoo.com/mobile/?.src=Android>
________________________________
From:"Saurabh Chandolia" <s.chando...@gmail.com<mailto:s.chando...@gmail.com>>
Date:Tue, 2 Jun, 2015 at 8:47 pm
Subject:Different number of records from COPY command
I am seeing different number of records each time I export a particular table. 
There were no writes/reads in this table while exporting the data. I am not 
able to understand why it is happening.
Am I missing something here?

Cassandra version: 2.1.4
Java driver version: 2.1.5
Cluster Size: 4 Nodes in same DC
Keyspace Replication factor: 2

Following commands were issued:
cqlsh:adlog> copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv';
Processed 68000 rows; Write: 3025.93 rows/s
68682 rows exported in 27.737 seconds.

cqlsh:adlog> copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv';
Processed 65000 rows; Write: 2821.06 rows/s
65535 rows exported in 26.667 seconds.

cqlsh:adlog> copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv';
Processed 66000 rows; Write: 3285.07 rows/s
66055 rows exported in 26.269 seconds.


cfstats for adlog.adclicklog20150528:
-------------------------------------------
$ nodetool cfstats adlog.adclicklog20150528
Keyspace: adlog
Read Count: 217
Read Latency: 2.773073732718894 ms.
Write Count: 103191
Write Latency: 0.10233075558915021 ms.
Pending Flushes: 0
Table: adclicklog20150528
SSTable count: 11
Space used (live): 37981202
Space used (total): 37981202
Space used by snapshots (total): 13407843
Off heap memory used (total): 25580
SSTable Compression Ratio: 0.26684147550494164
Number of keys (estimate): 5627
Memtable cell count: 94620
Memtable data size: 13459445
Memtable off heap memory used: 0
Memtable switch count: 19
Local read count: 217
Local read latency: 2.774 ms
Local write count: 103191
Local write latency: 0.103 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 7192
Bloom filter off heap memory used: 7104
Index summary off heap memory used: 980
Compression metadata off heap memory used: 17496
Compacted partition minimum bytes: 1110
Compacted partition maximum bytes: 182785
Compacted partition mean bytes: 27808
Average live cells per slice (last five minutes): 44.663594470046085
Maximum live cells per slice (last five minutes): 86.0
Average tombstones per slice (last five minutes): 0.0
Maximum tombstones per slice (last five minutes): 0.0

----------------

- Saurabh




Reply via email to