RE: CPU consumption of Cassandra

2014-09-23 Thread Leleu Eric
I tried to run cassandra-stress on some of my table as proposed by Jake Luciani. For a simple table, this tool is able to perform 8 read op/s with a few CPU consumption if I request the table by the PK(name, tenanted) Ex : TABLE : CREATE TABLE IF NOT EXISTS buckets (tenantid varchar, name

Re: CPU consumption of Cassandra

2014-09-23 Thread Chris Lohfink
Well, first off you shouldn't run stress tool on the node your testing. Give it its own box. With RF=N=2 your essentially testing a single machine locally which isnt the best indicator long term (optimizations available when reading data thats local to the node). 80k/sec on a system is

Is there harm from having all the nodes in the seed list?

2014-09-23 Thread Donald Smith
Is there any harm from having all the nodes listed in the seeds list in cassandra.yaml? Donald A. Smith | Senior Software Engineer P: 425.201.3900 x 3866 C: (206) 819-5965 F: (646) 443-2333 dona...@audiencescience.commailto:dona...@audiencescience.com [AudienceScience]

Cassandra sometimes times out on write queries and it spends majority amount of the CPU time on method org.apache.cassandra.db.marshal.AbstractCompositeType.compare()

2014-09-23 Thread Li, George
Hi, I am running some load test in a 5 node Cassandra cluster (EC2, single region, each node has 15 GB RAM, Cassandra version 2.0.6, replication factor 3). My Java program uses Java driver version 2.0.6 and it does 2000 rounds of batch write queries, each with 8 inserts, 8 updates and 8 deletes.

RE : CPU consumption of Cassandra

2014-09-23 Thread Leleu Eric
First of all, Thanks for your help ! :) Here is some details : With RF=N=2 your essentially testing a single machine locally which isnt the best indicator long term I will test with more nodes, (4 with RF = 2) but for now I'm limited to 2 nodes for non technical reason ... Well, first off

Re: Is there harm from having all the nodes in the seed list?

2014-09-23 Thread DuyHai Doan
Well, having all nodes in the seed list does not compromise any correctness of gossip protocol. However there will be extra network traffic when nodes are starting because it will ping all nodes for topology discovery, AFAIK On Tue, Sep 23, 2014 at 7:31 PM, Donald Smith

Re: CPU consumption of Cassandra

2014-09-23 Thread Chris Lohfink
CPU consumption may be affected from the cassandra-stress tool in 2nd example as well. Running on a separate system eliminates it as a possible cause. There is a little extra work but not anything that I think would be that obvious. tracing (can enable with nodetool) or profiling (ie with

Re: CPU consumption of Cassandra

2014-09-23 Thread DuyHai Doan
I had done some benching in the past when we faced high CPU usage even though data set is very small, sitting entirely in memory, read the report there: https://github.com/doanduyhai/Cassandra_Data_Model_Bench Our *partial *conclusion were: 1) slice query fetching a page of 64kb of data and

Re: CPU consumption of Cassandra

2014-09-23 Thread Daniel Chia
If I had to guess, it might be in part i could be due to inefficiencies in 2.0 with regards to CompositeType (which is used in CQL3 tables) -

Re: CPU consumption of Cassandra

2014-09-23 Thread DuyHai Doan
Nice catch Daniel. The comment from Sylvain explains a lot ! On Tue, Sep 23, 2014 at 11:33 PM, Daniel Chia danc...@coursera.org wrote: If I had to guess, it might be in part i could be due to inefficiencies in 2.0 with regards to CompositeType (which is used in CQL3 tables) -

How to get data which has changed within x minutes using CQL?

2014-09-23 Thread Check Peck
I have a table structure like below - CREATE TABLE client_data ( client_id int, consumer_id text, last_modified_date timestamp, PRIMARY KEY (client_id, last_modified_date, consumer_id) ) I have a query pattern like this - Give me everything for what has changed

Re: How to get data which has changed within x minutes using CQL?

2014-09-23 Thread DuyHai Doan
It is possible to request a range of data according to the last_modified_date but you still need to provide the client_id , the partition key, in any case On Wed, Sep 24, 2014 at 12:23 AM, Check Peck comptechge...@gmail.com wrote: I have a table structure like below - CREATE TABLE

Re: How to get data which has changed within x minutes using CQL?

2014-09-23 Thread Check Peck
Yes I can provide client_id in my where clause. So now my query pattern will be - Give me everything for what has changed within last 15 minutes or 5 minutes whose client_id is equal to 1? How does my query will look like then? On Tue, Sep 23, 2014 at 3:26 PM, DuyHai Doan doanduy...@gmail.com

Re: How to get data which has changed within x minutes using CQL?

2014-09-23 Thread DuyHai Doan
let previous15Min = now - 15 mins SELECT * FROM client_data WHERE client_id = 1 and last_modified_date = previous15Min Same thing for last 5 mins On Wed, Sep 24, 2014 at 12:32 AM, Check Peck comptechge...@gmail.com wrote: Yes I can provide client_id in my where clause. So now my query pattern

Re: How to get data which has changed within x minutes using CQL?

2014-09-23 Thread Check Peck
On Tue, Sep 23, 2014 at 3:41 PM, DuyHai Doan doanduy...@gmail.com wrote: now - 15 mins Can I run like this in CQL using cqlsh? SELECT * FROM client_data WHERE client_id = 1 and last_modified_date = now - 15 mins When I ran the above query I got an error on my cql client - Bad Request: line

Re: How to get data which has changed within x minutes using CQL?

2014-09-23 Thread DuyHai Doan
No, you need to compute yourself now - 15mins. CQL3 does not offer built-in functions to deal with dates right now Le 24 sept. 2014 00:47, Check Peck comptechge...@gmail.com a écrit : On Tue, Sep 23, 2014 at 3:41 PM, DuyHai Doan doanduy...@gmail.com wrote: now - 15 mins Can I run like

Reading SSTables Potential File Descriptor Leak 1.2.18

2014-09-23 Thread Tim Heckman
Hello, I ran in to a problem today where Cassandra 1.2.18 exhausted its number of permitted open file descriptors (65,535). This node has 256 tokens (vnodes) and runs in a test environment with relatively little traffic/data. As best I could tell, the majority of the file descriptors open were

RE: Reading SSTables Potential File Descriptor Leak 1.2.18

2014-09-23 Thread Job Thomas
Hi, It look like the offset in keycache is wrong !!. refresh the keycache may solve the issue. Thanks Regards Job M Thomas Platform Technology From: Tim Heckman [mailto:t...@pagerduty.com] Sent: Wed 9/24/2014 6:17 AM To: user@cassandra.apache.org Subject: