RE: Creating 'Put' requests

2015-04-23 Thread Matthew Johnson
Hi Jim, I think I have found what I was looking for here: https://gist.github.com/yangzhe1991/10349122 I would end up with code that looks something like this: *public* *void** createSchema() {* * System.**out**.println(**CREATING SCHEMA**);* *

Re: What is 'Read Reuqests' on OpsCenter exaclty?

2015-04-23 Thread Sebastian Estevez
Carlos is right: *Read Requests* - The number of read requests per second on the coordinator nodes, analogous to client reads. Monitoring the number of requests over a given time period reveals system read workload and usage patterns. *Avg* - The average of values recorded during a time

Data model suggestions

2015-04-23 Thread Ali Akhtar
Hey all, We are working on moving a mysql based application to Cassandra. The workflow in mysql is this: We have two tables: active and archive . Every hour, we pull in data from an external API. The records which are active, are kept in 'active' table. Once a record is no longer active, its

Re: Data model suggestions

2015-04-23 Thread Manoj Khangaonkar
Hi, How do you determine if the record is no longer active ? Is it a perioidic process that goes through every record and checks when the last update happened ? regards On Thu, Apr 23, 2015 at 8:09 AM, Ali Akhtar ali.rac...@gmail.com wrote: Hey all, We are working on moving a mysql based

Re: timeout creating table

2015-04-23 Thread Sebastian Estevez
That is a problem, you should not have RF N. Do an alter table to fix it. This will affect your reads and writes if you're doing anything CL 1 -- timeouts. On Apr 23, 2015 4:35 AM, Jimmy Lin y2klyf+w...@gmail.com wrote: Also I am not sure it matters, but I just realized the keyspace created

Re: Creating 'Put' requests

2015-04-23 Thread Alex Popescu
On Thu, Apr 23, 2015 at 8:50 AM, Matthew Johnson matt.john...@algomi.com wrote: Unfortunately it seems that I was misinformed on the “dynamically creating timeseries columns” feature, and that this WAS deprecated in CQL3 – in order to dynamically create columns I would have to issue an ‘ALTER

RE: timeout creating table

2015-04-23 Thread Matthew Johnson
Hi Jimmy, I have very limited experience with Cassandra so far, but from following some tutorials to create keyspaces, create tables, and insert data, it definitely seems to me like creating keyspaces and tables is way slower than inserting data. Perhaps a more experienced user can confirm if

Re: Adding New Node Issue

2015-04-23 Thread Jeff Ferland
Sounds to me like your stream throughput value is too high. `notetool getstreamthroughput` and `notetool setstreamthroughput` will update this value live. Limit it to something lower so that the system isn’t overloaded by streaming. The bottleneck that slows things down is mostly to be disk or

Re: timeout creating table

2015-04-23 Thread Jimmy Lin
well i am pretty sure our CL is one. and the long pause seems happen somewhat randomly. But is creating keyspace or table statements has different treatment in terms of CL that may explain the long pause? thanks On Thu, Apr 23, 2015 at 8:04 AM, Sebastian Estevez sebastian.este...@datastax.com

Re: Data model suggestions

2015-04-23 Thread Ali Akhtar
That's returned by the external API we're querying. We query them for active records, if a previous active record isn't included in the results, that means its time to archive that record. On Thu, Apr 23, 2015 at 9:20 PM, Manoj Khangaonkar khangaon...@gmail.com wrote: Hi, How do you determine

Re: How much data is bootstrapping supposed to send?

2015-04-23 Thread Robert Coli
On Wed, Apr 22, 2015 at 11:57 PM, Dave Galbraith david92galbra...@gmail.com wrote: So I was expecting the load to drop to about 6.5 MB on my original node while the new node would pick up about 6.5 MB, so they'd be balanced, but instead the disk usage on my original node somehow increased by

RE: Creating 'Put' requests

2015-04-23 Thread Matthew Johnson
Hi Jim, This would still involve either having a fixed(ish) schema, with a handful of pre-written prepared statements that I fill the values into, or some rather horrific StringBuilder that generates the statement based on some logic. Prepared Statements work great, for example, for inserting

Adding New Node Issue

2015-04-23 Thread Thomas Miller
Hello, Yesterday we ran into a serious issue while joining a new node to our existing 4 node Cassandra cluster (version 2.0.7). The average node data size is 152GB's with a replication factor of 3. The node was prepped just like the following document describes -

RE: Adding New Node Issue

2015-04-23 Thread Thomas Miller
Andrei, I did not see that bug report. Thanks for the heads up on that. I am thinking that that is still not the issue though since if this were the case then I should be seeing higher than 200Mbps on that interface. I am able to see that the two streaming nodes never get over 200Mbps via my

Re: Data model suggestions

2015-04-23 Thread Manoj Khangaonkar
Hi, If your external API returns active records, that means I am guessing you need to do a select * on the active table to figure out which records in the table are no longer active. You might be aware that range selects based on partition key will timeout in cassandra. They can however be made

Re: Adding New Node Issue

2015-04-23 Thread Ali Akhtar
What version are you running? On Fri, Apr 24, 2015 at 12:51 AM, Thomas Miller thomas.mil...@wda.com wrote: Jeff, Thanks for the response. I had come across that as a possible solution previously but there are discrepancies that would lead me to think that that is not the issue. It

RE: Adding New Node Issue

2015-04-23 Thread Thomas Miller
Ali, Our Cassandra version is 2.0.7. Thanks, Thomas Miller From: Ali Akhtar [mailto:ali.rac...@gmail.com] Sent: Thursday, April 23, 2015 4:22 PM To: user@cassandra.apache.org Subject: Re: Adding New Node Issue What version are you running? On Fri, Apr 24, 2015 at 12:51 AM, Thomas Miller

Re: Data model suggestions

2015-04-23 Thread Ali Akhtar
Good point about the range selects. I think they can be made to work with limits, though. Or, since the active records will never usually be 500k, the ids may just be cached in memory. Most of the time, during reads, the queries will just consist of select * where primaryKey = someValue . One

Re: Adding New Node Issue

2015-04-23 Thread Andrei Ivanov
Thomas, From our experience, C* is almost degrading quite a bit when we bootstrap new nodes - no idea why, was never able to get any help or hints. And we never reach anywhere close to 200Mbps. Though we also see higher CPU usage.Actually, there is another way of adding nodes, I guess. Like start

Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists

2015-04-23 Thread Andrei Ivanov
Just in case it helps - we are running C* with sstable sizes of something like 2.5 TB and ~4TB/node. No evident problems except the time it takes to compact. Andrei. On Wed, Apr 22, 2015 at 5:36 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote: Thanks Robert!! The JIRA was very helpful in

Re: Data model suggestions

2015-04-23 Thread Narendra Sharma
I think one table say record should be good. The primary key is record id. This will ensure good distribution. Just update the active attribute to true or false. For range query on active vs archive records maintain 2 indexes or try secondary index. On Apr 23, 2015 1:32 PM, Ali Akhtar

RE: Adding New Node Issue

2015-04-23 Thread Thomas Miller
Jeff, Thanks for the response. I had come across that as a possible solution previously but there are discrepancies that would lead me to think that that is not the issue. It appears our stream throughput is currently set to 200Mbps but unless the Cassandra service shares that same throughput

Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists

2015-04-23 Thread Anuj Wadehra
Great !!! Thanks Andrei !!! Thats the answer I was looking for :) Thanks Anuj Wadehra Sent from Yahoo Mail on Android From:Andrei Ivanov aiva...@iponweb.net Date:Thu, 23 Apr, 2015 at 11:57 pm Subject:Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists Just in

Re: What is 'Read Reuqests' on OpsCenter exaclty?

2015-04-23 Thread Bongseo Jang
Thanks a lot Carlos, Sebastian :-) My test was with 1 node/1 replica settings, on which I assumed client request = read request on the graph. Because there seems no read_repair and already CL=ONE in my case, I need more explanation, don't I? Or can any other internals be still involved? Do you

How much data is bootstrapping supposed to send?

2015-04-23 Thread Dave Galbraith
I had a one-node Cassandra 2.1.3 cluster, where the output of nodetool status looked like this: Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN

RE: RE: Cassandra tombstones being created by updating rows with TTL's

2015-04-23 Thread Walsh, Stephen
Thanks Anij, You are correct in understanding of our setup. However when we set the gc to 10 seconds its manages our tombstone count, any higher than 10 seconds and we start getting tombstone warnings. I think your right, when I set the gc_grace to 0 , I don’t believe the compaction kicked in

What is 'Read Reuqests' on OpsCenter exaclty?

2015-04-23 Thread Bongseo Jang
I have cassandra 2.1 + OpsCenter 5.1.1 and test them. When I monitored with opscenter 'read requests' graph, it seems the number on the graph is not what I expected, the number of client requests or responses. I recorded actual number of client request and compare it with graph, then found

Re: timeout creating table

2015-04-23 Thread Jimmy Lin
Also I am not sure it matters, but I just realized the keyspace created has replication factor of 2 when my Cassandra is really just a single node. Is Cassandra smart enough to ignore the RF of 2 and work with only 1 single node? On Mon, Apr 20, 2015 at 8:23 PM, Jimmy Lin y2klyf+w...@gmail.com

Re: What is 'Read Reuqests' on OpsCenter exaclty?

2015-04-23 Thread Carlos Rolo
Probably it takes in account the read repair, plus a read that have consistency != 1 will produce reads on other machines (which are taken in account). I don't know the internals of opscenter but I would assume that this is the case. If you want to test it further, disable read_repair, and make

Creating 'Put' requests

2015-04-23 Thread Matthew Johnson
Hi all, Currently looking at switching from HBase to Cassandra, and one big difference so far is that in HBase, we create a ‘Put’ object, add to it a set of column/value pairs, and send the Put to the server. So far in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I really

Re: Creating 'Put' requests

2015-04-23 Thread Jim Witschey
Are prepared statements what you're looking for? http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html Jim Witschey Software Engineer in Test | jim.witsc...@datastax.com On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson