Re: Multi-DC Repairs and Token Questions

2014-06-02 Thread Nick Bailey
See https://issues.apache.org/jira/browse/CASSANDRA-7317 On Mon, Jun 2, 2014 at 8:57 PM, Matthew Allen wrote: > Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually) > > SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out > identifiable information), they are same

Re: Paxos table gets larger when using 'IF NOT EXISTS'

2014-06-02 Thread Frederick Haebin Na
Really sorry, all. I should have thoroughly investigated before I post email to the thread. Okay, paxos rows are set with paxosTtl which defaults to max(3 * 3600, GC_GRACE_SECONDS). So, I have to wait 10 days to see the data removal with default configuration. Thank you. Haebin 2014-06-03 12

ANN: All Cassandra Resources Searchable

2014-06-02 Thread Otis Gospodnetic
Hi, We did this a while back, but never notified Cassandra community - we're indexing all Apache Cassandra resources - user/dev lists, wiki, website, JIRA, source code, and javadoc over at http://search-hadoop.com/cassandra . Some Apache projects, like Hadoop - http://hadoop.apache.org/ and HBase

Re: Paxos table gets larger when using 'IF NOT EXISTS'

2014-06-02 Thread Frederick Haebin Na
Again, sorry for spamming. I found out that there is no delete query for system.paxos. (2.0.6 code base, I'm using 2.0.7, tho.) Holy cow. So, It does not get deleted? Is this correct? Thanks. Haebin 2014-06-03 12:07 GMT+09:00 Frederick Haebin Na : > Sorry for spamming, folks. > > Okay, I fou

INSERT ... IF NOT EXISTS with some nodes unavailable

2014-06-02 Thread Ackerman, Mitchell
Hi, I'm trying to get a query using INSERT ... IF NOT EXISTS working when not all of the nodes are available. As a test case I have 2 nodes, one in AWS us-west-1, another in AWS eu-west-1. The keyspace settings are described below. When I only have one of the nodes available,

Re: Paxos table gets larger when using 'IF NOT EXISTS'

2014-06-02 Thread Frederick Haebin Na
Sorry for spamming, folks. Okay, I found out why it is growing. system.paxos row count is actually 3% greater than the data table. It may be caused by re-insertion of some data range. My ultimate question now would be why the paxos rows are not getting removed after the transaction. There is hi

Re: Paxos table gets larger when using 'IF NOT EXISTS'

2014-06-02 Thread Frederick Haebin Na
Sorry, I found out what are in the system.paxos. select * from system.paxos limit 1; | row_key | cf_id | in_porgress_ballot | most_recent_commit | most_recent_commit_at | proposal | proposal_ballot | I will investigate more on why it is growing continuously. If anyone have an idea why it is like

Paxos table gets larger when using 'IF NOT EXISTS'

2014-06-02 Thread Frederick Haebin Na
Hello all, We are trying to migrate data with 'INSERT IF NOT EXISTS' clause. Yet, strange thing is that the system.paxos table gets larger like 100GB which is equal to the size of the data table. Does anyone know what is happening here? What does system.paxos table store? Thank you. Haebin

Re: Multi-DC Repairs and Token Questions

2014-06-02 Thread Matthew Allen
Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually) SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out identifiable information), they are same keyspace. Keyspace: SN_KEYSPACE: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable

Re: Multi-DC Environment Question

2014-06-02 Thread Matthew Allen
Hi Vasilis, With regards to Question 2. * | How tokens are being assigned when adding a 2nd DC? Is the range -2^64 to 2^63 for each DC, or it is -2^64 to 2^63 for the entire cluster? (I think the latter is correct), * Have you been able to deduce an answer to this (assuming Murmur3 Part

Re: python cql driver - cassandra.ReadTimeout - “Operation timed out - received only 1 responses.”

2014-06-02 Thread Alex Popescu
If I'm reading this correctly, what you are seeing is the read_timeout on Cassandra side and not the client side timeout. Even if you set the client side timeouts, the C* read & write timeouts are still respected on that side. On Mon, Jun 2, 2014 at 10:55 AM, Marcelo Elias Del Valle < marc...@s1m

Re: Cassandra snapshot

2014-06-02 Thread Jack Krupansky
You might check the doc: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_restore_c.html -- Jack Krupansky From: ng Sent: Monday, June 2, 2014 3:18 PM To: user@cassandra.apache.org Subject: Cassandra snapshot I need to make sure that all the data in sstable

Re: Cassandra snapshot

2014-06-02 Thread Jeremy Jongsma
I wouldn't recommend doing this before regular backups for the simple reason that for large data sets it will take a long time to run, and will require that your node backup schedule be properly staggered (you should never be running repair on all nodes at the same time.) Backups should be trea

Re: Cassandra snapshot

2014-06-02 Thread Robert Coli
On Mon, Jun 2, 2014 at 12:18 PM, ng wrote: > > I need to make sure that all the data in sstable before taking the > snapshot. > > I am thinking of > nodetool cleanup > Cleanup does nothing but waste i/o if you have not recently added, removed, or replaced nodes. > nodetool repair > Repair can

Cassandra snapshot

2014-06-02 Thread ng
I need to make sure that all the data in sstable before taking the snapshot. I am thinking of nodetool cleanup nodetool repair nodetool flush nodetool snapshot Am I missing anything else? Thanks in advance for the responses/suggestions. ng

Re: migration to a new model

2014-06-02 Thread Marcelo Elias Del Valle
Hi Jens, Thanks for trying to help. Indeed, I know I can't do it using just CQL. But what would you use to migrate data manually? I tried to create a python program using auto paging, but I am getting timeouts. I also tried Hive, but no success. I only have two nodes and less than 200Gb in this c

Re: migration to a new model

2014-06-02 Thread Jens Rantil
Hi Marcelo, Looks like you can't do this without migrating your data manually: https://stackoverflow.com/questions/18421668/alter-cassandra-column-family-primary-key-using-cassandra-cli-or-cql Cheers, Jens On Mon, Jun 2, 2014 at 7:48 PM, Marcelo Elias Del Valle < marc...@s1mbi0se.com.br> wrote:

Problems using Hive + Cassandra community

2014-06-02 Thread Marcelo Elias Del Valle
Hi, Has anyone used HIVE + Cassandra Community successfully? I am having problems mapping the keyspace, but I started wondering if only DSE has support for it. I am trying to use HIVE 0.13 to access cassandra 2.0.8 column families created with CQL3. Here is how I created my column families: CR

migration to a new model

2014-06-02 Thread Marcelo Elias Del Valle
Hi, I have some cql CFs in a 2 node Cassandra 2.0.8 cluster. I realized I created my column family with the wrong partition. Instead of: CREATE TABLE IF NOT EXISTS entity_lookup ( name varchar, value varchar, entity_id uuid, PRIMARY KEY ((name, value), entity_id)) WITH caching=all;

Re: Tune cache MB settings per table.

2014-06-02 Thread Robert Coli
On Sun, Jun 1, 2014 at 12:49 PM, Kevin Burton wrote: > It's possible to set caching to: > > all, keys_only, rows_only, or none > > .. for a given table. > > But we have one table which is MASSIVE and we only need the most recent > 4-8 hours in memory. > > Anything older than that can go to disk a

[RELEASE CANDIDATE] Apache Cassandra 2.1.0-rc1 released

2014-06-02 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of the first release candidate for the future Apache Cassandra 2.1.0. Let first stress that this is not yet the final release of 2.1.0-rc1 and as such is *not* ready for production use. We however encourage as much testing as possible of this r

Re: Performance migrating from MySQL to C*

2014-06-02 Thread Simon Chemouil
32 cores are nearing 100%. We're only using SSDs and I believe we're cpu bound. However on the same dataset, same hardware, we get MySQL to answer just as fast with 1 core dedicated to the query (100%) and another few going up and down, leaving room for other queries (though they are still slightly

Re: Performance migrating from MySQL to C*

2014-06-02 Thread DuyHai Doan
"I tried this already but my datamodel still took most of our available CPUs and let very little room for other concurrent queries" It depends on your volumetry and cluster hardware config. If you do a lot of queries and have a lot of data a few nodes, it's normal that the cluster is overloaded.

Re: Performance migrating from MySQL to C*

2014-06-02 Thread Simon Chemouil
Thanks for your reply. About that bit: > "If you get to this situation, the solution is not to monitor how strangled Cassandra is - the solution is to come up with a data model that avoids the strangulation. CQL is a nice syntactic layer, but, at the end of the day, to avoid performance black hol

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Peter Lin
I've studied DataStax .Net driver and it doesn't satisfy all of my needs. I wish it met 50% of my needs, but it doesn't. In theory, Nectar could use the DataStax .Net driver for native protocol, but I'd have to do a lot more work to make it fit with Hector's API. Some use cases can stick with CQL f

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Colin Clark
Peter, There's very little reason today to write your own Cassandra driver for .net, java, or python. Those firms that do are now starting to wrap those drivers with any specific functionality they might require, like Netflix, for example. Have you looked at DataStax's .NET driver? -- Colin +1

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Peter Lin
thanks for the correction. Maybe it's just me, but I wish the implementation were also in apache's repo. It's not a big thing, but having multiple github forks to keep track of is a bit annoying. I'd rather spend time coding instead of screwing with git on windows. On Mon, Jun 2, 2014 at 8:29 AM,

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Benedict Elliott Smith
The native protocol specification has always been in the Apache Cassandra repository. The implementations are not. On 2 June 2014 13:25, Peter Lin wrote: > > There's nothing preventing support for native protocol going forward. It > was easier to go with thrift and I happen to like thirft. Nati

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Peter Lin
There's nothing preventing support for native protocol going forward. It was easier to go with thrift and I happen to like thirft. Native protocol is still relatively new, so I'm taking a wait and see approach.Is the native protocol specification and drivers still in DataStax's git? If it's going

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Colin Clark
Unless a cassandra driver is using the native protocol, it's going to have a very short life going forward. -- Colin +1 320 221 9531 On Mon, Jun 2, 2014 at 7:10 AM, Peter Lin wrote: > > it is using thrift. I've updated the project page to state that info. > > > On Mon, Jun 2, 2014 at 8:08 AM,

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Peter Lin
it is using thrift. I've updated the project page to state that info. On Mon, Jun 2, 2014 at 8:08 AM, Colin Clark wrote: > Is your version of Hector using native protocol or thrift? > > -- > Colin > +1 320 221 9531 > > > > On Mon, Jun 2, 2014 at 6:41 AM, Peter Lin wrote: > >> >> I'm happy to a

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Colin Clark
Is your version of Hector using native protocol or thrift? -- Colin +1 320 221 9531 On Mon, Jun 2, 2014 at 6:41 AM, Peter Lin wrote: > > I'm happy to announce Concord has decided to open source our port of > Hector to .Net. > > The project is hosted on google code > https://code.google.com/p/

RE: Performance migrating from MySQL to C*

2014-06-02 Thread moshe.kranc
From your email, I understand your use case a bit better - I now see that you want to query not just by dataName, but also by sensorId. Still, it seems like the major filter for the query is the dataName (you search for a few dozen at a time). Within that, you want to filter on some (potentiall

Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Peter Lin
I'm happy to announce Concord has decided to open source our port of Hector to .Net. The project is hosted on google code https://code.google.com/p/nectar-client/ I'm still adding code documentation and wiki pages. It has been tested against 1.1.x, 2.0.x thanks peter

Re: Performance migrating from MySQL to C*

2014-06-02 Thread Simon Chemouil
Hi Moshe, Thanks for your answer and the link on time series. We'd like to query on more than one dataName, but also on the time range and on an arbitrary number of sensorIds. Which we can't seem to do with CQL because we can't have multiple IN clauses or IN clauses on the primary key. (hopefully

Re: Performance migrating from MySQL to C*

2014-06-02 Thread Simon Chemouil
Hello DuyHai, Thanks for your answer. Unfortunately having one 'metricType' field or storing each metric in different partitions is roughly the same for us if we have to "merge" them on the timestamp client-side. Your solution can reduce the number of queries which is practical but we might get qu