Re: Using a node in separate cluster without decommissioning.

2012-07-13 Thread rohit bhatia
Hi Just wanted to say that it worked. I also made sure to modify thrift rpc_port and storage port so that the two clusters don't interfere. Thanks for the suggestion Thanks Rohit On Thu, Jul 12, 2012 at 10:01 AM, aaron morton aa...@thelastpickle.com wrote: Since replication factor is 2 in

Never ending manual repair after adding second DC

2012-07-13 Thread Bart Swedrowski
Hello everyone, I'm facing quite weird problem with Cassandra since we've added secondary DC to our cluster and have totally ran out of ideas; this email is a call for help/advice! History looks like: - we used to have 4 nodes in a single DC - running Cassandra 0.8.7 - RF:3 - around 50GB of data

Re: Cassandra and Tableau

2012-07-13 Thread Robin Verlangen
Thank you Aaron and Brian. We're currently investigating several options. Hadoop + Hive combo also seems a good choice as our input files are flat. I'll keep you up-to-date about our final decision. - Robin 2012/7/6 aaron morton aa...@thelastpickle.com Here are two links I've noticed in my

Re: Increased replication factor not evident in CLI

2012-07-13 Thread Dustin Wenz
It sounds plausible that is what we are running into. All of our nodes report a replication factor of 2 (both using describe, and show schema), even though the cluster reported that all schemas agree after I issued the change to 4. If this is related to the bug that you filed, it might also

Re: How to speed up data loading

2012-07-13 Thread Tupshin Harper
Any chance your server has been running for the last two weeks with the leap second bug? http://www.datastax.com/dev/blog/linux-cassandra-and-saturdays-leap-second-problem -Tupshin On Jul 12, 2012 1:43 PM, Leonid Ilyevsky lilyev...@mooncapital.com wrote: I am loading a large set of data into a

Cassandra Summit 2012

2012-07-13 Thread Jonathan Ellis
Hi all, The 2012 Cassandra Summit will be in San Jose on August 8. The 2011 Summit sold out with almost 500 attendees; this year we found a bigger venue to accommodate 700+. It's fantastic to see the Cassandra community grow like this! The 2012 Summit will have *four* talk tracks, plus the

2012 Cassandra MVP nominations

2012-07-13 Thread Jonathan Ellis
DataStax would like to recognize individuals who go above and beyond in their contributions to Apache Cassandra. To formalize this a little bit, we're creating an MVP program, the first of which will be announced at the Cassandra summit [1] in August. To make this program a success, we need your

Re: Increased replication factor not evident in CLI

2012-07-13 Thread Dustin Wenz
I was able to apply the patch in the cited bug report to the public source for version 1.1.2. It seemed pretty straightforward; six lines in MigrationManager.java were switched from System.currentTimeMillis() to FBUtilities.timestampMicros(). I then re-built the project by running 'ant

Re: SSTable format

2012-07-13 Thread Dave Brosius
On 07/13/2012 08:00 PM, Michael Theroux wrote: Hello, I've been trying to understand in greater detail how SStables are stored, and how information is transferred between Cassandra nodes, especially when a new node is joining a cluster. Specifically, Is information stored to SStables ordered

Re: SSTable format

2012-07-13 Thread Rob Coli
On Fri, Jul 13, 2012 at 5:18 PM, Dave Brosius dbros...@baybroadband.net wrote: It depends on what partitioner you use. You should be using the RandomPartitioner, and if so, the rows are sorted by the hash of the row key. there are partitioners that sort based on the raw key value but these

Re: SSTable format

2012-07-13 Thread Dave Brosius
While in memory cassandra calls it a MemTable, but yes sstables are write-once, and later combined with others into new ones thru compaction. On 07/13/2012 09:54 PM, Michael Theroux wrote: Thanks for the information, So is the SStable essentially kept in memory, then sorted and written to

Re: SSTable format

2012-07-13 Thread prasenjit mukherjee
It depends on what partitioner you use. You should be using the RandomPartitioner, and if so, the rows are sorted by the hash of the row key. there are partitioners that sort based on the raw key value but these partitioners shouldn't be used as they have problems due to uneven partitioning