Re: Fatal exception in thread Thread[RequestResponseStage......

2011-08-19 Thread Benjamin Coverston
You should run nodetool scrub on the affected nodes. You should probably seriously consider upgrading to the most recent 0.7 release. On Thu Aug 18 22:30:36 2011, Anand Somani wrote: Hi I am using 0.7.4 and am seeing this exception my logs a few times a day, should I be worried? Or is this

Re: fixing unbalanced cluster !?

2011-06-09 Thread Benjamin Coverston
Because you were able to successfully run repair you can follow up with a nodetool cleanup which will git rid of some of the extraneous data on that (bigger) node. You're also assured after you run repair that entropy beteen the nodes is minimal. Assuming you're using the random ordered

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread Benjamin Coverston
Hi AJ, Unfortunately, for storage capacity planning it's a bit of a guessing game. Until you run your load against it and profile the usage you just are not going to know for sure. I have seen cases where planning to have 50% excess capacity/node was plenty, and I have seen other extreme

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread Benjamin Coverston
to everyone who responded thus far. On 6/7/2011 10:16 AM, Benjamin Coverston wrote: snip Not to say that there aren't workloads where having many TB/Node doesn't work, but if you're planning to read from the data you're writing you do want to ensure that your working set is stored in memory. Thank

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-06 Thread Benjamin Coverston
Hi AJ, inline: On 6/6/11 11:03 PM, AJ wrote: Hi, I am working on a backup strategy and am trying to understand what is going on in the data directory. I notice that after a write to a CF and then flush, a new set of data files are created with an index number incremented in their names,

Re: java.lang.RuntimeException: Cannot recover SSTable with version a (current version f).

2011-05-05 Thread Benjamin Coverston
Hi Jeremiah, Did you try following up by running scrub? Did it help? Ben On 5/5/11 1:42 PM, Jeremiah Jordan wrote: Running repair and I am getting this error: java.lang.RuntimeException: Cannot recover SSTable with version a (current version f). at

Re: java.lang.RuntimeException: Cannot recover SSTable with version a (current version f).

2011-05-05 Thread Benjamin Coverston
Also, a nodetool cleanup would rebuild the SSTable to the most current version. On 5/5/11 1:42 PM, Jeremiah Jordan wrote: Running repair and I am getting this error: java.lang.RuntimeException: Cannot recover SSTable with version a (current version f). at

Re: Native heap leaks?

2011-05-05 Thread Benjamin Coverston
How many column families do you have? On 5/4/11 12:50 PM, Hannes Schmidt wrote: Hi, We are using Cassandra 0.6.12 in a cluster of 9 nodes. Each node is 64-bit, has 4 cores and 4G of RAM and runs on Ubuntu Lucid with the stock 2.6.32-31-generic kernel. We use the Sun/Oracle JDK. Here's the

Re: access.properties

2011-03-24 Thread Benjamin Coverston
Hi Hayden, What you are describing certainly seems useful. I am not aware of anyone using the security features of the SimpleAuthenticator anywhere in production. If you have a real world use case and would like to see the authenticator improved please open a JIRA ticket. If you have

Re: access.properties

2011-03-24 Thread Benjamin Coverston
Looking at the code I think your assumption is correct. When you choose the simple authority you have to explicitly set permissions. On 3/24/11 3:50 PM, Hayden Andrews wrote: Thanks for that Ben, Just to clarify: The current behavior is that if a user is given access to create and destroy

Re: URGENT HELP PLEASE!

2011-03-24 Thread Benjamin Coverston
Hi Jared, Sounds like you have two nodes in the cluster. What is your replication factor set to? 1? 2? Have you ever run repair? What consistency level do you use for reads and writes? From the way you are speaking it sounds like you are sending all of your traffic to a single node

Re: Argh: Data Corruption (LOST DATA) (0.7.0)

2011-03-04 Thread Benjamin Coverston
Hi Terje, Can you attach the portion of your logs that shows the exceptions indicating corruption? Which version are you on right now? Ben On 3/4/11 10:42 AM, Terje Marthinussen wrote: We are seeing various other messages as well related to deserialization, so this seems to be some random

Re: Cluster not starting up

2011-03-04 Thread Benjamin Coverston
The EOF exception looks like CASSANDRA-1992, which, if that is the problem, will be resolved by the scrub tool in 0.7.3. That release is being voted on right now. HTH, Ben On 3/4/11 10:32 AM, Matt Kennedy wrote: I'm currently the proud owner of an 8-node cluster that won't start up.

Re: time to live rows

2011-02-08 Thread Benjamin Coverston
On 2/8/11 1:23 PM, Kallin Nagelberg wrote: I did read those articles, but I didn't know know that deleting all the columns on a row was equivalent to deleting the row. Like I mentioned, I did delete all the columns from all my rows and then forced compaction before and after gc_grace had