mixed cluster 0.6.9 and 0.6.12

2011-03-09 Thread Daniel Doubleday
Hi all we are still on 0.6.9 and plan to upgrade to 0.6.12 but are a little concerned about: https://issues.apache.org/jira/browse/CASSANDRA-2170 I thought of upgrading only one node (of 5) to .12 and monitor for a couple of days. Is this a bad idea? Thanks, Daniel

Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Sasha Dolgy
Hi Everyone, Now that I'm past the problems of IP addresses changing ... I am onto the idea of storage. Initially I had though that for each cassandra instance, I should have an EBS volume to store all the cassandra data / information. Now I'm starting to wonder if this is duplication and not nec

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
I'm considering similar issues right now. The problem with ephemeral storage is I don't know an easy way to back it up, while on an EBS it's a simple snapshot API call. Otherwise, I believe the performance of the ephemeral (certainly in the case of large or greater, where you can RAID0 multiple d

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Sasha Dolgy
well, this is what i'm getting at. why would you want to back it up if the cluster is working properly? backup is silly ; ) On Wed, Mar 9, 2011 at 4:54 PM, William Oberman wrote: > I'm considering similar issues right now. The problem with ephemeral > storage is I don't know an easy way to

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
For me, to transition production data into a development environment for real world testing. Also, backups are never a bad idea, though I agree most all risk is mitigated due to cassandra's design. will On Wed, Mar 9, 2011 at 10:57 AM, Sasha Dolgy wrote: > > well, this is what i'm getting at.

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Jeremy Hanna
I've seen both sides but Cassandra does handle replication and bringing data back is a matter of bootstrapping a node to replace the downed node. One thing to consider is availability zones and regions though. What happens if your entire cluster goes down in the case of a single datacenter go

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Sasha Dolgy
Could you not nodetool snapshot the data into an mounted ebs/s3 bucket and satisfy your development requirement? -sd On Wed, Mar 9, 2011 at 5:23 PM, William Oberman wrote: > For me, to transition production data into a development environment for > real world testing. Also, backups are never a b

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
I thought nodetool snapshot writes the snapshot locally, requiring 2x of expensive storage allocation 24x7 (vs. cheap storage allocation of a ebs snapshot). By that I mean EBS allocation is GB allocated per month costs at one rate, and EBS snapshots are delta compressed copies to S3. Can you poin

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Sasha Dolgy
Hi Will, http://wiki.apache.org/cassandra/Operations#Backing_up_data If the snapshot is written to the ephemeral storage ... there isn't a cost. (i need to confirm that) You can then move this to an S3 bucket with RDS if you want or fu

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
I haven't done backups yet, so I don't know where the data is written. Is it where the nodetool is run from? Or local to the instance running cassandra (and there, local to the data directory?). I assumed it was the latter (not finding docs on that yet), and that would require 2x storage allocat

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Dave Viner
Sasha, You might also check out http://coreyhulen.org/category/cassandra/ for speed tests done by Corey Hulan on different disk configurations (both inside ec2 and on real hw). If you write to the ephermeral storage on an EC2 instance, there is no additional cost for the data written. Mostly sim

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Frank LoVecchio
> > Now that I'm past the problems of IP addresses changing ... I am onto the > idea of storage. Initially I had though that for each cassandra instance, I > should have an EBS volume to store all the cassandra data / information. > Now I'm starting to wonder if this is duplication and not necessa

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Sasha Dolgy
Hi will, Quickly did a snapshot: nodetool -h 10.0.0.2 -p 8080 snapshot 09032011 The snapshots end up in the data dir for cassandra. The default is /var/lib/cassandra/data//snapshots/ In this directory i have: 1299689801925-09032011 -sd On Wed, Mar 9, 2011 at 5:54 PM, William Oberman wrote:

removing a node

2011-03-09 Thread Sasha Dolgy
Hi there, Wanted to clarify with anyone ... re: http://wiki.apache.org/cassandra/Operations#Removing_nodes_entirely You can take a node out of the cluster with nodetool decommission to a live node, or nodetool removetoken (to any other machine) to remove a dead one. This will assign the ranges th

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Erik Onnen
I'd recommend not storing commit logs or data files on EBS volumes if your machines are under any decent amount of load. I say that for three reasons. First, both EBS volumes contend directly for network throughput with what appears to be a peer QoS policy to standard packets. In other words, if y

Re: Reducing memory footprint

2011-03-09 Thread Casey Deccio
On Sat, Mar 5, 2011 at 7:37 PM, aaron morton wrote: > There is some additional memory usage in the JVM beyond that Heap size, in > the permanent generation. 900mb sounds like too much for that, but you can > check by connecting with JConsole and looking at the memory tab. You can > also check the

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
This is excellent, specific feedback. Thanks! Given the relative costs, I was hoping L was the optimal tradeoff vs XL, but if that's the best option, that's the best option. will On Wed, Mar 9, 2011 at 12:04 PM, Erik Onnen wrote: > I'd recommend not storing commit logs or data files on EBS vo

Re: Does the memtable replace the old version of column with the new overwriting version or is it just a simple append ?

2011-03-09 Thread Jonathan Ellis
On Wed, Mar 9, 2011 at 1:56 AM, Aditya Narayan wrote: > so this means that in memtable only the most recent version of a > column will reside? The most recent version seen since the memtable was opened, which may not be the most recent version ever. > For this implementation, while writing "to >

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread William Oberman
Based on Eric's email, it sounds like EBS is a no go from the start. But given your snapshot feedback, it seems like you have to plan on leaving slack on every disk, and the % of slack depends on the size of a snapshot relative to the data (given the snapshot shares the disk with the data, at leas

Re: mixed cluster 0.6.9 and 0.6.12

2011-03-09 Thread Jonathan Ellis
We haven't been able to reproduce 2170 and have several customers on .12. Mixing .9 and .12 should be fine. On Wed, Mar 9, 2011 at 4:29 AM, Daniel Doubleday wrote: > Hi all > > we are still on 0.6.9 and plan to upgrade to 0.6.12 but are a little > concerned about: > > https://issues.apache.org/

Re: Aamzon EC2 & Cassandra to ebs or not..

2011-03-09 Thread Jonathan Ellis
Right, local snapshot is no-cost both from an EC2 pricing standpoint and from a disk usage standpoint (because it uses hard links). On Wed, Mar 9, 2011 at 10:48 AM, Sasha Dolgy wrote: > Hi Will, > http://wiki.apache.org/cassandra/Operations#Backing_up_data > If the snapshot is written to the ephe

Re: Reducing memory footprint

2011-03-09 Thread Jonathan Ellis
I edited Peter Schuller's reply last time this came up into a FAQ: http://wiki.apache.org/cassandra/FAQ#mmap On Wed, Mar 9, 2011 at 11:10 AM, Casey Deccio wrote: > On Sat, Mar 5, 2011 at 7:37 PM, aaron morton > wrote: >> >> There is some additional memory usage in the JVM beyond that Heap size,

Removing a node ...

2011-03-09 Thread Sasha Dolgy
mail servers keep catching up in spam filters. Something about being a loyal gmail user ...! Let's try this again: further to this ...using cassandra 0.7.0 nodetool -h 10.0.0.1 -p 8080 decommission INFO [RMI TCP Connection(2)-10.0.0.1] 2011-03-09 16:52:22,226 StorageService.java (line 399) Lea

RE: build.xml issue with 0.7.3?

2011-03-09 Thread Paul Choi
Eric, Thanks for your response. This is all I see in /build: [paulchoi@build02 build]$ find . . ./lib ./maven-ant-tasks-2.1.1.jar [paulchoi@build02 build]$ I just downloaded a new copy of 0.7.3-src and tried manually. I'm still running into the same problem. I tried doing this with 0.7.0, and a

cassandra and G1 gc

2011-03-09 Thread ruslan usifov
Hello Does anybody use G1 gc in production? What your impressions?

RE: build.xml issue with 0.7.3?

2011-03-09 Thread Paul Choi
Ah, the plot thickens... I downloaded Ant 1.8.2, tried building Cassandra 0.7.3. That works fine. CentOS 5 comes with ant 1.6.5, and that does not work with Cassandra 0.7.3 I see in Cassandra's changelog that, in 0.7.1, ivy was replaced with maven-ant-tasks. Now, maven-ant-tasks requires ant 1.

Re: build.xml issue with 0.7.3?

2011-03-09 Thread Stephen Connolly
there is no ivy any more. drop me a mail with details of your _exact_ ANT version and JDK and i'll see if I can diagnose your issues -Stephen On 9 March 2011 18:51, Paul Choi wrote: > Eric, > Thanks for your response. > > This is all I see in /build: > [paulchoi@build02 build]$ find . > . > ./

Re: Removing a node ...

2011-03-09 Thread Jonathan Ellis
Did you check log for errors? (ASF mailing lists prefer plain text; html has a higher spam score.) On Wed, Mar 9, 2011 at 11:59 AM, Sasha Dolgy wrote: > mail servers keep catching up in spam filters.  Something about being a > loyal gmail user ...! > Let's try this again: > > further to this ...

Re: cassandra and G1 gc

2011-03-09 Thread Jonathan Ellis
Our testing indicates G1 is not significantly better than CMS. On Wed, Mar 9, 2011 at 1:00 PM, ruslan usifov wrote: > Hello > > Does anybody use G1 gc in production? What your impressions? > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional

Re: cassandra and G1 gc

2011-03-09 Thread Peter Schuller
> Does anybody use G1 gc in production? What your impressions? I don't know if anyone does, but I've tested it very briefly and at least seen it run well for a while ;) (JDK 1.7 trunk builds) I do have some comments about expected behavior though. But first, for those not familiar with G1, the ma

Re: cassandra and G1 gc

2011-03-09 Thread Peter Schuller
> I don't know whether this particular problem would in fact be an issue > for Cassandra. Extended long-term testing would probably be required > under real workloads of different kinds to determine whether G1 seems > suitable in its current condition. But honestly, for any case where you have a l

Re: Removing a node ...

2011-03-09 Thread Sasha Dolgy
Hi, Checked logs on node where decommission command was performed and on other nodes, and no error messages. Just info messages. Although the behaviour and circumstances are exact, I'm wondering if https://issues.apache.org/jira/browse/CASSANDRA-2072 has something to do with it. Again, this is

Re: problem with bootstrap

2011-03-09 Thread aaron morton
The definition of "down" is important here. Down refers to a node that has joined the ring, so the other nodes know of it's existence and the range it is storing, which is not responding to gossip messages. While it is down it is still considered an endpoint. The error you and Patrik saw refer

Re: removing a node

2011-03-09 Thread aaron morton
yes. dead normally means machine cannot be started. Aaron On 10/03/2011, at 6:01 AM, Sasha Dolgy wrote: > > Hi there, > > Wanted to clarify with anyone ... re: > http://wiki.apache.org/cassandra/Operations#Removing_nodes_entirely > > You can take a node out of the cluster with nodetool dec

Re: problem with bootstrap

2011-03-09 Thread mcasandra
Thanks! aaron morton wrote: > > > The issue I think you and Patrik are seeing occurs when you *remove* nodes > from the ring. The ring does not know if they are up or down. E.g. you > have a ring of 3 nodes, and add a keyspace with RF 3. Then for whatever > reason 2 nodes are removed from the ri

Re: Reducing memory footprint

2011-03-09 Thread aaron morton
Casey, It sounds like the JVM is behaving. Perhaps turn off mmapped disk_access to double check that the number you are seeing as resident does not include the mapped memory? Aaron On 10/03/2011, at 6:36 AM, Jonathan Ellis wrote: > I edited Peter Schuller's reply last time this came

Re: Removing a node ...

2011-03-09 Thread Jonathan Ellis
I think 2072 is something different but you should definitely upgrade before troubleshooting further. On Wed, Mar 9, 2011 at 2:00 PM, Sasha Dolgy wrote: > Hi, > > Checked logs on node where decommission command was performed and on > other nodes, and no error messages.  Just info messages.  Altho

Re: Removing a node ...

2011-03-09 Thread Sasha Dolgy
Ok ... I have been very hesitant to upgrade from 0.7.0 because I haven't really had many problems and the comments on 0.7.1 and 0.7.2 weren't that encouraging . So testingon 10.0.0.1 I set up 0.7.3 and have it auto-bootstrap: 10.0.0.3 Up Normal 229.26 KB 40.78% 240530881901956

Re: Removing a node ...

2011-03-09 Thread Jonathan Ellis
On Wed, Mar 9, 2011 at 3:11 PM, Sasha Dolgy wrote: > Ok ... I have been very hesitant to upgrade from 0.7.0 because I > haven't really had many problems and the comments on 0.7.1 and 0.7.2 > weren't that encouraging . 0.7.0 is worse. > So testingon 10.0.0.1 I set up 0.7.3 and have it aut

Re: Removing a node ...

2011-03-09 Thread Sasha Dolgy
nice .. thanks Jonathan for the quick feedback. suppose then, for the time being, until the fix for 2283 is applied i guess i'll live with no decommission and my dirty way of removing a node. -sd On Wed, Mar 9, 2011 at 10:24 PM, Jonathan Ellis wrote: > On Wed, Mar 9, 2011 at 3:11 PM, Sasha Dolg

understanding tombstones

2011-03-09 Thread Jeffrey Wang
Hey all, I was wondering if this is the expected behavior of deletes (0.7.0). Let's say I have a 1-node cluster with a single CF which has gc_grace_seconds = 0. The following sequence of operations happens (in the given order): insert row X with timestamp T delete row X with timestamp T+1 force

Understanding index builds

2011-03-09 Thread Matt Kennedy
I'm trying to gain some insight into what happens with a cluster when indexes are being built, or when CFs with indexed columns are being written to. Over the past couple of days we've been doing some loads into a CF with 29 indexed columns. Eventually, the nodes just got overwhelmed and the clie

Re: problem with bootstrap

2011-03-09 Thread aaron morton
Bootstrapping uses the same mechanisms as a repair to streams data from other nodes. This can be a heavy weight process and you may want to control when it starts. Joining the ring just tells the other nodes you exists and this is your token. A On 10/03/2011, at 9:27 AM, mcasandra wrote: >

Re: problem with bootstrap

2011-03-09 Thread mcasandra
aaron morton wrote: > > > The issue I think you and Patrik are seeing occurs when you *remove* nodes > from the ring. The ring does not know if they are up or down. E.g. you > have a ring of 3 nodes, and add a keyspace with RF 3. Then for whatever > reason 2 nodes are removed from the ring. When

Bug in fix for #2296?

2011-03-09 Thread Jason Harvey
I applied the #2296 patch and retried a scrub. Now getting thousands of the following: java.io.IOException: Keys must be written in ascending order. at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:111) at org.apache.cassandra.io.sstable.SSTableWri

Re: Understanding index builds

2011-03-09 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-2294 https://issues.apache.org/jira/browse/CASSANDRA-2295 On Wed, Mar 9, 2011 at 5:47 PM, Matt Kennedy wrote: > I'm trying to gain some insight into what happens with a cluster when > indexes are being built, or when CFs with indexed columns are bei

Modeling Multi-Valued Fields

2011-03-09 Thread Cameron Leach
Is there a best-practice for modeling multi-valued fields (fields that are repeated or collections of fields)? Our current data model allows for a User to store multiple email addresses: User { Integer id; //row key List emails; Email { String type; //home, work, gmail, hotmail, etc...

Re: understanding tombstones

2011-03-09 Thread Jonathan Ellis
On Wed, Mar 9, 2011 at 4:54 PM, Jeffrey Wang wrote: > insert row X with timestamp T > delete row X with timestamp T+1 > force flush + compaction > insert row X with timestamp T > > My understanding is that the tombstone created by the delete (and row X) > will disappear with the flush + compaction

RE: understanding tombstones

2011-03-09 Thread Jeffrey Wang
Yup. https://issues.apache.org/jira/browse/CASSANDRA-2305 -Jeffrey -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Wednesday, March 09, 2011 6:19 PM To: user@cassandra.apache.org Subject: Re: understanding tombstones On Wed, Mar 9, 2011 at 4:54 PM, Jeffrey Wang

Exception when running a clean up

2011-03-09 Thread Stu King
I am seeing this exception when I am trying to run a cleanup. I want to decommission the node after the cleanup. java.util.concurrent.ExecutionException: java.io.IOError: java.io.EOFException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) at java.util.concurrent.FutureTask.g

Re: Exception when running a clean up

2011-03-09 Thread aaron morton
What version of cassandra are you using and what is the upgrade history for the cluster? Aaron On 10/03/2011, at 8:24 PM, Stu King wrote: > I am seeing this exception when I am trying to run a cleanup. I want to > decommission the node after the cleanup. > > java.util.concurrent.ExecutionExcep