Issue with leveled compaction and data migration

2013-09-13 Thread Michael Theroux
Hello, We've been undergoing a migration on Cassandra 1.1.9 where we are combining two column families. We are incrementally moving data from one column family into another, where the columns in a row in the source column family are being appended to columns in a row in the target column

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-13 Thread Keith Freeman
Paul- Sorry to go off-list but I'm diving pretty far into details here. Ignore if you wish. Thanks a lot for the example, definitely very helpful. I'm surprised that the Cassandra experts aren't more interested-in/alarmed-by our results, it seems like we've proved that insert performance

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-13 Thread Nate McCall
https://github.com/Netflix/astyanax/issues/391 I've gotten in touch with a couple of netflix folks and they are going to try to roll a release shortly. You should be able to build against 1.2.2 and 'talking' to 1.2.9 instance should work. Just a PITA development wise to maintain a different

Re: Normal OS: Disk Throughput levels for EC2

2013-09-13 Thread David Ward
My apologies, information that should have been in my original email. m1.xlarges using a single raid0 ephemeral array for both data and the commit log. Latest burst write was ~150GB over 3 nodes ( rf 3 so 150GB per node ) with 8GB heap but no major spikes show up on the Opscenter graph for

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-13 Thread Nate McCall
Also, I was working on this a bit for a client so compiled my notes and approach into a blog post for posterity (and so it's easier to find for others): http://thelastpickle.com/blog/2013/09/13/CQL3-to-Astyanax-Compatibility.html Paul's method on this thread is cited at the bottom as well. On

is there any type of table existing on all nodes(slow to up date, fast to read in map/reduce)?

2013-09-13 Thread Hiller, Dean
I was just wondering if cassandra had any special CF that every row exists on every node for smaller tables that we would want to leverage in map/reduce. The table row count is less than 500k and we are ok with slow updates to the table, but this would make M/R blazingly fast since for every

Re: is there any type of table existing on all nodes(slow to up date, fast to read in map/reduce)?

2013-09-13 Thread Jon Haddad
It sounds some something that's only useful in a really limited use case. In an 11 node cluster it would be quorum reads / writes would need to come from 6 nodes. It would probably be much slower for both reads writes. It sounds like what you want is a database with replication, not

Re: Normal OS: Disk Throughput levels for EC2

2013-09-13 Thread Nate McCall
This can vary pretty heavily by instance type and storage options. What size instances are these and how is the storage configured? On Fri, Sep 13, 2013 at 1:11 AM, David Ward da...@shareablee.com wrote: I noticed on EC2, the c* nodes according to OpsCenter have never gone above 1.6-2.2MBps.

Re: is there any type of table existing on all nodes(slow to up date, fast to read in map/reduce)?

2013-09-13 Thread Robert Coli
On Fri, Sep 13, 2013 at 10:47 AM, Hiller, Dean dean.hil...@nrel.gov wrote: I was just wondering if cassandra had any special CF that every row exists on every node for smaller tables that we would want to leverage in map/reduce. The table row count is less than 500k and we are ok with slow

Re: Normal OS: Disk Throughput levels for EC2

2013-09-13 Thread Alain RODRIGUEZ
You should give further information if you want an answer. What kind of instance it is ? Instance sore / EBS / Optimized EBS ? Do you try to read / write on this disk ? How much ? ... With m1 xlarge we reached 40 MBps and now with a hi1.4xlarge we don't have reach any limit yet, and we have 100+

Nodes separating from the ring

2013-09-13 Thread Dave Cowen
Hi, all - We've been running Cassandra 1.1.12 in production since February, and have experienced a vexing problem with an arbitrary node falling out of or separating from the ring on occasion. When a node falls out of the ring, running nodetool ring on the misbehaving node shows that the

Re: is there any type of table existing on all nodes(slow to up date, fast to read in map/reduce)?

2013-09-13 Thread Hiller, Dean
That's an interesting idea…..so that would be an RF=1 in each data center…..very interesting. Dean From: Jonathan Haddad j...@jonhaddad.commailto:j...@jonhaddad.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org

Re: is there any type of table existing on all nodes(slow to up date, fast to read in map/reduce)?

2013-09-13 Thread Robert Coli
On Fri, Sep 13, 2013 at 11:15 AM, Hiller, Dean dean.hil...@nrel.gov wrote: When I add nodes though, I would kind of be screwed there, right? Is there an RF=${nodecount}…that would be neat. Increasing replication factor is well understood, and in this case you could pre-load the entire