UnreachableNodes

2012-10-18 Thread Rene Kochen
I have a four node EC2 cluster. Three machines show via nodetool ring that all machines are UP. One machine shows via nodetool ring that one machine is DOWN. If I take a closer to the machine reporting the other machine as down, I see the following: - StorageService.UnreachableNodes =

Re: UnreachableNodes

2012-10-18 Thread aaron morton
You can double check the node reporting 9.109 as down can telnet to port 7000 on 9.109. Then I would restart 9.109 with -Dcassandra.load_ring_state=false added as a JVM param in cassandra-env.sh. If is still shows as down can you post the output from nodetool gossipinfo from 9.109 and the

Re: Cassandra nodes loaded unequally

2012-10-18 Thread aaron morton
At times of high load check the CPU % for the java service running C* to confirm C* is the source of load. If the load is generated from C* check the logs (or use OpsCentre / other monitoring) to see if it correlated to compaction, or Garbage Collection or repair or high throughput. Cheers

Re: run repair on each node or every R nodes?

2012-10-18 Thread aaron morton
Without -pr the repair works on all token ranges the node is a replica for. With -pr it only repairs data in the token range it is assigned. In your case when you ran it on node 0 with RF the token range form node 0 was repaired on nodes 0, 1 and 2. The other token ranges on nodes 0, 1 and 2

Re: Missing non composite column

2012-10-18 Thread aaron morton
Yes, i understand that. Reason why i am asking is, with this i need to split them to get actual column name using : as a seperator. The : is a artefact of the cassandra-cli, nothing something you will have to deal with via the thrift API. Internally we do not store the values with :

Re: Astyanax empty column check

2012-10-18 Thread aaron morton
Very slim reason to link to my favourite Joe Celko (http://en.wikipedia.org/wiki/Joe_Celko) quote: 'LOL! My wife is an ordained Soto Zen priest. I would say after 30 years together, I'd go with her. She is the only person who understood NULLs immediately.

Re: RF update

2012-10-18 Thread aaron morton
Follow up question: Is it safe to abort the compactions happening after node repair? It is always safe to abort a compaction. The purpose of compaction is to replicate the current truth in a more compact format. It does not modify data, it just creates new files. The worse case would be

how to get column type?

2012-10-18 Thread Hagos, A.S.
Hi all, I am wondering if there is a way to know the column type of an already stored value in Cassandra. My specific case is to get a column value of a known column name but not type. greetings Ambes

Re: potential data loss in Cassandra 1.1.0 .. 1.1.4

2012-10-18 Thread Alain RODRIGUEZ
Hi Jonathan. We are currently running the datastax AMI on amazon. Cassandra is in version 1.1.2. I guess that the datastax repo (deb http://debian.datastax.com/communitystable main) will be updated directly in 1.1.6 ? Replaying already-flushed data a second time is harmless -- except for

Re: how to get column type?

2012-10-18 Thread Hiller, Dean
This is specifically why Cassandra and even PlayOrm are going the direction of partial schemas. Everything in cassandra in raw form is just bytes. If you don't tell it the types, it doesn't know how to translate it. PlayOrm and other ORM layers are the same way though in these noSQL ORMs you

replaced node keeps returning in gossip

2012-10-18 Thread Thomas van Neerijnen
Hi all I'm running Cassandra 1.0.11 on Ubuntu 11.10. I've got a ghost node which keeps showing up on my ring. A node living on IP 10.16.128.210 and token 0 died and had to be replaced. I replaced it with a new node, IP 10.16.128.197 and again token 0 with a -Dcassandra.replace_token=0 at

Re: UnreachableNodes

2012-10-18 Thread Rene Kochen
Thanks Aaron, Telnet works (in both directions). After a normal (i.e. without discarding ring state) restart of the node reporting the other one as down, the ring shows up again. So a node restarts fixes the incorrect state. I see this error occasionally. I will further investigate and post

Re: Why my Cassandra is compacting like mad

2012-10-18 Thread Bryan
I think I am seeing the same issue, but it doesn't seem to be related to the schema_columns. I understand that repair is supposed to be intensive, but this is bringing the associated machine to its knees, to the point that logging on the machine takes a very, very long time and requests are no

constant CMS GC using CPU time

2012-10-18 Thread Bryan Talbot
In a 4 node cluster running Cassandra 1.1.5 with sun jvm 1.6.0_29-b11 (64-bit), the nodes are often getting stuck in state where CMS collections of the old space are constantly running. The JVM configuration is using the standard settings in cassandra-env -- relevant settings are included below.

Hinted Handoff runs every ten minutes

2012-10-18 Thread Stephen Pierce
I installed Cassandra on three nodes. I then ran a test suite against them to generate load. The test suite is designed to generate the same type of load that we plan to have in production. As one of many tests, I reset one of the nodes to check the failure/recovery modes. Cassandra worked

Re: Hinted Handoff runs every ten minutes

2012-10-18 Thread David Daeschler
Hi Steve, Also confirming this. After having a node go down on Cassandra 1.0.8 there seems to be hinted handoff between two of our 4 nodes every 10 minutes. Our setup also shows 0 rows. It does not appear to have any effect on the operation of the ring, just fills up the log files. - David On

hadoop consistency level

2012-10-18 Thread Andrey Ilinykh
Hello, everybody! I'm thinking about running hadoop jobs on the top of the cassandra cluster. My understanding is - hadoop jobs read data from local nodes only. Does it mean the consistency level is always ONE? Thank you, Andrey

Re: hadoop consistency level

2012-10-18 Thread Jean-Nicolas Boulay Desjardins
Why don't you look into Brisk: http://www.datastax.com/docs/0.8/brisk/about_brisk On Thu, Oct 18, 2012 at 2:46 PM, Andrey Ilinykh ailin...@gmail.com wrote: Hello, everybody! I'm thinking about running hadoop jobs on the top of the cassandra cluster. My understanding is - hadoop jobs read data

Re: hadoop consistency level

2012-10-18 Thread William Oberman
A recent thread made it sound like Brisk was no longer a datastax supported thing (it's DataStax Enterpise, or DSE, now): http://www.mail-archive.com/user@cassandra.apache.org/msg24921.html In particular this response: http://www.mail-archive.com/user@cassandra.apache.org/msg25061.html On Thu,

Re: hadoop consistency level

2012-10-18 Thread Michael Kjellman
Unless you have Brisk (however as far as I know there was one fork that got it working on 1.0 but nothing for 1.1 and is not being actively maintained by Datastax) or go with CFS (which comes with DSE) you are not guaranteed all data is on that hadoop node. You can take a look at the forks if

Re: hadoop consistency level

2012-10-18 Thread Jean-Nicolas Boulay Desjardins
I am surprise that it was abandoned this way. So if I want to use Brisk on Cassandra 1.1 I have to use DataStax Entreprise service... On Thu, Oct 18, 2012 at 3:00 PM, Michael Kjellman mkjell...@barracuda.com wrote: Unless you have Brisk (however as far as I know there was one fork that got it

Re: hadoop consistency level

2012-10-18 Thread Michael Kjellman
Honestly, I think what they did re Brisk development is fair. They left the code for any of us in the community to improve it and make it compatible with newer versions and they need to make money as a company as well. They already contribute so much to the Cassandra community in general and they

Re: hadoop consistency level

2012-10-18 Thread Andrey Ilinykh
On Thu, Oct 18, 2012 at 12:00 PM, Michael Kjellman mkjell...@barracuda.com wrote: Unless you have Brisk (however as far as I know there was one fork that got it working on 1.0 but nothing for 1.1 and is not being actively maintained by Datastax) or go with CFS (which comes with DSE) you are not

Re: hadoop consistency level

2012-10-18 Thread Michael Kjellman
Well there is *some* data locality, it's just not guaranteed. My understanding (and someone correct me if I'm wrong) is that ColumnFamilyInputFormat implements InputSplit and the getLocations() method. http://hadoop.apache.org/docs/mapreduce/current/api/org/apache/hadoop/mapre

Re: hadoop consistency level

2012-10-18 Thread Andrey Ilinykh
On Thu, Oct 18, 2012 at 1:24 PM, Michael Kjellman mkjell...@barracuda.com wrote: Well there is *some* data locality, it's just not guaranteed. My understanding (and someone correct me if I'm wrong) is that ColumnFamilyInputFormat implements InputSplit and the getLocations() method.

Re: hadoop consistency level

2012-10-18 Thread Michael Kjellman
Not sure I understand your question (if there is one..) You are more than welcome to do CL ONE and assuming you have hadoop nodes in the right places on your ring things could work out very nicely. If you need to guarantee that you have all the data in your job then you'll need to use QUORUM. If

Re: hadoop consistency level

2012-10-18 Thread Andrey Ilinykh
On Thu, Oct 18, 2012 at 1:34 PM, Michael Kjellman mkjell...@barracuda.com wrote: Not sure I understand your question (if there is one..) You are more than welcome to do CL ONE and assuming you have hadoop nodes in the right places on your ring things could work out very nicely. If you need to

Re: hadoop consistency level

2012-10-18 Thread Bryan Talbot
I believe that reading with CL.ONE will still cause read repair to be run (in the background) 'read_repair_chance' of the time. -Bryan On Thu, Oct 18, 2012 at 1:52 PM, Andrey Ilinykh ailin...@gmail.com wrote: On Thu, Oct 18, 2012 at 1:34 PM, Michael Kjellman mkjell...@barracuda.com wrote:

Re: hadoop consistency level

2012-10-18 Thread Jeremy Hanna
On Oct 18, 2012, at 3:52 PM, Andrey Ilinykh ailin...@gmail.com wrote: On Thu, Oct 18, 2012 at 1:34 PM, Michael Kjellman mkjell...@barracuda.com wrote: Not sure I understand your question (if there is one..) You are more than welcome to do CL ONE and assuming you have hadoop nodes in the

Re: Cassandra nodes loaded unequally

2012-10-18 Thread Ben Kaehne
After some time. I believe this is correct. The load seems to be correlated to compactions/number of files for keyspace/IO etc. Thanks all! Regards, On Thu, Oct 18, 2012 at 9:35 PM, aaron morton aa...@thelastpickle.comwrote: At times of high load check the CPU % for the java service running

Re: hadoop consistency level

2012-10-18 Thread Andrey Ilinykh
On Thu, Oct 18, 2012 at 2:31 PM, Jeremy Hanna jeremy.hanna1...@gmail.com wrote: On Oct 18, 2012, at 3:52 PM, Andrey Ilinykh ailin...@gmail.com wrote: On Thu, Oct 18, 2012 at 1:34 PM, Michael Kjellman mkjell...@barracuda.com wrote: Not sure I understand your question (if there is one..) You

Re: replaced node keeps returning in gossip

2012-10-18 Thread aaron morton
I replaced it with a new node, IP 10.16.128.197 and again token 0 with a -Dcassandra.replace_token=0 at startup Good Good. How long ago did you bring the new node on ? There is a fail safe to remove 128.210 after 3 days if it does not gossip to other nodes. I *thought* that remove_token

Re: UnreachableNodes

2012-10-18 Thread aaron morton
Cool. If you get it again grab nodetool gossipinfo from a few machines. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/10/2012, at 3:32 AM, Rene Kochen rene.koc...@emea.schange.com wrote: Thanks Aaron, Telnet works (in both

Re: potential data loss in Cassandra 1.1.0 .. 1.1.4

2012-10-18 Thread Jonathan Ellis
On Thu, Oct 18, 2012 at 7:30 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi Jonathan. We are currently running the datastax AMI on amazon. Cassandra is in version 1.1.2. I guess that the datastax repo (deb http://debian.datastax.com/community stable main) will be updated directly in 1.1.6