Re: Error after 1.2.0 upgrade
I use the same process as Aaron and I think after disabling gossip, thrift and draining the node, nothing more is witten in this node which is considered as being down. But after stopping Cassandra, if you are using counters and while drain is still broken, I would consider emptying the commit logs too to avoid recounting. I use an ubuntu package install. Why service cassandra stop doesn't use one of this best practice shut down. This script could do all the step described above by Aaron automatically, couldn't it ? Here is the ticket: https://issues.apache.org/jira/browse/CASSANDRA-5111 Le 3 janv. 2013 21:31, Edward Capriolo edlinuxg...@gmail.com a écrit : There is a danger here disablethrift and disablegossip do not stop the fat client. On Thu, Jan 3, 2013 at 3:07 PM, aaron morton aa...@thelastpickle.comwrote: This is what I do to shutdown. Disabling thrift and gossip will stop incoming requests, but it wont stop existing streams. However these do not go through the commit log. echo Disabling thrift and gossip... nodetool -h localhost disablethrift; nodetool -h localhost disablegossip; echo Sleeping for 10... sleep 10; echo Drain... nodetool -h localhost drain; echo Sleeping for 10... sleep 10; echo Stopping... sudo monit stop cassandra; A - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 4/01/2013, at 9:02 AM, Edward Capriolo edlinuxg...@gmail.com wrote: The only true drain is 1) turn on ip tables to stop all incoming traffic 2) flush 3) wait 4) delete files 5) upgrade 6) restart On Thu, Jan 3, 2013 at 2:59 PM, Michael Kjellman mkjell...@barracuda.com wrote: That's why I didn’t create a ticket as I knew there was one. But, I thought this had been fixed in 1.1.7 ?? From: Edward Capriolo edlinuxg...@gmail.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Thursday, January 3, 2013 11:57 AM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Re: Error after 1.2.0 upgrade There is a bug on this, drain has been in a weird state for a long time. In 1.0 it did not work labeled as a known limitation. https://issues.apache.org/jira/browse/CASSANDRA-4446 On Thu, Jan 3, 2013 at 2:49 PM, Michael Kjellman mkjell...@barracuda.com wrote: Another thing: for those that use counters this might be a problem. I always do a nodetool drain before upgrading a node (as is good practice btw). However, in every case on every one of my nodes, the commit log was replayed on each node and mutations were created. Could lead to double counting of counters… No bug for that yet Best, Micahel From: Michael Kjellman mkjell...@barracuda.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Thursday, January 3, 2013 11:42 AM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Re: Error after 1.2.0 upgrade Tracking Issues: https://issues.apache.org/jira/browse/CASSANDRA-5101 https://issues.apache.org/jira/browse/CASSANDRA-5104 which was created because of https://issues.apache.org/jira/browse/CASSANDRA-5103 https://issues.apache.org/jira/browse/CASSANDRA-5102 Also friendly reminder to all that cql2 created indexes will not work with cql3. You need to drop them and recreate in cql3, otherwise you'll see rpc_timeout issues. I'll update with more issues as I see them. The fun bugs never happen in your dev environment do they :) From: aaron morton aa...@thelastpickle.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Thursday, January 3, 2013 11:38 AM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Re: Error after 1.2.0 upgrade Michael, Could you share some of your problems ? May be of help for others. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 4/01/2013, at 5:45 AM, Michael Kjellman mkjell...@barracuda.com wrote: I'm having huge upgrade issues from 1.1.7 - 1.2.0 atm but in a 12 node cluster which I am slowly massaging into a good state I haven't seen this in 15+ hours of operation… This looks related to JNA? From: Alain RODRIGUEZ arodr...@gmail.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Thursday, January 3, 2013 8:42 AM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Error after 1.2.0 upgrade In a dev env, C* 1.1.7 - 1.2.0, 1 node. I run Cassandra in a 8GB memory environment. The upgrade went well, but I sometimes have the following error: INFO 17:31:04,143 Node /192.168.100.201 state jump to normal INFO 17:31:04,149 Enqueuing flush of Memtable-local@1654799672(32/32 serialized/live bytes, 2 ops) INFO 17:31:04,149 Writing Memtable-local@1654799672(32/32 serialized/live bytes, 2 ops) INFO 17:31:04,371 Completed flushing /home/stockage/cassandra/data/system/local/system-local-ia-12-Data.db (91
Re: AWS EMR - Cassandra
On all tasktrackers, I see: java.io.IOException: PIG_OUTPUT_INITIAL_ADDRESS or PIG_INITIAL_ADDRESS environment variable not set at org.apache.cassandra.hadoop.pig.CassandraStorage.setStoreLocation(CassandraStorage.java:821) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setLocation(PigOutputFormat.java:170) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.setUpContext(PigOutputCommitter.java:112) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:86) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.init(PigOutputCommitter.java:67) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:279) at org.apache.hadoop.mapred.Task.initialize(Task.java:515) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:358) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132) at org.apache.hadoop.mapred.Child.main(Child.java:249) On Thu, Jan 3, 2013 at 10:45 PM, aaron morton aa...@thelastpickle.comwrote: Instead, I get an error from CassandraStorage that the initial address isn't set (on the slave, the master is ok). Can you post the full error ? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 4/01/2013, at 11:15 AM, William Oberman ober...@civicscience.com wrote: Anyone ever try to read or write directly between EMR - Cassandra? I'm running various Cassandra resources in Ec2, so the physical connection part is pretty easy using security groups. But, I'm having some configuration issues. I have managed to get Cassandra + Hadoop working in the past using a DIY hadoop cluster, and looking at the configurations in the two environments (EMR vs DIY), I'm not sure what's different that is causing my failures... I should probably note I'm using the Pig integration of Cassandra. Versions: Hadoop 1.0.3, Pig 0.10, Cassandra 1.1.7. I'm 99% sure I have classpaths working (because I didn't at first, and now EMR can find and instantiate CassandraStorage on master and slaves). What isn't working are the system variables. In my DIY cluster, all I needed to do was: --- export PIG_INITIAL_ADDRESS=XXX export PIG_RPC_PORT=9160 export PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner -- And the task trackers somehow magically picked up the values (I never questioned how/why). But, in EMR, they do not. Instead, I get an error from CassandraStorage that the initial address isn't set (on the slave, the master is ok). My DIY cluster used CDH3, which was hadoop 0.20.something. So, maybe the problem is a different version of hadoop? Looking at the CassandraStorage class, I realize I have no idea how it used to work, since it only seems to look at System variables. Those variables are set on the Job.getConfiguration object. I don't know how that part of hadoop works though... do variables that get set on Job on the master get propagated to the task threads? I do know that on my DIY cluster, I do NOT set those system variables on the slaves... Thanks! will
Re: property 'disk_access_mode' not found in cassandra.yaml
Is the default 'auto' the best possible option so that no one has to worry about it anymore ? Yes something like that I guess. You can add the disk_access_mode property to your cassandra.yaml and set it to standard to disable memory mapped access, it will work. It's the same thing for the auto_bootstrap parameter and some other property like these two, they are not written in the conf anymore but still read if they exist. Alain 2013/1/4 DE VITO Dominique dominique.dev...@thalesgroup.com Is the default 'auto' the best possible option so that no one has to worry about it anymore ?
RE: property 'disk_access_mode' not found in cassandra.yaml
That's what I guesseed too (as this property could be found into the src code). But... I am still a bit surprised it's not, at least, mentionned into the doc (into some dedicated section). Dominique De : Alain RODRIGUEZ [mailto:arodr...@gmail.com] Envoyé : vendredi 4 janvier 2013 15:57 À : user@cassandra.apache.org Objet : Re: property 'disk_access_mode' not found in cassandra.yaml Is the default 'auto' the best possible option so that no one has to worry about it anymore ? Yes something like that I guess. You can add the disk_access_mode property to your cassandra.yaml and set it to standard to disable memory mapped access, it will work. It's the same thing for the auto_bootstrap parameter and some other property like these two, they are not written in the conf anymore but still read if they exist. Alain 2013/1/4 DE VITO Dominique dominique.dev...@thalesgroup.commailto:dominique.dev...@thalesgroup.com Is the default 'auto' the best possible option so that no one has to worry about it anymore ?
Re: num_tokens - virtual nodes
http://www.mail-archive.com/user@cassandra.apache.org/msg26528.html From: Alain RODRIGUEZ arodr...@gmail.commailto:arodr...@gmail.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, January 4, 2013 6:00 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: num_tokens - virtual nodes Hi, I just discover the vnodes new feature described here: http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2 In the post above, Brandon says If you’d like to upgrade an installation to virtual nodes, that’s possible too, but I’ll save that for a later post. Does this post now exist somewhere ? From the cassandra.yaml: # If you already have a cluster with 1 token per node, and wish to migrate to # multiple tokens per node, see http://wiki.apache.org/cassandra/Operations # num_tokens: 256 I see no reference to vnodes or num_tokens on this wiki. So, how to switch from physical nodes to vnodes ? This is useful only if your number of nodes is greater than the RF, right ? Why 256 token by default ? Where this value come from ? Is there more advantage / disadvantage of using vnodes that improving the internodes data streaming by increasing the number of data sources ? Alain Southfield Public School students safely access the tech tools they need on and off campus with the Barracuda Web Filter. Quick installation and easy to use- try the Barracuda Web Filter free for 30 days: http://on.fb.me/Vj6JBd
Specifying initial token in 1.2 fails
Hi Just started evaluating 1.2 - starting a clean Cassandra node - the usual practice is to specify the initial token - but when I attempt to start the node the following is observed: INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203) disk_failure_policy is stop DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205) page_cache_hinting is false INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266) Global memtable threshold is enabled at 339MB DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381) setting auto_bootstrap to true ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495) Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: For input string: 85070591730234615865843651857942052863 at org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440) This looks like a bug. Thanks
Re: Specifying initial token in 1.2 fails
Murmur3 != MD5 (RandomPartitioner) From: Dwight Smith dwight.sm...@genesyslab.commailto:dwight.sm...@genesyslab.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, January 4, 2013 8:36 AM To: 'user@cassandra.apache.orgmailto:'user@cassandra.apache.org' user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Specifying initial token in 1.2 fails Hi Just started evaluating 1.2 – starting a clean Cassandra node – the usual practice is to specify the initial token – but when I attempt to start the node the following is observed: INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203) disk_failure_policy is stop DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205) page_cache_hinting is false INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266) Global memtable threshold is enabled at 339MB DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381) setting auto_bootstrap to true ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495) Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: For input string: 85070591730234615865843651857942052863 at org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440) This looks like a bug. Thanks Southfield Public School students safely access the tech tools they need on and off campus with the Barracuda Web Filter. Quick installation and easy to use- try the Barracuda Web Filter free for 30 days: http://on.fb.me/Vj6JBd
RE: Specifying initial token in 1.2 fails
Michael Yes indeed - my mistake. Thanks. I can specify RandomPartitioner, since I do not use indexing - yet. Just for informational purposes - with Murmur3 - to achieve a balanced cluster - is the initial token method supported? If so - how should these be generated, the token-generator seems to only apply to RandomPartitioner. Thanks again From: Michael Kjellman [mailto:mkjell...@barracuda.com] Sent: Friday, January 04, 2013 8:39 AM To: user@cassandra.apache.org Subject: Re: Specifying initial token in 1.2 fails Murmur3 != MD5 (RandomPartitioner) From: Dwight Smith dwight.sm...@genesyslab.commailto:dwight.sm...@genesyslab.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, January 4, 2013 8:36 AM To: 'user@cassandra.apache.orgmailto:'user@cassandra.apache.org' user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Specifying initial token in 1.2 fails Hi Just started evaluating 1.2 - starting a clean Cassandra node - the usual practice is to specify the initial token - but when I attempt to start the node the following is observed: INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203) disk_failure_policy is stop DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205) page_cache_hinting is false INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266) Global memtable threshold is enabled at 339MB DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381) setting auto_bootstrap to true ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495) Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: For input string: 85070591730234615865843651857942052863 at org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440) This looks like a bug. Thanks -- Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f
Re: Specifying initial token in 1.2 fails
To be honest I haven't run a cluster with Murmur3. You can still use indexing with RandomPartitioner (all us old folk are stuck on Random btw..) And there was a thread floating around yesterday where Edward did some benchmarks and found that Murmur3 was actually slower than RandomPartitioner. http://www.mail-archive.com/user@cassandra.apache.org/msg26789.htmlhttp://permalink.gmane.org/gmane.comp.db.cassandra.user/30182 I do know that with vnodes token allocation is now 100% dynamic so no need to manually assign tokens to nodes anymore. Best, michael From: Dwight Smith dwight.sm...@genesyslab.commailto:dwight.sm...@genesyslab.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, January 4, 2013 8:48 AM To: 'user@cassandra.apache.orgmailto:'user@cassandra.apache.org' user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: RE: Specifying initial token in 1.2 fails Michael Yes indeed – my mistake. Thanks. I can specify RandomPartitioner, since I do not use indexing – yet. Just for informational purposes – with Murmur3 - to achieve a balanced cluster – is the initial token method supported? If so – how should these be generated, the token-generator seems to only apply to RandomPartitioner. Thanks again From: Michael Kjellman [mailto:mkjell...@barracuda.com] Sent: Friday, January 04, 2013 8:39 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Specifying initial token in 1.2 fails Murmur3 != MD5 (RandomPartitioner) From: Dwight Smith dwight.sm...@genesyslab.commailto:dwight.sm...@genesyslab.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, January 4, 2013 8:36 AM To: 'user@cassandra.apache.orgmailto:'user@cassandra.apache.org' user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Specifying initial token in 1.2 fails Hi Just started evaluating 1.2 – starting a clean Cassandra node – the usual practice is to specify the initial token – but when I attempt to start the node the following is observed: INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203) disk_failure_policy is stop DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205) page_cache_hinting is false INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266) Global memtable threshold is enabled at 339MB DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381) setting auto_bootstrap to true ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495) Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: For input string: 85070591730234615865843651857942052863 at org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440) This looks like a bug. Thanks -- Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f Southfield Public School students safely access the tech tools they need on and off campus with the Barracuda Web Filter. Quick installation and easy to use- try the Barracuda Web Filter free for 30 days: http://on.fb.me/Vj6JBd
Re: Specifying initial token in 1.2 fails
If you are planning on using murmur3 without vnodes (specifying your own tokens) there is a quick python script in the datastax docs you can use to generate balanced tokens. http://www.datastax.com/docs/1.2/initialize/token_generation#calculating-tokens-for-the-murmur3partitioner On Fri, Jan 4, 2013 at 10:53 AM, Michael Kjellman mkjell...@barracuda.comwrote: To be honest I haven't run a cluster with Murmur3. You can still use indexing with RandomPartitioner (all us old folk are stuck on Random btw..) And there was a thread floating around yesterday where Edward did some benchmarks and found that Murmur3 was actually slower than RandomPartitioner. http://www.mail-archive.com/user@cassandra.apache.org/msg26789.htmlhttp://permalink.gmane.org/gmane.comp.db.cassandra.user/30182 I do know that with vnodes token allocation is now 100% dynamic so no need to manually assign tokens to nodes anymore. Best, michael From: Dwight Smith dwight.sm...@genesyslab.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Friday, January 4, 2013 8:48 AM To: 'user@cassandra.apache.org' user@cassandra.apache.org Subject: RE: Specifying initial token in 1.2 fails Michael ** ** Yes indeed – my mistake. Thanks. I can specify RandomPartitioner, since I do not use indexing – yet. ** ** Just for informational purposes – with Murmur3 - to achieve a balanced cluster – is the initial token method supported? If so – how should these be generated, the token-generator seems to only apply to RandomPartitioner. ** ** Thanks again ** ** *From:* Michael Kjellman [mailto:mkjell...@barracuda.commkjell...@barracuda.com] *Sent:* Friday, January 04, 2013 8:39 AM *To:* user@cassandra.apache.org *Subject:* Re: Specifying initial token in 1.2 fails ** ** Murmur3 != MD5 (RandomPartitioner) ** ** *From: *Dwight Smith dwight.sm...@genesyslab.com *Reply-To: *user@cassandra.apache.org user@cassandra.apache.org *Date: *Friday, January 4, 2013 8:36 AM *To: *'user@cassandra.apache.org' user@cassandra.apache.org *Subject: *Specifying initial token in 1.2 fails ** ** Hi Just started evaluating 1.2 – starting a clean Cassandra node – the usual practice is to specify the initial token – but when I attempt to start the node the following is observed: INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203) disk_failure_policy is stop DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205) page_cache_hinting is false INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266) Global memtable threshold is enabled at 339MB DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381) setting auto_bootstrap to true ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495) Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: For input string: 85070591730234615865843651857942052863 at org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440) This looks like a bug. Thanks ** ** -- Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f -- Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f
Re: when are keyspace dirs removed?
yea, i turned that off because in dev i am creating/dropping keyspaces regularly for integration testing and was surprised to see all the directories left behind On Thu, Jan 3, 2013 at 7:47 PM, aaron morton aa...@thelastpickle.comwrote: Are they never removed in fear of removing snapshots? Aye. Their should be shapshots in there https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L402 Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 4/01/2013, at 2:04 PM, B. Todd Burruss bto...@gmail.com wrote: after dropping a keyspace, the directory in /var/lib/cassandra is still there, but empty. Are they never removed in fear of removing snapshots?
Re: Specifying initial token in 1.2 fails
Yes. They were really just introduced and if you are ready to hitch your wagon to every new feature you put yourself in considerable risk. With any piece of software not just Cassandra. On Fri, Jan 4, 2013 at 11:59 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: But I don't really get the point of starting a new cluster without vnodes... Is there some disadvantage using vnodes ? Alain 2013/1/4 Nick Bailey n...@datastax.com If you are planning on using murmur3 without vnodes (specifying your own tokens) there is a quick python script in the datastax docs you can use to generate balanced tokens. http://www.datastax.com/docs/1.2/initialize/token_generation#calculating-tokens-for-the-murmur3partitioner On Fri, Jan 4, 2013 at 10:53 AM, Michael Kjellman mkjell...@barracuda.com wrote: To be honest I haven't run a cluster with Murmur3. You can still use indexing with RandomPartitioner (all us old folk are stuck on Random btw..) And there was a thread floating around yesterday where Edward did some benchmarks and found that Murmur3 was actually slower than RandomPartitioner. http://www.mail-archive.com/user@cassandra.apache.org/msg26789.htmlhttp://permalink.gmane.org/gmane.comp.db.cassandra.user/30182 I do know that with vnodes token allocation is now 100% dynamic so no need to manually assign tokens to nodes anymore. Best, michael From: Dwight Smith dwight.sm...@genesyslab.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Friday, January 4, 2013 8:48 AM To: 'user@cassandra.apache.org' user@cassandra.apache.org Subject: RE: Specifying initial token in 1.2 fails Michael ** ** Yes indeed – my mistake. Thanks. I can specify RandomPartitioner, since I do not use indexing – yet. ** ** Just for informational purposes – with Murmur3 - to achieve a balanced cluster – is the initial token method supported? If so – how should these be generated, the token-generator seems to only apply to RandomPartitioner. ** ** Thanks again ** ** *From:* Michael Kjellman [mailto:mkjell...@barracuda.commkjell...@barracuda.com] *Sent:* Friday, January 04, 2013 8:39 AM *To:* user@cassandra.apache.org *Subject:* Re: Specifying initial token in 1.2 fails ** ** Murmur3 != MD5 (RandomPartitioner) ** ** *From: *Dwight Smith dwight.sm...@genesyslab.com *Reply-To: *user@cassandra.apache.org user@cassandra.apache.org *Date: *Friday, January 4, 2013 8:36 AM *To: *'user@cassandra.apache.org' user@cassandra.apache.org *Subject: *Specifying initial token in 1.2 fails ** ** Hi Just started evaluating 1.2 – starting a clean Cassandra node – the usual practice is to specify the initial token – but when I attempt to start the node the following is observed: INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203) disk_failure_policy is stop DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205) page_cache_hinting is false INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266) Global memtable threshold is enabled at 339MB DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381) setting auto_bootstrap to true ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495) Fatal configuration error org.apache.cassandra.exceptions.ConfigurationException: For input string: 85070591730234615865843651857942052863 at org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180) at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433) at org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440) This looks like a bug. Thanks ** ** -- Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f -- Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f
Re: AWS EMR - Cassandra
So I've made it work, but I don't get it yet. I have no idea why my DIY server works when I set the environment variables on the machine that kicks off pig (master), and in EMR it doesn't. I recompiled ConfigHelper and CassandraStorage with tons of debugging, and in EMR I can see the hadoop Configuration object get the proper values on the master node, and I can see it does NOT propagate to the task threads. The other part that was driving me nuts could be made more user friendly. The issue is this: I started to try to set cassandra.thrift.address, cassandra.thrift.port, cassandra.partitioner.class in mapred-site.xml, and it didn't work. After even more painful debugging, I noticed that the only time Cassandra sets the input/output versions of those settings (and these input/output specific versions are the only versions really used!) is when Cassandra maps the system environment variables. So, having cassandra.thrift.address in mapred-site.xml does NOTHING, as I needed to have cassandra.output.thrift.address set. It would be much nicer if the get{Input/Output}XYZ checked for the existence of getXYZ if get{Input/Output}XYZ is empty/null. E.g. in getOutputThriftAddress(), if that setting is null, it would have been nice if that method returned getThriftAddress(). My problem went away when I put the full cross product in the XML. E.g. cassandra.input.thrift.address and cassandra.output.thrift.address (and port, and partitioner). I still want to know why the old easy way (of setting the 3 system variables on the pig starter box, and having the config flow into the task trackers) doesn't work! will On Fri, Jan 4, 2013 at 9:04 AM, William Oberman ober...@civicscience.comwrote: On all tasktrackers, I see: java.io.IOException: PIG_OUTPUT_INITIAL_ADDRESS or PIG_INITIAL_ADDRESS environment variable not set at org.apache.cassandra.hadoop.pig.CassandraStorage.setStoreLocation(CassandraStorage.java:821) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setLocation(PigOutputFormat.java:170) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.setUpContext(PigOutputCommitter.java:112) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:86) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.init(PigOutputCommitter.java:67) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:279) at org.apache.hadoop.mapred.Task.initialize(Task.java:515) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:358) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132) at org.apache.hadoop.mapred.Child.main(Child.java:249) On Thu, Jan 3, 2013 at 10:45 PM, aaron morton aa...@thelastpickle.comwrote: Instead, I get an error from CassandraStorage that the initial address isn't set (on the slave, the master is ok). Can you post the full error ? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 4/01/2013, at 11:15 AM, William Oberman ober...@civicscience.com wrote: Anyone ever try to read or write directly between EMR - Cassandra? I'm running various Cassandra resources in Ec2, so the physical connection part is pretty easy using security groups. But, I'm having some configuration issues. I have managed to get Cassandra + Hadoop working in the past using a DIY hadoop cluster, and looking at the configurations in the two environments (EMR vs DIY), I'm not sure what's different that is causing my failures... I should probably note I'm using the Pig integration of Cassandra. Versions: Hadoop 1.0.3, Pig 0.10, Cassandra 1.1.7. I'm 99% sure I have classpaths working (because I didn't at first, and now EMR can find and instantiate CassandraStorage on master and slaves). What isn't working are the system variables. In my DIY cluster, all I needed to do was: --- export PIG_INITIAL_ADDRESS=XXX export PIG_RPC_PORT=9160 export PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner -- And the task trackers somehow magically picked up the values (I never questioned how/why). But, in EMR, they do not. Instead, I get an error from CassandraStorage that the initial address isn't set (on the slave, the master is ok). My DIY cluster used CDH3, which was hadoop 0.20.something. So, maybe the problem is a different version of hadoop? Looking at the CassandraStorage class, I realize I have no idea how it used to work, since it only seems to look at
Cassandra / Windows Server 2008
Hi folks - I have a Windows 2008 server that I'm trying to get Cassandra working on. I have disabled the Windows Firewall for the moment but I still cannot connect to the server. I have tried editing the cassandra.yaml to update the listen_address to the machine address as well as blank or commented out altogether - no change found at all. Any suggestion at all would be most welcome! -steve SERVER STARTUP (* snip *) INFO 13:58:47,161 Binding thrift service to localhost/127.0.0.1:9160 (* snip *) LOCAL CLIENT (default/localhost) C:\Java\apache-cassandra-1.1.7\bincassandra-cli Starting Cassandra Client Column Family assumptions read from x\assumptions.json Connected to: Test Cluster on 127.0.0.1/9160 Welcome to Cassandra CLI version 1.1.6 Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [default@unknown] (Success!) LOCAL CLIENT USING IP ADDRESS (connecting to localhost but using ip address) C:\Java\apache-cassandra-1.1.7\bincassandra-cli -h xxx.xxx.xxx.xxx Starting Cassandra Client org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused: connect at org.apache.thrift.transport.TSocket.open(TSocket.java:183) at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) at org.apache.cassandra.cli.CliMain.connect(CliMain.java:79) at org.apache.cassandra.cli.CliMain.main(CliMain.java:255) Caused by: java.net.ConnectException: Connection refused: connect at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at org.apache.thrift.transport.TSocket.open(TSocket.java:178) ... 3 more Exception connecting to xxx.xxx.xxx.xxx/9160. Reason: Connection refused: connect. Column Family assumptions read from xxx\assumptions.json Welcome to Cassandra CLI version 1.1.6 I get the same result trying to connect from a remote machine.
Re: Cassandra / Windows Server 2008
Use linux ;) More seriously, I'm wondering if it is binding to the IPV6 address? Is that enabled on that NIC? You could try disabling IPv6 and seeing if RPC binds correctly.. From: stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, January 4, 2013 11:23 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Cassandra / Windows Server 2008 Hi folks – I have a Windows 2008 server that I’m trying to get Cassandra working on. I have disabled the Windows Firewall for the moment but I still cannot connect to the server. I have tried editing the cassandra.yaml to update the listen_address to the machine address as well as blank or commented out altogether – no change found at all. Any suggestion at all would be most welcome! -steve SERVER STARTUP (* snip *) INFO 13:58:47,161 Binding thrift service to localhost/127.0.0.1:9160 (* snip *) LOCAL CLIENT (default/localhost) C:\Java\apache-cassandra-1.1.7\bincassandra-cli Starting Cassandra Client Column Family assumptions read from x\assumptions.json Connected to: Test Cluster on 127.0.0.1/9160 Welcome to Cassandra CLI version 1.1.6 Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [default@unknown] (Success!) LOCAL CLIENT USING IP ADDRESS (connecting to localhost but using ip address) C:\Java\apache-cassandra-1.1.7\bincassandra-cli -h xxx.xxx.xxx.xxx Starting Cassandra Client org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused: connect at org.apache.thrift.transport.TSocket.open(TSocket.java:183) at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) at org.apache.cassandra.cli.CliMain.connect(CliMain.java:79) at org.apache.cassandra.cli.CliMain.main(CliMain.java:255) Caused by: java.net.ConnectException: Connection refused: connect at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at org.apache.thrift.transport.TSocket.open(TSocket.java:178) ... 3 more Exception connecting to xxx.xxx.xxx.xxx/9160. Reason: Connection refused: connect. Column Family assumptions read from xxx\assumptions.json Welcome to Cassandra CLI version 1.1.6 I get the same result trying to connect from a remote machine. Southfield Public School students safely access the tech tools they need on and off campus with the Barracuda Web Filter. Quick installation and easy to use- try the Barracuda Web Filter free for 30 days: http://on.fb.me/Vj6JBd
RE: Cassandra / Windows Server 2008
Good suggestion ... I added -Djava.net.preferIPv4Stack=true as a JVM arg cassandra.bat and got exactly the same result though. Stephen Thompson Wells Fargo Corporation Internet Authentication Fraud Prevention 704.427.3137 (W) | 704.807.3431 (C) UPCOMING PTO: JAN 14-18 This message may contain confidential and/or privileged information, and is intended for the use of the addressee only. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose, or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation. From: Michael Kjellman [mailto:mkjell...@barracuda.com] Sent: Friday, January 04, 2013 2:26 PM To: user@cassandra.apache.org Subject: Re: Cassandra / Windows Server 2008 Use linux ;) More seriously, I'm wondering if it is binding to the IPV6 address? Is that enabled on that NIC? You could try disabling IPv6 and seeing if RPC binds correctly.. From: stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, January 4, 2013 11:23 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Cassandra / Windows Server 2008 Hi folks - I have a Windows 2008 server that I'm trying to get Cassandra working on. I have disabled the Windows Firewall for the moment but I still cannot connect to the server. I have tried editing the cassandra.yaml to update the listen_address to the machine address as well as blank or commented out altogether - no change found at all. Any suggestion at all would be most welcome! -steve SERVER STARTUP (* snip *) INFO 13:58:47,161 Binding thrift service to localhost/127.0.0.1:9160 (* snip *) LOCAL CLIENT (default/localhost) C:\Java\apache-cassandra-1.1.7\bincassandra-cli Starting Cassandra Client Column Family assumptions read from x\assumptions.json Connected to: Test Cluster on 127.0.0.1/9160 Welcome to Cassandra CLI version 1.1.6 Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [default@unknown] (Success!) LOCAL CLIENT USING IP ADDRESS (connecting to localhost but using ip address) C:\Java\apache-cassandra-1.1.7\bincassandra-cli -h xxx.xxx.xxx.xxx Starting Cassandra Client org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused: connect at org.apache.thrift.transport.TSocket.open(TSocket.java:183) at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) at org.apache.cassandra.cli.CliMain.connect(CliMain.java:79) at org.apache.cassandra.cli.CliMain.main(CliMain.java:255) Caused by: java.net.ConnectException: Connection refused: connect at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at org.apache.thrift.transport.TSocket.open(TSocket.java:178) ... 3 more Exception connecting to xxx.xxx.xxx.xxx/9160. Reason: Connection refused: connect. Column Family assumptions read from xxx\assumptions.json Welcome to Cassandra CLI version 1.1.6 I get the same result trying to connect from a remote machine. -- Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f
Re: num_tokens - virtual nodes
https://www.youtube.com/watch?v=GddZ3pXiDyslist=PLC5E3906433F5A165index=28 This video of Cassandra summit 2012 has mentioned the use of 256 tokens by default (though it's no longer in the conf/cassandra.yaml). I remember that more tokens could lead to more disk seeks or something. I think 256 is an empirical number based on tests of the speaker. Here are two more related articles that you want to read. http://www.acunu.com/2/post/2012/07/virtual-nodes-strategies.html http://www.acunu.com/2/post/2012/10/improving-cassandras-uptime-with-virtual-nodes.html On Sat, Jan 5, 2013 at 12:18 AM, Michael Kjellman mkjell...@barracuda.comwrote: http://www.mail-archive.com/user@cassandra.apache.org/msg26528.html From: Alain RODRIGUEZ arodr...@gmail.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Friday, January 4, 2013 6:00 AM To: user@cassandra.apache.org user@cassandra.apache.org Subject: num_tokens - virtual nodes Hi, I just discover the vnodes new feature described here: http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2 In the post above, Brandon says If you’d like to upgrade an installation to virtual nodes, that’s possible too, but I’ll save that for a later post. Does this post now exist somewhere ? From the cassandra.yaml: # If you already have a cluster with 1 token per node, and wish to migrate to # multiple tokens per node, see http://wiki.apache.org/cassandra/Operations # num_tokens: 256 I see no reference to vnodes or num_tokens on this wiki. So, how to switch from physical nodes to vnodes ? This is useful only if your number of nodes is greater than the RF, right ? Why 256 token by default ? Where this value come from ? Is there more advantage / disadvantage of using vnodes that improving the internodes data streaming by increasing the number of data sources ? Alain -- Join Barracuda Networks in the fight against hunger. To learn how you can help in your community, please visit: http://on.fb.me/UAdL4f