Re: Error after 1.2.0 upgrade

2013-01-04 Thread Alain RODRIGUEZ
I use the same process as Aaron and I think after disabling gossip, thrift
and draining the node, nothing more is witten in this node which is
considered as being down. But after stopping Cassandra, if you are using
counters and while drain is still broken, I would consider emptying the
commit logs too to avoid recounting.

I use an ubuntu package install. Why service cassandra stop doesn't use
one of this best practice shut down. This script could do all the step
described above by Aaron automatically, couldn't it ?

Here is the ticket: https://issues.apache.org/jira/browse/CASSANDRA-5111
Le 3 janv. 2013 21:31, Edward Capriolo edlinuxg...@gmail.com a écrit :

 There is a danger here disablethrift and disablegossip do not stop the fat
 client.

 On Thu, Jan 3, 2013 at 3:07 PM, aaron morton aa...@thelastpickle.comwrote:

 This is what I do to shutdown. Disabling thrift and gossip will stop
 incoming requests, but it wont stop existing streams. However these do not
 go through the commit log.

 echo Disabling thrift and gossip...
 nodetool -h localhost disablethrift;
 nodetool -h localhost disablegossip;

 echo Sleeping for 10...
 sleep 10;

 echo Drain...
 nodetool -h localhost drain;

 echo Sleeping for 10...
 sleep 10;

 echo Stopping...
 sudo monit stop cassandra;

 A

-
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 4/01/2013, at 9:02 AM, Edward Capriolo edlinuxg...@gmail.com wrote:

 The only true drain is
 1) turn on ip tables to stop all incoming traffic
 2) flush
 3) wait
 4) delete files
 5) upgrade
 6) restart


 On Thu, Jan 3, 2013 at 2:59 PM, Michael Kjellman mkjell...@barracuda.com
  wrote:

 That's why I didn’t create a ticket as I knew there was one. But, I
 thought this had been fixed in 1.1.7 ??

 From: Edward Capriolo edlinuxg...@gmail.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Thursday, January 3, 2013 11:57 AM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: Error after 1.2.0 upgrade

 There is a bug on this, drain has been in a weird state for a long time.
 In 1.0 it did not work labeled as a known limitation.

 https://issues.apache.org/jira/browse/CASSANDRA-4446



 On Thu, Jan 3, 2013 at 2:49 PM, Michael Kjellman 
 mkjell...@barracuda.com wrote:

  Another thing: for those that use counters this might be a problem.

 I always do a nodetool drain before upgrading a node (as is good
 practice btw). However, in every case on every one of my nodes, the commit
 log was replayed on each node and mutations were created. Could lead to
 double counting of counters…

 No bug for that yet

 Best,
 Micahel

 From: Michael Kjellman mkjell...@barracuda.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Thursday, January 3, 2013 11:42 AM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: Error after 1.2.0 upgrade

 Tracking Issues:

 https://issues.apache.org/jira/browse/CASSANDRA-5101
 https://issues.apache.org/jira/browse/CASSANDRA-5104 which was created
 because of https://issues.apache.org/jira/browse/CASSANDRA-5103
 https://issues.apache.org/jira/browse/CASSANDRA-5102

 Also friendly reminder to all that cql2 created indexes will not work
 with cql3. You need to drop them and recreate in cql3, otherwise you'll see
 rpc_timeout issues.

 I'll update with more issues as I see them. The fun bugs never happen
 in your dev environment do they :)

 From: aaron morton aa...@thelastpickle.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Thursday, January 3, 2013 11:38 AM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: Error after 1.2.0 upgrade

 Michael,
 Could you share some of your problems ? May be of help for others.

 Cheers

   -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 4/01/2013, at 5:45 AM, Michael Kjellman mkjell...@barracuda.com
 wrote:

 I'm having huge upgrade issues from 1.1.7 - 1.2.0 atm but in a 12 node
 cluster which I am slowly massaging into a good state I haven't seen this
 in 15+ hours of operation…

 This looks related to JNA?

 From: Alain RODRIGUEZ arodr...@gmail.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Thursday, January 3, 2013 8:42 AM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Error after 1.2.0 upgrade

 In a dev env, C* 1.1.7 - 1.2.0, 1 node.

 I run Cassandra in a 8GB memory environment.

 The upgrade went well, but I sometimes have the following error:

 INFO 17:31:04,143 Node /192.168.100.201 state jump to normal
  INFO 17:31:04,149 Enqueuing flush of Memtable-local@1654799672(32/32
 serialized/live bytes, 2 ops)
  INFO 17:31:04,149 Writing Memtable-local@1654799672(32/32
 serialized/live bytes, 2 ops)
  INFO 17:31:04,371 Completed flushing
 /home/stockage/cassandra/data/system/local/system-local-ia-12-Data.db (91

Re: AWS EMR - Cassandra

2013-01-04 Thread William Oberman
On all tasktrackers, I see:
java.io.IOException: PIG_OUTPUT_INITIAL_ADDRESS or PIG_INITIAL_ADDRESS
environment variable not set
at
org.apache.cassandra.hadoop.pig.CassandraStorage.setStoreLocation(CassandraStorage.java:821)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setLocation(PigOutputFormat.java:170)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.setUpContext(PigOutputCommitter.java:112)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:86)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.init(PigOutputCommitter.java:67)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:279)
at org.apache.hadoop.mapred.Task.initialize(Task.java:515)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:358)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)


On Thu, Jan 3, 2013 at 10:45 PM, aaron morton aa...@thelastpickle.comwrote:

 Instead, I get an error from CassandraStorage that the initial address
 isn't set (on the slave, the master is ok).

 Can you post the full error ?

 Cheers
-
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 4/01/2013, at 11:15 AM, William Oberman ober...@civicscience.com
 wrote:

 Anyone ever try to read or write directly between EMR - Cassandra?

 I'm running various Cassandra resources in Ec2, so the physical
 connection part is pretty easy using security groups.  But, I'm having
 some configuration issues.  I have managed to get Cassandra + Hadoop
 working in the past using a DIY hadoop cluster, and looking at the
 configurations in the two environments (EMR vs DIY), I'm not sure what's
 different that is causing my failures...  I should probably note I'm using
 the Pig integration of Cassandra.

 Versions: Hadoop 1.0.3, Pig 0.10, Cassandra 1.1.7.

 I'm 99% sure I have classpaths working (because I didn't at first, and now
 EMR can find and instantiate CassandraStorage on master and slaves).  What
 isn't working are the system variables.  In my DIY cluster, all I needed to
 do was:
 ---
 export PIG_INITIAL_ADDRESS=XXX
 export PIG_RPC_PORT=9160
 export PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner
 --
 And the task trackers somehow magically picked up the values (I never
 questioned how/why).  But, in EMR, they do not.  Instead, I get an error
 from CassandraStorage that the initial address isn't set (on the slave, the
 master is ok).

 My DIY cluster used CDH3, which was hadoop 0.20.something.  So, maybe the
 problem is a different version of hadoop?

 Looking at the CassandraStorage class, I realize I have no idea how it
 used to work, since it only seems to look at System variables.  Those
 variables are set on the Job.getConfiguration object.  I don't know how
 that part of hadoop works though... do variables that get set on Job on the
 master get propagated to the task threads?  I do know that on my DIY
 cluster, I do NOT set those system variables on the slaves...

 Thanks!

 will





Re: property 'disk_access_mode' not found in cassandra.yaml

2013-01-04 Thread Alain RODRIGUEZ
Is the default 'auto' the best possible option so that no one has to worry
about it anymore ?

Yes something like that I guess.

You can add the disk_access_mode property to your cassandra.yaml and set it
to standard to disable memory mapped access, it will work. It's the same
thing for the auto_bootstrap parameter and some other property like these
two, they are not written in the conf anymore but still read if they exist.

Alain

2013/1/4 DE VITO Dominique dominique.dev...@thalesgroup.com

 Is the default 'auto' the best possible option so that no one has to worry
 about it anymore ?


RE: property 'disk_access_mode' not found in cassandra.yaml

2013-01-04 Thread DE VITO Dominique
That's what I guesseed too (as this property could be found into the src code).

But... I am still a bit surprised it's not, at least, mentionned into the doc 
(into some dedicated section).

Dominique



De : Alain RODRIGUEZ [mailto:arodr...@gmail.com]
Envoyé : vendredi 4 janvier 2013 15:57
À : user@cassandra.apache.org
Objet : Re: property 'disk_access_mode' not found in cassandra.yaml

Is the default 'auto' the best possible option so that no one has to worry 
about it anymore ?

Yes something like that I guess.

You can add the disk_access_mode property to your cassandra.yaml and set it to 
standard to disable memory mapped access, it will work. It's the same thing for 
the auto_bootstrap parameter and some other property like these two, they are 
not written in the conf anymore but still read if they exist.

Alain

2013/1/4 DE VITO Dominique 
dominique.dev...@thalesgroup.commailto:dominique.dev...@thalesgroup.com
Is the default 'auto' the best possible option so that no one has to worry 
about it anymore ?



Re: num_tokens - virtual nodes

2013-01-04 Thread Michael Kjellman
http://www.mail-archive.com/user@cassandra.apache.org/msg26528.html

From: Alain RODRIGUEZ arodr...@gmail.commailto:arodr...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Friday, January 4, 2013 6:00 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: num_tokens - virtual nodes

Hi,

I just discover the vnodes new feature described here: 
http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2

In the post above, Brandon says If you’d like to upgrade an installation to 
virtual nodes, that’s possible too, but I’ll save that for a later post. Does 
this post now exist somewhere ?

From the cassandra.yaml:

# If you already have a cluster with 1 token per node, and wish to migrate to
# multiple tokens per node, see http://wiki.apache.org/cassandra/Operations
# num_tokens: 256

I see no reference to vnodes or num_tokens on this wiki.

So, how to switch from physical nodes to vnodes ?

This is useful only if your number of nodes is greater than the RF, right ?

Why 256 token by default ? Where this value come from ?

Is there more advantage / disadvantage of using vnodes that improving the 
internodes data streaming by increasing the number of data sources ?

Alain

Southfield Public School students safely access the tech tools they need on and 
off campus with the Barracuda Web Filter.

Quick installation and easy to use- try the Barracuda Web Filter free for 30 
days: http://on.fb.me/Vj6JBd


Specifying initial token in 1.2 fails

2013-01-04 Thread Dwight Smith
Hi

Just started evaluating 1.2 - starting a clean Cassandra node - the usual 
practice is to specify the initial token - but when I attempt to start the node 
the following is observed:

INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203) 
disk_failure_policy is stop
DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205) 
page_cache_hinting is false
INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266) Global 
memtable threshold is enabled at 339MB
DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381) setting 
auto_bootstrap to true
ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495) Fatal 
configuration error
org.apache.cassandra.exceptions.ConfigurationException: For input string: 
85070591730234615865843651857942052863
at 
org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180)
at 
org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433)
at 
org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440)


This looks like a bug.

Thanks



Re: Specifying initial token in 1.2 fails

2013-01-04 Thread Michael Kjellman
Murmur3 != MD5 (RandomPartitioner)

From: Dwight Smith 
dwight.sm...@genesyslab.commailto:dwight.sm...@genesyslab.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Friday, January 4, 2013 8:36 AM
To: 'user@cassandra.apache.orgmailto:'user@cassandra.apache.org' 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Specifying initial token in 1.2 fails

Hi

Just started evaluating 1.2 – starting a clean Cassandra node – the usual 
practice is to specify the initial token – but when I attempt to start the node 
the following is observed:

INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203) 
disk_failure_policy is stop
DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205) 
page_cache_hinting is false
INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266) Global 
memtable threshold is enabled at 339MB
DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381) setting 
auto_bootstrap to true
ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495) Fatal 
configuration error
org.apache.cassandra.exceptions.ConfigurationException: For input string: 
85070591730234615865843651857942052863
at 
org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180)
at 
org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433)
at 
org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440)


This looks like a bug.

Thanks



Southfield Public School students safely access the tech tools they need on and 
off campus with the Barracuda Web Filter.

Quick installation and easy to use- try the Barracuda Web Filter free for 30 
days: http://on.fb.me/Vj6JBd


RE: Specifying initial token in 1.2 fails

2013-01-04 Thread Dwight Smith
Michael

Yes indeed - my mistake.  Thanks.  I can specify RandomPartitioner, since I do 
not use indexing - yet.

Just for informational purposes - with Murmur3 - to achieve a balanced cluster 
- is the initial token method supported?
If so - how should these be generated, the token-generator seems to only apply 
to RandomPartitioner.

Thanks again

From: Michael Kjellman [mailto:mkjell...@barracuda.com]
Sent: Friday, January 04, 2013 8:39 AM
To: user@cassandra.apache.org
Subject: Re: Specifying initial token in 1.2 fails

Murmur3 != MD5 (RandomPartitioner)

From: Dwight Smith 
dwight.sm...@genesyslab.commailto:dwight.sm...@genesyslab.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Friday, January 4, 2013 8:36 AM
To: 'user@cassandra.apache.orgmailto:'user@cassandra.apache.org' 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Specifying initial token in 1.2 fails

Hi

Just started evaluating 1.2 - starting a clean Cassandra node - the usual 
practice is to specify the initial token - but when I attempt to start the node 
the following is observed:

INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203) 
disk_failure_policy is stop
DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205) 
page_cache_hinting is false
INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266) Global 
memtable threshold is enabled at 339MB
DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381) setting 
auto_bootstrap to true
ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495) Fatal 
configuration error
org.apache.cassandra.exceptions.ConfigurationException: For input string: 
85070591730234615865843651857942052863
at 
org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180)
at 
org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433)
at 
org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440)


This looks like a bug.

Thanks



--
Join Barracuda Networks in the fight against hunger.
To learn how you can help in your community, please visit: 
http://on.fb.me/UAdL4f
  

Re: Specifying initial token in 1.2 fails

2013-01-04 Thread Michael Kjellman
To be honest I haven't run a cluster with Murmur3.

You can still use indexing with RandomPartitioner (all us old folk are stuck 
on Random btw..)

And there was a thread floating around yesterday where Edward did some 
benchmarks and found that Murmur3 was actually slower than RandomPartitioner.

http://www.mail-archive.com/user@cassandra.apache.org/msg26789.htmlhttp://permalink.gmane.org/gmane.comp.db.cassandra.user/30182

I do know that with vnodes token allocation is now 100% dynamic so no need to 
manually assign tokens to nodes anymore.

Best,
michael

From: Dwight Smith 
dwight.sm...@genesyslab.commailto:dwight.sm...@genesyslab.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Friday, January 4, 2013 8:48 AM
To: 'user@cassandra.apache.orgmailto:'user@cassandra.apache.org' 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: RE: Specifying initial token in 1.2 fails

Michael

Yes indeed – my mistake.  Thanks.  I can specify RandomPartitioner, since I do 
not use indexing – yet.

Just for informational purposes – with Murmur3 - to achieve a balanced cluster 
– is the initial token method supported?
If so – how should these be generated, the token-generator seems to only apply 
to RandomPartitioner.

Thanks again

From: Michael Kjellman [mailto:mkjell...@barracuda.com]
Sent: Friday, January 04, 2013 8:39 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Specifying initial token in 1.2 fails

Murmur3 != MD5 (RandomPartitioner)

From: Dwight Smith 
dwight.sm...@genesyslab.commailto:dwight.sm...@genesyslab.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Friday, January 4, 2013 8:36 AM
To: 'user@cassandra.apache.orgmailto:'user@cassandra.apache.org' 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Specifying initial token in 1.2 fails

Hi

Just started evaluating 1.2 – starting a clean Cassandra node – the usual 
practice is to specify the initial token – but when I attempt to start the node 
the following is observed:

INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203) 
disk_failure_policy is stop
DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205) 
page_cache_hinting is false
INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266) Global 
memtable threshold is enabled at 339MB
DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381) setting 
auto_bootstrap to true
ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495) Fatal 
configuration error
org.apache.cassandra.exceptions.ConfigurationException: For input string: 
85070591730234615865843651857942052863
at 
org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180)
at 
org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433)
at 
org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440)


This looks like a bug.

Thanks



--
Join Barracuda Networks in the fight against hunger.
To learn how you can help in your community, please visit: 
http://on.fb.me/UAdL4f
  ­­

Southfield Public School students safely access the tech tools they need on and 
off campus with the Barracuda Web Filter.

Quick installation and easy to use- try the Barracuda Web Filter free for 30 
days: http://on.fb.me/Vj6JBd


Re: Specifying initial token in 1.2 fails

2013-01-04 Thread Nick Bailey
If you are planning on using murmur3 without vnodes (specifying your own
tokens) there is a quick python script in the datastax docs you can use to
generate balanced tokens.

http://www.datastax.com/docs/1.2/initialize/token_generation#calculating-tokens-for-the-murmur3partitioner


On Fri, Jan 4, 2013 at 10:53 AM, Michael Kjellman
mkjell...@barracuda.comwrote:

 To be honest I haven't run a cluster with Murmur3.

 You can still use indexing with RandomPartitioner (all us old folk are
 stuck on Random btw..)

 And there was a thread floating around yesterday where Edward did some
 benchmarks and found that Murmur3 was actually slower than
 RandomPartitioner.

 http://www.mail-archive.com/user@cassandra.apache.org/msg26789.htmlhttp://permalink.gmane.org/gmane.comp.db.cassandra.user/30182

 I do know that with vnodes token allocation is now 100% dynamic so no need
 to manually assign tokens to nodes anymore.

 Best,
 michael

 From: Dwight Smith dwight.sm...@genesyslab.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Friday, January 4, 2013 8:48 AM
 To: 'user@cassandra.apache.org' user@cassandra.apache.org
 Subject: RE: Specifying initial token in 1.2 fails

 Michael

 ** **

 Yes indeed – my mistake.  Thanks.  I can specify RandomPartitioner, since
 I do not use indexing – yet.

 ** **

 Just for informational purposes – with Murmur3 - to achieve a balanced
 cluster – is the initial token method supported?

 If so – how should these be generated, the token-generator seems to only
 apply to RandomPartitioner.

 ** **

 Thanks again

 ** **

 *From:* Michael Kjellman 
 [mailto:mkjell...@barracuda.commkjell...@barracuda.com]

 *Sent:* Friday, January 04, 2013 8:39 AM
 *To:* user@cassandra.apache.org
 *Subject:* Re: Specifying initial token in 1.2 fails

 ** **

 Murmur3 != MD5 (RandomPartitioner)

 ** **

 *From: *Dwight Smith dwight.sm...@genesyslab.com
 *Reply-To: *user@cassandra.apache.org user@cassandra.apache.org
 *Date: *Friday, January 4, 2013 8:36 AM
 *To: *'user@cassandra.apache.org' user@cassandra.apache.org
 *Subject: *Specifying initial token in 1.2 fails

 ** **

 Hi

  

 Just started evaluating 1.2 – starting a clean Cassandra node – the usual
 practice is to specify the initial token – but when I attempt to start the
 node the following is observed:

  

 INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203)
 disk_failure_policy is stop

 DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205)
 page_cache_hinting is false

 INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266)
 Global memtable threshold is enabled at 339MB

 DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381)
 setting auto_bootstrap to true

 ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495)
 Fatal configuration error

 org.apache.cassandra.exceptions.ConfigurationException: For input string:
 85070591730234615865843651857942052863

 at
 org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180)
 

 at
 org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433)
 

 at
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121)
 

 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178)
 

 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397)
 

 at
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440)
 

  

  

 This looks like a bug.

  

 Thanks

  

  

 ** **

 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your community, please visit:
 http://on.fb.me/UAdL4f 

   ­­  

 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your community, please visit:
 http://on.fb.me/UAdL4f
   ­­



Re: when are keyspace dirs removed?

2013-01-04 Thread B. Todd Burruss
yea, i turned that off because in dev i am creating/dropping keyspaces
regularly for integration testing and was surprised to see all the
directories left behind


On Thu, Jan 3, 2013 at 7:47 PM, aaron morton aa...@thelastpickle.comwrote:

 Are they never removed in fear of removing snapshots?

 Aye.
 Their should be shapshots in there
 https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L402

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 4/01/2013, at 2:04 PM, B. Todd Burruss bto...@gmail.com wrote:

 after dropping a keyspace, the directory in /var/lib/cassandra is still
 there, but empty.

 Are they never removed in fear of removing snapshots?






Re: Specifying initial token in 1.2 fails

2013-01-04 Thread Edward Capriolo
Yes. They were really just introduced and if you are ready to hitch your
wagon to every new feature you put yourself in considerable risk. With any
piece of software not just Cassandra.

On Fri, Jan 4, 2013 at 11:59 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:

 But I don't really get the point of starting a new cluster without
 vnodes... Is there some disadvantage using vnodes ?

 Alain


 2013/1/4 Nick Bailey n...@datastax.com

 If you are planning on using murmur3 without vnodes (specifying your own
 tokens) there is a quick python script in the datastax docs you can use to
 generate balanced tokens.


 http://www.datastax.com/docs/1.2/initialize/token_generation#calculating-tokens-for-the-murmur3partitioner


 On Fri, Jan 4, 2013 at 10:53 AM, Michael Kjellman 
 mkjell...@barracuda.com wrote:

 To be honest I haven't run a cluster with Murmur3.

 You can still use indexing with RandomPartitioner (all us old folk are
 stuck on Random btw..)

 And there was a thread floating around yesterday where Edward did some
 benchmarks and found that Murmur3 was actually slower than
 RandomPartitioner.

 http://www.mail-archive.com/user@cassandra.apache.org/msg26789.htmlhttp://permalink.gmane.org/gmane.comp.db.cassandra.user/30182

 I do know that with vnodes token allocation is now 100% dynamic so no
 need to manually assign tokens to nodes anymore.

 Best,
 michael

 From: Dwight Smith dwight.sm...@genesyslab.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Friday, January 4, 2013 8:48 AM
 To: 'user@cassandra.apache.org' user@cassandra.apache.org
 Subject: RE: Specifying initial token in 1.2 fails

 Michael

 ** **

 Yes indeed – my mistake.  Thanks.  I can specify RandomPartitioner,
 since I do not use indexing – yet.

 ** **

 Just for informational purposes – with Murmur3 - to achieve a balanced
 cluster – is the initial token method supported?

 If so – how should these be generated, the token-generator seems to only
 apply to RandomPartitioner.

 ** **

 Thanks again

 ** **

 *From:* Michael Kjellman 
 [mailto:mkjell...@barracuda.commkjell...@barracuda.com]

 *Sent:* Friday, January 04, 2013 8:39 AM
 *To:* user@cassandra.apache.org
 *Subject:* Re: Specifying initial token in 1.2 fails

 ** **

 Murmur3 != MD5 (RandomPartitioner)

 ** **

 *From: *Dwight Smith dwight.sm...@genesyslab.com
 *Reply-To: *user@cassandra.apache.org user@cassandra.apache.org
 *Date: *Friday, January 4, 2013 8:36 AM
 *To: *'user@cassandra.apache.org' user@cassandra.apache.org
 *Subject: *Specifying initial token in 1.2 fails

 ** **

 Hi

  

 Just started evaluating 1.2 – starting a clean Cassandra node – the
 usual practice is to specify the initial token – but when I attempt to
 start the node the following is observed:

  

 INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 203)
 disk_failure_policy is stop

 DEBUG [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 205)
 page_cache_hinting is false

 INFO [main] 2013-01-03 14:08:57,774 DatabaseDescriptor.java (line 266)
 Global memtable threshold is enabled at 339MB

 DEBUG [main] 2013-01-03 14:08:58,008 DatabaseDescriptor.java (line 381)
 setting auto_bootstrap to true

 ERROR [main] 2013-01-03 14:08:58,024 DatabaseDescriptor.java (line 495)
 Fatal configuration error

 org.apache.cassandra.exceptions.ConfigurationException: For input
 string: 85070591730234615865843651857942052863

 at
 org.apache.cassandra.dht.Murmur3Partitioner$1.validate(Murmur3Partitioner.java:180)
 

 at
 org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:433)
 

 at
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:121)
 

 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:178)
 

 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:397)
 

 at
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:440)
 

  

  

 This looks like a bug.

  

 Thanks

  

  

 ** **

 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your community, please visit:
 http://on.fb.me/UAdL4f 

   ­­  

 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your community, please visit:
 http://on.fb.me/UAdL4f
   ­­






Re: AWS EMR - Cassandra

2013-01-04 Thread William Oberman
So I've made it work, but I don't get it yet.

I have no idea why my DIY server works when I set the environment variables
on the machine that kicks off pig (master), and in EMR it doesn't.  I
recompiled ConfigHelper and CassandraStorage with tons of debugging, and in
EMR I can see the hadoop Configuration object get the proper values on the
master node, and I can see it does NOT propagate to the task threads.

The other part that was driving me nuts could be made more user friendly.
 The issue is this: I started to try to set
cassandra.thrift.address, cassandra.thrift.port,
cassandra.partitioner.class in mapred-site.xml, and it didn't work.  After
even more painful debugging, I noticed that the only time Cassandra sets
the input/output versions of those settings (and these input/output
specific versions are the only versions really used!) is when Cassandra
maps the system environment variables.  So, having cassandra.thrift.address
in mapred-site.xml does NOTHING, as I needed to
have cassandra.output.thrift.address set.  It would be much nicer if the
get{Input/Output}XYZ checked for the existence of getXYZ
if get{Input/Output}XYZ is empty/null.  E.g. in getOutputThriftAddress(),
if that setting is null, it would have been nice if that method returned
getThriftAddress().  My problem went away when I put the full cross product
in the XML. E.g. cassandra.input.thrift.address
and cassandra.output.thrift.address (and port, and partitioner).

I still want to know why the old easy way (of setting the 3 system
variables on the pig starter box, and having the config flow into the task
trackers) doesn't work!

will


On Fri, Jan 4, 2013 at 9:04 AM, William Oberman ober...@civicscience.comwrote:

 On all tasktrackers, I see:
 java.io.IOException: PIG_OUTPUT_INITIAL_ADDRESS or PIG_INITIAL_ADDRESS
 environment variable not set
 at
 org.apache.cassandra.hadoop.pig.CassandraStorage.setStoreLocation(CassandraStorage.java:821)
 at
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setLocation(PigOutputFormat.java:170)
 at
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.setUpContext(PigOutputCommitter.java:112)
 at
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:86)
 at
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.init(PigOutputCommitter.java:67)
 at
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:279)
 at org.apache.hadoop.mapred.Task.initialize(Task.java:515)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:358)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)


 On Thu, Jan 3, 2013 at 10:45 PM, aaron morton aa...@thelastpickle.comwrote:

 Instead, I get an error from CassandraStorage that the initial address
 isn't set (on the slave, the master is ok).

 Can you post the full error ?

 Cheers
-
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 4/01/2013, at 11:15 AM, William Oberman ober...@civicscience.com
 wrote:

 Anyone ever try to read or write directly between EMR - Cassandra?

 I'm running various Cassandra resources in Ec2, so the physical
 connection part is pretty easy using security groups.  But, I'm having
 some configuration issues.  I have managed to get Cassandra + Hadoop
 working in the past using a DIY hadoop cluster, and looking at the
 configurations in the two environments (EMR vs DIY), I'm not sure what's
 different that is causing my failures...  I should probably note I'm using
 the Pig integration of Cassandra.

 Versions: Hadoop 1.0.3, Pig 0.10, Cassandra 1.1.7.

 I'm 99% sure I have classpaths working (because I didn't at first, and
 now EMR can find and instantiate CassandraStorage on master and slaves).
  What isn't working are the system variables.  In my DIY cluster, all I
 needed to do was:
 ---
 export PIG_INITIAL_ADDRESS=XXX
 export PIG_RPC_PORT=9160
 export PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner
 --
 And the task trackers somehow magically picked up the values (I never
 questioned how/why).  But, in EMR, they do not.  Instead, I get an error
 from CassandraStorage that the initial address isn't set (on the slave, the
 master is ok).

 My DIY cluster used CDH3, which was hadoop 0.20.something.  So, maybe the
 problem is a different version of hadoop?

 Looking at the CassandraStorage class, I realize I have no idea how it
 used to work, since it only seems to look at 

Cassandra / Windows Server 2008

2013-01-04 Thread Stephen.M.Thompson
Hi folks - I have a Windows 2008 server that I'm trying to get Cassandra 
working on.  I have disabled the Windows Firewall for the moment but I still 
cannot connect to the server.

I have tried editing the cassandra.yaml to update the listen_address to the 
machine address as well as blank or commented out altogether - no change found 
at all.

Any suggestion at all would be most welcome!

-steve

SERVER STARTUP
(* snip *)
INFO 13:58:47,161 Binding thrift service to localhost/127.0.0.1:9160
(* snip *)


LOCAL CLIENT
(default/localhost)

C:\Java\apache-cassandra-1.1.7\bincassandra-cli
Starting Cassandra Client
Column Family assumptions read from x\assumptions.json
Connected to: Test Cluster on 127.0.0.1/9160
Welcome to Cassandra CLI version 1.1.6

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[default@unknown]

(Success!)

LOCAL CLIENT USING IP ADDRESS
(connecting to localhost but using ip address)

C:\Java\apache-cassandra-1.1.7\bincassandra-cli -h xxx.xxx.xxx.xxx
Starting Cassandra Client
org.apache.thrift.transport.TTransportException: java.net.ConnectException: 
Connection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
at 
org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at org.apache.cassandra.cli.CliMain.connect(CliMain.java:79)
at org.apache.cassandra.cli.CliMain.main(CliMain.java:255)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
... 3 more
Exception connecting to xxx.xxx.xxx.xxx/9160. Reason: Connection refused: 
connect.
Column Family assumptions read from xxx\assumptions.json
Welcome to Cassandra CLI version 1.1.6

I get the same result trying to connect from a remote machine.


Re: Cassandra / Windows Server 2008

2013-01-04 Thread Michael Kjellman
Use linux ;)

More seriously, I'm wondering if it is binding to the IPV6 address? Is that 
enabled on that NIC? You could try disabling IPv6 and seeing if RPC binds 
correctly..

From: 
stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com 
stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Friday, January 4, 2013 11:23 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Cassandra / Windows Server 2008

Hi folks – I have a Windows 2008 server that I’m trying to get Cassandra 
working on.  I have disabled the Windows Firewall for the moment but I still 
cannot connect to the server.

I have tried editing the cassandra.yaml to update the listen_address to the 
machine address as well as blank or commented out altogether – no change found 
at all.

Any suggestion at all would be most welcome!

-steve

SERVER STARTUP
(* snip *)
INFO 13:58:47,161 Binding thrift service to localhost/127.0.0.1:9160
(* snip *)


LOCAL CLIENT
(default/localhost)

C:\Java\apache-cassandra-1.1.7\bincassandra-cli
Starting Cassandra Client
Column Family assumptions read from x\assumptions.json
Connected to: Test Cluster on 127.0.0.1/9160
Welcome to Cassandra CLI version 1.1.6

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[default@unknown]

(Success!)

LOCAL CLIENT USING IP ADDRESS
(connecting to localhost but using ip address)

C:\Java\apache-cassandra-1.1.7\bincassandra-cli -h xxx.xxx.xxx.xxx
Starting Cassandra Client
org.apache.thrift.transport.TTransportException: java.net.ConnectException: 
Connection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
at 
org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at org.apache.cassandra.cli.CliMain.connect(CliMain.java:79)
at org.apache.cassandra.cli.CliMain.main(CliMain.java:255)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
... 3 more
Exception connecting to xxx.xxx.xxx.xxx/9160. Reason: Connection refused: 
connect.
Column Family assumptions read from xxx\assumptions.json
Welcome to Cassandra CLI version 1.1.6

I get the same result trying to connect from a remote machine.

Southfield Public School students safely access the tech tools they need on and 
off campus with the Barracuda Web Filter.

Quick installation and easy to use- try the Barracuda Web Filter free for 30 
days: http://on.fb.me/Vj6JBd


RE: Cassandra / Windows Server 2008

2013-01-04 Thread Stephen.M.Thompson
Good suggestion ... I added -Djava.net.preferIPv4Stack=true as a JVM arg 
cassandra.bat and got exactly the same result though.

Stephen Thompson
Wells Fargo Corporation
Internet Authentication  Fraud Prevention
704.427.3137 (W) | 704.807.3431 (C)

UPCOMING PTO:  JAN 14-18

This message may contain confidential and/or privileged information, and is 
intended for the use of the addressee only. If you are not the addressee or 
authorized to receive this for the addressee, you must not use, copy, disclose, 
or take any action based on this message or any information herein. If you have 
received this message in error, please advise the sender immediately by reply 
e-mail and delete this message. Thank you for your cooperation.

From: Michael Kjellman [mailto:mkjell...@barracuda.com]
Sent: Friday, January 04, 2013 2:26 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra / Windows Server 2008

Use linux ;)

More seriously, I'm wondering if it is binding to the IPV6 address? Is that 
enabled on that NIC? You could try disabling IPv6 and seeing if RPC binds 
correctly..

From: 
stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com 
stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Friday, January 4, 2013 11:23 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Cassandra / Windows Server 2008

Hi folks - I have a Windows 2008 server that I'm trying to get Cassandra 
working on.  I have disabled the Windows Firewall for the moment but I still 
cannot connect to the server.

I have tried editing the cassandra.yaml to update the listen_address to the 
machine address as well as blank or commented out altogether - no change found 
at all.

Any suggestion at all would be most welcome!

-steve

SERVER STARTUP
(* snip *)
INFO 13:58:47,161 Binding thrift service to localhost/127.0.0.1:9160
(* snip *)


LOCAL CLIENT
(default/localhost)

C:\Java\apache-cassandra-1.1.7\bincassandra-cli
Starting Cassandra Client
Column Family assumptions read from x\assumptions.json
Connected to: Test Cluster on 127.0.0.1/9160
Welcome to Cassandra CLI version 1.1.6

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[default@unknown]

(Success!)

LOCAL CLIENT USING IP ADDRESS
(connecting to localhost but using ip address)

C:\Java\apache-cassandra-1.1.7\bincassandra-cli -h xxx.xxx.xxx.xxx
Starting Cassandra Client
org.apache.thrift.transport.TTransportException: java.net.ConnectException: 
Connection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
at 
org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at org.apache.cassandra.cli.CliMain.connect(CliMain.java:79)
at org.apache.cassandra.cli.CliMain.main(CliMain.java:255)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
... 3 more
Exception connecting to xxx.xxx.xxx.xxx/9160. Reason: Connection refused: 
connect.
Column Family assumptions read from xxx\assumptions.json
Welcome to Cassandra CLI version 1.1.6

I get the same result trying to connect from a remote machine.

--
Join Barracuda Networks in the fight against hunger.
To learn how you can help in your community, please visit: 
http://on.fb.me/UAdL4f
  


Re: num_tokens - virtual nodes

2013-01-04 Thread Manu Zhang
https://www.youtube.com/watch?v=GddZ3pXiDyslist=PLC5E3906433F5A165index=28
This
video of Cassandra summit 2012 has mentioned the use of 256 tokens by
default (though it's no longer in the conf/cassandra.yaml). I remember that
more tokens could lead to more disk seeks or something. I think 256 is an
empirical number based on tests of the speaker.

Here are two more related articles that you want to read.
http://www.acunu.com/2/post/2012/07/virtual-nodes-strategies.html
http://www.acunu.com/2/post/2012/10/improving-cassandras-uptime-with-virtual-nodes.html



On Sat, Jan 5, 2013 at 12:18 AM, Michael Kjellman
mkjell...@barracuda.comwrote:

 http://www.mail-archive.com/user@cassandra.apache.org/msg26528.html

 From: Alain RODRIGUEZ arodr...@gmail.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Friday, January 4, 2013 6:00 AM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: num_tokens - virtual nodes

 Hi,

 I just discover the vnodes new feature described here:
 http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2

 In the post above, Brandon says If you’d like to upgrade an installation
 to virtual nodes, that’s possible too, but I’ll save that for a later
 post. Does this post now exist somewhere ?

 From the cassandra.yaml:

 # If you already have a cluster with 1 token per node, and wish to migrate
 to
 # multiple tokens per node, see
 http://wiki.apache.org/cassandra/Operations
 # num_tokens: 256

 I see no reference to vnodes or num_tokens on this wiki.

 So, how to switch from physical nodes to vnodes ?

 This is useful only if your number of nodes is greater than the RF, right ?

 Why 256 token by default ? Where this value come from ?

 Is there more advantage / disadvantage of using vnodes that improving the
 internodes data streaming by increasing the number of data sources ?

 Alain

 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your community, please visit:
 http://on.fb.me/UAdL4f
   ­­