Re: snapshot files locked

2012-03-14 Thread Maki Watanabe
snapshot files are hardlinks of the original sstables.
As you know, on windows, you can't delete files opened by other process.
If you try to delete the hardlink, windows thinks you try to delete
the sstables in production.

maki

2012/3/14 Jim Newsham jnews...@referentia.com:

 Hi,

 I'm using Cassandra 1.0.8, on Windows 7.  When I take a snapshot of the
 database, I find that I am unable to delete the snapshot directory (i.e.,
 dir named {datadir}\{keyspacename}\snapshots\{snapshottag}) while
 Cassandra is running:  The action can't be completed because the folder or
 a file in it is open in another program.  Close the folder or file and try
 again.  If I terminate Cassandra, then I can delete the directory with no
 problem.  Is there a reason why Cassandra must hold onto these files?

 Thanks,
 Jim



Re: Row iteration over indexed clause

2012-03-14 Thread aaron morton
If you want 100 results per call, ask for 101. Use the first 100, when you get 
to the 101'st do not examine it's data. Instead use it's key as the start to 
get the next 101.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/03/2012, at 2:03 AM, Vivek Mishra wrote:

 Thanks.
 
 Attribute
 Type
 Default
 Required
 Description
 expressions
 listIndexExpression
 n/a
 Y
 The list of IndexExpression objects which must contain one EQ IndexOperator 
 among the expressions
 start_key
 binary
 n/a
 Y
 Start the index query at the specified key - can be set to '', i.e., an empty 
 byte array, to start with the first key
 count
 integer
 100
 Y
 The number of results to which the index query will be constrained
 
 
 
 How do i iterate using it? How do i ensure that it should not return me 
 previous results(without i need to keep something in-memory)?
 
 This is the method i am looking into:
 
 get_indexed_slices(ColumnParent column_parent, IndexClause index_clause, 
 SlicePredicate column_predicate, ConsistencyLevel consistency_level)
 
 It does not have anything like count.
 
 
 Thanks,
 Vivek
 
 On Tue, Mar 13, 2012 at 6:24 PM, Shimi Kiviti shim...@gmail.com wrote:
 Yes.
 
 use get_indexed_slices (http://wiki.apache.org/cassandra/API)
 
 On Tue, Mar 13, 2012 at 2:12 PM, Vivek Mishra mishra.v...@gmail.com wrote:
 Hi,
 Is it possible to iterate and fetch in chunks using thrift API by querying 
 using secondary indexes?
 
 -Vivek
 
 



Re: Does the 'batch' order matter ?

2012-03-14 Thread aaron morton
It may, but it would not be guaranteed.  

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/03/2012, at 8:11 AM, A J wrote:

 I know batch operations are not atomic but does the success of a write
 imply all writes preceeding it in the batch were successful ?
 
 For example, using cql:
 BEGIN BATCH USING CONSISTENCY QUORUM AND TTL 864
  INSERT INTO users (KEY, password, name) VALUES ('user2',
 'ch@ngem3b', 'second user')
  UPDATE users SET password = 'ps22dhds' WHERE KEY = 'user2'
  INSERT INTO users (KEY, password) VALUES ('user3', 'ch@ngem3c')
  DELETE name FROM users WHERE key = 'user2'
  INSERT INTO users (KEY, password, name) VALUES ('user4',
 'ch@ngem3c', 'Andrew')
 APPLY BATCH;
 
 Say the batch failed but I see that the third write was present on a
 node. Does it imply that the first insert and the second update
 definitely made to that node as well ?
 
 Thanks.



Re: [Windows] How to configure simple authentication and authorization ?

2012-03-14 Thread Maki Watanabe
Have you build and installed SimpleAuthenticator from the source repository?
It is not included in the binary kit.

maki

2012/3/14 Sabbiolina sabbiol...@gmail.com:
 HI. I followed this:



 To set up simple authentication and authorization
 1. Edit cassandra.yaml, setting
 org.apache.cassandra.auth.SimpleAuthenticator as the
 authenticator value. The default value of AllowAllAuthenticator is
 equivalent to no authentication.
 2. Edit access.properties, adding entries for users and their permissions to
 read and write to specified
 keyspaces and column families. See access.properties below for details on
 the correct format.
 3. Make sure that users specified in access.properties have corresponding
 entries in passwd.properties.
 See passwd.properties below for details and examples.
 4. After making the required configuration changes, you must specify the
 properties files when starting Cassandra
 with the flags -Dpasswd.properties and -Daccess.properties. For example:
 cd $CASSANDRA_HOME
 sh bin/cassandra -f -Dpasswd.properties=conf/passwd.properties
 -Daccess.properties=conf/access.properties


 I started services with additional parameters, but no result, no Log,
 nothing

 I use datastax 1.0.8 communiti edition on win 7 64 bit


 Tnxs



Re: Why is row lookup much faster than column lookup

2012-03-14 Thread aaron morton
Here is a look at query plans 
http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/

tl;dr - wide rows require in index to be read from disk; the fastest query uses 
no start and no finish.  

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/03/2012, at 6:58 AM, Dave Brosius wrote:

 
 sorry, should have been: Given the hashtable nature of cassandra, finding a 
 row is probably 'relatively' constant no matter how many *rows* you have.
 
 
 - Original Message -
 From: Dave Brosius dbros...@mebigfatguy.com 
 Sent: Tue, March 13, 2012 13:43
 Subject: Re: Why is row lookup much faster than column lookup
 
  div clas s=PrivateMsgDiv Given the hashtable nature of cassandra, 
 finding a row is probably 'relatively' constant no matter how many columns 
 you have.
 
 The smaller the number of columns, i suppose the more likely that all the 
 columns will be in one sstable. If you've got a ton of columns per row, it is 
 much more likely that these columns will be spread out in multple ss tables. 
 Plus, columns are read in chunks, depending on yaml settings.
 
 
 - Original Message -
 From: A J s5a...@gmail.com 
 Sent: Tue, March 13, 2012 13:35
 Subject: Why is row lookup much faster than column lookup
 
 From my tests, I am seeing that a CF that has less than 100 columns
 but millions of rows has a much lower latency to read a column in a
 row than a CF that has only a few thousands of rows but wide rows with
 each having 20K columns.
 
 Example:
 cf1 has 6 Million rows and each row has about 100 columns.
 t1 = time.time()
 cf1.get(1234,column_count=1)
 t2 = time.time() - t1
 print int(t2*1000)
 takes 3 ms
 
 cf2 has 5K rows and each row has about 18K columns.
 t1 = time.time()
 cf2.get(1234,column_count=1)
 t2 = time.time() - t1
 print int(t2*1000)
 takes 82ms
 
 Anything in general on the Cassandra architecture that causes row
 lookup to be much faster than column lookup ?
 
 Thanks.



Re: CAn't bootstrap a new node to my cluster

2012-03-14 Thread Cyril Scetbon

To bootstap the node I :

- start a new node with auto_bootstrap=true and join-ring=false
- use the join command

and yes, when I use the join command the first time, it says it can't 
find seed nodes, as I wrote in my first email


The logfiles I sent you have all this except that the join error does 
not appear in it


On 3/13/12 9:36 AM, aaron morton wrote:

Can you provide some context for the log files please.

The original error had to do with bootstrapping a new node into a 
cluster. The log looks like a node is starting with 
-Dcassadra.join-ring = false and then nodetool join is run.


Is there an error when this runs ?

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/03/2012, at 11:58 PM, Cyril Scetbon wrote:

I don't know if it can helps, but the only thing I see on cluster's 
nodes is :


== /var/log/cassandra/output.log ==
 INFO 10:57:28,530 InetAddress /10.0.1.70 is now dead.


when I try to join the node 10.0.1.70 to the cluster

On 3/12/12 11:27 AM, Cyril Scetbon wrote:

It's done.

Nothing new on stderr when I use the join command. I send you the 
logfiles after I've tried to add the node.


Regards

On 3/12/12 10:47 AM, aaron morton wrote:
Modify this line the log4j-server.properties. It will normally be 
located in /etc/cassandra


https://github.com/apache/cassandra/blob/trunk/conf/log4j-server.properties#L21

Change INFO to DEBUG

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com http://www.thelastpickle.com/

On 12/03/2012, at 10:12 PM, Cyril Scetbon wrote:


On 3/12/12 9:50 AM, aaron morton wrote:
It may be the case that the joining node does not have enough 
information. But there is a default 30 second delay while the 
node waits for the ring information to stabilise.


What version are you using ?

1.0.7


Next time you add a new node can you try it with logging set the 
DEBUG. If you get the error please add it to 
https://issues.apache.org/jira/browse/CASSANDRA with the relevant 
logs.
where do I have to add it ? I added it to the cassandra-env.sh and 
got a lot of things but are you saying that I must add it to the 
join command ? if yes, how ?


after the join command fails as you saw I have the ring 
information after that. I don't if it took 30 seconds or not ...


Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com http://www.thelastpickle.com/



--
Cyril SCETBON





--
Cyril SCETBON



--
Cyril SCETBON





--
Cyril SCETBON



Adding nodes to cluster (Cassandra 1.0.8)

2012-03-14 Thread Rishabh Agrawal
I was able to successfully join a node to already existing one-node cluster 
(without giving any intital_token), but when I add another machine with 
identical settings (with changes in listen broadcast and rpc address). I am 
unable to join it to the cluster and it gives me following error:

INFO 17:50:35,555 JOINING: schema complete, ready to bootstrap
INFO 17:50:35,556 JOINING: getting bootstrap token
ERROR 17:50:35,557 Exception encountered during startup
java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap.If you 
intended to start a single-node cluster, you should make sure your 
broadcast_address (or listen_address) is listed as a seed.  Otherwise, you need 
to determine why the seed being contacted has no knowledge of the rest of the 
cluster.  Usually, this can be solved by giving all nodes the same seed list.
at 
org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:168)
at 
org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:150)
at 
org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:145)
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:565)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:484)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:395)
at 
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:234)
at 
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356)
at 
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)
java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap.If you 
intended to start a single-node cluster, you should make sure your 
broadcast_address (or listen_address) is listed as a seed.  Otherwise, you need 
to determine why the seed being contacted has no knowledge of the rest of the 
cluster.  Usually, this can be solved by giving all nodes the same seed list.
at 
org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:168)
at 
org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:150)
at 
org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:145)
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:565)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:484)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:395)
at 
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:234)
at 
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356)
at 
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)
Exception encountered during startup: No other nodes seen!  Unable to 
bootstrap.If you intended to start a single-node cluster, you should make sure 
your broadcast_address (or listen_address) is listed as a seed.  Otherwise, you 
need to determine why the seed being contacted has no knowledge of the rest of 
the cluster.  Usually, this can be solved by giving all nodes the same seed 
list.
INFO 17:50:35,571 Waiting for messaging service to quiesce
INFO 17:50:35,571 MessagingService shutting down server thread.


Now when I put some interger value to intial_token  the Cassandra starts 
working but is not able to connect to the main cluster which became evident 
from the command Nodetool -h ip of this node ring. It displayed itself with 
100% ownership.


Kindly help me with it asap.

Regards
Rishabh




Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22. Know more 
about our Big Data quick-start program at the event.

New Impetus webcast 'Cloud-enabled Performance Testing vis-?-vis On-premise' 
available at http://bit.ly/z6zT4L.


NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


Re: upgrade from 1.0.7 to 1.0.8

2012-03-14 Thread Alain RODRIGUEZ
Did everything go well with your update ?

Do the version 1.0.8 worth the risk of an update?

Alain

2012/3/11 Tamar Fraenkel ta...@tok-media.com

 Thanks! will check it in the following days :)
 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]


 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Sun, Mar 11, 2012 at 10:00 AM, Marcel Steinbach mstei...@gmail.comwrote:

 Check this out:


 http://www.datastax.com/docs/1.0/install/upgrading#upgrading-between-minor-releases-of-cassandra-1-0-x

 Cheers

 Am 11.03.2012 um 07:42 schrieb Tamar Fraenkel ta...@tok-media.com:

 Hi!

 I want to experiment with upgrading. Does anyone have a good link on how
 to upgrade Cassandra?

 Thanks,

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 tokLogo.png


 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





tokLogo.png

Re: Adding nodes to cluster (Cassandra 1.0.8)

2012-03-14 Thread Maki Watanabe
Do you use same storage_port across 3 nodes?
Can you access to the storage_port of the seed node from the last (failed) node?

2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in:
 I was able to successfully join a node to already existing one-node cluster
 (without giving any intital_token), but when I add another machine with
 identical settings (with changes in listen broadcast and rpc address). I am
 unable to join it to the cluster and it gives me following error:



 INFO 17:50:35,555 JOINING: schema complete, ready to bootstrap

 INFO 17:50:35,556 JOINING: getting bootstrap token

 ERROR 17:50:35,557 Exception encountered during startup

 java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap.If you
 intended to start a single-node cluster, you should make sure your
 broadcast_address (or listen_address) is listed as a seed.  Otherwise, you
 need to determine why the seed being contacted has no knowledge of the rest
 of the cluster.  Usually, this can be solved by giving all nodes the same
 seed list.

     at
 org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:168)

     at
 org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:150)

     at
 org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:145)

     at
 org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:565)

     at
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:484)

     at
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:395)

     at
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:234)

     at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356)

     at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)

 java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap.If you
 intended to start a single-node cluster, you should make sure your
 broadcast_address (or listen_address) is listed as a seed.  Otherwise, you
 need to determine why the seed being contacted has no knowledge of the rest
 of the cluster.  Usually, this can be solved by giving all nodes the same
 seed list.

     at
 org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:168)

     at
 org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:150)

     at
 org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:145)

     at
 org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:565)

     at
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:484)

     at
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:395)

     at
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:234)

     at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356)

     at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)

 Exception encountered during startup: No other nodes seen!  Unable to
 bootstrap.If you intended to start a single-node cluster, you should make
 sure your broadcast_address (or listen_address) is listed as a seed.
 Otherwise, you need to determine why the seed being contacted has no
 knowledge of the rest of the cluster.  Usually, this can be solved by giving
 all nodes the same seed list.

 INFO 17:50:35,571 Waiting for messaging service to quiesce

 INFO 17:50:35,571 MessagingService shutting down server thread.





 Now when I put some interger value to intial_token  the Cassandra starts
 working but is not able to connect to the main cluster which became evident
 from the command Nodetool –h ip of this node ring. It displayed itself
 with 100% ownership.





 Kindly help me with it asap.



 Regards

 Rishabh




 

 Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22. Know
 more about our Big Data quick-start program at the event.

 New Impetus webcast ‘Cloud-enabled Performance Testing vis-à-vis On-premise’
 available at http://bit.ly/z6zT4L.


 NOTE: This message may contain information that is confidential,
 proprietary, privileged or otherwise protected by law. The message is
 intended solely for the named addressee. If received in error, please
 destroy and notify the sender. Any use of this email is prohibited when
 received in error. Impetus does not represent, warrant and/or guarantee,
 that the integrity of this communication has been maintained nor that the
 communication is free of errors, virus, interception or interference.


Re: CAn't bootstrap a new node to my cluster

2012-03-14 Thread Cyril Scetbon

I have to add that I didn't set the token too to let the cluster choose one

On 3/14/12 10:11 AM, Cyril Scetbon wrote:

To bootstap the node I :

- start a new node with auto_bootstrap=true and join-ring=false
- use the join command

and yes, when I use the join command the first time, it says it can't 
find seed nodes, as I wrote in my first email


The logfiles I sent you have all this except that the join error does 
not appear in it


On 3/13/12 9:36 AM, aaron morton wrote:

Can you provide some context for the log files please.

The original error had to do with bootstrapping a new node into a 
cluster. The log looks like a node is starting with 
-Dcassadra.join-ring = false and then nodetool join is run.


Is there an error when this runs ?

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 12/03/2012, at 11:58 PM, Cyril Scetbon wrote:

I don't know if it can helps, but the only thing I see on cluster's 
nodes is :


== /var/log/cassandra/output.log ==
 INFO 10:57:28,530 InetAddress /10.0.1.70 is now dead.


when I try to join the node 10.0.1.70 to the cluster

On 3/12/12 11:27 AM, Cyril Scetbon wrote:

It's done.

Nothing new on stderr when I use the join command. I send you the 
logfiles after I've tried to add the node.


Regards

On 3/12/12 10:47 AM, aaron morton wrote:
Modify this line the log4j-server.properties. It will normally be 
located in /etc/cassandra


https://github.com/apache/cassandra/blob/trunk/conf/log4j-server.properties#L21

Change INFO to DEBUG

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com http://www.thelastpickle.com/

On 12/03/2012, at 10:12 PM, Cyril Scetbon wrote:


On 3/12/12 9:50 AM, aaron morton wrote:
It may be the case that the joining node does not have enough 
information. But there is a default 30 second delay while the 
node waits for the ring information to stabilise.


What version are you using ?

1.0.7


Next time you add a new node can you try it with logging set the 
DEBUG. If you get the error please add it to 
https://issues.apache.org/jira/browse/CASSANDRA with the 
relevant logs.
where do I have to add it ? I added it to the cassandra-env.sh 
and got a lot of things but are you saying that I must add it to 
the join command ? if yes, how ?


after the join command fails as you saw I have the ring 
information after that. I don't if it took 30 seconds or not ...


Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com http://www.thelastpickle.com/



--
Cyril SCETBON





--
Cyril SCETBON



--
Cyril SCETBON





--
Cyril SCETBON



--
Cyril SCETBON



RE: Adding nodes to cluster (Cassandra 1.0.8)

2012-03-14 Thread Rishabh Agrawal
I am using storage port 7000 (deafult) across all nodes.

-Original Message-
From: Maki Watanabe [mailto:watanabe.m...@gmail.com]
Sent: Wednesday, March 14, 2012 3:42 PM
To: user@cassandra.apache.org
Subject: Re: Adding nodes to cluster (Cassandra 1.0.8)

Do you use same storage_port across 3 nodes?
Can you access to the storage_port of the seed node from the last (failed) node?

2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in:
 I was able to successfully join a node to already existing one-node
 cluster (without giving any intital_token), but when I add another
 machine with identical settings (with changes in listen broadcast and
 rpc address). I am unable to join it to the cluster and it gives me following 
 error:



 INFO 17:50:35,555 JOINING: schema complete, ready to bootstrap

 INFO 17:50:35,556 JOINING: getting bootstrap token

 ERROR 17:50:35,557 Exception encountered during startup

 java.lang.RuntimeException: No other nodes seen!  Unable to
 bootstrap.If you intended to start a single-node cluster, you should
 make sure your broadcast_address (or listen_address) is listed as a
 seed.  Otherwise, you need to determine why the seed being contacted
 has no knowledge of the rest of the cluster.  Usually, this can be
 solved by giving all nodes the same seed list.

 at
 org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.
 java:168)

 at
 org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja
 va:150)

 at
 org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j
 ava:145)

 at
 org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi
 ce.java:565)

 at
 org.apache.cassandra.service.StorageService.initServer(StorageService.
 java:484)

 at
 org.apache.cassandra.service.StorageService.initServer(StorageService.
 java:395)

 at
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas
 sandraDaemon.java:234)

 at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract
 CassandraDaemon.java:356)

 at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:
 107)

 java.lang.RuntimeException: No other nodes seen!  Unable to
 bootstrap.If you intended to start a single-node cluster, you should
 make sure your broadcast_address (or listen_address) is listed as a
 seed.  Otherwise, you need to determine why the seed being contacted
 has no knowledge of the rest of the cluster.  Usually, this can be
 solved by giving all nodes the same seed list.

 at
 org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.
 java:168)

 at
 org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja
 va:150)

 at
 org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j
 ava:145)

 at
 org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi
 ce.java:565)

 at
 org.apache.cassandra.service.StorageService.initServer(StorageService.
 java:484)

 at
 org.apache.cassandra.service.StorageService.initServer(StorageService.
 java:395)

 at
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas
 sandraDaemon.java:234)

 at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract
 CassandraDaemon.java:356)

 at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:
 107)

 Exception encountered during startup: No other nodes seen!  Unable to
 bootstrap.If you intended to start a single-node cluster, you should
 make sure your broadcast_address (or listen_address) is listed as a seed.
 Otherwise, you need to determine why the seed being contacted has no
 knowledge of the rest of the cluster.  Usually, this can be solved by
 giving all nodes the same seed list.

 INFO 17:50:35,571 Waiting for messaging service to quiesce

 INFO 17:50:35,571 MessagingService shutting down server thread.





 Now when I put some interger value to intial_token  the Cassandra
 starts working but is not able to connect to the main cluster which
 became evident from the command Nodetool –h ip of this node ring. It
 displayed itself with 100% ownership.





 Kindly help me with it asap.



 Regards

 Rishabh




 

 Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22.
 Know more about our Big Data quick-start program at the event.

 New Impetus webcast ‘Cloud-enabled Performance Testing vis-à-vis On-premise’
 available at http://bit.ly/z6zT4L.


 NOTE: This message may contain information that is confidential,
 proprietary, privileged or otherwise protected by law. The message is
 intended solely for the named addressee. If received in error, please
 destroy and notify the sender. Any use of this email is prohibited
 when received in error. Impetus does not represent, warrant and/or
 guarantee, that the integrity of this communication has been

Re: upgrade from 1.0.7 to 1.0.8

2012-03-14 Thread Tamar Fraenkel
Haven't got to it yet, will update you when I do,

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Wed, Mar 14, 2012 at 11:40 AM, Alain RODRIGUEZ arodr...@gmail.comwrote:

 Did everything go well with your update ?

 Do the version 1.0.8 worth the risk of an update?

 Alain


 2012/3/11 Tamar Fraenkel ta...@tok-media.com

 Thanks! will check it in the following days :)
 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]


 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Sun, Mar 11, 2012 at 10:00 AM, Marcel Steinbach mstei...@gmail.comwrote:

 Check this out:


 http://www.datastax.com/docs/1.0/install/upgrading#upgrading-between-minor-releases-of-cassandra-1-0-x

 Cheers

 Am 11.03.2012 um 07:42 schrieb Tamar Fraenkel ta...@tok-media.com:

 Hi!

 I want to experiment with upgrading. Does anyone have a good link on how
 to upgrade Cassandra?

 Thanks,

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 tokLogo.png


 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956






tokLogo.pngtokLogo.png

counter columns question

2012-03-14 Thread Tamar Fraenkel
Hi!

My apologies for the naive question, but I am new to Cassandra and even
more to counter columns.

If I have a counter column, what happens if two threads are trying to
increment it's value at the same time for the same key?

Thanks,

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956
tokLogo.png

Spotify at Cassandra Europe. Last tickets remaining.

2012-03-14 Thread Konrad Kennedy
Hi All,

The first ever edition of Cassandra Europe is now only two weeks away. Only
a few tickets remain so if you intend to attend, make sure to book now to
avoid disappointment. We are happy to announce that Spotify have become
sponsors of Cassandra Europe. In addition to Spotify, we are pleased to
bring you talks from the world's top developers and users of Cassandra
including Netflix, Rackspace and Dachis Group. A full list of talks is
available on the website.

You should attend if:
- you're considering moving away from Oracle / MySQL and exploring NoSQL
solutions;
- you're considering using Cassandra;
- are curious about what it offers and what other people do with it;
- want to know how it compares to all the other NoSQL offerings and Hadoop;
- want to know more about how Cassandra works.

In particular, we're encouraging CTOs, CIOs, CEOs, Sys Admins, Developers,
Dev Ops, and Architects to attend.

More information and tickets can be found here: http://cassandra-eu.org/

Your friendly Cassandra Europe team.


cleanup crashing with java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 8

2012-03-14 Thread Thomas van Neerijnen
Hi all

I am trying to run a cleanup on a column family and am getting the
following error returned after about 15 seconds. A cleanup on a slightly
smaller column family completes in about 21 minutes. This is on the Apache
packaged version of Cassandra on Ubuntu 11.10, version 1.0.7.

~# nodetool -h localhost cleanup Player PlayerDetail
Error occured during cleanup
java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: 8
at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at
org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:203)
at
org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:237)
at
org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:984)
at
org.apache.cassandra.service.StorageService.forceTableCleanup(StorageService.java:1635)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
at
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
at
com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
at
com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
at
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
at
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
at
javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
at
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
at
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
at
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 8
at
org.apache.cassandra.db.compaction.LeveledManifest.add(LeveledManifest.java:298)
at
org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:186)
at
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:141)
at
org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:494)
at
org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:234)
at
org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:1006)
at
org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:791)
at
org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63)
at
org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:241)
at
org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:182)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
... 3 more


Re: LeveledCompaction and/or SnappyCompressor causing memory pressure during repair

2012-03-14 Thread Thomas van Neerijnen
Thanks for the suggestions but I'd already removed the compression when
your message came thru. That alleviated the problem but didn't solve it.
I'm still looking at a few other possible causes, I'll post back if I work
out what's going on, for now I am running rolling repairs to avoid another
outage.

On Sun, Mar 11, 2012 at 6:32 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 One thing you may want to look at is the meanRowSize from nodetool
 cfstats and your compression block size. In our case the mean
 compacted size is 560 bytes and 64KB block size caused CPU tickets and
 a lot of short lived memory. I have brought by block size down to 16K.
 The result tables are not noticeably larger and less memory pressure
 on the young gen. I might try going down to 4 K next.

 On Sat, Mar 10, 2012 at 5:38 PM, Edward Capriolo edlinuxg...@gmail.com
 wrote:
  The only downside of compression is it does cause more memory
  pressure. I can imagine something like repair could confound this.
  Since it would seem like building the merkle tree would involve
  decompressing every block on disk.
 
  I have been attempting to determine if the block size being larger or
  smaller has any effect on memory pressure.
 
  On Sat, Mar 10, 2012 at 4:50 PM, Peter Schuller
  peter.schul...@infidyne.com wrote:
  However, when I run a repair my CMS usage graph no longer shows sudden
 drops
  but rather gradual slopes and only manages to clear around 300MB each
 GC.
  This seems to occur on 2 other nodes in my cluster around the same
 time, I
  assume this is because they're the replicas (we use 3 replicas). Parnew
  collections look about the same on my graphs with or without repair
 running
  so no trouble there so far as I can tell.
 
  I don't know why leveled/snappy would affect it, but disregarding
  that, I would have been suggesting that you are seeing additional heap
  usage because of long-running repairs retaining sstables and delaying
  their unload/removal (index sampling/bloom filters filling your heap).
  If it really only happens for leveled/snappy however, I don't know
  what that might be caused by.
 
  --
  / Peter Schuller (@scode, http://worldmodscode.wordpress.com)



Re: Adding nodes to cluster (Cassandra 1.0.8)

2012-03-14 Thread Alain RODRIGUEZ
It looks like Cassandra is not able to find your existing cluster.

Do you use the same name of cluster in the 2 nodes ?

Are you able to contact the existing node on port 7000 from the new node ?

Is the first node defined as a seed in the cassandra.yaml of your 2 nodes ?

If you can't make a connection between these 2 nodes, this error is normal,
Cassandra try to create a new cluster and the node itself isn't defined as
a seed.

If you can't get it working and are sure of the point given above, tell us
what is you infrastructure, how you proceeded to install Cassandra and give
us a copy of your cassandra.yaml for both of your nodes.

Alain

2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in

 I am using storage port 7000 (deafult) across all nodes.

 -Original Message-
 From: Maki Watanabe [mailto:watanabe.m...@gmail.com]
 Sent: Wednesday, March 14, 2012 3:42 PM
 To: user@cassandra.apache.org
 Subject: Re: Adding nodes to cluster (Cassandra 1.0.8)

 Do you use same storage_port across 3 nodes?
 Can you access to the storage_port of the seed node from the last (failed)
 node?

 2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in:
  I was able to successfully join a node to already existing one-node
  cluster (without giving any intital_token), but when I add another
  machine with identical settings (with changes in listen broadcast and
  rpc address). I am unable to join it to the cluster and it gives me
 following error:
 
 
 
  INFO 17:50:35,555 JOINING: schema complete, ready to bootstrap
 
  INFO 17:50:35,556 JOINING: getting bootstrap token
 
  ERROR 17:50:35,557 Exception encountered during startup
 
  java.lang.RuntimeException: No other nodes seen!  Unable to
  bootstrap.If you intended to start a single-node cluster, you should
  make sure your broadcast_address (or listen_address) is listed as a
  seed.  Otherwise, you need to determine why the seed being contacted
  has no knowledge of the rest of the cluster.  Usually, this can be
  solved by giving all nodes the same seed list.
 
  at
  org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.
  java:168)
 
  at
  org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja
  va:150)
 
  at
  org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j
  ava:145)
 
  at
  org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi
  ce.java:565)
 
  at
  org.apache.cassandra.service.StorageService.initServer(StorageService.
  java:484)
 
  at
  org.apache.cassandra.service.StorageService.initServer(StorageService.
  java:395)
 
  at
  org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas
  sandraDaemon.java:234)
 
  at
  org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract
  CassandraDaemon.java:356)
 
  at
  org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:
  107)
 
  java.lang.RuntimeException: No other nodes seen!  Unable to
  bootstrap.If you intended to start a single-node cluster, you should
  make sure your broadcast_address (or listen_address) is listed as a
  seed.  Otherwise, you need to determine why the seed being contacted
  has no knowledge of the rest of the cluster.  Usually, this can be
  solved by giving all nodes the same seed list.
 
  at
  org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.
  java:168)
 
  at
  org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja
  va:150)
 
  at
  org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j
  ava:145)
 
  at
  org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi
  ce.java:565)
 
  at
  org.apache.cassandra.service.StorageService.initServer(StorageService.
  java:484)
 
  at
  org.apache.cassandra.service.StorageService.initServer(StorageService.
  java:395)
 
  at
  org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas
  sandraDaemon.java:234)
 
  at
  org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract
  CassandraDaemon.java:356)
 
  at
  org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:
  107)
 
  Exception encountered during startup: No other nodes seen!  Unable to
  bootstrap.If you intended to start a single-node cluster, you should
  make sure your broadcast_address (or listen_address) is listed as a seed.
  Otherwise, you need to determine why the seed being contacted has no
  knowledge of the rest of the cluster.  Usually, this can be solved by
  giving all nodes the same seed list.
 
  INFO 17:50:35,571 Waiting for messaging service to quiesce
 
  INFO 17:50:35,571 MessagingService shutting down server thread.
 
 
 
 
 
  Now when I put some interger value to intial_token  the Cassandra
  starts working but is not able to connect to the main cluster which
  became evident from the 

Re: cleanup crashing with java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 8

2012-03-14 Thread Maki Watanabe
Fixed in 1.0.9, 1.1.0
https://issues.apache.org/jira/browse/CASSANDRA-3989

You should better to avoid to use cleanup/scrub/upgradesstable if you
can on 1.0.7 though
it will not corrupt sstables.

2012/3/14 Thomas van Neerijnen t...@bossastudios.com:
 Hi all

 I am trying to run a cleanup on a column family and am getting the following
 error returned after about 15 seconds. A cleanup on a slightly smaller
 column family completes in about 21 minutes. This is on the Apache packaged
 version of Cassandra on Ubuntu 11.10, version 1.0.7.

 ~# nodetool -h localhost cleanup Player PlayerDetail
 Error occured during cleanup
 java.util.concurrent.ExecutionException:
 java.lang.ArrayIndexOutOfBoundsException: 8
     at
 java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
     at java.util.concurrent.FutureTask.get(FutureTask.java:83)
     at
 org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:203)
     at
 org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:237)
     at
 org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:984)
     at
 org.apache.cassandra.service.StorageService.forceTableCleanup(StorageService.java:1635)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
     at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
     at
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
     at
 com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
     at
 com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
     at
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
     at
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
     at
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
     at
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
     at
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
     at
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
     at
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
     at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
     at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at
 sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
     at sun.rmi.transport.Transport$1.run(Transport.java:159)
     at java.security.AccessController.doPrivileged(Native Method)
     at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
     at
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
     at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
     at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
     at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
     at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
     at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 8
     at
 org.apache.cassandra.db.compaction.LeveledManifest.add(LeveledManifest.java:298)
     at
 org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:186)
     at
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:141)
     at
 org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:494)
     at
 org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:234)
     at
 org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:1006)
     at
 org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:791)
     at
 org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63)
     at
 org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:241)
     at
 org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:182)
     at
 

Re: Adding nodes to cluster (Cassandra 1.0.8)

2012-03-14 Thread Maki Watanabe
Try:
  % telnet seed_node 7000
on the 3rd node and see what it says.

maki


2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in:
 I am using storage port 7000 (deafult) across all nodes.

 -Original Message-
 From: Maki Watanabe [mailto:watanabe.m...@gmail.com]
 Sent: Wednesday, March 14, 2012 3:42 PM
 To: user@cassandra.apache.org
 Subject: Re: Adding nodes to cluster (Cassandra 1.0.8)

 Do you use same storage_port across 3 nodes?
 Can you access to the storage_port of the seed node from the last (failed) 
 node?

 2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in:
 I was able to successfully join a node to already existing one-node
 cluster (without giving any intital_token), but when I add another
 machine with identical settings (with changes in listen broadcast and
 rpc address). I am unable to join it to the cluster and it gives me 
 following error:



 INFO 17:50:35,555 JOINING: schema complete, ready to bootstrap

 INFO 17:50:35,556 JOINING: getting bootstrap token

 ERROR 17:50:35,557 Exception encountered during startup

 java.lang.RuntimeException: No other nodes seen!  Unable to
 bootstrap.If you intended to start a single-node cluster, you should
 make sure your broadcast_address (or listen_address) is listed as a
 seed.  Otherwise, you need to determine why the seed being contacted
 has no knowledge of the rest of the cluster.  Usually, this can be
 solved by giving all nodes the same seed list.

         at
 org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.
 java:168)

         at
 org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja
 va:150)

         at
 org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j
 ava:145)

         at
 org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi
 ce.java:565)

         at
 org.apache.cassandra.service.StorageService.initServer(StorageService.
 java:484)

         at
 org.apache.cassandra.service.StorageService.initServer(StorageService.
 java:395)

         at
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas
 sandraDaemon.java:234)

         at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract
 CassandraDaemon.java:356)

         at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:
 107)

 java.lang.RuntimeException: No other nodes seen!  Unable to
 bootstrap.If you intended to start a single-node cluster, you should
 make sure your broadcast_address (or listen_address) is listed as a
 seed.  Otherwise, you need to determine why the seed being contacted
 has no knowledge of the rest of the cluster.  Usually, this can be
 solved by giving all nodes the same seed list.

         at
 org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.
 java:168)

         at
 org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja
 va:150)

         at
 org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j
 ava:145)

         at
 org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi
 ce.java:565)

         at
 org.apache.cassandra.service.StorageService.initServer(StorageService.
 java:484)

         at
 org.apache.cassandra.service.StorageService.initServer(StorageService.
 java:395)

         at
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas
 sandraDaemon.java:234)

         at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract
 CassandraDaemon.java:356)

         at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:
 107)

 Exception encountered during startup: No other nodes seen!  Unable to
 bootstrap.If you intended to start a single-node cluster, you should
 make sure your broadcast_address (or listen_address) is listed as a seed.
 Otherwise, you need to determine why the seed being contacted has no
 knowledge of the rest of the cluster.  Usually, this can be solved by
 giving all nodes the same seed list.

 INFO 17:50:35,571 Waiting for messaging service to quiesce

 INFO 17:50:35,571 MessagingService shutting down server thread.





 Now when I put some interger value to intial_token  the Cassandra
 starts working but is not able to connect to the main cluster which
 became evident from the command Nodetool –h ip of this node ring. It
 displayed itself with 100% ownership.





 Kindly help me with it asap.



 Regards

 Rishabh




 

 Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22.
 Know more about our Big Data quick-start program at the event.

 New Impetus webcast ‘Cloud-enabled Performance Testing vis-à-vis On-premise’
 available at http://bit.ly/z6zT4L.


 NOTE: This message may contain information that is confidential,
 proprietary, privileged or otherwise protected by law. The message is
 intended solely for the named addressee. If received in error, please
 destroy and notify the sender. Any use of this 

Re: counter columns question

2012-03-14 Thread Tamar Fraenkel
Thanks!
*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Wed, Mar 14, 2012 at 1:52 PM, Alain RODRIGUEZ arodr...@gmail.com wrote:

 Here you got a good explaination of how the counters work.

 Video : http://blip.tv/datastax/counters-in-cassandra-5497678
 Sildes :
 http://www.datastax.com/wp-content/uploads/2011/07/cassandra_sf_counters.pdf

 I think it will answer your question and help you understanding
 interesting things about counters.

 If you need a short answer, it concurrency will be handled nicely and
 you'll have the expected result.

 Alain

 2012/3/14 Tamar Fraenkel ta...@tok-media.com

 Hi!

 My apologies for the naive question, but I am new to Cassandra and even
 more to counter columns.

 If I have a counter column, what happens if two threads are trying to
 increment it's value at the same time for the same key?

 Thanks,

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





tokLogo.pngtokLogo.png

Re: [Windows] How to configure simple authentication and authorization ?

2012-03-14 Thread Sabbiolina
encryption_options:
internode_encryption: none
keystore: conf/.keystore
keystore_password: cassandra
truststore: conf/.truststore
truststore_password: cassandra

from log:

INFO [FlushWriter:1] 2012-03-14 14:32:58,948 Memtable.java (line 283)
Completed flushing C:\DataStax\data\data\system\
LocationInfo-hc-12-Data.db (80 bytes)
 ERROR [main] 2012-03-14 14:32:58,952 AbstractCassandraDaemon.java (line
373) Exception encountered during startup
 java.io.FileNotFoundException: conf\.truststore


Where do I put the files?




On Wed, Mar 14, 2012 at 8:43 AM, Maki Watanabe watanabe.m...@gmail.comwrote:

 Have you build and installed SimpleAuthenticator from the source
 repository?
 It is not included in the binary kit.

 maki

 2012/3/14 Sabbiolina sabbiol...@gmail.com:
  HI. I followed this:
 
 
 
  To set up simple authentication and authorization
  1. Edit cassandra.yaml, setting
  org.apache.cassandra.auth.SimpleAuthenticator as the
  authenticator value. The default value of AllowAllAuthenticator is
  equivalent to no authentication.
  2. Edit access.properties, adding entries for users and their
 permissions to
  read and write to specified
  keyspaces and column families. See access.properties below for details on
  the correct format.
  3. Make sure that users specified in access.properties have corresponding
  entries in passwd.properties.
  See passwd.properties below for details and examples.
  4. After making the required configuration changes, you must specify the
  properties files when starting Cassandra
  with the flags -Dpasswd.properties and -Daccess.properties. For example:
  cd $CASSANDRA_HOME
  sh bin/cassandra -f -Dpasswd.properties=conf/passwd.properties
  -Daccess.properties=conf/access.properties
 
 
  I started services with additional parameters, but no result, no Log,
  nothing
 
  I use datastax 1.0.8 communiti edition on win 7 64 bit
 
 
  Tnxs
 



Re: Composite keys and range queries

2012-03-14 Thread John Laban
Hmm, now I'm really confused.

 This may be of use to you
http://www.datastax.com/dev/blog/schema-in-cassandra-1-1

This article is what I actually used to come up with my schema here.  In
the Clustering, composite keys, and more section they're using a schema
very similarly to how I'm trying to use it.  They define a composite key
with two parts, expecting the first part to be used as the partition key
and the second part to be used for ordering.

 The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may
be 1 .

Why?  Shouldn't only uuid-1 be used as the partition key?  (So shouldn't
those two hash to the same location?)

I'm thinking of using supercolumns for this instead as I know they'll work
(where the row key is the uuid and the supercolumn name is the priority),
but aren't composite row keys supposed to essentially replace the need for
supercolumns?

Thanks, and sorry if I'm getting this all wrong,
John



On Wed, Mar 14, 2012 at 12:52 AM, aaron morton aa...@thelastpickle.comwrote:

 You are seeing this http://wiki.apache.org/cassandra/FAQ#range_rp

 The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be
 1 .

 You cannot do what you want to. Even if you passed a start of
 (uuid1,empty) and no finish, you would not only get rows where the key
 starts with uuid1.

 This may be of use to you
 http://www.datastax.com/dev/blog/schema-in-cassandra-1-1

 Or you can store all the priorities that are valid for an ID in another
 row.

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 14/03/2012, at 1:05 PM, John Laban wrote:

  Forwarding to the Cassandra mailing list as well, in case this is more
 of an issue on how I'm using Cassandra.
 
  Am I correct to assume that I can use range queries on composite row
 keys, even when using a RandomPartitioner, if I make sure that the first
 part of the composite key is fixed?
 
  Any help would be appreciated,
  John
 
 
 
  On Tue, Mar 13, 2012 at 12:15 PM, John Laban j...@pagerduty.com wrote:
  Hi,
 
  I have a column family that uses a composite key:
 
  (ID, priority) - ...
 
  Where the ID is a UUID and the priority is an integer.
 
  I'm trying to perform a range query now:  I want all the rows where the
 ID matches some fixed UUID, but within a range of priorities.  This is
 supported even if I'm using a RandomPartitioner, right?  (Because the first
 key in the composite key is the partition key, and the second part of the
 composite key is automatically ordered?)
 
  So I perform a range slices query:
 
  val rangeQuery = HFactory.createRangeSlicesQuery(keyspace, new
 CompositeSerializer, StringSerializer.get, BytesArraySerializer.get)
  rangeQuery.setColumnFamily(RouteColumnFamilyName).
  setKeys( new Composite(id, priorityStart), new Composite(id,
 priorityEnd) ).
  setRange( null, null, false, Int.MaxValue )
 
 
  But I get this error:
 
  me.prettyprint.hector.api.exceptions.HInvalidRequestException:
 InvalidRequestException(why:start key's md5 sorts after end key's md5.
  this is not allowed; you probably should not specify end key at all, under
 RandomPartitioner)
 
  Shouldn't they have the same md5, since they have the same partition key?
 
  Am I using the wrong query here, or does Hector not support composte
 range queries, or am I making some mistake in how I think Cassandra's
 composite keys work?
 
  Thanks,
  John
 
 




Large hints column family

2012-03-14 Thread Bryce Godfrey
The system HintsColumnFamily seems large in my cluster, and I want to track 
down why that is.  I try invoking listEndpointsPendingHints() for 
o.a.c.db.HintedHandoffManager and it never returns, and also freezes the node 
that its invoked against.  It's a 3 node cluster, and all nodes have been up 
and running without issue for a while.  Any help on where to start with this?

   Column Family: HintsColumnFamily
SSTable count: 11
Space used (live): 11271669539
Space used (total): 11271669539
Number of Keys (estimate): 1408
Memtable Columns Count: 338
Memtable Data Size: 0
Memtable Switch Count: 1
Read Count: 3
Read Latency: 4354.669 ms.
Write Count: 848
Write Latency: 0.029 ms.
Pending Tasks: 0
Bloom Filter False Postives: 0
Bloom Filter False Ratio: 0.0
Bloom Filter Space Used: 12656
Key cache capacity: 14
Key cache size: 11
Key cache hit rate: 0.
Row cache: disabled
Compacted row minimum size: 105779
Compacted row maximum size: 7152383774
Compacted row mean size: 590818614

Thanks,
Bryce


RE: Large hints column family

2012-03-14 Thread Bryce Godfrey
Forgot to mention that this is on 1.0.8

From: Bryce Godfrey [mailto:bryce.godf...@azaleos.com]
Sent: Wednesday, March 14, 2012 12:34 PM
To: user@cassandra.apache.org
Subject: Large hints column family

The system HintsColumnFamily seems large in my cluster, and I want to track 
down why that is.  I try invoking listEndpointsPendingHints() for 
o.a.c.db.HintedHandoffManager and it never returns, and also freezes the node 
that its invoked against.  It's a 3 node cluster, and all nodes have been up 
and running without issue for a while.  Any help on where to start with this?

   Column Family: HintsColumnFamily
SSTable count: 11
Space used (live): 11271669539
Space used (total): 11271669539
Number of Keys (estimate): 1408
Memtable Columns Count: 338
Memtable Data Size: 0
Memtable Switch Count: 1
Read Count: 3
Read Latency: 4354.669 ms.
Write Count: 848
Write Latency: 0.029 ms.
Pending Tasks: 0
Bloom Filter False Postives: 0
Bloom Filter False Ratio: 0.0
Bloom Filter Space Used: 12656
Key cache capacity: 14
Key cache size: 11
Key cache hit rate: 0.
Row cache: disabled
Compacted row minimum size: 105779
Compacted row maximum size: 7152383774
Compacted row mean size: 590818614

Thanks,
Bryce


RE: Composite keys and range queries

2012-03-14 Thread Jeremiah Jordan
Right, so until the new CQL stuff exists to actually query with something smart 
enough to know about composite keys , You have to define and query on your 
own.

Row Key = UUID
Column = CompositeColumn(string, string)

You want to then use COLUMN slicing, not row ranges to query the data.  Where 
you slice in priority as the first part of a Composite Column Name.

See the Under the hood and historical notes section of the blog post.  You 
want to layout your data per the Physical representation of the denormalized 
timeline rows diagram.
Where your UUID is the user_id from the example, and your priority is the 
tweet_id

-Jeremiah



From: John Laban [j...@pagerduty.com]
Sent: Wednesday, March 14, 2012 12:37 PM
To: user@cassandra.apache.org
Subject: Re: Composite keys and range queries

Hmm, now I'm really confused.

 This may be of use to you 
 http://www.datastax.com/dev/blog/schema-in-cassandra-1-1

This article is what I actually used to come up with my schema here.  In the 
Clustering, composite keys, and more section they're using a schema very 
similarly to how I'm trying to use it.  They define a composite key with two 
parts, expecting the first part to be used as the partition key and the second 
part to be used for ordering.

 The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 .

Why?  Shouldn't only uuid-1 be used as the partition key?  (So shouldn't 
those two hash to the same location?)

I'm thinking of using supercolumns for this instead as I know they'll work 
(where the row key is the uuid and the supercolumn name is the priority), but 
aren't composite row keys supposed to essentially replace the need for 
supercolumns?

Thanks, and sorry if I'm getting this all wrong,
John



On Wed, Mar 14, 2012 at 12:52 AM, aaron morton 
aa...@thelastpickle.commailto:aa...@thelastpickle.com wrote:
You are seeing this http://wiki.apache.org/cassandra/FAQ#range_rp

The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 .

You cannot do what you want to. Even if you passed a start of (uuid1,empty) 
and no finish, you would not only get rows where the key starts with uuid1.

This may be of use to you 
http://www.datastax.com/dev/blog/schema-in-cassandra-1-1

Or you can store all the priorities that are valid for an ID in another row.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/03/2012, at 1:05 PM, John Laban wrote:

 Forwarding to the Cassandra mailing list as well, in case this is more of an 
 issue on how I'm using Cassandra.

 Am I correct to assume that I can use range queries on composite row keys, 
 even when using a RandomPartitioner, if I make sure that the first part of 
 the composite key is fixed?

 Any help would be appreciated,
 John



 On Tue, Mar 13, 2012 at 12:15 PM, John Laban 
 j...@pagerduty.commailto:j...@pagerduty.com wrote:
 Hi,

 I have a column family that uses a composite key:

 (ID, priority) - ...

 Where the ID is a UUID and the priority is an integer.

 I'm trying to perform a range query now:  I want all the rows where the ID 
 matches some fixed UUID, but within a range of priorities.  This is supported 
 even if I'm using a RandomPartitioner, right?  (Because the first key in the 
 composite key is the partition key, and the second part of the composite key 
 is automatically ordered?)

 So I perform a range slices query:

 val rangeQuery = HFactory.createRangeSlicesQuery(keyspace, new 
 CompositeSerializer, StringSerializer.get, BytesArraySerializer.get)
 rangeQuery.setColumnFamily(RouteColumnFamilyName).
 setKeys( new Composite(id, priorityStart), new Composite(id, 
 priorityEnd) ).
 setRange( null, null, false, Int.MaxValue )


 But I get this error:

 me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
 InvalidRequestException(why:start key's md5 sorts after end key's md5.  this 
 is not allowed; you probably should not specify end key at all, under 
 RandomPartitioner)

 Shouldn't they have the same md5, since they have the same partition key?

 Am I using the wrong query here, or does Hector not support composte range 
 queries, or am I making some mistake in how I think Cassandra's composite 
 keys work?

 Thanks,
 John






Re: snapshot files locked

2012-03-14 Thread Jim Newsham


Hi Maki,

Thanks for the reply.  Yes, I understand that snapshots are hard links.  
However, my understanding is that removing any hard-linked files just 
removes the link (decrementing the link counter of the file on disk) -- 
it does not delete the file itself nor remove any other links which may 
be pointing at the file.  To confirm my understanding, I tested this in 
Windows by terminating Cassandra and then deleting all files in the 
snapshot dir.  None of the corresponding files in the parent keyspace 
directory were removed.


Regards,
Jim

On 3/13/2012 9:29 PM, Maki Watanabe wrote:

snapshot files are hardlinks of the original sstables.
As you know, on windows, you can't delete files opened by other process.
If you try to delete the hardlink, windows thinks you try to delete
the sstables in production.

maki

2012/3/14 Jim Newshamjnews...@referentia.com:

Hi,

I'm using Cassandra 1.0.8, on Windows 7.  When I take a snapshot of the
database, I find that I am unable to delete the snapshot directory (i.e.,
dir named {datadir}\{keyspacename}\snapshots\{snapshottag}) while
Cassandra is running:  The action can't be completed because the folder or
a file in it is open in another program.  Close the folder or file and try
again.  If I terminate Cassandra, then I can delete the directory with no
problem.  Is there a reason why Cassandra must hold onto these files?

Thanks,
Jim





Re: Datastax Enterprise mixed workload cluster configuration

2012-03-14 Thread Tyler Hobbs
Yes, you can do this.

You will want to have three DCs: DC1 with [1, 2, 3], DC2 with [4, 5, 6],
and DC3 with [7, 8, 9].  For your normal data keyspace, the replication
strategy should be NTS, and the strategy_options should have some replicas
in each of the three DCs.  For example: {DC1: 3, DC2: 3, DC3: 3} if you
need that level of replication in each one (although you probably only want
an RF of 1 for DC3).

Your clients that are performing writes should only open connections
against the nodes in DC1, and you should write at CL.ONE or
CL.LOCAL_QUORUM.  Likewise for reads, your clients should only connect to
nodes in DC2, and you should read at CL.ONE or CL.LOCAL_QUORUM.

The nodes in DC3 should run as analytics nodes.  I believe the default CL
for m/r jobs is ONE, which would work.

As far as tokens go, interleaving all three DCs and evenly spacing the
tokens will work.  For example, the ordering of your nodes might be [1, 4,
7, 2, 5, 8, 3, 6, 9].

On Wed, Mar 14, 2012 at 12:05 PM, Alexandru Sicoe adsi...@gmail.com wrote:

 Hi everyone,
  I want to test out the Datastax Enterprise software to have a mixed
 workload setup with an analytics and a real time part.

  However I am not sure how to configure it to achieve what I want: I will
 have 3 real machines on one side of a gateway (1,2,3) and 6 VMs on
 another(4,5,6).
  1,2,3 will each have a normal Cassandra node that just takes data
 directly from my data sources. I want them to replicate the data to the
 other 6 VMs. Now, out of those 6 VMs 4,5,6 will run normal Cassandra nodes
 and 7,8,9 will run Analytics nodes. So I only want to write to the 1,2,3
 and I only want to serve user reads from 4,5,6 and do analytics on 7,8,9.
 Can I achieve this by configuring 1,2,3,4,5,6 as normal nodes and the rest
 as analytics nodes? If I alternate the tokens as it's explained in
 http://www.datastax.com/docs/1.0/datastax_enterprise/init_dse_cluster#init-dseis
  it analoguous to achieving something like 3 DCs each getting their own
 replica?

 Thanks,
 Alex




-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: Does the 'batch' order matter ?

2012-03-14 Thread Tyler Hobbs
On Wed, Mar 14, 2012 at 11:50 AM, A J s5a...@gmail.com wrote:


 Are you saying the way 'batch mutate' is coded, the order of writes in
 the batch does not mean anything ? You can ask the batch to do A,B,C
 and then D in sequence; but sometimes Cassandra can end up applying
 just C and A,B (and D) may still not be applied ?


No, batch_mutate() is an atomic operation.  When a node locally applies a
batch mutation, either all of the changes are applied or none of them are.

Aaron was referring to the possibility that one of the replicas received
the batch_mutate, but the other replicas did not.

-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: snapshot files locked

2012-03-14 Thread Tyler Hobbs
Can you open a ticket on
https://issues.apache.org/jira/browse/CASSANDRAfor this?

On Wed, Mar 14, 2012 at 3:04 PM, Jim Newsham jnews...@referentia.comwrote:


 Hi Maki,

 Thanks for the reply.  Yes, I understand that snapshots are hard links.
  However, my understanding is that removing any hard-linked files just
 removes the link (decrementing the link counter of the file on disk) -- it
 does not delete the file itself nor remove any other links which may be
 pointing at the file.  To confirm my understanding, I tested this in
 Windows by terminating Cassandra and then deleting all files in the
 snapshot dir.  None of the corresponding files in the parent keyspace
 directory were removed.

 Regards,
 Jim


 On 3/13/2012 9:29 PM, Maki Watanabe wrote:

 snapshot files are hardlinks of the original sstables.
 As you know, on windows, you can't delete files opened by other process.
 If you try to delete the hardlink, windows thinks you try to delete
 the sstables in production.

 maki

 2012/3/14 Jim Newshamjnewsham@referentia.**com jnews...@referentia.com
 :

 Hi,

 I'm using Cassandra 1.0.8, on Windows 7.  When I take a snapshot of the
 database, I find that I am unable to delete the snapshot directory (i.e.,
 dir named {datadir}\{keyspacename}\**snapshots\{snapshottag}) while
 Cassandra is running:  The action can't be completed because the folder
 or
 a file in it is open in another program.  Close the folder or file and
 try
 again.  If I terminate Cassandra, then I can delete the directory with
 no
 problem.  Is there a reason why Cassandra must hold onto these files?

 Thanks,
 Jim





-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: Question on ByteOrdered rebalancing

2012-03-14 Thread work late
The bytes being the key I use or what ?
Maybe you answered the 2nd question and I don't get it, but what would I
pass the move command when I want to rebalance the cluster?

Thanks

On Tue, Mar 13, 2012 at 11:34 AM, Tyler Hobbs ty...@datastax.com wrote:

 The tokens are hex encoded arrays of bytes.


 On Tue, Mar 13, 2012 at 1:05 PM, work late worklate1...@gmail.com wrote:

 The ring command on nodetool shows as

 Address DC  RackStatus State   Load
  OwnsToken

  Token(bytes[88401b216270ab8ebb690946b0b70eab])
 10.1.1.1 datacenter1 rack1   Up Normal  69.1 KB
 50.00%  Token(bytes[4936c862b88db2bdd92d684583bf0280])
 10.1.1.2datacenter1 rack1   Up Normal  69.1 KB 50.00%
  Token(bytes[88401b216270ab8ebb690946b0b70eab])


 The token looks like a MD5 value, is that correct? So when rebalancing
 the cluster, what is the token value I am supposed to give the move command
 (with RP it is the token between 0- 2^127), what should I use BOP?


 Thanks




 --
 Tyler Hobbs
 DataStax http://datastax.com/




Re: Composite keys and range queries

2012-03-14 Thread John Laban
Ahhh, ok, I thought that CQL was just being brought up to date with
the functionality already built into composite keys, but I guess I was
mistaken there.

But I guess it's just providing a convenient abstraction, using composite
column names under the hood.  That's where I was confused, thanks.

So, in terms of composite column names vs supercolumns:  is the only
advantage to composite column names that you can do column slicing on
subsets of the subcolumns? I.e. if I don't mind loading all of the
subcolumns for a given supercolumn name in memory at once (since I need
them all anyway), is there any disadvantage to using supercolumns here?
 They seem a little cleaner and more straightforward for my use case, since
I don't have the advantage of the CQL composite key thing.

Thanks,
John


On Wed, Mar 14, 2012 at 12:53 PM, Jeremiah Jordan 
jeremiah.jor...@morningstar.com wrote:

  Right, so until the new CQL stuff exists to actually query with
 something smart enough to know about composite keys , You have to define
 and query on your own.

 Row Key = UUID
 Column = CompositeColumn(string, string)

 You want to then use COLUMN slicing, not row ranges to query the data.
 Where you slice in priority as the first part of a Composite Column Name.

 See the Under the hood and historical notes section of the blog post.
 You want to layout your data per the Physical representation of the
 denormalized timeline rows diagram.
 Where your UUID is the user_id from the example, and your priority is
 the tweet_id

 -Jeremiah


  --
 *From:* John Laban [j...@pagerduty.com]
 *Sent:* Wednesday, March 14, 2012 12:37 PM
 *To:* user@cassandra.apache.org
 *Subject:* Re: Composite keys and range queries

   Hmm, now I'm really confused.

   This may be of use to you
 http://www.datastax.com/dev/blog/schema-in-cassandra-1-1

  This article is what I actually used to come up with my schema here.  In
 the Clustering, composite keys, and more section they're using a schema
 very similarly to how I'm trying to use it.  They define a composite key
 with two parts, expecting the first part to be used as the partition key
 and the second part to be used for ordering.

   The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2)
 may be 1 .

  Why?  Shouldn't only uuid-1 be used as the partition key?  (So
 shouldn't those two hash to the same location?)

  I'm thinking of using supercolumns for this instead as I know they'll
 work (where the row key is the uuid and the supercolumn name is the
 priority), but aren't composite row keys supposed to essentially replace
 the need for supercolumns?

  Thanks, and sorry if I'm getting this all wrong,
 John



 On Wed, Mar 14, 2012 at 12:52 AM, aaron morton aa...@thelastpickle.comwrote:

 You are seeing this http://wiki.apache.org/cassandra/FAQ#range_rp

 The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may
 be 1 .

 You cannot do what you want to. Even if you passed a start of
 (uuid1,empty) and no finish, you would not only get rows where the key
 starts with uuid1.

 This may be of use to you
 http://www.datastax.com/dev/blog/schema-in-cassandra-1-1

 Or you can store all the priorities that are valid for an ID in another
 row.

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 14/03/2012, at 1:05 PM, John Laban wrote:

  Forwarding to the Cassandra mailing list as well, in case this is more
 of an issue on how I'm using Cassandra.
 
  Am I correct to assume that I can use range queries on composite row
 keys, even when using a RandomPartitioner, if I make sure that the first
 part of the composite key is fixed?
 
  Any help would be appreciated,
  John
 
 
 
  On Tue, Mar 13, 2012 at 12:15 PM, John Laban j...@pagerduty.com
 wrote:
  Hi,
 
  I have a column family that uses a composite key:
 
  (ID, priority) - ...
 
  Where the ID is a UUID and the priority is an integer.
 
  I'm trying to perform a range query now:  I want all the rows where the
 ID matches some fixed UUID, but within a range of priorities.  This is
 supported even if I'm using a RandomPartitioner, right?  (Because the first
 key in the composite key is the partition key, and the second part of the
 composite key is automatically ordered?)
 
  So I perform a range slices query:
 
  val rangeQuery = HFactory.createRangeSlicesQuery(keyspace, new
 CompositeSerializer, StringSerializer.get, BytesArraySerializer.get)
  rangeQuery.setColumnFamily(RouteColumnFamilyName).
  setKeys( new Composite(id, priorityStart), new
 Composite(id, priorityEnd) ).
  setRange( null, null, false, Int.MaxValue )
 
 
  But I get this error:
 
  me.prettyprint.hector.api.exceptions.HInvalidRequestException:
 InvalidRequestException(why:start key's md5 sorts after end key's md5.
  this is not allowed; you probably should not specify end key at all, under
 RandomPartitioner)
 
  Shouldn't they have the same 

Re: Does the 'batch' order matter ?

2012-03-14 Thread A J
 No, batch_mutate() is an atomic operation.  When a node locally applies a 
 batch mutation, either all of the changes are applied or none of them are.
The steps in my batch are not confined to a single CF, nor to a single key.

The documentation says:
datastax:
Column updates are only considered atomic within a given record (row).

Pycassa.batch:
This interface does not implement atomic operations across column
families. All the limitations of the batch_mutate Thrift API call
applies. Remember, a mutation in Cassandra is always atomic per key
per column family only.


On Wed, Mar 14, 2012 at 4:15 PM, Tyler Hobbs ty...@datastax.com wrote:
 On Wed, Mar 14, 2012 at 11:50 AM, A J s5a...@gmail.com wrote:


 Are you saying the way 'batch mutate' is coded, the order of writes in
 the batch does not mean anything ? You can ask the batch to do A,B,C
 and then D in sequence; but sometimes Cassandra can end up applying
 just C and A,B (and D) may still not be applied ?


 No, batch_mutate() is an atomic operation.  When a node locally applies a
 batch mutation, either all of the changes are applied or none of them are.

 Aaron was referring to the possibility that one of the replicas received the
 batch_mutate, but the other replicas did not.

 --
 Tyler Hobbs
 DataStax



Re: snapshot files locked

2012-03-14 Thread Watanabe Maki
It's true on unix, but you can't delete hard links of opened files on windows.
Try following:
1  Create a text file ORIG.TXT
2. MKLINK /D LINK.TXT ORIG.TXT
  Now you have hard link LINK.TXT .
3  Open ORIG.TXT with MS Word.
4. DEL LINK.TXT
It returns error.

Interesting thing is that you can delete link.txt if you open orig.txt with 
notepad. 
Notepad may close file after load, or it may be using different API.

On 2012/03/15, at 5:04, Jim Newsham jnews...@referentia.com wrote:

 
 Hi Maki,
 
 Thanks for the reply.  Yes, I understand that snapshots are hard links.  
 However, my understanding is that removing any hard-linked files just removes 
 the link (decrementing the link counter of the file on disk) -- it does not 
 delete the file itself nor remove any other links which may be pointing at 
 the file.  To confirm my understanding, I tested this in Windows by 
 terminating Cassandra and then deleting all files in the snapshot dir.  None 
 of the corresponding files in the parent keyspace directory were removed.
 
 Regards,
 Jim
 
 On 3/13/2012 9:29 PM, Maki Watanabe wrote:
 snapshot files are hardlinks of the original sstables.
 As you know, on windows, you can't delete files opened by other process.
 If you try to delete the hardlink, windows thinks you try to delete
 the sstables in production.
 
 maki
 
 2012/3/14 Jim Newshamjnews...@referentia.com:
 Hi,
 
 I'm using Cassandra 1.0.8, on Windows 7.  When I take a snapshot of the
 database, I find that I am unable to delete the snapshot directory (i.e.,
 dir named {datadir}\{keyspacename}\snapshots\{snapshottag}) while
 Cassandra is running:  The action can't be completed because the folder or
 a file in it is open in another program.  Close the folder or file and try
 again.  If I terminate Cassandra, then I can delete the directory with no
 problem.  Is there a reason why Cassandra must hold onto these files?
 
 Thanks,
 Jim
 
 


Re: snapshot files locked

2012-03-14 Thread Watanabe Maki
Typo

 2. MKLINK /D LINK.TXT ORIG.TXT

MKLINK /H


On 2012/03/15, at 9:04, Watanabe Maki watanabe.m...@gmail.com wrote:

 It's true on unix, but you can't delete hard links of opened files on windows.
 Try following:
 1  Create a text file ORIG.TXT
 2. MKLINK /D LINK.TXT ORIG.TXT
  Now you have hard link LINK.TXT .
 3  Open ORIG.TXT with MS Word.
 4. DEL LINK.TXT
It returns error.
 
 Interesting thing is that you can delete link.txt if you open orig.txt with 
 notepad. 
 Notepad may close file after load, or it may be using different API.
 
 On 2012/03/15, at 5:04, Jim Newsham jnews...@referentia.com wrote:
 
 
 Hi Maki,
 
 Thanks for the reply.  Yes, I understand that snapshots are hard links.  
 However, my understanding is that removing any hard-linked files just 
 removes the link (decrementing the link counter of the file on disk) -- it 
 does not delete the file itself nor remove any other links which may be 
 pointing at the file.  To confirm my understanding, I tested this in Windows 
 by terminating Cassandra and then deleting all files in the snapshot dir.  
 None of the corresponding files in the parent keyspace directory were 
 removed.
 
 Regards,
 Jim
 
 On 3/13/2012 9:29 PM, Maki Watanabe wrote:
 snapshot files are hardlinks of the original sstables.
 As you know, on windows, you can't delete files opened by other process.
 If you try to delete the hardlink, windows thinks you try to delete
 the sstables in production.
 
 maki
 
 2012/3/14 Jim Newshamjnews...@referentia.com:
 Hi,
 
 I'm using Cassandra 1.0.8, on Windows 7.  When I take a snapshot of the
 database, I find that I am unable to delete the snapshot directory (i.e.,
 dir named {datadir}\{keyspacename}\snapshots\{snapshottag}) while
 Cassandra is running:  The action can't be completed because the folder or
 a file in it is open in another program.  Close the folder or file and try
 again.  If I terminate Cassandra, then I can delete the directory with no
 problem.  Is there a reason why Cassandra must hold onto these files?
 
 Thanks,
 Jim
 
 


Re: Does the 'batch' order matter ?

2012-03-14 Thread Tyler Hobbs
Ah, my mistake, you are correct. Not sure why I had forgotten that.

The pycassa docs are slightly wrong there, though.  It's technically atomic
for the same key across multiple column families.  I'll get that fixed.

On Wed, Mar 14, 2012 at 5:22 PM, A J s5a...@gmail.com wrote:

  No, batch_mutate() is an atomic operation.  When a node locally applies
 a batch mutation, either all of the changes are applied or none of them
 are.
 The steps in my batch are not confined to a single CF, nor to a single key.

 The documentation says:
 datastax:
 Column updates are only considered atomic within a given record (row).

 Pycassa.batch:
 This interface does not implement atomic operations across column
 families. All the limitations of the batch_mutate Thrift API call
 applies. Remember, a mutation in Cassandra is always atomic per key
 per column family only.


 On Wed, Mar 14, 2012 at 4:15 PM, Tyler Hobbs ty...@datastax.com wrote:
  On Wed, Mar 14, 2012 at 11:50 AM, A J s5a...@gmail.com wrote:
 
 
  Are you saying the way 'batch mutate' is coded, the order of writes in
  the batch does not mean anything ? You can ask the batch to do A,B,C
  and then D in sequence; but sometimes Cassandra can end up applying
  just C and A,B (and D) may still not be applied ?
 
 
  No, batch_mutate() is an atomic operation.  When a node locally applies a
  batch mutation, either all of the changes are applied or none of them
 are.
 
  Aaron was referring to the possibility that one of the replicas received
 the
  batch_mutate, but the other replicas did not.
 
  --
  Tyler Hobbs
  DataStax
 




-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: Question on ByteOrdered rebalancing

2012-03-14 Thread Tyler Hobbs
On Wed, Mar 14, 2012 at 4:17 PM, work late worklate1...@gmail.com wrote:

 The bytes being the key I use or what ?


Yes.  Partitioners split up the space of possible keys based on the tokens
of the nodes.  For BOP, a node's primary range will include all rows with
keys that are less than it's token and greater than the token of the node
with the next smallest token.

So, if you have two nodes with tokens a and b, the node with token b
will be the primary replica for all rows whose key falls between a and
b (essentially, all keys that start with a).

With BOP, the ideal set of tokens would split a sorted list all of the keys
in your data into equal partitions.  You have to know the distribution of
your keys to do this properly, and the correct choice of tokens will change
over time as your data does.  This is why we typically strongly advise
against using ordered partitioner.

When you're doing a move with BOP using nodetool, you specify the new token
in hex.


 Maybe you answered the 2nd question and I don't get it, but what would I
 pass the move command when I want to rebalance the cluster?

 Thanks


  On Tue, Mar 13, 2012 at 11:34 AM, Tyler Hobbs ty...@datastax.com wrote:

 The tokens are hex encoded arrays of bytes.


 On Tue, Mar 13, 2012 at 1:05 PM, work late worklate1...@gmail.comwrote:

 The ring command on nodetool shows as

 Address DC  RackStatus State   Load
  OwnsToken

Token(bytes[88401b216270ab8ebb690946b0b70eab])
 10.1.1.1 datacenter1 rack1   Up Normal  69.1 KB
 50.00%  Token(bytes[4936c862b88db2bdd92d684583bf0280])
 10.1.1.2datacenter1 rack1   Up Normal  69.1 KB
 50.00%  Token(bytes[88401b216270ab8ebb690946b0b70eab])


 The token looks like a MD5 value, is that correct? So when rebalancing
 the cluster, what is the token value I am supposed to give the move command
 (with RP it is the token between 0- 2^127), what should I use BOP?


 Thanks




 --
 Tyler Hobbs
 DataStax http://datastax.com/





-- 
Tyler Hobbs
DataStax http://datastax.com/