Re: snapshot files locked
snapshot files are hardlinks of the original sstables. As you know, on windows, you can't delete files opened by other process. If you try to delete the hardlink, windows thinks you try to delete the sstables in production. maki 2012/3/14 Jim Newsham jnews...@referentia.com: Hi, I'm using Cassandra 1.0.8, on Windows 7. When I take a snapshot of the database, I find that I am unable to delete the snapshot directory (i.e., dir named {datadir}\{keyspacename}\snapshots\{snapshottag}) while Cassandra is running: The action can't be completed because the folder or a file in it is open in another program. Close the folder or file and try again. If I terminate Cassandra, then I can delete the directory with no problem. Is there a reason why Cassandra must hold onto these files? Thanks, Jim
Re: Row iteration over indexed clause
If you want 100 results per call, ask for 101. Use the first 100, when you get to the 101'st do not examine it's data. Instead use it's key as the start to get the next 101. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/03/2012, at 2:03 AM, Vivek Mishra wrote: Thanks. Attribute Type Default Required Description expressions listIndexExpression n/a Y The list of IndexExpression objects which must contain one EQ IndexOperator among the expressions start_key binary n/a Y Start the index query at the specified key - can be set to '', i.e., an empty byte array, to start with the first key count integer 100 Y The number of results to which the index query will be constrained How do i iterate using it? How do i ensure that it should not return me previous results(without i need to keep something in-memory)? This is the method i am looking into: get_indexed_slices(ColumnParent column_parent, IndexClause index_clause, SlicePredicate column_predicate, ConsistencyLevel consistency_level) It does not have anything like count. Thanks, Vivek On Tue, Mar 13, 2012 at 6:24 PM, Shimi Kiviti shim...@gmail.com wrote: Yes. use get_indexed_slices (http://wiki.apache.org/cassandra/API) On Tue, Mar 13, 2012 at 2:12 PM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, Is it possible to iterate and fetch in chunks using thrift API by querying using secondary indexes? -Vivek
Re: Does the 'batch' order matter ?
It may, but it would not be guaranteed. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/03/2012, at 8:11 AM, A J wrote: I know batch operations are not atomic but does the success of a write imply all writes preceeding it in the batch were successful ? For example, using cql: BEGIN BATCH USING CONSISTENCY QUORUM AND TTL 864 INSERT INTO users (KEY, password, name) VALUES ('user2', 'ch@ngem3b', 'second user') UPDATE users SET password = 'ps22dhds' WHERE KEY = 'user2' INSERT INTO users (KEY, password) VALUES ('user3', 'ch@ngem3c') DELETE name FROM users WHERE key = 'user2' INSERT INTO users (KEY, password, name) VALUES ('user4', 'ch@ngem3c', 'Andrew') APPLY BATCH; Say the batch failed but I see that the third write was present on a node. Does it imply that the first insert and the second update definitely made to that node as well ? Thanks.
Re: [Windows] How to configure simple authentication and authorization ?
Have you build and installed SimpleAuthenticator from the source repository? It is not included in the binary kit. maki 2012/3/14 Sabbiolina sabbiol...@gmail.com: HI. I followed this: To set up simple authentication and authorization 1. Edit cassandra.yaml, setting org.apache.cassandra.auth.SimpleAuthenticator as the authenticator value. The default value of AllowAllAuthenticator is equivalent to no authentication. 2. Edit access.properties, adding entries for users and their permissions to read and write to specified keyspaces and column families. See access.properties below for details on the correct format. 3. Make sure that users specified in access.properties have corresponding entries in passwd.properties. See passwd.properties below for details and examples. 4. After making the required configuration changes, you must specify the properties files when starting Cassandra with the flags -Dpasswd.properties and -Daccess.properties. For example: cd $CASSANDRA_HOME sh bin/cassandra -f -Dpasswd.properties=conf/passwd.properties -Daccess.properties=conf/access.properties I started services with additional parameters, but no result, no Log, nothing I use datastax 1.0.8 communiti edition on win 7 64 bit Tnxs
Re: Why is row lookup much faster than column lookup
Here is a look at query plans http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ tl;dr - wide rows require in index to be read from disk; the fastest query uses no start and no finish. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/03/2012, at 6:58 AM, Dave Brosius wrote: sorry, should have been: Given the hashtable nature of cassandra, finding a row is probably 'relatively' constant no matter how many *rows* you have. - Original Message - From: Dave Brosius dbros...@mebigfatguy.com Sent: Tue, March 13, 2012 13:43 Subject: Re: Why is row lookup much faster than column lookup div clas s=PrivateMsgDiv Given the hashtable nature of cassandra, finding a row is probably 'relatively' constant no matter how many columns you have. The smaller the number of columns, i suppose the more likely that all the columns will be in one sstable. If you've got a ton of columns per row, it is much more likely that these columns will be spread out in multple ss tables. Plus, columns are read in chunks, depending on yaml settings. - Original Message - From: A J s5a...@gmail.com Sent: Tue, March 13, 2012 13:35 Subject: Why is row lookup much faster than column lookup From my tests, I am seeing that a CF that has less than 100 columns but millions of rows has a much lower latency to read a column in a row than a CF that has only a few thousands of rows but wide rows with each having 20K columns. Example: cf1 has 6 Million rows and each row has about 100 columns. t1 = time.time() cf1.get(1234,column_count=1) t2 = time.time() - t1 print int(t2*1000) takes 3 ms cf2 has 5K rows and each row has about 18K columns. t1 = time.time() cf2.get(1234,column_count=1) t2 = time.time() - t1 print int(t2*1000) takes 82ms Anything in general on the Cassandra architecture that causes row lookup to be much faster than column lookup ? Thanks.
Re: CAn't bootstrap a new node to my cluster
To bootstap the node I : - start a new node with auto_bootstrap=true and join-ring=false - use the join command and yes, when I use the join command the first time, it says it can't find seed nodes, as I wrote in my first email The logfiles I sent you have all this except that the join error does not appear in it On 3/13/12 9:36 AM, aaron morton wrote: Can you provide some context for the log files please. The original error had to do with bootstrapping a new node into a cluster. The log looks like a node is starting with -Dcassadra.join-ring = false and then nodetool join is run. Is there an error when this runs ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/03/2012, at 11:58 PM, Cyril Scetbon wrote: I don't know if it can helps, but the only thing I see on cluster's nodes is : == /var/log/cassandra/output.log == INFO 10:57:28,530 InetAddress /10.0.1.70 is now dead. when I try to join the node 10.0.1.70 to the cluster On 3/12/12 11:27 AM, Cyril Scetbon wrote: It's done. Nothing new on stderr when I use the join command. I send you the logfiles after I've tried to add the node. Regards On 3/12/12 10:47 AM, aaron morton wrote: Modify this line the log4j-server.properties. It will normally be located in /etc/cassandra https://github.com/apache/cassandra/blob/trunk/conf/log4j-server.properties#L21 Change INFO to DEBUG Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com http://www.thelastpickle.com/ On 12/03/2012, at 10:12 PM, Cyril Scetbon wrote: On 3/12/12 9:50 AM, aaron morton wrote: It may be the case that the joining node does not have enough information. But there is a default 30 second delay while the node waits for the ring information to stabilise. What version are you using ? 1.0.7 Next time you add a new node can you try it with logging set the DEBUG. If you get the error please add it to https://issues.apache.org/jira/browse/CASSANDRA with the relevant logs. where do I have to add it ? I added it to the cassandra-env.sh and got a lot of things but are you saying that I must add it to the join command ? if yes, how ? after the join command fails as you saw I have the ring information after that. I don't if it took 30 seconds or not ... Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com http://www.thelastpickle.com/ -- Cyril SCETBON -- Cyril SCETBON -- Cyril SCETBON -- Cyril SCETBON
Adding nodes to cluster (Cassandra 1.0.8)
I was able to successfully join a node to already existing one-node cluster (without giving any intital_token), but when I add another machine with identical settings (with changes in listen broadcast and rpc address). I am unable to join it to the cluster and it gives me following error: INFO 17:50:35,555 JOINING: schema complete, ready to bootstrap INFO 17:50:35,556 JOINING: getting bootstrap token ERROR 17:50:35,557 Exception encountered during startup java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:168) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:150) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:145) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:565) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:484) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:395) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:234) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107) java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:168) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:150) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:145) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:565) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:484) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:395) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:234) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107) Exception encountered during startup: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. INFO 17:50:35,571 Waiting for messaging service to quiesce INFO 17:50:35,571 MessagingService shutting down server thread. Now when I put some interger value to intial_token the Cassandra starts working but is not able to connect to the main cluster which became evident from the command Nodetool -h ip of this node ring. It displayed itself with 100% ownership. Kindly help me with it asap. Regards Rishabh Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22. Know more about our Big Data quick-start program at the event. New Impetus webcast 'Cloud-enabled Performance Testing vis-?-vis On-premise' available at http://bit.ly/z6zT4L. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: upgrade from 1.0.7 to 1.0.8
Did everything go well with your update ? Do the version 1.0.8 worth the risk of an update? Alain 2012/3/11 Tamar Fraenkel ta...@tok-media.com Thanks! will check it in the following days :) *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Sun, Mar 11, 2012 at 10:00 AM, Marcel Steinbach mstei...@gmail.comwrote: Check this out: http://www.datastax.com/docs/1.0/install/upgrading#upgrading-between-minor-releases-of-cassandra-1-0-x Cheers Am 11.03.2012 um 07:42 schrieb Tamar Fraenkel ta...@tok-media.com: Hi! I want to experiment with upgrading. Does anyone have a good link on how to upgrade Cassandra? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media tokLogo.png ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 tokLogo.png
Re: Adding nodes to cluster (Cassandra 1.0.8)
Do you use same storage_port across 3 nodes? Can you access to the storage_port of the seed node from the last (failed) node? 2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in: I was able to successfully join a node to already existing one-node cluster (without giving any intital_token), but when I add another machine with identical settings (with changes in listen broadcast and rpc address). I am unable to join it to the cluster and it gives me following error: INFO 17:50:35,555 JOINING: schema complete, ready to bootstrap INFO 17:50:35,556 JOINING: getting bootstrap token ERROR 17:50:35,557 Exception encountered during startup java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:168) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:150) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:145) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:565) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:484) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:395) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:234) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107) java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:168) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:150) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:145) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:565) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:484) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:395) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:234) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107) Exception encountered during startup: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. INFO 17:50:35,571 Waiting for messaging service to quiesce INFO 17:50:35,571 MessagingService shutting down server thread. Now when I put some interger value to intial_token the Cassandra starts working but is not able to connect to the main cluster which became evident from the command Nodetool –h ip of this node ring. It displayed itself with 100% ownership. Kindly help me with it asap. Regards Rishabh Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22. Know more about our Big Data quick-start program at the event. New Impetus webcast ‘Cloud-enabled Performance Testing vis-à-vis On-premise’ available at http://bit.ly/z6zT4L. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: CAn't bootstrap a new node to my cluster
I have to add that I didn't set the token too to let the cluster choose one On 3/14/12 10:11 AM, Cyril Scetbon wrote: To bootstap the node I : - start a new node with auto_bootstrap=true and join-ring=false - use the join command and yes, when I use the join command the first time, it says it can't find seed nodes, as I wrote in my first email The logfiles I sent you have all this except that the join error does not appear in it On 3/13/12 9:36 AM, aaron morton wrote: Can you provide some context for the log files please. The original error had to do with bootstrapping a new node into a cluster. The log looks like a node is starting with -Dcassadra.join-ring = false and then nodetool join is run. Is there an error when this runs ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/03/2012, at 11:58 PM, Cyril Scetbon wrote: I don't know if it can helps, but the only thing I see on cluster's nodes is : == /var/log/cassandra/output.log == INFO 10:57:28,530 InetAddress /10.0.1.70 is now dead. when I try to join the node 10.0.1.70 to the cluster On 3/12/12 11:27 AM, Cyril Scetbon wrote: It's done. Nothing new on stderr when I use the join command. I send you the logfiles after I've tried to add the node. Regards On 3/12/12 10:47 AM, aaron morton wrote: Modify this line the log4j-server.properties. It will normally be located in /etc/cassandra https://github.com/apache/cassandra/blob/trunk/conf/log4j-server.properties#L21 Change INFO to DEBUG Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com http://www.thelastpickle.com/ On 12/03/2012, at 10:12 PM, Cyril Scetbon wrote: On 3/12/12 9:50 AM, aaron morton wrote: It may be the case that the joining node does not have enough information. But there is a default 30 second delay while the node waits for the ring information to stabilise. What version are you using ? 1.0.7 Next time you add a new node can you try it with logging set the DEBUG. If you get the error please add it to https://issues.apache.org/jira/browse/CASSANDRA with the relevant logs. where do I have to add it ? I added it to the cassandra-env.sh and got a lot of things but are you saying that I must add it to the join command ? if yes, how ? after the join command fails as you saw I have the ring information after that. I don't if it took 30 seconds or not ... Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com http://www.thelastpickle.com/ -- Cyril SCETBON -- Cyril SCETBON -- Cyril SCETBON -- Cyril SCETBON -- Cyril SCETBON
RE: Adding nodes to cluster (Cassandra 1.0.8)
I am using storage port 7000 (deafult) across all nodes. -Original Message- From: Maki Watanabe [mailto:watanabe.m...@gmail.com] Sent: Wednesday, March 14, 2012 3:42 PM To: user@cassandra.apache.org Subject: Re: Adding nodes to cluster (Cassandra 1.0.8) Do you use same storage_port across 3 nodes? Can you access to the storage_port of the seed node from the last (failed) node? 2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in: I was able to successfully join a node to already existing one-node cluster (without giving any intital_token), but when I add another machine with identical settings (with changes in listen broadcast and rpc address). I am unable to join it to the cluster and it gives me following error: INFO 17:50:35,555 JOINING: schema complete, ready to bootstrap INFO 17:50:35,556 JOINING: getting bootstrap token ERROR 17:50:35,557 Exception encountered during startup java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper. java:168) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja va:150) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j ava:145) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi ce.java:565) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:484) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:395) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas sandraDaemon.java:234) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract CassandraDaemon.java:356) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java: 107) java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper. java:168) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja va:150) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j ava:145) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi ce.java:565) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:484) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:395) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas sandraDaemon.java:234) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract CassandraDaemon.java:356) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java: 107) Exception encountered during startup: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. INFO 17:50:35,571 Waiting for messaging service to quiesce INFO 17:50:35,571 MessagingService shutting down server thread. Now when I put some interger value to intial_token the Cassandra starts working but is not able to connect to the main cluster which became evident from the command Nodetool –h ip of this node ring. It displayed itself with 100% ownership. Kindly help me with it asap. Regards Rishabh Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22. Know more about our Big Data quick-start program at the event. New Impetus webcast ‘Cloud-enabled Performance Testing vis-à-vis On-premise’ available at http://bit.ly/z6zT4L. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been
Re: upgrade from 1.0.7 to 1.0.8
Haven't got to it yet, will update you when I do, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Mar 14, 2012 at 11:40 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: Did everything go well with your update ? Do the version 1.0.8 worth the risk of an update? Alain 2012/3/11 Tamar Fraenkel ta...@tok-media.com Thanks! will check it in the following days :) *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Sun, Mar 11, 2012 at 10:00 AM, Marcel Steinbach mstei...@gmail.comwrote: Check this out: http://www.datastax.com/docs/1.0/install/upgrading#upgrading-between-minor-releases-of-cassandra-1-0-x Cheers Am 11.03.2012 um 07:42 schrieb Tamar Fraenkel ta...@tok-media.com: Hi! I want to experiment with upgrading. Does anyone have a good link on how to upgrade Cassandra? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media tokLogo.png ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 tokLogo.pngtokLogo.png
counter columns question
Hi! My apologies for the naive question, but I am new to Cassandra and even more to counter columns. If I have a counter column, what happens if two threads are trying to increment it's value at the same time for the same key? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 tokLogo.png
Spotify at Cassandra Europe. Last tickets remaining.
Hi All, The first ever edition of Cassandra Europe is now only two weeks away. Only a few tickets remain so if you intend to attend, make sure to book now to avoid disappointment. We are happy to announce that Spotify have become sponsors of Cassandra Europe. In addition to Spotify, we are pleased to bring you talks from the world's top developers and users of Cassandra including Netflix, Rackspace and Dachis Group. A full list of talks is available on the website. You should attend if: - you're considering moving away from Oracle / MySQL and exploring NoSQL solutions; - you're considering using Cassandra; - are curious about what it offers and what other people do with it; - want to know how it compares to all the other NoSQL offerings and Hadoop; - want to know more about how Cassandra works. In particular, we're encouraging CTOs, CIOs, CEOs, Sys Admins, Developers, Dev Ops, and Architects to attend. More information and tickets can be found here: http://cassandra-eu.org/ Your friendly Cassandra Europe team.
cleanup crashing with java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 8
Hi all I am trying to run a cleanup on a column family and am getting the following error returned after about 15 seconds. A cleanup on a slightly smaller column family completes in about 21 minutes. This is on the Apache packaged version of Cassandra on Ubuntu 11.10, version 1.0.7. ~# nodetool -h localhost cleanup Player PlayerDetail Error occured during cleanup java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 8 at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:203) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:237) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:984) at org.apache.cassandra.service.StorageService.forceTableCleanup(StorageService.java:1635) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ArrayIndexOutOfBoundsException: 8 at org.apache.cassandra.db.compaction.LeveledManifest.add(LeveledManifest.java:298) at org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:186) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:141) at org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:494) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:234) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:1006) at org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:791) at org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63) at org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:241) at org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:182) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more
Re: LeveledCompaction and/or SnappyCompressor causing memory pressure during repair
Thanks for the suggestions but I'd already removed the compression when your message came thru. That alleviated the problem but didn't solve it. I'm still looking at a few other possible causes, I'll post back if I work out what's going on, for now I am running rolling repairs to avoid another outage. On Sun, Mar 11, 2012 at 6:32 PM, Edward Capriolo edlinuxg...@gmail.comwrote: One thing you may want to look at is the meanRowSize from nodetool cfstats and your compression block size. In our case the mean compacted size is 560 bytes and 64KB block size caused CPU tickets and a lot of short lived memory. I have brought by block size down to 16K. The result tables are not noticeably larger and less memory pressure on the young gen. I might try going down to 4 K next. On Sat, Mar 10, 2012 at 5:38 PM, Edward Capriolo edlinuxg...@gmail.com wrote: The only downside of compression is it does cause more memory pressure. I can imagine something like repair could confound this. Since it would seem like building the merkle tree would involve decompressing every block on disk. I have been attempting to determine if the block size being larger or smaller has any effect on memory pressure. On Sat, Mar 10, 2012 at 4:50 PM, Peter Schuller peter.schul...@infidyne.com wrote: However, when I run a repair my CMS usage graph no longer shows sudden drops but rather gradual slopes and only manages to clear around 300MB each GC. This seems to occur on 2 other nodes in my cluster around the same time, I assume this is because they're the replicas (we use 3 replicas). Parnew collections look about the same on my graphs with or without repair running so no trouble there so far as I can tell. I don't know why leveled/snappy would affect it, but disregarding that, I would have been suggesting that you are seeing additional heap usage because of long-running repairs retaining sstables and delaying their unload/removal (index sampling/bloom filters filling your heap). If it really only happens for leveled/snappy however, I don't know what that might be caused by. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
Re: Adding nodes to cluster (Cassandra 1.0.8)
It looks like Cassandra is not able to find your existing cluster. Do you use the same name of cluster in the 2 nodes ? Are you able to contact the existing node on port 7000 from the new node ? Is the first node defined as a seed in the cassandra.yaml of your 2 nodes ? If you can't make a connection between these 2 nodes, this error is normal, Cassandra try to create a new cluster and the node itself isn't defined as a seed. If you can't get it working and are sure of the point given above, tell us what is you infrastructure, how you proceeded to install Cassandra and give us a copy of your cassandra.yaml for both of your nodes. Alain 2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in I am using storage port 7000 (deafult) across all nodes. -Original Message- From: Maki Watanabe [mailto:watanabe.m...@gmail.com] Sent: Wednesday, March 14, 2012 3:42 PM To: user@cassandra.apache.org Subject: Re: Adding nodes to cluster (Cassandra 1.0.8) Do you use same storage_port across 3 nodes? Can you access to the storage_port of the seed node from the last (failed) node? 2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in: I was able to successfully join a node to already existing one-node cluster (without giving any intital_token), but when I add another machine with identical settings (with changes in listen broadcast and rpc address). I am unable to join it to the cluster and it gives me following error: INFO 17:50:35,555 JOINING: schema complete, ready to bootstrap INFO 17:50:35,556 JOINING: getting bootstrap token ERROR 17:50:35,557 Exception encountered during startup java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper. java:168) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja va:150) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j ava:145) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi ce.java:565) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:484) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:395) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas sandraDaemon.java:234) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract CassandraDaemon.java:356) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java: 107) java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper. java:168) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja va:150) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j ava:145) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi ce.java:565) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:484) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:395) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas sandraDaemon.java:234) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract CassandraDaemon.java:356) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java: 107) Exception encountered during startup: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. INFO 17:50:35,571 Waiting for messaging service to quiesce INFO 17:50:35,571 MessagingService shutting down server thread. Now when I put some interger value to intial_token the Cassandra starts working but is not able to connect to the main cluster which became evident from the
Re: cleanup crashing with java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 8
Fixed in 1.0.9, 1.1.0 https://issues.apache.org/jira/browse/CASSANDRA-3989 You should better to avoid to use cleanup/scrub/upgradesstable if you can on 1.0.7 though it will not corrupt sstables. 2012/3/14 Thomas van Neerijnen t...@bossastudios.com: Hi all I am trying to run a cleanup on a column family and am getting the following error returned after about 15 seconds. A cleanup on a slightly smaller column family completes in about 21 minutes. This is on the Apache packaged version of Cassandra on Ubuntu 11.10, version 1.0.7. ~# nodetool -h localhost cleanup Player PlayerDetail Error occured during cleanup java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 8 at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:203) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:237) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:984) at org.apache.cassandra.service.StorageService.forceTableCleanup(StorageService.java:1635) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ArrayIndexOutOfBoundsException: 8 at org.apache.cassandra.db.compaction.LeveledManifest.add(LeveledManifest.java:298) at org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:186) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:141) at org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:494) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:234) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:1006) at org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:791) at org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63) at org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:241) at org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:182) at
Re: Adding nodes to cluster (Cassandra 1.0.8)
Try: % telnet seed_node 7000 on the 3rd node and see what it says. maki 2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in: I am using storage port 7000 (deafult) across all nodes. -Original Message- From: Maki Watanabe [mailto:watanabe.m...@gmail.com] Sent: Wednesday, March 14, 2012 3:42 PM To: user@cassandra.apache.org Subject: Re: Adding nodes to cluster (Cassandra 1.0.8) Do you use same storage_port across 3 nodes? Can you access to the storage_port of the seed node from the last (failed) node? 2012/3/14 Rishabh Agrawal rishabh.agra...@impetus.co.in: I was able to successfully join a node to already existing one-node cluster (without giving any intital_token), but when I add another machine with identical settings (with changes in listen broadcast and rpc address). I am unable to join it to the cluster and it gives me following error: INFO 17:50:35,555 JOINING: schema complete, ready to bootstrap INFO 17:50:35,556 JOINING: getting bootstrap token ERROR 17:50:35,557 Exception encountered during startup java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper. java:168) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja va:150) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j ava:145) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi ce.java:565) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:484) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:395) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas sandraDaemon.java:234) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract CassandraDaemon.java:356) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java: 107) java.lang.RuntimeException: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. at org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper. java:168) at org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.ja va:150) at org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.j ava:145) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageServi ce.java:565) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:484) at org.apache.cassandra.service.StorageService.initServer(StorageService. java:395) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCas sandraDaemon.java:234) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(Abstract CassandraDaemon.java:356) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java: 107) Exception encountered during startup: No other nodes seen! Unable to bootstrap.If you intended to start a single-node cluster, you should make sure your broadcast_address (or listen_address) is listed as a seed. Otherwise, you need to determine why the seed being contacted has no knowledge of the rest of the cluster. Usually, this can be solved by giving all nodes the same seed list. INFO 17:50:35,571 Waiting for messaging service to quiesce INFO 17:50:35,571 MessagingService shutting down server thread. Now when I put some interger value to intial_token the Cassandra starts working but is not able to connect to the main cluster which became evident from the command Nodetool –h ip of this node ring. It displayed itself with 100% ownership. Kindly help me with it asap. Regards Rishabh Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22. Know more about our Big Data quick-start program at the event. New Impetus webcast ‘Cloud-enabled Performance Testing vis-à-vis On-premise’ available at http://bit.ly/z6zT4L. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this
Re: counter columns question
Thanks! *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Mar 14, 2012 at 1:52 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Here you got a good explaination of how the counters work. Video : http://blip.tv/datastax/counters-in-cassandra-5497678 Sildes : http://www.datastax.com/wp-content/uploads/2011/07/cassandra_sf_counters.pdf I think it will answer your question and help you understanding interesting things about counters. If you need a short answer, it concurrency will be handled nicely and you'll have the expected result. Alain 2012/3/14 Tamar Fraenkel ta...@tok-media.com Hi! My apologies for the naive question, but I am new to Cassandra and even more to counter columns. If I have a counter column, what happens if two threads are trying to increment it's value at the same time for the same key? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 tokLogo.pngtokLogo.png
Re: [Windows] How to configure simple authentication and authorization ?
encryption_options: internode_encryption: none keystore: conf/.keystore keystore_password: cassandra truststore: conf/.truststore truststore_password: cassandra from log: INFO [FlushWriter:1] 2012-03-14 14:32:58,948 Memtable.java (line 283) Completed flushing C:\DataStax\data\data\system\ LocationInfo-hc-12-Data.db (80 bytes) ERROR [main] 2012-03-14 14:32:58,952 AbstractCassandraDaemon.java (line 373) Exception encountered during startup java.io.FileNotFoundException: conf\.truststore Where do I put the files? On Wed, Mar 14, 2012 at 8:43 AM, Maki Watanabe watanabe.m...@gmail.comwrote: Have you build and installed SimpleAuthenticator from the source repository? It is not included in the binary kit. maki 2012/3/14 Sabbiolina sabbiol...@gmail.com: HI. I followed this: To set up simple authentication and authorization 1. Edit cassandra.yaml, setting org.apache.cassandra.auth.SimpleAuthenticator as the authenticator value. The default value of AllowAllAuthenticator is equivalent to no authentication. 2. Edit access.properties, adding entries for users and their permissions to read and write to specified keyspaces and column families. See access.properties below for details on the correct format. 3. Make sure that users specified in access.properties have corresponding entries in passwd.properties. See passwd.properties below for details and examples. 4. After making the required configuration changes, you must specify the properties files when starting Cassandra with the flags -Dpasswd.properties and -Daccess.properties. For example: cd $CASSANDRA_HOME sh bin/cassandra -f -Dpasswd.properties=conf/passwd.properties -Daccess.properties=conf/access.properties I started services with additional parameters, but no result, no Log, nothing I use datastax 1.0.8 communiti edition on win 7 64 bit Tnxs
Re: Composite keys and range queries
Hmm, now I'm really confused. This may be of use to you http://www.datastax.com/dev/blog/schema-in-cassandra-1-1 This article is what I actually used to come up with my schema here. In the Clustering, composite keys, and more section they're using a schema very similarly to how I'm trying to use it. They define a composite key with two parts, expecting the first part to be used as the partition key and the second part to be used for ordering. The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 . Why? Shouldn't only uuid-1 be used as the partition key? (So shouldn't those two hash to the same location?) I'm thinking of using supercolumns for this instead as I know they'll work (where the row key is the uuid and the supercolumn name is the priority), but aren't composite row keys supposed to essentially replace the need for supercolumns? Thanks, and sorry if I'm getting this all wrong, John On Wed, Mar 14, 2012 at 12:52 AM, aaron morton aa...@thelastpickle.comwrote: You are seeing this http://wiki.apache.org/cassandra/FAQ#range_rp The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 . You cannot do what you want to. Even if you passed a start of (uuid1,empty) and no finish, you would not only get rows where the key starts with uuid1. This may be of use to you http://www.datastax.com/dev/blog/schema-in-cassandra-1-1 Or you can store all the priorities that are valid for an ID in another row. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/03/2012, at 1:05 PM, John Laban wrote: Forwarding to the Cassandra mailing list as well, in case this is more of an issue on how I'm using Cassandra. Am I correct to assume that I can use range queries on composite row keys, even when using a RandomPartitioner, if I make sure that the first part of the composite key is fixed? Any help would be appreciated, John On Tue, Mar 13, 2012 at 12:15 PM, John Laban j...@pagerduty.com wrote: Hi, I have a column family that uses a composite key: (ID, priority) - ... Where the ID is a UUID and the priority is an integer. I'm trying to perform a range query now: I want all the rows where the ID matches some fixed UUID, but within a range of priorities. This is supported even if I'm using a RandomPartitioner, right? (Because the first key in the composite key is the partition key, and the second part of the composite key is automatically ordered?) So I perform a range slices query: val rangeQuery = HFactory.createRangeSlicesQuery(keyspace, new CompositeSerializer, StringSerializer.get, BytesArraySerializer.get) rangeQuery.setColumnFamily(RouteColumnFamilyName). setKeys( new Composite(id, priorityStart), new Composite(id, priorityEnd) ). setRange( null, null, false, Int.MaxValue ) But I get this error: me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:start key's md5 sorts after end key's md5. this is not allowed; you probably should not specify end key at all, under RandomPartitioner) Shouldn't they have the same md5, since they have the same partition key? Am I using the wrong query here, or does Hector not support composte range queries, or am I making some mistake in how I think Cassandra's composite keys work? Thanks, John
Large hints column family
The system HintsColumnFamily seems large in my cluster, and I want to track down why that is. I try invoking listEndpointsPendingHints() for o.a.c.db.HintedHandoffManager and it never returns, and also freezes the node that its invoked against. It's a 3 node cluster, and all nodes have been up and running without issue for a while. Any help on where to start with this? Column Family: HintsColumnFamily SSTable count: 11 Space used (live): 11271669539 Space used (total): 11271669539 Number of Keys (estimate): 1408 Memtable Columns Count: 338 Memtable Data Size: 0 Memtable Switch Count: 1 Read Count: 3 Read Latency: 4354.669 ms. Write Count: 848 Write Latency: 0.029 ms. Pending Tasks: 0 Bloom Filter False Postives: 0 Bloom Filter False Ratio: 0.0 Bloom Filter Space Used: 12656 Key cache capacity: 14 Key cache size: 11 Key cache hit rate: 0. Row cache: disabled Compacted row minimum size: 105779 Compacted row maximum size: 7152383774 Compacted row mean size: 590818614 Thanks, Bryce
RE: Large hints column family
Forgot to mention that this is on 1.0.8 From: Bryce Godfrey [mailto:bryce.godf...@azaleos.com] Sent: Wednesday, March 14, 2012 12:34 PM To: user@cassandra.apache.org Subject: Large hints column family The system HintsColumnFamily seems large in my cluster, and I want to track down why that is. I try invoking listEndpointsPendingHints() for o.a.c.db.HintedHandoffManager and it never returns, and also freezes the node that its invoked against. It's a 3 node cluster, and all nodes have been up and running without issue for a while. Any help on where to start with this? Column Family: HintsColumnFamily SSTable count: 11 Space used (live): 11271669539 Space used (total): 11271669539 Number of Keys (estimate): 1408 Memtable Columns Count: 338 Memtable Data Size: 0 Memtable Switch Count: 1 Read Count: 3 Read Latency: 4354.669 ms. Write Count: 848 Write Latency: 0.029 ms. Pending Tasks: 0 Bloom Filter False Postives: 0 Bloom Filter False Ratio: 0.0 Bloom Filter Space Used: 12656 Key cache capacity: 14 Key cache size: 11 Key cache hit rate: 0. Row cache: disabled Compacted row minimum size: 105779 Compacted row maximum size: 7152383774 Compacted row mean size: 590818614 Thanks, Bryce
RE: Composite keys and range queries
Right, so until the new CQL stuff exists to actually query with something smart enough to know about composite keys , You have to define and query on your own. Row Key = UUID Column = CompositeColumn(string, string) You want to then use COLUMN slicing, not row ranges to query the data. Where you slice in priority as the first part of a Composite Column Name. See the Under the hood and historical notes section of the blog post. You want to layout your data per the Physical representation of the denormalized timeline rows diagram. Where your UUID is the user_id from the example, and your priority is the tweet_id -Jeremiah From: John Laban [j...@pagerduty.com] Sent: Wednesday, March 14, 2012 12:37 PM To: user@cassandra.apache.org Subject: Re: Composite keys and range queries Hmm, now I'm really confused. This may be of use to you http://www.datastax.com/dev/blog/schema-in-cassandra-1-1 This article is what I actually used to come up with my schema here. In the Clustering, composite keys, and more section they're using a schema very similarly to how I'm trying to use it. They define a composite key with two parts, expecting the first part to be used as the partition key and the second part to be used for ordering. The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 . Why? Shouldn't only uuid-1 be used as the partition key? (So shouldn't those two hash to the same location?) I'm thinking of using supercolumns for this instead as I know they'll work (where the row key is the uuid and the supercolumn name is the priority), but aren't composite row keys supposed to essentially replace the need for supercolumns? Thanks, and sorry if I'm getting this all wrong, John On Wed, Mar 14, 2012 at 12:52 AM, aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com wrote: You are seeing this http://wiki.apache.org/cassandra/FAQ#range_rp The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 . You cannot do what you want to. Even if you passed a start of (uuid1,empty) and no finish, you would not only get rows where the key starts with uuid1. This may be of use to you http://www.datastax.com/dev/blog/schema-in-cassandra-1-1 Or you can store all the priorities that are valid for an ID in another row. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/03/2012, at 1:05 PM, John Laban wrote: Forwarding to the Cassandra mailing list as well, in case this is more of an issue on how I'm using Cassandra. Am I correct to assume that I can use range queries on composite row keys, even when using a RandomPartitioner, if I make sure that the first part of the composite key is fixed? Any help would be appreciated, John On Tue, Mar 13, 2012 at 12:15 PM, John Laban j...@pagerduty.commailto:j...@pagerduty.com wrote: Hi, I have a column family that uses a composite key: (ID, priority) - ... Where the ID is a UUID and the priority is an integer. I'm trying to perform a range query now: I want all the rows where the ID matches some fixed UUID, but within a range of priorities. This is supported even if I'm using a RandomPartitioner, right? (Because the first key in the composite key is the partition key, and the second part of the composite key is automatically ordered?) So I perform a range slices query: val rangeQuery = HFactory.createRangeSlicesQuery(keyspace, new CompositeSerializer, StringSerializer.get, BytesArraySerializer.get) rangeQuery.setColumnFamily(RouteColumnFamilyName). setKeys( new Composite(id, priorityStart), new Composite(id, priorityEnd) ). setRange( null, null, false, Int.MaxValue ) But I get this error: me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:start key's md5 sorts after end key's md5. this is not allowed; you probably should not specify end key at all, under RandomPartitioner) Shouldn't they have the same md5, since they have the same partition key? Am I using the wrong query here, or does Hector not support composte range queries, or am I making some mistake in how I think Cassandra's composite keys work? Thanks, John
Re: snapshot files locked
Hi Maki, Thanks for the reply. Yes, I understand that snapshots are hard links. However, my understanding is that removing any hard-linked files just removes the link (decrementing the link counter of the file on disk) -- it does not delete the file itself nor remove any other links which may be pointing at the file. To confirm my understanding, I tested this in Windows by terminating Cassandra and then deleting all files in the snapshot dir. None of the corresponding files in the parent keyspace directory were removed. Regards, Jim On 3/13/2012 9:29 PM, Maki Watanabe wrote: snapshot files are hardlinks of the original sstables. As you know, on windows, you can't delete files opened by other process. If you try to delete the hardlink, windows thinks you try to delete the sstables in production. maki 2012/3/14 Jim Newshamjnews...@referentia.com: Hi, I'm using Cassandra 1.0.8, on Windows 7. When I take a snapshot of the database, I find that I am unable to delete the snapshot directory (i.e., dir named {datadir}\{keyspacename}\snapshots\{snapshottag}) while Cassandra is running: The action can't be completed because the folder or a file in it is open in another program. Close the folder or file and try again. If I terminate Cassandra, then I can delete the directory with no problem. Is there a reason why Cassandra must hold onto these files? Thanks, Jim
Re: Datastax Enterprise mixed workload cluster configuration
Yes, you can do this. You will want to have three DCs: DC1 with [1, 2, 3], DC2 with [4, 5, 6], and DC3 with [7, 8, 9]. For your normal data keyspace, the replication strategy should be NTS, and the strategy_options should have some replicas in each of the three DCs. For example: {DC1: 3, DC2: 3, DC3: 3} if you need that level of replication in each one (although you probably only want an RF of 1 for DC3). Your clients that are performing writes should only open connections against the nodes in DC1, and you should write at CL.ONE or CL.LOCAL_QUORUM. Likewise for reads, your clients should only connect to nodes in DC2, and you should read at CL.ONE or CL.LOCAL_QUORUM. The nodes in DC3 should run as analytics nodes. I believe the default CL for m/r jobs is ONE, which would work. As far as tokens go, interleaving all three DCs and evenly spacing the tokens will work. For example, the ordering of your nodes might be [1, 4, 7, 2, 5, 8, 3, 6, 9]. On Wed, Mar 14, 2012 at 12:05 PM, Alexandru Sicoe adsi...@gmail.com wrote: Hi everyone, I want to test out the Datastax Enterprise software to have a mixed workload setup with an analytics and a real time part. However I am not sure how to configure it to achieve what I want: I will have 3 real machines on one side of a gateway (1,2,3) and 6 VMs on another(4,5,6). 1,2,3 will each have a normal Cassandra node that just takes data directly from my data sources. I want them to replicate the data to the other 6 VMs. Now, out of those 6 VMs 4,5,6 will run normal Cassandra nodes and 7,8,9 will run Analytics nodes. So I only want to write to the 1,2,3 and I only want to serve user reads from 4,5,6 and do analytics on 7,8,9. Can I achieve this by configuring 1,2,3,4,5,6 as normal nodes and the rest as analytics nodes? If I alternate the tokens as it's explained in http://www.datastax.com/docs/1.0/datastax_enterprise/init_dse_cluster#init-dseis it analoguous to achieving something like 3 DCs each getting their own replica? Thanks, Alex -- Tyler Hobbs DataStax http://datastax.com/
Re: Does the 'batch' order matter ?
On Wed, Mar 14, 2012 at 11:50 AM, A J s5a...@gmail.com wrote: Are you saying the way 'batch mutate' is coded, the order of writes in the batch does not mean anything ? You can ask the batch to do A,B,C and then D in sequence; but sometimes Cassandra can end up applying just C and A,B (and D) may still not be applied ? No, batch_mutate() is an atomic operation. When a node locally applies a batch mutation, either all of the changes are applied or none of them are. Aaron was referring to the possibility that one of the replicas received the batch_mutate, but the other replicas did not. -- Tyler Hobbs DataStax http://datastax.com/
Re: snapshot files locked
Can you open a ticket on https://issues.apache.org/jira/browse/CASSANDRAfor this? On Wed, Mar 14, 2012 at 3:04 PM, Jim Newsham jnews...@referentia.comwrote: Hi Maki, Thanks for the reply. Yes, I understand that snapshots are hard links. However, my understanding is that removing any hard-linked files just removes the link (decrementing the link counter of the file on disk) -- it does not delete the file itself nor remove any other links which may be pointing at the file. To confirm my understanding, I tested this in Windows by terminating Cassandra and then deleting all files in the snapshot dir. None of the corresponding files in the parent keyspace directory were removed. Regards, Jim On 3/13/2012 9:29 PM, Maki Watanabe wrote: snapshot files are hardlinks of the original sstables. As you know, on windows, you can't delete files opened by other process. If you try to delete the hardlink, windows thinks you try to delete the sstables in production. maki 2012/3/14 Jim Newshamjnewsham@referentia.**com jnews...@referentia.com : Hi, I'm using Cassandra 1.0.8, on Windows 7. When I take a snapshot of the database, I find that I am unable to delete the snapshot directory (i.e., dir named {datadir}\{keyspacename}\**snapshots\{snapshottag}) while Cassandra is running: The action can't be completed because the folder or a file in it is open in another program. Close the folder or file and try again. If I terminate Cassandra, then I can delete the directory with no problem. Is there a reason why Cassandra must hold onto these files? Thanks, Jim -- Tyler Hobbs DataStax http://datastax.com/
Re: Question on ByteOrdered rebalancing
The bytes being the key I use or what ? Maybe you answered the 2nd question and I don't get it, but what would I pass the move command when I want to rebalance the cluster? Thanks On Tue, Mar 13, 2012 at 11:34 AM, Tyler Hobbs ty...@datastax.com wrote: The tokens are hex encoded arrays of bytes. On Tue, Mar 13, 2012 at 1:05 PM, work late worklate1...@gmail.com wrote: The ring command on nodetool shows as Address DC RackStatus State Load OwnsToken Token(bytes[88401b216270ab8ebb690946b0b70eab]) 10.1.1.1 datacenter1 rack1 Up Normal 69.1 KB 50.00% Token(bytes[4936c862b88db2bdd92d684583bf0280]) 10.1.1.2datacenter1 rack1 Up Normal 69.1 KB 50.00% Token(bytes[88401b216270ab8ebb690946b0b70eab]) The token looks like a MD5 value, is that correct? So when rebalancing the cluster, what is the token value I am supposed to give the move command (with RP it is the token between 0- 2^127), what should I use BOP? Thanks -- Tyler Hobbs DataStax http://datastax.com/
Re: Composite keys and range queries
Ahhh, ok, I thought that CQL was just being brought up to date with the functionality already built into composite keys, but I guess I was mistaken there. But I guess it's just providing a convenient abstraction, using composite column names under the hood. That's where I was confused, thanks. So, in terms of composite column names vs supercolumns: is the only advantage to composite column names that you can do column slicing on subsets of the subcolumns? I.e. if I don't mind loading all of the subcolumns for a given supercolumn name in memory at once (since I need them all anyway), is there any disadvantage to using supercolumns here? They seem a little cleaner and more straightforward for my use case, since I don't have the advantage of the CQL composite key thing. Thanks, John On Wed, Mar 14, 2012 at 12:53 PM, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: Right, so until the new CQL stuff exists to actually query with something smart enough to know about composite keys , You have to define and query on your own. Row Key = UUID Column = CompositeColumn(string, string) You want to then use COLUMN slicing, not row ranges to query the data. Where you slice in priority as the first part of a Composite Column Name. See the Under the hood and historical notes section of the blog post. You want to layout your data per the Physical representation of the denormalized timeline rows diagram. Where your UUID is the user_id from the example, and your priority is the tweet_id -Jeremiah -- *From:* John Laban [j...@pagerduty.com] *Sent:* Wednesday, March 14, 2012 12:37 PM *To:* user@cassandra.apache.org *Subject:* Re: Composite keys and range queries Hmm, now I'm really confused. This may be of use to you http://www.datastax.com/dev/blog/schema-in-cassandra-1-1 This article is what I actually used to come up with my schema here. In the Clustering, composite keys, and more section they're using a schema very similarly to how I'm trying to use it. They define a composite key with two parts, expecting the first part to be used as the partition key and the second part to be used for ordering. The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 . Why? Shouldn't only uuid-1 be used as the partition key? (So shouldn't those two hash to the same location?) I'm thinking of using supercolumns for this instead as I know they'll work (where the row key is the uuid and the supercolumn name is the priority), but aren't composite row keys supposed to essentially replace the need for supercolumns? Thanks, and sorry if I'm getting this all wrong, John On Wed, Mar 14, 2012 at 12:52 AM, aaron morton aa...@thelastpickle.comwrote: You are seeing this http://wiki.apache.org/cassandra/FAQ#range_rp The hash for (uuid-1 , p1) may be 100 and the hash for (uuid-1, p2) may be 1 . You cannot do what you want to. Even if you passed a start of (uuid1,empty) and no finish, you would not only get rows where the key starts with uuid1. This may be of use to you http://www.datastax.com/dev/blog/schema-in-cassandra-1-1 Or you can store all the priorities that are valid for an ID in another row. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/03/2012, at 1:05 PM, John Laban wrote: Forwarding to the Cassandra mailing list as well, in case this is more of an issue on how I'm using Cassandra. Am I correct to assume that I can use range queries on composite row keys, even when using a RandomPartitioner, if I make sure that the first part of the composite key is fixed? Any help would be appreciated, John On Tue, Mar 13, 2012 at 12:15 PM, John Laban j...@pagerduty.com wrote: Hi, I have a column family that uses a composite key: (ID, priority) - ... Where the ID is a UUID and the priority is an integer. I'm trying to perform a range query now: I want all the rows where the ID matches some fixed UUID, but within a range of priorities. This is supported even if I'm using a RandomPartitioner, right? (Because the first key in the composite key is the partition key, and the second part of the composite key is automatically ordered?) So I perform a range slices query: val rangeQuery = HFactory.createRangeSlicesQuery(keyspace, new CompositeSerializer, StringSerializer.get, BytesArraySerializer.get) rangeQuery.setColumnFamily(RouteColumnFamilyName). setKeys( new Composite(id, priorityStart), new Composite(id, priorityEnd) ). setRange( null, null, false, Int.MaxValue ) But I get this error: me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:start key's md5 sorts after end key's md5. this is not allowed; you probably should not specify end key at all, under RandomPartitioner) Shouldn't they have the same
Re: Does the 'batch' order matter ?
No, batch_mutate() is an atomic operation. When a node locally applies a batch mutation, either all of the changes are applied or none of them are. The steps in my batch are not confined to a single CF, nor to a single key. The documentation says: datastax: Column updates are only considered atomic within a given record (row). Pycassa.batch: This interface does not implement atomic operations across column families. All the limitations of the batch_mutate Thrift API call applies. Remember, a mutation in Cassandra is always atomic per key per column family only. On Wed, Mar 14, 2012 at 4:15 PM, Tyler Hobbs ty...@datastax.com wrote: On Wed, Mar 14, 2012 at 11:50 AM, A J s5a...@gmail.com wrote: Are you saying the way 'batch mutate' is coded, the order of writes in the batch does not mean anything ? You can ask the batch to do A,B,C and then D in sequence; but sometimes Cassandra can end up applying just C and A,B (and D) may still not be applied ? No, batch_mutate() is an atomic operation. When a node locally applies a batch mutation, either all of the changes are applied or none of them are. Aaron was referring to the possibility that one of the replicas received the batch_mutate, but the other replicas did not. -- Tyler Hobbs DataStax
Re: snapshot files locked
It's true on unix, but you can't delete hard links of opened files on windows. Try following: 1 Create a text file ORIG.TXT 2. MKLINK /D LINK.TXT ORIG.TXT Now you have hard link LINK.TXT . 3 Open ORIG.TXT with MS Word. 4. DEL LINK.TXT It returns error. Interesting thing is that you can delete link.txt if you open orig.txt with notepad. Notepad may close file after load, or it may be using different API. On 2012/03/15, at 5:04, Jim Newsham jnews...@referentia.com wrote: Hi Maki, Thanks for the reply. Yes, I understand that snapshots are hard links. However, my understanding is that removing any hard-linked files just removes the link (decrementing the link counter of the file on disk) -- it does not delete the file itself nor remove any other links which may be pointing at the file. To confirm my understanding, I tested this in Windows by terminating Cassandra and then deleting all files in the snapshot dir. None of the corresponding files in the parent keyspace directory were removed. Regards, Jim On 3/13/2012 9:29 PM, Maki Watanabe wrote: snapshot files are hardlinks of the original sstables. As you know, on windows, you can't delete files opened by other process. If you try to delete the hardlink, windows thinks you try to delete the sstables in production. maki 2012/3/14 Jim Newshamjnews...@referentia.com: Hi, I'm using Cassandra 1.0.8, on Windows 7. When I take a snapshot of the database, I find that I am unable to delete the snapshot directory (i.e., dir named {datadir}\{keyspacename}\snapshots\{snapshottag}) while Cassandra is running: The action can't be completed because the folder or a file in it is open in another program. Close the folder or file and try again. If I terminate Cassandra, then I can delete the directory with no problem. Is there a reason why Cassandra must hold onto these files? Thanks, Jim
Re: snapshot files locked
Typo 2. MKLINK /D LINK.TXT ORIG.TXT MKLINK /H On 2012/03/15, at 9:04, Watanabe Maki watanabe.m...@gmail.com wrote: It's true on unix, but you can't delete hard links of opened files on windows. Try following: 1 Create a text file ORIG.TXT 2. MKLINK /D LINK.TXT ORIG.TXT Now you have hard link LINK.TXT . 3 Open ORIG.TXT with MS Word. 4. DEL LINK.TXT It returns error. Interesting thing is that you can delete link.txt if you open orig.txt with notepad. Notepad may close file after load, or it may be using different API. On 2012/03/15, at 5:04, Jim Newsham jnews...@referentia.com wrote: Hi Maki, Thanks for the reply. Yes, I understand that snapshots are hard links. However, my understanding is that removing any hard-linked files just removes the link (decrementing the link counter of the file on disk) -- it does not delete the file itself nor remove any other links which may be pointing at the file. To confirm my understanding, I tested this in Windows by terminating Cassandra and then deleting all files in the snapshot dir. None of the corresponding files in the parent keyspace directory were removed. Regards, Jim On 3/13/2012 9:29 PM, Maki Watanabe wrote: snapshot files are hardlinks of the original sstables. As you know, on windows, you can't delete files opened by other process. If you try to delete the hardlink, windows thinks you try to delete the sstables in production. maki 2012/3/14 Jim Newshamjnews...@referentia.com: Hi, I'm using Cassandra 1.0.8, on Windows 7. When I take a snapshot of the database, I find that I am unable to delete the snapshot directory (i.e., dir named {datadir}\{keyspacename}\snapshots\{snapshottag}) while Cassandra is running: The action can't be completed because the folder or a file in it is open in another program. Close the folder or file and try again. If I terminate Cassandra, then I can delete the directory with no problem. Is there a reason why Cassandra must hold onto these files? Thanks, Jim
Re: Does the 'batch' order matter ?
Ah, my mistake, you are correct. Not sure why I had forgotten that. The pycassa docs are slightly wrong there, though. It's technically atomic for the same key across multiple column families. I'll get that fixed. On Wed, Mar 14, 2012 at 5:22 PM, A J s5a...@gmail.com wrote: No, batch_mutate() is an atomic operation. When a node locally applies a batch mutation, either all of the changes are applied or none of them are. The steps in my batch are not confined to a single CF, nor to a single key. The documentation says: datastax: Column updates are only considered atomic within a given record (row). Pycassa.batch: This interface does not implement atomic operations across column families. All the limitations of the batch_mutate Thrift API call applies. Remember, a mutation in Cassandra is always atomic per key per column family only. On Wed, Mar 14, 2012 at 4:15 PM, Tyler Hobbs ty...@datastax.com wrote: On Wed, Mar 14, 2012 at 11:50 AM, A J s5a...@gmail.com wrote: Are you saying the way 'batch mutate' is coded, the order of writes in the batch does not mean anything ? You can ask the batch to do A,B,C and then D in sequence; but sometimes Cassandra can end up applying just C and A,B (and D) may still not be applied ? No, batch_mutate() is an atomic operation. When a node locally applies a batch mutation, either all of the changes are applied or none of them are. Aaron was referring to the possibility that one of the replicas received the batch_mutate, but the other replicas did not. -- Tyler Hobbs DataStax -- Tyler Hobbs DataStax http://datastax.com/
Re: Question on ByteOrdered rebalancing
On Wed, Mar 14, 2012 at 4:17 PM, work late worklate1...@gmail.com wrote: The bytes being the key I use or what ? Yes. Partitioners split up the space of possible keys based on the tokens of the nodes. For BOP, a node's primary range will include all rows with keys that are less than it's token and greater than the token of the node with the next smallest token. So, if you have two nodes with tokens a and b, the node with token b will be the primary replica for all rows whose key falls between a and b (essentially, all keys that start with a). With BOP, the ideal set of tokens would split a sorted list all of the keys in your data into equal partitions. You have to know the distribution of your keys to do this properly, and the correct choice of tokens will change over time as your data does. This is why we typically strongly advise against using ordered partitioner. When you're doing a move with BOP using nodetool, you specify the new token in hex. Maybe you answered the 2nd question and I don't get it, but what would I pass the move command when I want to rebalance the cluster? Thanks On Tue, Mar 13, 2012 at 11:34 AM, Tyler Hobbs ty...@datastax.com wrote: The tokens are hex encoded arrays of bytes. On Tue, Mar 13, 2012 at 1:05 PM, work late worklate1...@gmail.comwrote: The ring command on nodetool shows as Address DC RackStatus State Load OwnsToken Token(bytes[88401b216270ab8ebb690946b0b70eab]) 10.1.1.1 datacenter1 rack1 Up Normal 69.1 KB 50.00% Token(bytes[4936c862b88db2bdd92d684583bf0280]) 10.1.1.2datacenter1 rack1 Up Normal 69.1 KB 50.00% Token(bytes[88401b216270ab8ebb690946b0b70eab]) The token looks like a MD5 value, is that correct? So when rebalancing the cluster, what is the token value I am supposed to give the move command (with RP it is the token between 0- 2^127), what should I use BOP? Thanks -- Tyler Hobbs DataStax http://datastax.com/ -- Tyler Hobbs DataStax http://datastax.com/