Re: trouble setting up initial cluster: Host ID collision between active endpoint
They both have 0 for their token, and this is stored in their System keyspace. Scrub them and start again. But I found that the tokens that were being generated would require way too much memory Token assignments have nothing to do with memory usage. m1.micro instances You are better off using your laptop than micro instances. For playing around try m1.large and terminate them when not in use. To make life easier use this to make the cluster for you http://www.datastax.com/docs/1.2/install/install_ami Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 24/01/2013, at 5:17 AM, Tim Dunphy bluethu...@gmail.com wrote: Hello list, I really do appreciate the advice I've gotten here as I start building familiarity with Cassandra. Aside from the single node instance I setup for a developer friend, I've just been playing with a single node in a VM on my laptop and playing around with the cassandra-cli and PHP. Well I've decided to setup my first cluster on my amazon ec2 account and I'm running into an issue getting the nodes to gossip. I've set the IP's of 'node01' and 'node02' ec2 instances in their respective listen_address, rpc_address and made sure that the 'cluster_name' on both was in agreement. I believe the problem may be in one of two places: either the seeds or the initial_token setting. For the seeds I have it setup as such. I put the IPs for both machines in the 'seeds' settings for each, thinking this would be how each node would discover each other: - seeds: 10.xxx.xxx.248,10.xxx.xxx.123 Initially I tried the tokengen script that I found in the documentation. But I found that the tokens that were being generated would require way too much memory for the m1.micro instances that I'm experimenting with on the Amazon free tier. And according to the docs in the config it is in some cases ok to leave that field blank. So that's what I did on both instances. Not sure how much/if this matters but I am using the setting - endpoint_snitch: Ec2Snitch Finally, when I start up the first node all goes well. But when I startup the second node I see this exception on both hosts: node1 INFO 11:02:32,231 Listening for thrift clients... INFO 11:02:59,262 Node /10.xxx.xxx.123 is now part of the cluster INFO 11:02:59,268 InetAddress /10.xxx.xxx.123 is now UP ERROR 11:02:59,270 Exception in thread Thread[GossipStage:1,5,main] java.lang.RuntimeException: Host ID collision between active endpoint /10..xxx.248 and /10.xxx.xxx.123 (id=54ce7ccd-1b1d-418e-9861-1c281c078b8f) at org.apache.cassandra.locator.TokenMetadata.updateHostId(TokenMetadata.java:227) at org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1296) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1157) at org.apache.cassandra.service.StorageService.onJoin(StorageService.java:1895) at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:805) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:883) at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:43) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) And on node02 I see: INFO 11:02:58,817 Starting Messaging Service on port 7000 INFO 11:02:58,835 Using saved token [0] INFO 11:02:58,837 Enqueuing flush of Memtable-local@672636645(84/84 serialized/live bytes, 4 ops) INFO 11:02:58,838 Writing Memtable-local@672636645(84/84 serialized/live bytes, 4 ops) INFO 11:02:58,912 Completed flushing /var/lib/cassandra/data/system/local/system-local-ia-43-Data.db (120 bytes) for commitlog position ReplayPosition(segmentId=1358956977628, position=49266) INFO 11:02:58,922 Enqueuing flush of Memtable-local@1007604537(32/32 serialized/live bytes, 2 ops) INFO 11:02:58,923 Writing Memtable-local@1007604537(32/32 serialized/live bytes, 2 ops) INFO 11:02:58,943 Compacting [SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-ia-40-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-ia-42-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-ia-43-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-ia-41-Data.db')] INFO 11:02:58,953 Node /10.192.179.248 is now part of the cluster INFO 11:02:58,961 InetAddress /10.192.179.248 is now UP INFO 11:02:59,003 Completed flushing /var/lib/cassandra/data/system/local/system-local-ia-44-Data.db (90 bytes) for
Re: Perfroming simple CQL Query using pyhton db-api 2.0 fails
How did you create the table? Anyways that looks like a bug, I *think* they should go here http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/issues/list Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 24/01/2013, at 7:14 AM, Paul van Hoven paul.van.ho...@googlemail.com wrote: I try to access my local cassandra database via python. Therefore I installed db-api 2.0 and thrift for accessing the database. Opening and closing a connection works fine. But a simply query is not working: The script looks like this: c = conn.cursor() c.execute(select * from users;) data = c.fetchall() print Query: select * from users; returned the following result: print str(data) The table users looks like this: qlsh:demodb select * from users; user_name | birth_year | gender | password | session_token | state ---+++--+---+--- jsmith | null | null | secret | null | null But when I try to execute it I get the following error: Open connection to localhost:9160 on keyspace demodb Traceback (most recent call last): File /Users/Tom/Freelancing/Company/Python/ApacheCassandra/src/CassandraDemo.py, line 56, in module perfromSimpleCQLQuery() File /Users/Tom/Freelancing/Company/Python/ApacheCassandra/src/CassandraDemo.py, line 46, in perfromSimpleCQLQuery c.execute(select * from users;) File /Library/Python/2.7/site-packages/cql/cursor.py, line 81, in execute return self.process_execution_results(response, decoder=decoder) File /Library/Python/2.7/site-packages/cql/thrifteries.py, line 116, in process_execution_results self.get_metadata_info(self.result[0]) File /Library/Python/2.7/site-packages/cql/cursor.py, line 97, in get_metadata_info name, nbytes, vtype, ctype = self.get_column_metadata(colid) File /Library/Python/2.7/site-packages/cql/cursor.py, line 104, in get_column_metadata return self.decoder.decode_metadata_and_type(column_id) File /Library/Python/2.7/site-packages/cql/decoders.py, line 45, in decode_metadata_and_type name = self.name_decode_error(e, namebytes, comptype.cql_parameterized_type()) File /Library/Python/2.7/site-packages/cql/decoders.py, line 29, in name_decode_error % (namebytes, expectedtype, err)) cql.apivalues.ProgrammingError: column name '\x00\x00\x00' can't be deserialized as 'org.apache.cassandra.db.marshal.CompositeType': global name 'self' is not defined I'm not shure if this is the right place to ask for: But am I doing here something wrong?
multiple reducers with BulkOutputFormat on the same host
Hello, We see that BulkOutputFormat fails to stream data from multiple reduce instances that run on the same host. We get the same error messages that issue https://issues.apache.org/jira/browse/CASSANDRA-4223 tries to address. Looks like (ip-adress + in_out_flag + atomic integer) is not unique enough for a sessionId when we have multiple JVMs streaming from one physical host. We get the problem fixed by setting one reducer per machine in hadoop config, but it's not an option we want to deploy. Thanks, Alexei Bakanov
Re: multiple reducers with BulkOutputFormat on the same host
Alexel, You were right. It was already fixed to use UUID for streaming session and released in 1.2.0. See https://issues.apache.org/jira/browse/CASSANDRA-4813. On Thursday, January 24, 2013 at 6:49 AM, Alexei Bakanov wrote: Hello, We see that BulkOutputFormat fails to stream data from multiple reduce instances that run on the same host. We get the same error messages that issue https://issues.apache.org/jira/browse/CASSANDRA-4223 tries to address. Looks like (ip-adress + in_out_flag + atomic integer) is not unique enough for a sessionId when we have multiple JVMs streaming from one physical host. We get the problem fixed by setting one reducer per machine in hadoop config, but it's not an option we want to deploy. Thanks, Alexei Bakanov
Re: multiple reducers with BulkOutputFormat on the same host
Oh, that's nice! Thanks! On 24 January 2013 13:55, Yuki Morishita mor.y...@gmail.com wrote: Alexel, You were right. It was already fixed to use UUID for streaming session and released in 1.2.0. See https://issues.apache.org/jira/browse/CASSANDRA-4813. On Thursday, January 24, 2013 at 6:49 AM, Alexei Bakanov wrote: Hello, We see that BulkOutputFormat fails to stream data from multiple reduce instances that run on the same host. We get the same error messages that issue https://issues.apache.org/jira/browse/CASSANDRA-4223 tries to address. Looks like (ip-adress + in_out_flag + atomic integer) is not unique enough for a sessionId when we have multiple JVMs streaming from one physical host. We get the problem fixed by setting one reducer per machine in hadoop config, but it's not an option we want to deploy. Thanks, Alexei Bakanov
Re: trouble setting up initial cluster: Host ID collision between active endpoint
Cool Thanks for the advice Aaron. I actually did get this working before I read your reply. The trick apparently for me was to use the IP for the first node in the seeds setting of each successive node. But I like the idea of using larges for an hour or so and terminating them for some basic experimentation. Also, thanks for pointing me to the Datastax AMIs I'll be sure to check them out. Tim On Thu, Jan 24, 2013 at 3:45 AM, aaron morton aa...@thelastpickle.comwrote: They both have 0 for their token, and this is stored in their System keyspace. Scrub them and start again. But I found that the tokens that were being generated would require way too much memory Token assignments have nothing to do with memory usage. m1.micro instances You are better off using your laptop than micro instances. For playing around try m1.large and terminate them when not in use. To make life easier use this to make the cluster for you http://www.datastax.com/docs/1.2/install/install_ami Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 24/01/2013, at 5:17 AM, Tim Dunphy bluethu...@gmail.com wrote: Hello list, I really do appreciate the advice I've gotten here as I start building familiarity with Cassandra. Aside from the single node instance I setup for a developer friend, I've just been playing with a single node in a VM on my laptop and playing around with the cassandra-cli and PHP. Well I've decided to setup my first cluster on my amazon ec2 account and I'm running into an issue getting the nodes to gossip. I've set the IP's of 'node01' and 'node02' ec2 instances in their respective listen_address, rpc_address and made sure that the 'cluster_name' on both was in agreement. I believe the problem may be in one of two places: either the seeds or the initial_token setting. For the seeds I have it setup as such. I put the IPs for both machines in the 'seeds' settings for each, thinking this would be how each node would discover each other: - seeds: 10.xxx.xxx.248,10.xxx.xxx.123 Initially I tried the tokengen script that I found in the documentation. But I found that the tokens that were being generated would require way too much memory for the m1.micro instances that I'm experimenting with on the Amazon free tier. And according to the docs in the config it is in some cases ok to leave that field blank. So that's what I did on both instances. Not sure how much/if this matters but I am using the setting - endpoint_snitch: Ec2Snitch Finally, when I start up the first node all goes well. But when I startup the second node I see this exception on both hosts: node1 INFO 11:02:32,231 Listening for thrift clients... INFO 11:02:59,262 Node /10.xxx.xxx.123 is now part of the cluster INFO 11:02:59,268 InetAddress /10.xxx.xxx.123 is now UP ERROR 11:02:59,270 Exception in thread Thread[GossipStage:1,5,main] java.lang.RuntimeException: Host ID collision between active endpoint /10..xxx.248 and /10.xxx.xxx.123 (id=54ce7ccd-1b1d-418e-9861-1c281c078b8f) at org.apache.cassandra.locator.TokenMetadata.updateHostId(TokenMetadata.java:227) at org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1296) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1157) at org.apache.cassandra.service.StorageService.onJoin(StorageService.java:1895) at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:805) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:883) at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:43) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) And on node02 I see: INFO 11:02:58,817 Starting Messaging Service on port 7000 INFO 11:02:58,835 Using saved token [0] INFO 11:02:58,837 Enqueuing flush of Memtable-local@672636645(84/84 serialized/live bytes, 4 ops) INFO 11:02:58,838 Writing Memtable-local@672636645(84/84 serialized/live bytes, 4 ops) INFO 11:02:58,912 Completed flushing /var/lib/cassandra/data/system/local/system-local-ia-43-Data.db (120 bytes) for commitlog position ReplayPosition(segmentId=1358956977628, position=49266) INFO 11:02:58,922 Enqueuing flush of Memtable-local@1007604537(32/32 serialized/live bytes, 2 ops) INFO 11:02:58,923 Writing Memtable-local@1007604537(32/32 serialized/live bytes, 2 ops) INFO 11:02:58,943 Compacting [SSTableReader(path='/var/lib/cassandra/data/system/local/system-local-ia-40-Data.db'),
Fwd: {kundera-discuss} Kundera 2.3 released
-- Forwarded message -- From: Vivek Mishra vivek.mis...@impetus.co.in Date: Thu, Jan 24, 2013 at 8:29 PM Subject: {kundera-discuss} Kundera 2.3 released To: kundera-disc...@googlegroups.com kundera-disc...@googlegroups.com Hi All, We are happy to announce release of Kundera 2.3. Kundera is a JPA 2.0 compliant, object-datastore mapping library for NoSQL datastores. The idea behind Kundera is to make working with NoSQL Databases drop-dead simple and fun. It currently supports Cassandra, HBase, MongoDB, Redis and relational databases. Major Changes: - 1) Added Redis (http://redis.io/) to Kundera's supported database list. ( https://github.com/impetus-opensource/Kundera/wiki/Kundera-over-Redis-Connecting-... ) 2) Cassandra 1.2 migration. 3) Changes in HBase schema handling. 4) Stronger query support, like selective column/id search via JPQL. 5) Enable support for @Transient for embeddedColumns and mappedsuperclass. 6) Allow to set record limit on search for mongodb . 7) Performance improvement on Cassandra,HBase,MongoDB. Github Bug Fixes: -- https://github.com/impetus-opensource/Kundera/issues/163 https://github.com/impetus-opensource/Kundera/issues/162 https://github.com/impetus-opensource/Kundera/issues/154 https://github.com/impetus-opensource/Kundera/issues/141 https://github.com/impetus-opensource/Kundera/issues/133 https://github.com/impetus-opensource/Kundera/issues/131 https://github.com/impetus-opensource/Kundera/issues/127 https://github.com/impetus-opensource/Kundera/issues/122 https://github.com/impetus-opensource/Kundera/issues/121 https://github.com/impetus-opensource/Kundera/issues/117 https://github.com/impetus-opensource/Kundera/issues/84 https://github.com/impetus-opensource/Kundera/issues/67 @kundera-discuss issues --- 1) Batch operation over Cassandra composite key not working. We have revamped our wiki, so you might want to have a look at it here: https://github.com/impetus-**opensource/Kundera/wikihttps://github.com/impetus-opensource/Kundera/wiki To download, use or contribute to Kundera, visit: http://github.com/impetus-**opensource/Kunderahttp://github.com/impetus-opensource/Kundera Latest released tag version is 2.3 Kundera maven libraries are now available at: https://oss.sonatype.org/**content/repositories/releases/** com/impetushttps://oss.sonatype.org/content/repositories/releases/com/impetus Sample codes and examples for using Kundera can be found here: http://github.com/impetus-**opensource/Kundera-Exampleshttp://github.com/impetus-opensource/Kundera-Examples And https://github.com/impetus-**opensource/Kundera/tree/trunk/**kundera-testshttps://github.com/impetus-opensource/Kundera/tree/trunk/kundera-tests Thank you all for your contributions! Sincerely, Kundera Team -- NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. --
Issues with CQLSH in Cassandra 1.2
Hi, I have spent half of the day today trying to make a new Cassandra cluster to work. I have setup a single data center cluster, using NetworkTopologyStrategy, DC1:3. I'm using latest version of Astyanax client to connect. After many hours of debug, I found out that the problem may be in cqlsh utility. So, after the cluster was up and running: [me@cassandra-node1 cassandra]$ nodetool status Datacenter: DC-1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.11.1.109 59.1 KB256 0.0% 726689df-edc3-49a0-b680-370953994a8c RAC2 UN 10.11.1.108 67.49 KB 256 0.0% 73cd86a9-4efb-4407-9fe8-9a1b3a277af7 RAC1 UN 10.11.1.200 59.84 KB 64 0.0% d6d700d4-28aa-4722-b215-a6a7d304b8e7 RAC3 I went to create the keyspace: 1. First I have tried using cqlsh: create keyspace foo with replication= {'class':'NetworkTopologyStrategy','DC1':3}; after this, I have checked that the keyspace was properly created by running cqlsh select * from system.schema_keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options ---++--+ system_auth | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} foo | True | org.apache.cassandra.locator.NetworkTopologyStrategy | {dc1:3} system | True | org.apache.cassandra.locator.LocalStrategy | {} system_traces | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} but if I run nodetool describering foo, it will not show anything into endpoint, or endpoint_details fields. In this situation, Astyanax client will throw exception with /NoAvailableHostsException/. I have used following configuration: withAstyanaxConfiguration(new AstyanaxConfigurationImpl() .setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE) .setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE) First option did not worked at all. 2. I've dropped the keyspace crated with cqlsh and re-created with cassandra-cli. This time, the nodetool describering foo, shows information into endpoint and endpoint_details columns, and also the Astyanax client works properly. Hope it will avoid others to avoid spending time to figure out how to go around this issue. Br, Gabi
Re: Issues with CQLSH in Cassandra 1.2
Hi, Astyanax is not 1.2 compatible yet https://github.com/Netflix/astyanax/issues/191 https://github.com/Netflix/astyanax/issues/191Eran planned to make it in 1.57.x четверг, 24 января 2013 г. пользователь Gabriel Ciuloaica писал: Hi, I have spent half of the day today trying to make a new Cassandra cluster to work. I have setup a single data center cluster, using NetworkTopologyStrategy, DC1:3. I'm using latest version of Astyanax client to connect. After many hours of debug, I found out that the problem may be in cqlsh utility. So, after the cluster was up and running: [me@cassandra-node1 cassandra]$ nodetool status Datacenter: DC-1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.11.1.109 59.1 KB256 0.0% 726689df-edc3-49a0-b680-370953994a8c RAC2 UN 10.11.1.108 67.49 KB 256 0.0% 73cd86a9-4efb-4407-9fe8-9a1b3a277af7 RAC1 UN 10.11.1.200 59.84 KB 64 0.0% d6d700d4-28aa-4722-b215-a6a7d304b8e7 RAC3 I went to create the keyspace: 1. First I have tried using cqlsh: create keyspace foo with replication= {'class':'NetworkTopologyStrategy','DC1':3}; after this, I have checked that the keyspace was properly created by running cqlsh select * from system.schema_keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options ---++--+ system_auth | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} foo | True | org.apache.cassandra.locator.NetworkTopologyStrategy | {dc1:3} system | True | org.apache.cassandra.locator.LocalStrategy | {} system_traces | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} but if I run nodetool describering foo, it will not show anything into endpoint, or endpoint_details fields. In this situation, Astyanax client will throw exception with * NoAvailableHostsException*. I have used following configuration: withAstyanaxConfiguration(new AstyanaxConfigurationImpl() .setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE) .setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE) First option did not worked at all. 2. I've dropped the keyspace crated with cqlsh and re-created with cassandra-cli. This time, the nodetool describering foo, shows information into endpoint and endpoint_details columns, and also the Astyanax client works properly. Hope it will avoid others to avoid spending time to figure out how to go around this issue. Br, Gabi
Re: Issues with CQLSH in Cassandra 1.2
I do not think that it has anything to do with Astyanax, but after I have recreated the keyspace with cassandra-cli, everything is working fine. Also, I have mention below that not even nodetool describering foo, did not showed correct information for the tokens, encoding_details, if the keyspace was created with cqlsh. Thanks, Gabi On 1/24/13 9:21 PM, Ivan Velykorodnyy wrote: Hi, Astyanax is not 1.2 compatible yet https://github.com/Netflix/astyanax/issues/191 Eran planned to make it in 1.57.x четверг, 24 января 2013 г. пользователь Gabriel Ciuloaica писал: Hi, I have spent half of the day today trying to make a new Cassandra cluster to work. I have setup a single data center cluster, using NetworkTopologyStrategy, DC1:3. I'm using latest version of Astyanax client to connect. After many hours of debug, I found out that the problem may be in cqlsh utility. So, after the cluster was up and running: [me@cassandra-node1 cassandra]$ nodetool status Datacenter: DC-1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.11.1.109 59.1 KB256 0.0% 726689df-edc3-49a0-b680-370953994a8c RAC2 UN 10.11.1.108 67.49 KB 256 0.0% 73cd86a9-4efb-4407-9fe8-9a1b3a277af7 RAC1 UN 10.11.1.200 59.84 KB 64 0.0% d6d700d4-28aa-4722-b215-a6a7d304b8e7 RAC3 I went to create the keyspace: 1. First I have tried using cqlsh: create keyspace foo with replication= {'class':'NetworkTopologyStrategy','DC1':3}; after this, I have checked that the keyspace was properly created by running cqlsh select * from system.schema_keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options ---++--+ system_auth | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} foo | True | org.apache.cassandra.locator.NetworkTopologyStrategy | {dc1:3} system | True | org.apache.cassandra.locator.LocalStrategy | {} system_traces | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} but if I run nodetool describering foo, it will not show anything into endpoint, or endpoint_details fields. In this situation, Astyanax client will throw exception with /NoAvailableHostsException/. I have used following configuration: withAstyanaxConfiguration(new AstyanaxConfigurationImpl() .setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE) .setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE) First option did not worked at all. 2. I've dropped the keyspace crated with cqlsh and re-created with cassandra-cli. This time, the nodetool describering foo, shows information into endpoint and endpoint_details columns, and also the Astyanax client works properly. Hope it will avoid others to avoid spending time to figure out how to go around this issue. Br, Gabi
Re: Perfroming simple CQL Query using pyhton db-api 2.0 fails
The reason for the error was that I opened the connection to the database wrong. I did: con = cql.connect(host, port, keyspace) but correct is: con = cql.connect(host, port, keyspace, cql_version='3.0.0') Now it works fine. Thanks for reading. 2013/1/24 aaron morton aa...@thelastpickle.com: How did you create the table? Anyways that looks like a bug, I *think* they should go here http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/issues/list Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 24/01/2013, at 7:14 AM, Paul van Hoven paul.van.ho...@googlemail.com wrote: I try to access my local cassandra database via python. Therefore I installed db-api 2.0 and thrift for accessing the database. Opening and closing a connection works fine. But a simply query is not working: The script looks like this: c = conn.cursor() c.execute(select * from users;) data = c.fetchall() print Query: select * from users; returned the following result: print str(data) The table users looks like this: qlsh:demodb select * from users; user_name | birth_year | gender | password | session_token | state ---+++--+---+--- jsmith | null | null | secret | null | null But when I try to execute it I get the following error: Open connection to localhost:9160 on keyspace demodb Traceback (most recent call last): File /Users/Tom/Freelancing/Company/Python/ApacheCassandra/src/CassandraDemo.py, line 56, in module perfromSimpleCQLQuery() File /Users/Tom/Freelancing/Company/Python/ApacheCassandra/src/CassandraDemo.py, line 46, in perfromSimpleCQLQuery c.execute(select * from users;) File /Library/Python/2.7/site-packages/cql/cursor.py, line 81, in execute return self.process_execution_results(response, decoder=decoder) File /Library/Python/2.7/site-packages/cql/thrifteries.py, line 116, in process_execution_results self.get_metadata_info(self.result[0]) File /Library/Python/2.7/site-packages/cql/cursor.py, line 97, in get_metadata_info name, nbytes, vtype, ctype = self.get_column_metadata(colid) File /Library/Python/2.7/site-packages/cql/cursor.py, line 104, in get_column_metadata return self.decoder.decode_metadata_and_type(column_id) File /Library/Python/2.7/site-packages/cql/decoders.py, line 45, in decode_metadata_and_type name = self.name_decode_error(e, namebytes, comptype.cql_parameterized_type()) File /Library/Python/2.7/site-packages/cql/decoders.py, line 29, in name_decode_error % (namebytes, expectedtype, err)) cql.apivalues.ProgrammingError: column name '\x00\x00\x00' can't be deserialized as 'org.apache.cassandra.db.marshal.CompositeType': global name 'self' is not defined I'm not shure if this is the right place to ask for: But am I doing here something wrong?
Re: Issues with CQLSH in Cassandra 1.2
Gabriel, It looks like you used DC1 for the datacenter name in your replication strategy options, while the actual datacenter name was DC-1 (based on the nodetool status output). Perhaps that was causing the problem? On Thu, Jan 24, 2013 at 1:57 PM, Gabriel Ciuloaica gciuloa...@gmail.comwrote: I do not think that it has anything to do with Astyanax, but after I have recreated the keyspace with cassandra-cli, everything is working fine. Also, I have mention below that not even nodetool describering foo, did not showed correct information for the tokens, encoding_details, if the keyspace was created with cqlsh. Thanks, Gabi On 1/24/13 9:21 PM, Ivan Velykorodnyy wrote: Hi, Astyanax is not 1.2 compatible yet https://github.com/Netflix/astyanax/issues/191 Eran planned to make it in 1.57.x четверг, 24 января 2013 г. пользователь Gabriel Ciuloaica писал: Hi, I have spent half of the day today trying to make a new Cassandra cluster to work. I have setup a single data center cluster, using NetworkTopologyStrategy, DC1:3. I'm using latest version of Astyanax client to connect. After many hours of debug, I found out that the problem may be in cqlsh utility. So, after the cluster was up and running: [me@cassandra-node1 cassandra]$ nodetool status Datacenter: DC-1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.11.1.109 59.1 KB256 0.0% 726689df-edc3-49a0-b680-370953994a8c RAC2 UN 10.11.1.108 67.49 KB 256 0.0% 73cd86a9-4efb-4407-9fe8-9a1b3a277af7 RAC1 UN 10.11.1.200 59.84 KB 64 0.0% d6d700d4-28aa-4722-b215-a6a7d304b8e7 RAC3 I went to create the keyspace: 1. First I have tried using cqlsh: create keyspace foo with replication= {'class':'NetworkTopologyStrategy','DC1':3}; after this, I have checked that the keyspace was properly created by running cqlsh select * from system.schema_keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options ---++--+ system_auth | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} foo | True | org.apache.cassandra.locator.NetworkTopologyStrategy | {dc1:3} system | True | org.apache.cassandra.locator.LocalStrategy | {} system_traces | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} but if I run nodetool describering foo, it will not show anything into endpoint, or endpoint_details fields. In this situation, Astyanax client will throw exception with * NoAvailableHostsException*. I have used following configuration: withAstyanaxConfiguration(new AstyanaxConfigurationImpl() .setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE) .setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE) First option did not worked at all. 2. I've dropped the keyspace crated with cqlsh and re-created with cassandra-cli. This time, the nodetool describering foo, shows information into endpoint and endpoint_details columns, and also the Astyanax client works properly. Hope it will avoid others to avoid spending time to figure out how to go around this issue. Br, Gabi -- Tyler Hobbs DataStax http://datastax.com/
Re: Issues with CQLSH in Cassandra 1.2
Hi Tyler, No, it was just a typo in the email, I changed names of DC in the email after copy/paste from output of the tools. It is quite easy to reproduce (assuming you have a correct configuration for NetworkTopologyStrategy, with vNodes(default, 256)): 1. launch cqlsh and create the keyspace create keyspace foo with replication= {'class':'NetworkTopologyStrategy','DC1':3}; 2. exit cqlsh, run nodetool describering foo you'll see something like this: TokenRange(start_token:2318224911779291128, end_token:2351629206880900296, endpoints:[], rpc_endpoints:[], endpoint_details:[]) TokenRange(start_token:-8291638263612363845, end_token:-8224756763869823639, endpoints:[], rpc_endpoints:[], endpoint_details:[]) 3. start cqlsh, drop keyspace foo; 4. Exit cqlsh, start cassandra-cli create keyspace foo with placement_strategy = 'NetworkTopologyStrategy' AND strategy_options={DC1}; if you run nodetool describering foo you'll see: TokenRange(start_token:2318224911779291128, end_token:2351629206880900296, endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], rpc_endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], endpoint_details:[EndpointDetails(host:10.11.1.200, datacenter:DC1, rack:RAC3), EndpointDetails(host:10.11.1.109, datacenter:DC1, rack:RAC2), EndpointDetails(host:10.11.1.108, datacenter:DC1, rack:RAC1)]) TokenRange(start_token:-8291638263612363845, end_token:-8224756763869823639, endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], rpc_endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], endpoint_details:[EndpointDetails(host:10.11.1.200, datacenter:DC1, rack:RAC3), EndpointDetails(host:10.11.1.109, datacenter:DC1, rack:RAC2), EndpointDetails(host:10.11.1.108, datacenter:DC1, rack:RAC1)]) Br, Gabi On 1/24/13 10:22 PM, Tyler Hobbs wrote: Gabriel, It looks like you used DC1 for the datacenter name in your replication strategy options, while the actual datacenter name was DC-1 (based on the nodetool status output). Perhaps that was causing the problem? On Thu, Jan 24, 2013 at 1:57 PM, Gabriel Ciuloaica gciuloa...@gmail.com mailto:gciuloa...@gmail.com wrote: I do not think that it has anything to do with Astyanax, but after I have recreated the keyspace with cassandra-cli, everything is working fine. Also, I have mention below that not even nodetool describering foo, did not showed correct information for the tokens, encoding_details, if the keyspace was created with cqlsh. Thanks, Gabi On 1/24/13 9:21 PM, Ivan Velykorodnyy wrote: Hi, Astyanax is not 1.2 compatible yet https://github.com/Netflix/astyanax/issues/191 Eran planned to make it in 1.57.x четверг, 24 января 2013 г. пользователь Gabriel Ciuloaica писал: Hi, I have spent half of the day today trying to make a new Cassandra cluster to work. I have setup a single data center cluster, using NetworkTopologyStrategy, DC1:3. I'm using latest version of Astyanax client to connect. After many hours of debug, I found out that the problem may be in cqlsh utility. So, after the cluster was up and running: [me@cassandra-node1 cassandra]$ nodetool status Datacenter: DC-1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.11.1.109 59.1 KB256 0.0% 726689df-edc3-49a0-b680-370953994a8c RAC2 UN 10.11.1.108 67.49 KB 256 0.0% 73cd86a9-4efb-4407-9fe8-9a1b3a277af7 RAC1 UN 10.11.1.200 59.84 KB 64 0.0% d6d700d4-28aa-4722-b215-a6a7d304b8e7 RAC3 I went to create the keyspace: 1. First I have tried using cqlsh: create keyspace foo with replication= {'class':'NetworkTopologyStrategy','DC1':3}; after this, I have checked that the keyspace was properly created by running cqlsh select * from system.schema_keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options ---++--+ system_auth | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} foo | True | org.apache.cassandra.locator.NetworkTopologyStrategy | {dc1:3} system | True | org.apache.cassandra.locator.LocalStrategy | {} system_traces | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} but if I run nodetool describering foo, it will not show anything into endpoint, or endpoint_details fields. In this situation, Astyanax client will
Re: trouble setting up initial cluster: Host ID collision between active endpoint
Hi Tim If you want to check out Cassandra on AWS you should also have a look www.instaclustr.com. We are still very much in Beta (so if you come across anything, please let us know), but if you have a few minutes and want to deploy a cluster in just a few clicks I highly recommend trying Instaclustr out. Cheers Ben Bromhead *Instaclustr* On Fri, Jan 25, 2013 at 12:35 AM, Tim Dunphy bluethu...@gmail.com wrote: Cool Thanks for the advice Aaron. I actually did get this working before I read your reply. The trick apparently for me was to use the IP for the first node in the seeds setting of each successive node. But I like the idea of using larges for an hour or so and terminating them for some basic experimentation. Also, thanks for pointing me to the Datastax AMIs I'll be sure to check them out. Tim On Thu, Jan 24, 2013 at 3:45 AM, aaron morton aa...@thelastpickle.comwrote: They both have 0 for their token, and this is stored in their System keyspace. Scrub them and start again. But I found that the tokens that were being generated would require way too much memory Token assignments have nothing to do with memory usage. m1.micro instances You are better off using your laptop than micro instances. For playing around try m1.large and terminate them when not in use. To make life easier use this to make the cluster for you http://www.datastax.com/docs/1.2/install/install_ami Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 24/01/2013, at 5:17 AM, Tim Dunphy bluethu...@gmail.com wrote: Hello list, I really do appreciate the advice I've gotten here as I start building familiarity with Cassandra. Aside from the single node instance I setup for a developer friend, I've just been playing with a single node in a VM on my laptop and playing around with the cassandra-cli and PHP. Well I've decided to setup my first cluster on my amazon ec2 account and I'm running into an issue getting the nodes to gossip. I've set the IP's of 'node01' and 'node02' ec2 instances in their respective listen_address, rpc_address and made sure that the 'cluster_name' on both was in agreement. I believe the problem may be in one of two places: either the seeds or the initial_token setting. For the seeds I have it setup as such. I put the IPs for both machines in the 'seeds' settings for each, thinking this would be how each node would discover each other: - seeds: 10.xxx.xxx.248,10.xxx.xxx.123 Initially I tried the tokengen script that I found in the documentation. But I found that the tokens that were being generated would require way too much memory for the m1.micro instances that I'm experimenting with on the Amazon free tier. And according to the docs in the config it is in some cases ok to leave that field blank. So that's what I did on both instances. Not sure how much/if this matters but I am using the setting - endpoint_snitch: Ec2Snitch Finally, when I start up the first node all goes well. But when I startup the second node I see this exception on both hosts: node1 INFO 11:02:32,231 Listening for thrift clients... INFO 11:02:59,262 Node /10.xxx.xxx.123 is now part of the cluster INFO 11:02:59,268 InetAddress /10.xxx.xxx.123 is now UP ERROR 11:02:59,270 Exception in thread Thread[GossipStage:1,5,main] java.lang.RuntimeException: Host ID collision between active endpoint /10..xxx.248 and /10.xxx.xxx.123 (id=54ce7ccd-1b1d-418e-9861-1c281c078b8f) at org.apache.cassandra.locator.TokenMetadata.updateHostId(TokenMetadata.java:227) at org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1296) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1157) at org.apache.cassandra.service.StorageService.onJoin(StorageService.java:1895) at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:805) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:883) at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:43) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) And on node02 I see: INFO 11:02:58,817 Starting Messaging Service on port 7000 INFO 11:02:58,835 Using saved token [0] INFO 11:02:58,837 Enqueuing flush of Memtable-local@672636645(84/84 serialized/live bytes, 4 ops) INFO 11:02:58,838 Writing Memtable-local@672636645(84/84 serialized/live bytes, 4 ops) INFO 11:02:58,912 Completed flushing /var/lib/cassandra/data/system/local/system-local-ia-43-Data.db (120 bytes) for commitlog position
CQL3 and clients for new Cluster
Hi all, I started looking at Cassandra awhile ago and got used to the Thrift API. I put it on the back burner for awhile though until now. To get back up to speed I have read a lot of documentation at the DataStax website, and it appears that the Thrift API is no longer considered the ideal way to interface with Cassandra. So my questions are these: What is the future of the Thrift API, should I just ignore it going forward and use CQL? If CQL is the preferred way to interface with Cassandra, does using any of the clients listed here: http://wiki.apache.org/cassandra/ClientOptionsprovide me any benefits over using a JDBC like the one listed here http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/ Thanks, Matt
Re: CQL3 and clients for new Cluster
Some of the Mapping libraries can help translate into objects and not have sooo much DAO code PlayOrm has a whole feature list of things that can be helpful. I am sure other high level clients have stuff as well that can speed up development time. https://github.com/deanhiller/playorm#playorm-feature-list Dean From: Matthew Langton mjla...@gmail.commailto:mjla...@gmail.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Thursday, January 24, 2013 4:35 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: CQL3 and clients for new Cluster Hi all, I started looking at Cassandra awhile ago and got used to the Thrift API. I put it on the back burner for awhile though until now. To get back up to speed I have read a lot of documentation at the DataStax website, and it appears that the Thrift API is no longer considered the ideal way to interface with Cassandra. So my questions are these: What is the future of the Thrift API, should I just ignore it going forward and use CQL? If CQL is the preferred way to interface with Cassandra, does using any of the clients listed here: http://wiki.apache.org/cassandra/ClientOptions provide me any benefits over using a JDBC like the one listed here http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/ Thanks, Matt
Re: CQL3 and clients for new Cluster
Either CQL or a higher level API running on top of Thrift like Hector/Asyntax/etc. Thrift is uh... painful. On Thu, Jan 24, 2013 at 3:35 PM, Matthew Langton mjla...@gmail.com wrote: Hi all, I started looking at Cassandra awhile ago and got used to the Thrift API. I put it on the back burner for awhile though until now. To get back up to speed I have read a lot of documentation at the DataStax website, and it appears that the Thrift API is no longer considered the ideal way to interface with Cassandra. So my questions are these: What is the future of the Thrift API, should I just ignore it going forward and use CQL? If CQL is the preferred way to interface with Cassandra, does using any of the clients listed here: http://wiki.apache.org/cassandra/ClientOptions provide me any benefits over using a JDBC like the one listed here http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/ Thanks, Matt -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Re: CQL3 and clients for new Cluster
I use both Thrift and CQL. my bias take is use CQL for select queries and thrift for insert/update. I like being able to insert exactly the data type I want for the column name and value. CQL is more user friendly, but it lacks the flexibility of thrift in terms of using different data types for column names and values. the SQL metaphor only goes so far. My bias opinion, you're not getting the most out of Cassandra if you're only using CQL. On Thu, Jan 24, 2013 at 6:38 PM, Aaron Turner synfina...@gmail.com wrote: Either CQL or a higher level API running on top of Thrift like Hector/Asyntax/etc. Thrift is uh... painful. On Thu, Jan 24, 2013 at 3:35 PM, Matthew Langton mjla...@gmail.com wrote: Hi all, I started looking at Cassandra awhile ago and got used to the Thrift API. I put it on the back burner for awhile though until now. To get back up to speed I have read a lot of documentation at the DataStax website, and it appears that the Thrift API is no longer considered the ideal way to interface with Cassandra. So my questions are these: What is the future of the Thrift API, should I just ignore it going forward and use CQL? If CQL is the preferred way to interface with Cassandra, does using any of the clients listed here: http://wiki.apache.org/cassandra/ClientOptions provide me any benefits over using a JDBC like the one listed here http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/ Thanks, Matt -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Does setstreamthroughput also throttle the network traffic caused by nodetool repair?
In the yaml, it has the following setting # Throttles all outbound streaming file transfers on this node to the # given total throughput in Mbps. This is necessary because Cassandra does # mostly sequential IO when streaming data during bootstrap or repair, which # can lead to saturating the network connection and degrading rpc performance. # When unset, the default is 400 Mbps or 50 MB/s. # stream_throughput_outbound_megabits_per_sec: 400 Is this the same value as if I call Nodetool setstreamthroughput Should I call it to all the nodes on the cluster? Will that throttle the network traffic caused by nodetool repair? Thanks. -Wei
Re: Issues with CQLSH in Cassandra 1.2
Can you provide details of the snitch configuration and the number of nodes you have? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 25/01/2013, at 9:39 AM, Gabriel Ciuloaica gciuloa...@gmail.com wrote: Hi Tyler, No, it was just a typo in the email, I changed names of DC in the email after copy/paste from output of the tools. It is quite easy to reproduce (assuming you have a correct configuration for NetworkTopologyStrategy, with vNodes(default, 256)): 1. launch cqlsh and create the keyspace create keyspace foo with replication= {'class':'NetworkTopologyStrategy','DC1':3}; 2. exit cqlsh, run nodetool describering foo you'll see something like this: TokenRange(start_token:2318224911779291128, end_token:2351629206880900296, endpoints:[], rpc_endpoints:[], endpoint_details:[]) TokenRange(start_token:-8291638263612363845, end_token:-8224756763869823639, endpoints:[], rpc_endpoints:[], endpoint_details:[]) 3. start cqlsh, drop keyspace foo; 4. Exit cqlsh, start cassandra-cli create keyspace foo with placement_strategy = 'NetworkTopologyStrategy' AND strategy_options={DC1}; if you run nodetool describering foo you'll see: TokenRange(start_token:2318224911779291128, end_token:2351629206880900296, endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], rpc_endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], endpoint_details:[EndpointDetails(host:10.11.1.200, datacenter:DC1, rack:RAC3), EndpointDetails(host:10.11.1.109, datacenter:DC1, rack:RAC2), EndpointDetails(host:10.11.1.108, datacenter:DC1, rack:RAC1)]) TokenRange(start_token:-8291638263612363845, end_token:-8224756763869823639, endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], rpc_endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], endpoint_details:[EndpointDetails(host:10.11.1.200, datacenter:DC1, rack:RAC3), EndpointDetails(host:10.11.1.109, datacenter:DC1, rack:RAC2), EndpointDetails(host:10.11.1.108, datacenter:DC1, rack:RAC1)]) Br, Gabi On 1/24/13 10:22 PM, Tyler Hobbs wrote: Gabriel, It looks like you used DC1 for the datacenter name in your replication strategy options, while the actual datacenter name was DC-1 (based on the nodetool status output). Perhaps that was causing the problem? On Thu, Jan 24, 2013 at 1:57 PM, Gabriel Ciuloaica gciuloa...@gmail.com wrote: I do not think that it has anything to do with Astyanax, but after I have recreated the keyspace with cassandra-cli, everything is working fine. Also, I have mention below that not even nodetool describering foo, did not showed correct information for the tokens, encoding_details, if the keyspace was created with cqlsh. Thanks, Gabi On 1/24/13 9:21 PM, Ivan Velykorodnyy wrote: Hi, Astyanax is not 1.2 compatible yet https://github.com/Netflix/astyanax/issues/191 Eran planned to make it in 1.57.x четверг, 24 января 2013 г. пользователь Gabriel Ciuloaica писал: Hi, I have spent half of the day today trying to make a new Cassandra cluster to work. I have setup a single data center cluster, using NetworkTopologyStrategy, DC1:3. I'm using latest version of Astyanax client to connect. After many hours of debug, I found out that the problem may be in cqlsh utility. So, after the cluster was up and running: [me@cassandra-node1 cassandra]$ nodetool status Datacenter: DC-1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.11.1.109 59.1 KB256 0.0% 726689df-edc3-49a0-b680-370953994a8c RAC2 UN 10.11.1.108 67.49 KB 256 0.0% 73cd86a9-4efb-4407-9fe8-9a1b3a277af7 RAC1 UN 10.11.1.200 59.84 KB 64 0.0% d6d700d4-28aa-4722-b215-a6a7d304b8e7 RAC3 I went to create the keyspace: 1. First I have tried using cqlsh: create keyspace foo with replication= {'class':'NetworkTopologyStrategy','DC1':3}; after this, I have checked that the keyspace was properly created by running cqlsh select * from system.schema_keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options ---++--+ system_auth | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} foo | True | org.apache.cassandra.locator.NetworkTopologyStrategy | {dc1:3} system | True | org.apache.cassandra.locator.LocalStrategy | {} system_traces | True | org.apache.cassandra.locator.SimpleStrategy | {replication_factor:1} but if I run nodetool describering foo, it will not show anything
Re: Issues with CQLSH in Cassandra 1.2
Hi Aaron, I'm using PropertyFileSnitch, an my cassandra-topology.propertis looks like this: /# Cassandra Node IP=Data Center:Rack// // //# default for unknown nodes// //default=DC1:RAC1// // //# all known nodes// // 10.11.1.108=DC1:RAC1// // 10.11.1.109=DC1:RAC2// // 10.11.1.200=DC1:RAC3 /Cheers, Gabi/ / On 1/25/13 4:38 AM, aaron morton wrote: Can you provide details of the snitch configuration and the number of nodes you have? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 25/01/2013, at 9:39 AM, Gabriel Ciuloaica gciuloa...@gmail.com mailto:gciuloa...@gmail.com wrote: Hi Tyler, No, it was just a typo in the email, I changed names of DC in the email after copy/paste from output of the tools. It is quite easy to reproduce (assuming you have a correct configuration for NetworkTopologyStrategy, with vNodes(default, 256)): 1. launch cqlsh and create the keyspace create keyspace foo with replication= {'class':'NetworkTopologyStrategy','DC1':3}; 2. exit cqlsh, run nodetool describering foo you'll see something like this: TokenRange(start_token:2318224911779291128, end_token:2351629206880900296, endpoints:[], rpc_endpoints:[], endpoint_details:[]) TokenRange(start_token:-8291638263612363845, end_token:-8224756763869823639, endpoints:[], rpc_endpoints:[], endpoint_details:[]) 3. start cqlsh, drop keyspace foo; 4. Exit cqlsh, start cassandra-cli create keyspace foo with placement_strategy = 'NetworkTopologyStrategy' AND strategy_options={DC1}; if you run nodetool describering foo you'll see: TokenRange(start_token:2318224911779291128, end_token:2351629206880900296, endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], rpc_endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], endpoint_details:[EndpointDetails(host:10.11.1.200, datacenter:DC1, rack:RAC3), EndpointDetails(host:10.11.1.109, datacenter:DC1, rack:RAC2), EndpointDetails(host:10.11.1.108, datacenter:DC1, rack:RAC1)]) TokenRange(start_token:-8291638263612363845, end_token:-8224756763869823639, endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], rpc_endpoints:[10.11.1.200, 10.11.1.109, 10.11.1.108], endpoint_details:[EndpointDetails(host:10.11.1.200, datacenter:DC1, rack:RAC3), EndpointDetails(host:10.11.1.109, datacenter:DC1, rack:RAC2), EndpointDetails(host:10.11.1.108, datacenter:DC1, rack:RAC1)]) Br, Gabi On 1/24/13 10:22 PM, Tyler Hobbs wrote: Gabriel, It looks like you used DC1 for the datacenter name in your replication strategy options, while the actual datacenter name was DC-1 (based on the nodetool status output). Perhaps that was causing the problem? On Thu, Jan 24, 2013 at 1:57 PM, Gabriel Ciuloaica gciuloa...@gmail.com mailto:gciuloa...@gmail.com wrote: I do not think that it has anything to do with Astyanax, but after I have recreated the keyspace with cassandra-cli, everything is working fine. Also, I have mention below that not even nodetool describering foo, did not showed correct information for the tokens, encoding_details, if the keyspace was created with cqlsh. Thanks, Gabi On 1/24/13 9:21 PM, Ivan Velykorodnyy wrote: Hi, Astyanax is not 1.2 compatible yet https://github.com/Netflix/astyanax/issues/191 Eran planned to make it in 1.57.x четверг, 24 января 2013 г. пользователь Gabriel Ciuloaica писал: Hi, I have spent half of the day today trying to make a new Cassandra cluster to work. I have setup a single data center cluster, using NetworkTopologyStrategy, DC1:3. I'm using latest version of Astyanax client to connect. After many hours of debug, I found out that the problem may be in cqlsh utility. So, after the cluster was up and running: [me@cassandra-node1 cassandra]$ nodetool status Datacenter: DC-1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.11.1.109 59.1 KB256 0.0% 726689df-edc3-49a0-b680-370953994a8c RAC2 UN 10.11.1.108 67.49 KB 256 0.0% 73cd86a9-4efb-4407-9fe8-9a1b3a277af7 RAC1 UN 10.11.1.200 59.84 KB 64 0.0% d6d700d4-28aa-4722-b215-a6a7d304b8e7 RAC3 I went to create the keyspace: 1. First I have tried using cqlsh: create keyspace foo with replication= {'class':'NetworkTopologyStrategy','DC1':3}; after this, I have checked that the keyspace was properly created by running cqlsh select * from system.schema_keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options ---++--+ system_auth | True |
Re: Cassandra pending compaction tasks keeps increasing
Do you mean 90% of the reads should come from 1 SSTable? By the way, after I finished the data migrating, I ran nodetool repair -pr on one of the nodes. Before nodetool repair, all the nodes have the same disk space usage. After I ran the nodetool repair, the disk space for that node jumped from 135G to 220G, also there are more than 15000 pending compaction tasks. After a while , Cassandra started to throw the exception like below and stop compacting. I had to restart the node. By the way, we are using 1.1.7. Something doesn't seem right. INFO [CompactionExecutor:108804] 2013-01-24 22:23:10,427 CompactionTask.java (line 109) Compacting [SSTableReader(path='/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-753782-Data.db')] INFO [CompactionExecutor:108804] 2013-01-24 22:23:11,610 CompactionTask.java (line 221) Compacted to [/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-754996-Data.db,]. 5,259,403 to 5,259,403 (~100% of original) bytes for 1,983 keys at 4.268730MB/s. Time: 1,175ms. INFO [CompactionExecutor:108805] 2013-01-24 22:23:11,617 CompactionTask.java (line 109) Compacting [SSTableReader(path='/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-754880-Data.db')] INFO [CompactionExecutor:108805] 2013-01-24 22:23:12,828 CompactionTask.java (line 221) Compacted to [/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-754997-Data.db,]. 5,272,746 to 5,272,746 (~100% of original) bytes for 1,941 keys at 4.152339MB/s. Time: 1,211ms. ERROR [CompactionExecutor:108806] 2013-01-24 22:23:13,048 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[CompactionExecutor:108806,1,main] java.lang.StackOverflowError at java.util.AbstractList$Itr.hasNext(Unknown Source) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:517) at com.google.common.collect.Iterators$3.hasNext(Iterators.java:114) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:517) at com.google.common.collect.Iterators$3.hasNext(Iterators.java:114) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:517) at com.google.common.collect.Iterators$3.hasNext(Iterators.java:114) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:517) at com.google.common.collect.Iterators$3.hasNext(Iterators.java:114) - Original Message - From: aaron morton aa...@thelastpickle.com To: user@cassandra.apache.org Sent: Wednesday, January 23, 2013 2:40:45 PM Subject: Re: Cassandra pending compaction tasks keeps increasing The histogram does not look right to me, too many SSTables for an LCS CF. It's a symptom no a cause. If LCS is catching up though it should be more like the distribution in the linked article. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 23/01/2013, at 10:57 AM, Jim Cistaro jcist...@netflix.com wrote: What version are you using? Are you seeing any compaction related assertions in the logs? Might be https://issues.apache.org/jira/browse/CASSANDRA-4411 We experienced this problem of the count only decreasing to a certain number and then stopping. If you are idle, it should go to 0. I have not seen it overestimate for zero, only for non-zero amounts. As for timeouts etc, you will need to look at things like nodetool tpstats to see if you have pending transactions queueing up. Jc From: Wei Zhu wz1...@yahoo.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org , Wei Zhu wz1...@yahoo.com Date: Tuesday, January 22, 2013 12:56 PM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Re: Cassandra pending compaction tasks keeps increasing Thanks Aaron and Jim for your reply. The data import is done. We have about 135G on each node and it's about 28K SStables. For normal operation, we only have about 90 writes per seconds, but when I ran nodetool compationstats, it remains at 9 and hardly changes. I guess it's just an estimated number. When I ran histogram, Offset SSTables Write Latency Read Latency Row Size Column Count 1 2644 0 0 0 18660057 2 8204 0 0 0 9824270 3 11198 0 0 0 6968475 4 4269 6 0 0 5510745 5 517 29 0 0 4595205 You can see about half of the reads result in 3 SSTables. Majority of read latency are under 5ms, only a dozen are over 10ms. We haven't fully turn on reads yet, only 60 reads per second. We see about 20 read timeout during the past 12 hours. Not a single warning from Cassandra Log. Is it normal for Cassandra to timeout some requests? We set rpc timeout to be 1s, it shouldn't time out any of them? Thanks. -Wei From: aaron morton aa...@thelastpickle.com To: user@cassandra.apache.org Sent: Monday, January 21, 2013 12:21 AM Subject: Re: Cassandra pending compaction tasks keeps increasing The main guarantee LCS gives you is that most reads will only touch 1
Re: Cassandra pending compaction tasks keeps increasing
Increasing the stack size in cassandra-env.sh should help you get past the stack overflow. Doesn't help with your original problem though. On Fri, Jan 25, 2013 at 12:00 AM, Wei Zhu wz1...@yahoo.com wrote: Well, even after restart, it throws the the same exception. I am basically stuck. Any suggestion to clear the pending compaction tasks? Below is the end of stack trace: at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$3.iterator(Sets.java:667) at com.google.common.collect.Sets$3.size(Sets.java:670) at com.google.common.collect.Iterables.size(Iterables.java:80) at org.apache.cassandra.db.DataTracker.buildIntervalTree(DataTracker.java:557) at org.apache.cassandra.db.compaction.CompactionController.init(CompactionController.java:69) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:105) at org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50) at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Any suggestion is very much appreciated -Wei - Original Message - From: Wei Zhu wz1...@yahoo.com To: user@cassandra.apache.org Sent: Thursday, January 24, 2013 10:55:07 PM Subject: Re: Cassandra pending compaction tasks keeps increasing Do you mean 90% of the reads should come from 1 SSTable? By the way, after I finished the data migrating, I ran nodetool repair -pr on one of the nodes. Before nodetool repair, all the nodes have the same disk space usage. After I ran the nodetool repair, the disk space for that node jumped from 135G to 220G, also there are more than 15000 pending compaction tasks. After a while , Cassandra started to throw the exception like below and stop compacting. I had to restart the node. By the way, we are using 1.1.7. Something doesn't seem right. INFO [CompactionExecutor:108804] 2013-01-24 22:23:10,427 CompactionTask.java (line 109) Compacting [SSTableReader(path='/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-753782-Data.db')] INFO [CompactionExecutor:108804] 2013-01-24 22:23:11,610 CompactionTask.java (line 221) Compacted to [/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-754996-Data.db,]. 5,259,403 to 5,259,403 (~100% of original) bytes for 1,983 keys at 4.268730MB/s. Time: 1,175ms. INFO [CompactionExecutor:108805] 2013-01-24 22:23:11,617 CompactionTask.java (line 109) Compacting [SSTableReader(path='/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-754880-Data.db')] INFO [CompactionExecutor:108805] 2013-01-24 22:23:12,828 CompactionTask.java (line 221) Compacted to [/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-754997-Data.db,]. 5,272,746 to 5,272,746 (~100% of original) bytes for 1,941 keys at 4.152339MB/s. Time: 1,211ms. ERROR [CompactionExecutor:108806] 2013-01-24 22:23:13,048 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[CompactionExecutor:108806,1,main] java.lang.StackOverflowError at java.util.AbstractList$Itr.hasNext(Unknown Source) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:517) at com.google.common.collect.Iterators$3.hasNext(Iterators.java:114) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:517) at com.google.common.collect.Iterators$3.hasNext(Iterators.java:114) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:517) at com.google.common.collect.Iterators$3.hasNext(Iterators.java:114) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:517) at com.google.common.collect.Iterators$3.hasNext(Iterators.java:114) - Original Message - From: aaron morton aa...@thelastpickle.com To: user@cassandra.apache.org Sent: Wednesday, January 23, 2013 2:40:45 PM Subject: Re: Cassandra pending compaction tasks keeps increasing The histogram does not look right to me, too many SSTables for an LCS CF. It's a symptom no a cause. If LCS is catching up though it should be more like the distribution in the linked article. Cheers - Aaron Morton Freelance
Re: Cassandra pending compaction tasks keeps increasing
Thanks Derek, in the cassandra-env.sh, it says # reduce the per-thread stack size to minimize the impact of Thrift # thread-per-client. (Best practice is for client connections to # be pooled anyway.) Only do so on Linux where it is known to be # supported. # u34 and greater need 180k JVM_OPTS=$JVM_OPTS -Xss180k What value should I use? Java defaults at 400K? Maybe try that first. Thanks. -Wei - Original Message - From: Derek Williams de...@fyrie.net To: user@cassandra.apache.org, Wei Zhu wz1...@yahoo.com Sent: Thursday, January 24, 2013 11:06:00 PM Subject: Re: Cassandra pending compaction tasks keeps increasing Increasing the stack size in cassandra-env.sh should help you get past the stack overflow. Doesn't help with your original problem though. On Fri, Jan 25, 2013 at 12:00 AM, Wei Zhu wz1...@yahoo.com wrote: Well, even after restart, it throws the the same exception. I am basically stuck. Any suggestion to clear the pending compaction tasks? Below is the end of stack trace: at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$1.iterator(Sets.java:578) at com.google.common.collect.Sets$3.iterator(Sets.java:667) at com.google.common.collect.Sets$3.size(Sets.java:670) at com.google.common.collect.Iterables.size(Iterables.java:80) at org.apache.cassandra.db.DataTracker.buildIntervalTree(DataTracker.java:557) at org.apache.cassandra.db.compaction.CompactionController.init(CompactionController.java:69) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:105) at org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50) at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Any suggestion is very much appreciated -Wei - Original Message - From: Wei Zhu wz1...@yahoo.com To: user@cassandra.apache.org Sent: Thursday, January 24, 2013 10:55:07 PM Subject: Re: Cassandra pending compaction tasks keeps increasing Do you mean 90% of the reads should come from 1 SSTable? By the way, after I finished the data migrating, I ran nodetool repair -pr on one of the nodes. Before nodetool repair, all the nodes have the same disk space usage. After I ran the nodetool repair, the disk space for that node jumped from 135G to 220G, also there are more than 15000 pending compaction tasks. After a while , Cassandra started to throw the exception like below and stop compacting. I had to restart the node. By the way, we are using 1.1.7. Something doesn't seem right. INFO [CompactionExecutor:108804] 2013-01-24 22:23:10,427 CompactionTask.java (line 109) Compacting [SSTableReader(path='/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-753782-Data.db')] INFO [CompactionExecutor:108804] 2013-01-24 22:23:11,610 CompactionTask.java (line 221) Compacted to [/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-754996-Data.db,]. 5,259,403 to 5,259,403 (~100% of original) bytes for 1,983 keys at 4.268730MB/s. Time: 1,175ms. INFO [CompactionExecutor:108805] 2013-01-24 22:23:11,617 CompactionTask.java (line 109) Compacting [SSTableReader(path='/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-754880-Data.db')] INFO [CompactionExecutor:108805] 2013-01-24 22:23:12,828 CompactionTask.java (line 221) Compacted to [/ssd/cassandra/data/zoosk/friends/zoosk-friends-hf-754997-Data.db,]. 5,272,746 to 5,272,746 (~100% of original) bytes for 1,941 keys at 4.152339MB/s. Time: 1,211ms. ERROR [CompactionExecutor:108806] 2013-01-24 22:23:13,048