Cassandra leap second
Is it ok to run this https://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ Seeing high cpu consumption for cassandra process
RE: Cassandra leap second
reboot of the machine worked From: nair...@outlook.com To: user@cassandra.apache.org Subject: Cassandra leap second Date: Wed, 1 Jul 2015 02:54:53 + Is it ok to run this https://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ Seeing high cpu consumption for cassandra process
Re: Issue when node goes away?
That is a GREAT lead! So it looks like I can't add a few nodes to the cluster of the new version, have it settle down, and then upgrade the rest? On Tue, Jun 30, 2015 at 11:58 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Would it matter that I'm mixing cassandra versions? From: http://docs.datastax.com/en/upgrade/doc/upgrade/datastax_enterprise/upgrdLim.html General upgrade limitations¶ Do not run nodetool repair. Do not enable new features. Do not issue these types of queries during a rolling restart: DDL, TRUNCATE *During upgrades, the nodes on different versions show a schema disagreement*. I think this is a good lead. C*heers, Alain 2015-06-30 20:22 GMT+02:00 David Aronchick aronch...@gmail.com: I appreciate the thoughts! My issue is that it seems to work perfectly, until the node goes away. Would it matter that I'm mixing cassandra versions? (2.1.4 and 2.1.5)? On Tue, Jun 30, 2015 at 5:23 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi David ? What does a nodetool describecluster output look like ? My guess is you might be having a schema version desynchronisation. If you see a node with different schema version you might want to try a nodetool resetlocal*schema* - Reset node's local *schema* and resync You asked for any thoughts, this is a thought, not sure if it will help, I hope so. C*heers, Alain 2015-06-30 1:44 GMT+02:00 Robert Coli rc...@eventbrite.com: On Mon, Jun 29, 2015 at 2:43 PM, David Aronchick aronch...@gmail.com wrote: Ping--- any thoughts here? I don't have any thoughts on your specific issue at this time, but FWIW #cassandra on freenode is sometimes a better forum for interactive debugging of operational edge cases. =Rob
Re: Error while adding a new node.
Looks like you have too many open files issue. Increase the ulimit for the user. If you are starting the cassandra daemon using user cassandra, increase the ulimit for that user. On Jun 30, 2015, at 21:16, Neha Trivedi nehajtriv...@gmail.com wrote: Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Re: Error while adding a new node.
Thanks Arun ! I will try and get back ! On Wed, Jul 1, 2015 at 10:32 AM, Arun arunsi...@gmail.com wrote: Looks like you have too many open files issue. Increase the ulimit for the user. If you are starting the cassandra daemon using user cassandra, increase the ulimit for that user. On Jun 30, 2015, at 21:16, Neha Trivedi nehajtriv...@gmail.com wrote: Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Decommissioned node still in Gossip
Hi, I have a cluster which had 4 datacenters running 2.0.12. Last week one of the datacenters was decommissioned using nodetool decommission on each of the servers in turn. This seemed to work fine until one of the nodes started appearing in the logs of all of the remaining servers with messages like: INFO [GossipStage:3] 2015-06-30 11:22:39,189 Gossiper.java (line 924) InetAddress /172.29.8.8 is now DOWN INFO [GossipStage:3] 2015-06-30 11:22:39,190 StorageService.java (line 1773) Removing tokens [...] for /172.29.8.8 These come up in the log every minute or two. I believe it may have re-appeared after a repair, but I'm not sure. The problem is that this node does not exist in nodetool status, nodetool gossipinfo or in the system.peers table. So how can tell the cluster that this node is decommissioned? Regards, Jeff
Insert (and delete) data loss?
Hi all, I configured a Cassandra Cluster (3 nodes). Then I created a KEYSPACE: cql CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 }; and a table: cql CREATE TABLE chiamate_stabili (chiave TEXT PRIMARY KEY, valore BLOB); I inserted (synchronous) 10.000 rows (with a Java Client that connects to one of 3 nodes during connection phase). Each row has a valore that contains an array of 100K bytes. When Java Client ends, I wait some seconds and then I try this command inside cql: cql SELECT COUNT(*) FROM test.chiamate_stabili LIMIT 1; Result is often 1 but sometime ! The same records are found with my Java Client. Here is Java query code: for(int i = 1; i = 1; i++) { String key = key- + i; Clause eqClause = QueryBuilder.eq(chiave, key); Statement statement = QueryBuilder.select().all().from(test, tableName).where(eqClause); session.execute(statement); } Same behaviour with deletes. When I try to delete all the records sometimes the table is empty, but sometimes records are still present. I think that something is wrong in my code or (probably) in cluster configuration but I don't know which kind of tuning or configuration options I can explore. Any help is very appreciated. Many thanks in advance. Moreno Internet Email Confidentiality Footer La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. * This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred.
Re: Issue when node goes away?
Hi David ? What does a nodetool describecluster output look like ? My guess is you might be having a schema version desynchronisation. If you see a node with different schema version you might want to try a nodetool resetlocal*schema* - Reset node's local *schema* and resync You asked for any thoughts, this is a thought, not sure if it will help, I hope so. C*heers, Alain 2015-06-30 1:44 GMT+02:00 Robert Coli rc...@eventbrite.com: On Mon, Jun 29, 2015 at 2:43 PM, David Aronchick aronch...@gmail.com wrote: Ping--- any thoughts here? I don't have any thoughts on your specific issue at this time, but FWIW #cassandra on freenode is sometimes a better forum for interactive debugging of operational edge cases. =Rob
Re: Insert (and delete) data loss?
Hi Moreno, Which consistency level are you using? If you're using ONE, that may make sense, as, depending on the partitioning and the cluster coordinating the query, different values may be received. Hope it helps. Regards Carlos Alonso | Software Engineer | @calonso https://twitter.com/calonso On 30 June 2015 at 13:34, Mauri Moreno Cesare morenocesare.ma...@italtel.com wrote: Hi all, I configured a Cassandra Cluster (3 nodes). Then I created a KEYSPACE: cql CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 }; and a table: cql CREATE TABLE chiamate_stabili (chiave TEXT PRIMARY KEY, valore BLOB); I inserted (synchronous) 10.000 rows (with a Java Client that connects to one of 3 nodes during connection phase). Each row has a “valore” that contains an array of 100K bytes. When Java Client ends, I wait some seconds and then I try this command inside cql: cql SELECT COUNT(*) FROM test.chiamate_stabili LIMIT 1; Result is often 1 but sometime ! The same records are found with my Java Client. Here is Java query code: for(int i = 1; i = 1; i++) { String key = key- + i; Clause eqClause = QueryBuilder.eq(chiave, key); Statement statement = QueryBuilder.select().all().from(test, tableName).where(eqClause); session.execute(statement); } Same behaviour with deletes. When I try to delete all the records sometimes the table is empty, but sometimes records are still present. I think that something is wrong in my code or (probably) in cluster configuration but I don’t know which kind of tuning or configuration options I can explore. Any help is very appreciated. Many thanks in advance. Moreno Internet Email Confidentiality Footer La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. * This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred.
Re: Insert (and delete) data loss?
Can you try two more tests: 1) Write the way you are, perform a repair on all nodes, then read the way you are. wipe data 2) Write with CL quorum, read with CL quorum. On Tue, Jun 30, 2015 at 8:34 AM, Mauri Moreno Cesare morenocesare.ma...@italtel.com wrote: Hi all, I configured a Cassandra Cluster (3 nodes). Then I created a KEYSPACE: cql CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 }; and a table: cql CREATE TABLE chiamate_stabili (chiave TEXT PRIMARY KEY, valore BLOB); I inserted (synchronous) 10.000 rows (with a Java Client that connects to one of 3 nodes during connection phase). Each row has a “valore” that contains an array of 100K bytes. When Java Client ends, I wait some seconds and then I try this command inside cql: cql SELECT COUNT(*) FROM test.chiamate_stabili LIMIT 1; Result is often 1 but sometime ! The same records are found with my Java Client. Here is Java query code: for(int i = 1; i = 1; i++) { String key = key- + i; Clause eqClause = QueryBuilder.eq(chiave, key); Statement statement = QueryBuilder.select().all().from(test, tableName).where(eqClause); session.execute(statement); } Same behaviour with deletes. When I try to delete all the records sometimes the table is empty, but sometimes records are still present. I think that something is wrong in my code or (probably) in cluster configuration but I don’t know which kind of tuning or configuration options I can explore. Any help is very appreciated. Many thanks in advance. Moreno Internet Email Confidentiality Footer La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. * This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred. -- Jason Kushmaul | 517.899.7852 Engineering Manager
RE: Insert (and delete) data loss?
Hi Carlos, first I used Consistency ONE, then ALL (I retry with ALL in order to be sure that problem doesn’t disappear). Thanks Moreno From: Carlos Alonso [mailto:i...@mrcalonso.com] Sent: martedì 30 giugno 2015 15.24 To: user@cassandra.apache.org Subject: Re: Insert (and delete) data loss? Hi Moreno, Which consistency level are you using? If you're using ONE, that may make sense, as, depending on the partitioning and the cluster coordinating the query, different values may be received. Hope it helps. Regards Carlos Alonso | Software Engineer | @calonsohttps://twitter.com/calonso On 30 June 2015 at 13:34, Mauri Moreno Cesare morenocesare.ma...@italtel.commailto:morenocesare.ma...@italtel.com wrote: Hi all, I configured a Cassandra Cluster (3 nodes). Then I created a KEYSPACE: cql CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 }; and a table: cql CREATE TABLE chiamate_stabili (chiave TEXT PRIMARY KEY, valore BLOB); I inserted (synchronous) 10.000 rows (with a Java Client that connects to one of 3 nodes during connection phase). Each row has a “valore” that contains an array of 100K bytes. When Java Client ends, I wait some seconds and then I try this command inside cql: cql SELECT COUNT(*) FROM test.chiamate_stabili LIMIT 1; Result is often 1 but sometime ! The same records are found with my Java Client. Here is Java query code: for(int i = 1; i = 1; i++) { String key = key- + i; Clause eqClause = QueryBuilder.eq(chiave, key); Statement statement = QueryBuilder.select().all().from(test, tableName).where(eqClause); session.execute(statement); } Same behaviour with deletes. When I try to delete all the records sometimes the table is empty, but sometimes records are still present. I think that something is wrong in my code or (probably) in cluster configuration but I don’t know which kind of tuning or configuration options I can explore. Any help is very appreciated. Many thanks in advance. Moreno Internet Email Confidentiality Footer La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. * This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred. Internet Email Confidentiality Footer La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. * This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred.
Re: Stream failure while adding a new node
Hi David, Are you sure you ran the repair entirely (9 days + repair logs ok on opscenter server) before adding the 10th node ? This is important to avoid potential data loss ! Did you set auto_bootstrap to true on this 10th node ? C*heers, Alain 2015-06-29 14:54 GMT+02:00 David CHARBONNIER david.charbonn...@rgsystem.com : Hi, We’re using Cassandra 2.0.8.39 through Datastax Enterprise 4.5.1 with a 9 nodes cluster. We need to add a few new nodes to the cluster but we’re experiencing an issue we don’t know how to solve. Here is exactly what we did : - We had 8 nodes and need to add a few ones - We tried to add 9th node but stream stucked a very long time and bootstrap never finish (related to streaming_socket_timeout_in_ms default value in cassandra.yaml) - We ran a solution given by a Datastax’s architect : restart the node with auto_bootstrap set to false and run a repair - After this issue, we ran into pathing the default configuration on all our nodes to avoid this problem and made a rolling restart of the cluster - Then, we tried adding a 10th node but it receives stream from only one node (node2). Here is the logs on this problematic node (node10) : INFO [main] 2015-06-26 15:25:59,490 StreamResultFuture.java (line 87) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Executing streaming plan for Bootstrap INFO [main] 2015-06-26 15:25:59,490 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node6 INFO [main] 2015-06-26 15:25:59,491 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node5 INFO [main] 2015-06-26 15:25:59,492 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node4 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node3 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node9 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node8 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node7 INFO [main] 2015-06-26 15:25:59,494 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node1 INFO [main] 2015-06-26 15:25:59,494 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node2 INFO [STREAM-IN-/node6] 2015-06-26 15:25:59,515 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node6 is complete INFO [STREAM-IN-/node4] 2015-06-26 15:25:59,516 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node4 is complete INFO [STREAM-IN-/node5] 2015-06-26 15:25:59,517 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node5 is complete INFO [STREAM-IN-/node3] 2015-06-26 15:25:59,527 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node3 is complete INFO [STREAM-IN-/node1] 2015-06-26 15:25:59,528 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node1 is complete INFO [STREAM-IN-/node8] 2015-06-26 15:25:59,530 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node8 is complete INFO [STREAM-IN-/node7] 2015-06-26 15:25:59,531 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node7 is complete INFO [STREAM-IN-/node9] 2015-06-26 15:25:59,533 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node9 is complete INFO [STREAM-IN-/node2] 2015-06-26 15:26:04,874 StreamResultFuture.java (line 173) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Prepare completed. Receiving 171 files(14844054090 bytes), sending 0 files(0 bytes) On the other nodes (not node2 which streams data), there is an error telling that node10 has no hostID. Did you ran into this issue or do you have any idea on how to solve this ? Thank you for your help. Best regards, *David CHARBONNIER* Sysadmin T : +33 411 934 200 david.charbonn...@rgsystem.com ZAC Aéroport 125 Impasse Adam Smith 34470 Pérols - France *www.rgsystem.com* http://www.rgsystem.com/
RE: Insert (and delete) data loss?
Thank you Jason! Ok, I will try with QUORUM and, if problem continues to be present, I’ll give “nodetool repair” cmd. (but it’s a good practice, in production environment, the use of “nodetool repair”?) Moreno From: Jason Kushmaul [mailto:jkushm...@rocketfuelinc.com] Sent: martedì 30 giugno 2015 15.25 To: user@cassandra.apache.org Subject: Re: Insert (and delete) data loss? Can you try two more tests: 1) Write the way you are, perform a repair on all nodes, then read the way you are. wipe data 2) Write with CL quorum, read with CL quorum. On Tue, Jun 30, 2015 at 8:34 AM, Mauri Moreno Cesare morenocesare.ma...@italtel.commailto:morenocesare.ma...@italtel.com wrote: Hi all, I configured a Cassandra Cluster (3 nodes). Then I created a KEYSPACE: cql CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 }; and a table: cql CREATE TABLE chiamate_stabili (chiave TEXT PRIMARY KEY, valore BLOB); I inserted (synchronous) 10.000 rows (with a Java Client that connects to one of 3 nodes during connection phase). Each row has a “valore” that contains an array of 100K bytes. When Java Client ends, I wait some seconds and then I try this command inside cql: cql SELECT COUNT(*) FROM test.chiamate_stabili LIMIT 1; Result is often 1 but sometime ! The same records are found with my Java Client. Here is Java query code: for(int i = 1; i = 1; i++) { String key = key- + i; Clause eqClause = QueryBuilder.eq(chiave, key); Statement statement = QueryBuilder.select().all().from(test, tableName).where(eqClause); session.execute(statement); } Same behaviour with deletes. When I try to delete all the records sometimes the table is empty, but sometimes records are still present. I think that something is wrong in my code or (probably) in cluster configuration but I don’t know which kind of tuning or configuration options I can explore. Any help is very appreciated. Many thanks in advance. Moreno Internet Email Confidentiality Footer La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. * This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred. -- Jason Kushmaul | 517.899.7852 Engineering Manager Internet Email Confidentiality Footer La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. * This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred.
Error while adding a new node.
Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Re: Error while adding a new node.
Arun, I am logging on to Server as root and running (sudo service cassandra start) regards Neha On Wed, Jul 1, 2015 at 11:00 AM, Neha Trivedi nehajtriv...@gmail.com wrote: Thanks Arun ! I will try and get back ! On Wed, Jul 1, 2015 at 10:32 AM, Arun arunsi...@gmail.com wrote: Looks like you have too many open files issue. Increase the ulimit for the user. If you are starting the cassandra daemon using user cassandra, increase the ulimit for that user. On Jun 30, 2015, at 21:16, Neha Trivedi nehajtriv...@gmail.com wrote: Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Re: Decommissioned node still in Gossip
I was having exactly the same issue with the same version, check your seed list and make sure it contains only the live nodes, I know that seeds are only read when cassandra starts but updating the seed list to live nodes and then doing a roiling restart fixed this issue for me. I hope this helps you. Thanks Sent from my iPhone On Jun 30, 2015, at 4:42 AM, Jeff Williams je...@wherethebitsroam.com wrote: Hi, I have a cluster which had 4 datacenters running 2.0.12. Last week one of the datacenters was decommissioned using nodetool decommission on each of the servers in turn. This seemed to work fine until one of the nodes started appearing in the logs of all of the remaining servers with messages like: INFO [GossipStage:3] 2015-06-30 11:22:39,189 Gossiper.java (line 924) InetAddress /172.29.8.8 is now DOWN INFO [GossipStage:3] 2015-06-30 11:22:39,190 StorageService.java (line 1773) Removing tokens [...] for /172.29.8.8 These come up in the log every minute or two. I believe it may have re-appeared after a repair, but I'm not sure. The problem is that this node does not exist in nodetool status, nodetool gossipinfo or in the system.peers table. So how can tell the cluster that this node is decommissioned? Regards, Jeff
RE: Insert (and delete) data loss?
Problem is still present ☹ First I changed ConsistencyLevel in java client code(from ONE to QUORUM). When I changed ConsistencyLevel I need to re-config Cassandra Cluster in a proper way (according to “Cassandra Calculator for Dummies”, I changed Keyspace’s Replication Factor, from 2 to 3: without this change Calculator told me that my cluster didn’t survive to node loss). Insert was OK (5 or 6 attempts, all of them with 10.000 records inserted). Delete was KO (I tried to delete 10.000 records but, after delete, select count() returned 3334 records: no Exception caught client side ☹). Even after “nodetool repair” (command executed on all 3 nodes) select count() returned 3334 records. Is there something else I can change in cluster configuration (or java client code?). Thanks again Moreno From: Jason Kushmaul [mailto:jkushm...@rocketfuelinc.com] Sent: martedì 30 giugno 2015 15.25 To: user@cassandra.apache.org Subject: Re: Insert (and delete) data loss? Can you try two more tests: 1) Write the way you are, perform a repair on all nodes, then read the way you are. wipe data 2) Write with CL quorum, read with CL quorum. On Tue, Jun 30, 2015 at 8:34 AM, Mauri Moreno Cesare morenocesare.ma...@italtel.commailto:morenocesare.ma...@italtel.com wrote: Hi all, I configured a Cassandra Cluster (3 nodes). Then I created a KEYSPACE: cql CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 }; and a table: cql CREATE TABLE chiamate_stabili (chiave TEXT PRIMARY KEY, valore BLOB); I inserted (synchronous) 10.000 rows (with a Java Client that connects to one of 3 nodes during connection phase). Each row has a “valore” that contains an array of 100K bytes. When Java Client ends, I wait some seconds and then I try this command inside cql: cql SELECT COUNT(*) FROM test.chiamate_stabili LIMIT 1; Result is often 1 but sometime ! The same records are found with my Java Client. Here is Java query code: for(int i = 1; i = 1; i++) { String key = key- + i; Clause eqClause = QueryBuilder.eq(chiave, key); Statement statement = QueryBuilder.select().all().from(test, tableName).where(eqClause); session.execute(statement); } Same behaviour with deletes. When I try to delete all the records sometimes the table is empty, but sometimes records are still present. I think that something is wrong in my code or (probably) in cluster configuration but I don’t know which kind of tuning or configuration options I can explore. Any help is very appreciated. Many thanks in advance. Moreno Internet Email Confidentiality Footer La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. * This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred. -- Jason Kushmaul | 517.899.7852 Engineering Manager Internet Email Confidentiality Footer La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file
Re: Read Consistency
Agree Tyler. I think its our application problem. If client returns failed write in spite of retries, application must have a rollback mechanism to make sure old state is restored. Failed write may be because of the fact that CL was not met even though one node successfully wrote.Cassandra wont do cleanup or rollback on one node so you need to do it yourself to make sure that integrity of data is maintained in case strong consistency is a requirement. Right? We use Hector by the way and plannning to switch to CQL driver.. Thanks Anuj Wadehra Sent from Yahoo Mail on Android From:Tyler Hobbs ty...@datastax.com Date:Tue, 30 Jun, 2015 at 10:42 pm Subject:Re: Read Consistency I think these scenarios are still possible even when we are writing at QUORUM ..if we have dropped mutations in our cluster.. It was very strange in our case ...We had RF=3 and READ/WRITE CL=QUORUM..we had dropped mutations for long time but we never faced any scenario like scenario 1 when READ went to node 2 and 3 and read did's return any data..Any comments on this are welcome?? They are not possible if you write at QUORUM, because QUORUM guarantees that at least two of the nodes will have the most recent version of the data. If fewer than two replicas respond successfully (meaning two replicas dropped mutations), you will get an error on the write. All of the drivers and cqlsh default to consistency level ONE, so I would double check that your application is setting the consistency level correctly. On Sun, Jun 28, 2015 at 12:55 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote: Sorry for typo in your name Owen !! Anuj Sent from Yahoo Mail on Android From:Anuj Wadehra anujw_2...@yahoo.co.in Date:Sun, 28 Jun, 2015 at 11:11 pm Subject:Re: Read Consistency Agree Owem !! Response in both scenarios would depend on the 2 replicas chosen for meeting QUORUM. But, the intent is to get the tricky part of scenario 1 answered i.e. when 2 nodes selected are one with and one without data. As per my understanding of Read Path and documentation https://wiki.apache.org/cassandra/ArchitectureInternals: 1. Data would be read from closest node and digest would be received from one more replica. 2. If mismatch is found between digest, blocked read happens on same 2 replicas (not all replicas ..so in scenario 2, if 2 nodes didnt have latest data and third node has it ..still stale data would be returned) I think these scenarios are still possible even when we are writing at QUORUM ..if we have dropped mutations in our cluster.. It was very strange in our case ...We had RF=3 and READ/WRITE CL=QUORUM..we had dropped mutations for long time but we never faced any scenario like scenario 1 when READ went to node 2 and 3 and read did's return any data..Any comments on this are welcome?? Thanks for clarifying further as discussion could have mislead few.. Thanks Anuj On Sunday, 28 June 2015 6:16 AM, Owen Kim ohech...@gmail.com wrote: Sorry. I have to jump in and disagree. Data is not guaranteed to retire in scenario 1. Since two nodes do not have data and two nodes may be the only nodes queried at that CL, the read query may return data or not. Similarly, in scenario 2, the query may not return the most recent data because the node with that data may not be queried at all (the other two may). Keep in mind, these scenarios seem to generally assume you are not writing data at consistently at QUORUM CL so therefore your reads may be inconsistent. On Tuesday, June 23, 2015, Anuj Wadehra anujw_2...@yahoo.co.in wrote: Thanks..So all of us agree that in scenario 1, data would be returned and that was my initial understanding.. Anuj Sent from Yahoo Mail on Android From:Anuj Wadehra anujw_2...@yahoo.co.in Date:Wed, 24 Jun, 2015 at 12:15 am Subject:Re: Read Consistency M more confused...Different responses. .Anyone who can explain with 100% surity ? Thanks Anuj Sent from Yahoo Mail on Android From:arun sirimalla arunsi...@gmail.com Date:Wed, 24 Jun, 2015 at 12:00 am Subject:Re: Read Consistency Thanks good to know that. On Tue, Jun 23, 2015 at 11:27 AM, Philip Thompson philip.thomp...@datastax.com wrote: Yes, that is what he means. CL is for how many nodes need to respond, not agree. On Tue, Jun 23, 2015 at 2:26 PM, arun sirimalla arunsi...@gmail.com wrote: So do you mean with CL set to QUORUM, if data is only on one node, the query still succeeds. On Tue, Jun 23, 2015 at 11:21 AM, Philip Thompson philip.thomp...@datastax.com wrote: Anuj, In the first scenario, the data from the single node holding data is returned. The query will not fail if the consistency level is met, even if the read was inconsistent. On Tue, Jun 23, 2015 at 2:16 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote: Why would it fail and with what Thrift error? What if the data didnt exist on any of the nodes..query wont fail if doesnt find data.. Not convinced.. Sent from Yahoo Mail on
About Contents Suggestions for Cassandra with Python
Hi All, I am having general understanding of cassandra working and basic knowledge of Python. But I want to conduct a session in which I want to take audience from intermediate to Advanced. So which contents would you recommend me to take for the workshop with Python. If you will send me the links, videos or blogs it will be very useful for me. Also please suggest additional contents so that audience will be benefited. https://goo.gl/RlPz4s Seeking for guidance, Thanks !! -- *Cheers,Mayur S Patil,Looking for RD or Soft Engg positions,Pune, India.*
Re: Read Consistency
I think these scenarios are still possible even when we are writing at QUORUM ..if we have dropped mutations in our cluster.. It was very strange in our case ...We had RF=3 and READ/WRITE CL=QUORUM..we had dropped mutations for long time but we never faced any scenario like scenario 1 when READ went to node 2 and 3 and read did's return any data..Any comments on this are welcome?? They are not possible if you write at QUORUM, because QUORUM guarantees that at least two of the nodes will have the most recent version of the data. If fewer than two replicas respond successfully (meaning two replicas dropped mutations), you will get an error on the write. All of the drivers and cqlsh default to consistency level ONE, so I would double check that your application is setting the consistency level correctly. On Sun, Jun 28, 2015 at 12:55 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote: Sorry for typo in your name Owen !! Anuj Sent from Yahoo Mail on Android https://overview.mail.yahoo.com/mobile/?.src=Android -- *From*:Anuj Wadehra anujw_2...@yahoo.co.in *Date*:Sun, 28 Jun, 2015 at 11:11 pm *Subject*:Re: Read Consistency Agree Owem !! Response in both scenarios would depend on the 2 replicas chosen for meeting QUORUM. But, the intent is to get the tricky part of scenario 1 answered i.e. when 2 nodes selected are one with and one without data. As per my understanding of Read Path and documentation https://wiki.apache.org/cassandra/ArchitectureInternals: 1. Data would be read from closest node and digest would be received from one more replica. 2. If mismatch is found between digest, blocked read happens on same 2 replicas (not all replicas ..so in scenario 2, if 2 nodes didnt have latest data and third node has it ..still stale data would be returned) I think these scenarios are still possible even when we are writing at QUORUM ..if we have dropped mutations in our cluster.. It was very strange in our case ...We had RF=3 and READ/WRITE CL=QUORUM..we had dropped mutations for long time but we never faced any scenario like scenario 1 when READ went to node 2 and 3 and read did's return any data..Any comments on this are welcome?? Thanks for clarifying further as discussion could have mislead few.. Thanks Anuj On Sunday, 28 June 2015 6:16 AM, Owen Kim ohech...@gmail.com wrote: Sorry. I have to jump in and disagree. Data is not guaranteed to retire in scenario 1. Since two nodes do not have data and two nodes may be the only nodes queried at that CL, the read query may return data or not. Similarly, in scenario 2, the query may not return the most recent data because the node with that data may not be queried at all (the other two may). Keep in mind, these scenarios seem to generally assume you are not writing data at consistently at QUORUM CL so therefore your reads may be inconsistent. On Tuesday, June 23, 2015, Anuj Wadehra anujw_2...@yahoo.co.in wrote: Thanks..So all of us agree that in scenario 1, data would be returned and that was my initial understanding.. Anuj Sent from Yahoo Mail on Android https://overview.mail.yahoo.com/mobile/?.src=Android -- *From*:Anuj Wadehra anujw_2...@yahoo.co.in *Date*:Wed, 24 Jun, 2015 at 12:15 am *Subject*:Re: Read Consistency M more confused...Different responses. .Anyone who can explain with 100% surity ? Thanks Anuj Sent from Yahoo Mail on Android https://overview.mail.yahoo.com/mobile/?.src=Android -- *From*:arun sirimalla arunsi...@gmail.com *Date*:Wed, 24 Jun, 2015 at 12:00 am *Subject*:Re: Read Consistency Thanks good to know that. On Tue, Jun 23, 2015 at 11:27 AM, Philip Thompson philip.thomp...@datastax.com wrote: Yes, that is what he means. CL is for how many nodes need to respond, not agree. On Tue, Jun 23, 2015 at 2:26 PM, arun sirimalla arunsi...@gmail.com wrote: So do you mean with CL set to QUORUM, if data is only on one node, the query still succeeds. On Tue, Jun 23, 2015 at 11:21 AM, Philip Thompson philip.thomp...@datastax.com wrote: Anuj, In the first scenario, the data from the single node holding data is returned. The query will not fail if the consistency level is met, even if the read was inconsistent. On Tue, Jun 23, 2015 at 2:16 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote: Why would it fail and with what Thrift error? What if the data didnt exist on any of the nodes..query wont fail if doesnt find data.. Not convinced.. Sent from Yahoo Mail on Android https://overview.mail.yahoo.com/mobile/?.src=Android -- *From*:arun sirimalla arunsi...@gmail.com *Date*:Tue, 23 Jun, 2015 at 11:39 pm *Subject*:Re: Read Consistency Scenario 1: Read query is fired for a key, data is found on one node and not found on other two nodes who are responsible for the token corresponding to key. You read query will fail, as
Re: Migrate table data to another table
Another option is Brian's cassandra loader: https://github.com/brianmhess/cassandra-loader All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Tue, Jun 30, 2015 at 1:26 PM, John Sanda john.sa...@gmail.com wrote: You might want to take a look at CQLSSTableWriter[1] in the Cassandra source tree. http://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated On Tue, Jun 30, 2015 at 1:18 PM, Umut Kocasaraç ukocasa...@gmail.com wrote: Hi, I want to change clustering order column of my table. As far as i know it is not possible to use alter command so i have created new table and i would like to move data from old table to this one. I am using Cassandra 2.0.7 and there is almost 100GB data on table. Is there any easy method to move data except Copy command. Thanks Umut -- - John
Re: Issue when node goes away?
I appreciate the thoughts! My issue is that it seems to work perfectly, until the node goes away. Would it matter that I'm mixing cassandra versions? (2.1.4 and 2.1.5)? On Tue, Jun 30, 2015 at 5:23 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi David ? What does a nodetool describecluster output look like ? My guess is you might be having a schema version desynchronisation. If you see a node with different schema version you might want to try a nodetool resetlocal*schema* - Reset node's local *schema* and resync You asked for any thoughts, this is a thought, not sure if it will help, I hope so. C*heers, Alain 2015-06-30 1:44 GMT+02:00 Robert Coli rc...@eventbrite.com: On Mon, Jun 29, 2015 at 2:43 PM, David Aronchick aronch...@gmail.com wrote: Ping--- any thoughts here? I don't have any thoughts on your specific issue at this time, but FWIW #cassandra on freenode is sometimes a better forum for interactive debugging of operational edge cases. =Rob
Re: Read Consistency
On Tue, Jun 30, 2015 at 12:27 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote: Agree Tyler. I think its our application problem. If client returns failed write in spite of retries, application must have a rollback mechanism to make sure old state is restored. Failed write may be because of the fact that CL was not met even though one node successfully wrote.Cassandra wont do cleanup or rollback on one node so you need to do it yourself to make sure that integrity of data is maintained in case strong consistency is a requirement. Right? Correct, if you get a WriteTimeout error, you don't know if any replicas have written the data or not. It's even possible that all replicas wrote the data but didn't respond to the coordinator in time. I suspect most users handle this situation by retrying the write with the same timestamp (which makes the operation idempotent). It's worth noting that if you get an Unavailable response, you are guaranteed that the data has not been written to any replicas, because the coordinator already knew that the replicas were down when it got the response. -- Tyler Hobbs DataStax http://datastax.com/
Re: Migrate table data to another table
You might want to take a look at CQLSSTableWriter[1] in the Cassandra source tree. http://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated On Tue, Jun 30, 2015 at 1:18 PM, Umut Kocasaraç ukocasa...@gmail.com wrote: Hi, I want to change clustering order column of my table. As far as i know it is not possible to use alter command so i have created new table and i would like to move data from old table to this one. I am using Cassandra 2.0.7 and there is almost 100GB data on table. Is there any easy method to move data except Copy command. Thanks Umut -- - John
Migrate table data to another table
Hi, I want to change clustering order column of my table. As far as i know it is not possible to use alter command so i have created new table and i would like to move data from old table to this one. I am using Cassandra 2.0.7 and there is almost 100GB data on table. Is there any easy method to move data except Copy command. Thanks Umut
Commitlog still replaying after drain shutdown
Hi all, To quote Sebastian Estevez in one recent thread: You said you ran a nodetool drain before the restart, but your logs show commitlogs replayed. That does not add up... The docs seem to generally agree with this: if you did `nodetool drain` before restarting your node there shouldn't be any commitlogs. But my experience has been that if I do `nodetool drain`, I need to wait at least 30-60 seconds after it has finished if I really want no commitlog replay on restart. If I restart immediately (or even 10-20s later) then it replays plenty. (This was true on 2.X and is still true on 2.1.7 for me.) Is this unusual or the same thing others see? Is `nodetool drain` really supposed to wait until all memtables are flushed and commitlogs are deleted before it returns? Thanks, -dan
Re: Issue when node goes away?
Would it matter that I'm mixing cassandra versions? From: http://docs.datastax.com/en/upgrade/doc/upgrade/datastax_enterprise/upgrdLim.html General upgrade limitations¶ Do not run nodetool repair. Do not enable new features. Do not issue these types of queries during a rolling restart: DDL, TRUNCATE *During upgrades, the nodes on different versions show a schema disagreement*. I think this is a good lead. C*heers, Alain 2015-06-30 20:22 GMT+02:00 David Aronchick aronch...@gmail.com: I appreciate the thoughts! My issue is that it seems to work perfectly, until the node goes away. Would it matter that I'm mixing cassandra versions? (2.1.4 and 2.1.5)? On Tue, Jun 30, 2015 at 5:23 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi David ? What does a nodetool describecluster output look like ? My guess is you might be having a schema version desynchronisation. If you see a node with different schema version you might want to try a nodetool resetlocal*schema* - Reset node's local *schema* and resync You asked for any thoughts, this is a thought, not sure if it will help, I hope so. C*heers, Alain 2015-06-30 1:44 GMT+02:00 Robert Coli rc...@eventbrite.com: On Mon, Jun 29, 2015 at 2:43 PM, David Aronchick aronch...@gmail.com wrote: Ping--- any thoughts here? I don't have any thoughts on your specific issue at this time, but FWIW #cassandra on freenode is sometimes a better forum for interactive debugging of operational edge cases. =Rob
Re: InputCQLPageRowSize seems to be behaving differently (or I am doing something wrong)
Looking at the debug log, I see [2015-06-29 23:38:11] [main] DEBUG CqlRecordReader - cqlQuery SELECT wpid,value FROM qarth_catalog_dev.product_v1 WHERE token(wpid)? AND token(wpid)=? LIMIT 10 [2015-06-29 23:38:11] [main] DEBUG CqlRecordReader - created org.apache.cassandra.hadoop.cql3.CqlRecordReader$RowIterator@11963225 [2015-06-29 23:38:11] [main] DEBUG CqlRecordReader - Finished scanning 6 rows (estimate was: 0) I know the split has about 1000 rows, so why is the record reader not paging through the whole thing? I guess I am missing something very fundamental and I cannot figure it out from the manuals or the source code for CqlInputFormat and CqlRecordReader. Anyone have a working sample code they can share? Venky Kandaswamy 925-200-7124 On 6/29/15, 8:46 PM, Venkatesh Kandaswamy ve...@walmartlabs.com wrote: Apologize, I meant version C* 2.0.16 The latest 2.1.7 source has a different WordCount example and this does not use the CqlPagingInputFormat. I am comparing the differences to understand why the change was made. But if you can shed some light on the reasoning, it is much appreciated (and will save me a few hours of digging through the code). Venky Kandaswamy 925-200-7124 On 6/29/15, 8:40 PM, Venkatesh Kandaswamy ve...@walmartlabs.com wrote: I was going through the WordCount example in the latest 2.1.7 Apache C* source and there is a reference to org.apache.cassandra.hadoop.cql3.CqlPagingInputFormat, but it is not in the source tree or in the compiled binary. Looks like we really cannot use C* with Hadoop without a paging input format. Is there a reason why this was removed? But the example includes it. I am confused. Please shed some light if you know the answer. Venky Kandaswamy 925-200-7124 On 6/29/15, 1:15 PM, Venkatesh Kandaswamy ve...@walmartlabs.com wrote: All, I converted one of my C* programs to Hadoop 2.x and C* datastax drivers for 2.1.0. The original program (Hadoop 1.x) worked fine when we specified InputCQLPageRowSize and InputSplitSize to reasonable values. For example, if we had 60K rows, a row size of 100 and split size of 1 will run 6 mappers and give us 60K rows. When we switched to 2.1.x version of the datastax drivers, the same program now gives only 600 rows. It looks like the paging logic has changed and the page size is only getting the first 100 rows. How do we get all the rows? [cid:E4089CAC-450F-40E4-8A26-88A74F209FC9] Venky Kandaswamy 925-200-7124
Re: Commitlog still replaying after drain shutdown
On Tue, Jun 30, 2015 at 11:59 AM, Dan Kinder dkin...@turnitin.com wrote: Is this unusual or the same thing others see? Is `nodetool drain` really supposed to wait until all memtables are flushed and commitlogs are deleted before it returns? nodetool drain *should* work the way you suggest - if one runs it before shutdown and waits for the DRAINED message in the log, one should get no commitlog replay on start. In practice, this has historically not-worked at various times. https://issues.apache.org/jira/browse/CASSANDRA-4446 https://issues.apache.org/jira/browse/CASSANDRA-5911 / https://issues.apache.org/jira/browse/CASSANDRA-3578 (marked fixed in 2.1 beta) It's pretty harmless to over-replay in most cases, one is most-hosed if one is using counters. But if you can repro in 2.1.7 by restarting at any time after the DRAINED message, you should file an issue on issues.apache.org and reply here to let the list know the URL for it. =Rob
Re: Read Consistency
On Tue, Jun 30, 2015 at 11:16 AM, Tyler Hobbs ty...@datastax.com wrote: Correct, if you get a WriteTimeout error, you don't know if any replicas have written the data or not. It's even possible that all replicas wrote the data but didn't respond to the coordinator in time. I suspect most users handle this situation by retrying the write with the same timestamp (which makes the operation idempotent*). *Unless the operation is a counter increment/decrement, in which case it cannot be idempotent. This is why people who need very-accurate counters should probably not use Cassandra for them. :) =Rob