Re: Decommissioned node still in Gossip
Hi, just a guess, there was a possibility to purge gossip state on a node, at least in version 1.2 http://docs.datastax.com/en/cassandra/1.2/cassandra/architecture/architectureGossipPurge_t.html the trick was to add -Dcassandra.load_ring_state=false somehow to the jvm parameters I'm not sure if it works for 2.0 (since it's not in the doc for 2.x) 2015-07-01 10:23 GMT+03:00 Jeff Williams je...@wherethebitsroam.com: Thanks for the tip Aiman, but this node is not in the seed list anywhere. Jeff On 30 June 2015 at 18:16, Aiman Parvaiz ai...@flipagram.com wrote: I was having exactly the same issue with the same version, check your seed list and make sure it contains only the live nodes, I know that seeds are only read when cassandra starts but updating the seed list to live nodes and then doing a roiling restart fixed this issue for me. I hope this helps you. Thanks Sent from my iPhone On Jun 30, 2015, at 4:42 AM, Jeff Williams je...@wherethebitsroam.com wrote: Hi, I have a cluster which had 4 datacenters running 2.0.12. Last week one of the datacenters was decommissioned using nodetool decommission on each of the servers in turn. This seemed to work fine until one of the nodes started appearing in the logs of all of the remaining servers with messages like: INFO [GossipStage:3] 2015-06-30 11:22:39,189 Gossiper.java (line 924) InetAddress /172.29.8.8 is now DOWN INFO [GossipStage:3] 2015-06-30 11:22:39,190 StorageService.java (line 1773) Removing tokens [...] for /172.29.8.8 These come up in the log every minute or two. I believe it may have re-appeared after a repair, but I'm not sure. The problem is that this node does not exist in nodetool status, nodetool gossipinfo or in the system.peers table. So how can tell the cluster that this node is decommissioned? Regards, Jeff -- with best regards, Vitalii Skakun
Re: Issue when node goes away?
Basically, when you add nodes, add them on the correct version to avoid schema / network issues in your streams. Also, try to update all the node using rolling restarts in a reduced time frame after stopping repairs, with all the nodes up, etc. You must have a healthy cluster before performing an upgrade. Remember running a nodetool upgradesstable after any upgrade (if not needed it will end fast, so I would run it always as a best practice, just in case). C*heers, Alain 2015-07-01 2:16 GMT+02:00 David Aronchick aronch...@gmail.com: That is a GREAT lead! So it looks like I can't add a few nodes to the cluster of the new version, have it settle down, and then upgrade the rest? On Tue, Jun 30, 2015 at 11:58 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Would it matter that I'm mixing cassandra versions? From: http://docs.datastax.com/en/upgrade/doc/upgrade/datastax_enterprise/upgrdLim.html General upgrade limitations¶ Do not run nodetool repair. Do not enable new features. Do not issue these types of queries during a rolling restart: DDL, TRUNCATE *During upgrades, the nodes on different versions show a schema disagreement*. I think this is a good lead. C*heers, Alain 2015-06-30 20:22 GMT+02:00 David Aronchick aronch...@gmail.com: I appreciate the thoughts! My issue is that it seems to work perfectly, until the node goes away. Would it matter that I'm mixing cassandra versions? (2.1.4 and 2.1.5)? On Tue, Jun 30, 2015 at 5:23 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi David ? What does a nodetool describecluster output look like ? My guess is you might be having a schema version desynchronisation. If you see a node with different schema version you might want to try a nodetool resetlocal*schema* - Reset node's local *schema* and resync You asked for any thoughts, this is a thought, not sure if it will help, I hope so. C*heers, Alain 2015-06-30 1:44 GMT+02:00 Robert Coli rc...@eventbrite.com: On Mon, Jun 29, 2015 at 2:43 PM, David Aronchick aronch...@gmail.com wrote: Ping--- any thoughts here? I don't have any thoughts on your specific issue at this time, but FWIW #cassandra on freenode is sometimes a better forum for interactive debugging of operational edge cases. =Rob
Re: Error while adding a new node.
Just check the process owner to be sure (top, htop, ps, ...) http://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html#reference_ds_sxl_gf3_2k__user-resource-limits C*heers, Alain 2015-07-01 7:33 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Arun, I am logging on to Server as root and running (sudo service cassandra start) regards Neha On Wed, Jul 1, 2015 at 11:00 AM, Neha Trivedi nehajtriv...@gmail.com wrote: Thanks Arun ! I will try and get back ! On Wed, Jul 1, 2015 at 10:32 AM, Arun arunsi...@gmail.com wrote: Looks like you have too many open files issue. Increase the ulimit for the user. If you are starting the cassandra daemon using user cassandra, increase the ulimit for that user. On Jun 30, 2015, at 21:16, Neha Trivedi nehajtriv...@gmail.com wrote: Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Experiencing Timeouts on one node
We have a 28 node cluster, out of which only one node is experiencing timeouts. We thought it was the raid, but there are two other nodes on the same raid without any problem. Also The problem goes away if we reboot the node, and then reappears after seven days. The following hinted hand-off timeouts are seen on the node experiencing the timeouts. Also we did not notice any gossip errors. I was wondering if anyone has seen this issue and how they resolved it. Cassandra Version: 1.2.15.1 OS: Linux cm 2.6.32-504.8.1.el6.x86_64 #1 SMP Fri Dec 19 12:09:25 EST 2014 x86_64 x86_64 x86_64 GNU/Linux java version 1.6.0_85 INFO [HintedHandoff:2] 2015-06-17 22:52:08,130 HintedHandOffManager.java (line 296) Started hinted handoff for host: 4fe86051-6bca-4c28-b09c-1b0f073c1588 with IP: /192.168.1.122 INFO [HintedHandoff:1] 2015-06-17 22:52:08,131 HintedHandOffManager.java (line 296) Started hinted handoff for host: bbf0878b-b405-4518-b649-f6cf7c9a6550 with IP: /192.168.1.119 INFO [HintedHandoff:2] 2015-06-17 22:52:17,634 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.122; aborting (0 delivered) INFO [HintedHandoff:2] 2015-06-17 22:52:17,635 HintedHandOffManager.java (line 296) Started hinted handoff for host: f7b7ab10-4d42-4f0c-af92-2934a075bee3 with IP: /192.168.1.108 INFO [HintedHandoff:1] 2015-06-17 22:52:17,643 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.119; aborting (0 delivered) INFO [HintedHandoff:1] 2015-06-17 22:52:17,643 HintedHandOffManager.java (line 296) Started hinted handoff for host: ddb79f35-3e2b-4be8-84d8-7942086e2b73 with IP: /192.168.1.104 INFO [HintedHandoff:2] 2015-06-17 22:52:27,143 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.108; aborting (0 delivered) INFO [HintedHandoff:2] 2015-06-17 22:52:27,144 HintedHandOffManager.java (line 296) Started hinted handoff for host: 6a2fa431-4a51-44cb-af19-1991c960e075 with IP: /192.168.1.117 INFO [HintedHandoff:1] 2015-06-17 22:52:27,153 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.104; aborting (0 delivered) INFO [HintedHandoff:1] 2015-06-17 22:52:27,154 HintedHandOffManager.java (line 296) Started hinted handoff for host: cf03174a-533c-44d6-a679-e70090ad2bc5 with IP: /192.168.1.107 Thanks -shashi..
Re: Issue when node goes away?
When you say add 2 nodes, do you mean bootstrap, or upgrade in place? On Wed, Jul 1, 2015 at 11:37 AM David Aronchick aronch...@gmail.com wrote: This helps - so let me understand: Starting point: - 4 nodes running 2.1.4 - System is healthy Decide to upgrade: - Add 2 nodes running 2.1.5 - Run nodetool upgradestable - Wait until system is healthy - Stop 2 nodes running 2.1.4 - Run nodetool upgradestable - Add 2 nodes running 2.1.5 - Run nodetool upgradestable - Stop 2 nodes running 2.1.4 - Run nodetool upgradestable Finished? On Wed, Jul 1, 2015 at 1:49 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Basically, when you add nodes, add them on the correct version to avoid schema / network issues in your streams. Also, try to update all the node using rolling restarts in a reduced time frame after stopping repairs, with all the nodes up, etc. You must have a healthy cluster before performing an upgrade. Remember running a nodetool upgradesstable after any upgrade (if not needed it will end fast, so I would run it always as a best practice, just in case). C*heers, Alain 2015-07-01 2:16 GMT+02:00 David Aronchick aronch...@gmail.com: That is a GREAT lead! So it looks like I can't add a few nodes to the cluster of the new version, have it settle down, and then upgrade the rest? On Tue, Jun 30, 2015 at 11:58 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Would it matter that I'm mixing cassandra versions? From: http://docs.datastax.com/en/upgrade/doc/upgrade/datastax_enterprise/upgrdLim.html General upgrade limitations¶ Do not run nodetool repair. Do not enable new features. Do not issue these types of queries during a rolling restart: DDL, TRUNCATE *During upgrades, the nodes on different versions show a schema disagreement*. I think this is a good lead. C*heers, Alain 2015-06-30 20:22 GMT+02:00 David Aronchick aronch...@gmail.com: I appreciate the thoughts! My issue is that it seems to work perfectly, until the node goes away. Would it matter that I'm mixing cassandra versions? (2.1.4 and 2.1.5)? On Tue, Jun 30, 2015 at 5:23 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi David ? What does a nodetool describecluster output look like ? My guess is you might be having a schema version desynchronisation. If you see a node with different schema version you might want to try a nodetool resetlocal*schema* - Reset node's local *schema* and resync You asked for any thoughts, this is a thought, not sure if it will help, I hope so. C*heers, Alain 2015-06-30 1:44 GMT+02:00 Robert Coli rc...@eventbrite.com: On Mon, Jun 29, 2015 at 2:43 PM, David Aronchick aronch...@gmail.com wrote: Ping--- any thoughts here? I don't have any thoughts on your specific issue at this time, but FWIW #cassandra on freenode is sometimes a better forum for interactive debugging of operational edge cases. =Rob
Re: Issue when node goes away?
This helps - so let me understand: Starting point: - 4 nodes running 2.1.4 - System is healthy Decide to upgrade: - Add 2 nodes running 2.1.5 - Run nodetool upgradestable - Wait until system is healthy - Stop 2 nodes running 2.1.4 - Run nodetool upgradestable - Add 2 nodes running 2.1.5 - Run nodetool upgradestable - Stop 2 nodes running 2.1.4 - Run nodetool upgradestable Finished? On Wed, Jul 1, 2015 at 1:49 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Basically, when you add nodes, add them on the correct version to avoid schema / network issues in your streams. Also, try to update all the node using rolling restarts in a reduced time frame after stopping repairs, with all the nodes up, etc. You must have a healthy cluster before performing an upgrade. Remember running a nodetool upgradesstable after any upgrade (if not needed it will end fast, so I would run it always as a best practice, just in case). C*heers, Alain 2015-07-01 2:16 GMT+02:00 David Aronchick aronch...@gmail.com: That is a GREAT lead! So it looks like I can't add a few nodes to the cluster of the new version, have it settle down, and then upgrade the rest? On Tue, Jun 30, 2015 at 11:58 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Would it matter that I'm mixing cassandra versions? From: http://docs.datastax.com/en/upgrade/doc/upgrade/datastax_enterprise/upgrdLim.html General upgrade limitations¶ Do not run nodetool repair. Do not enable new features. Do not issue these types of queries during a rolling restart: DDL, TRUNCATE *During upgrades, the nodes on different versions show a schema disagreement*. I think this is a good lead. C*heers, Alain 2015-06-30 20:22 GMT+02:00 David Aronchick aronch...@gmail.com: I appreciate the thoughts! My issue is that it seems to work perfectly, until the node goes away. Would it matter that I'm mixing cassandra versions? (2.1.4 and 2.1.5)? On Tue, Jun 30, 2015 at 5:23 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi David ? What does a nodetool describecluster output look like ? My guess is you might be having a schema version desynchronisation. If you see a node with different schema version you might want to try a nodetool resetlocal*schema* - Reset node's local *schema* and resync You asked for any thoughts, this is a thought, not sure if it will help, I hope so. C*heers, Alain 2015-06-30 1:44 GMT+02:00 Robert Coli rc...@eventbrite.com: On Mon, Jun 29, 2015 at 2:43 PM, David Aronchick aronch...@gmail.com wrote: Ping--- any thoughts here? I don't have any thoughts on your specific issue at this time, but FWIW #cassandra on freenode is sometimes a better forum for interactive debugging of operational edge cases. =Rob
Seed gossip version error
Hi, I have a running cluster running with version 2.1.7. Two of the machines went down and they are not joining the cluster even after restart. I see the following WARN message in system.log in all the nodes: system.log:WARN [MessagingService-Outgoing-cassandra2.cleartrip.com/172.18.3.32] 2015-07-01 13:00:41,878 OutboundTcpConnection.java:414 - Seed gossip version is -2147483648; will not connect with that version Please let me know if you have faced the same problem. Regards, Amlan
High load on cassandra node
HI I have a 6 node cluster and I ran a major compaction on node 1 but I found that the load reached very high levels on node 2. Is this explainable? Attaching tpstats and metrics: cassandra-2 ~]$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MutationStage 0 0 185152938 0 0 ReadStage 0 0490 0 0 RequestResponseStage 0 0 168660091 0 0 ReadRepairStage 0 0 21247 0 0 ReplicateOnWriteStage32 6186 88699535 0 7163 MiscStage 0 0 0 0 0 HintedHandoff 0 1 1090 0 0 FlushWriter 0 0 2059 0 13 MemoryMeter 0 0 3922 0 0 GossipStage 0 02246873 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CompactionExecutor0 0 12353 0 0 ValidationExecutor0 0 0 0 0 MigrationStage0 0 1 0 0 commitlog_archiver0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 PendingRangeCalculator0 0 16 0 0 MemtablePostFlusher 0 0 10932 0 0 Message type Dropped READ 49051 RANGE_SLICE 0 _TRACE 0 MUTATION 269 COUNTER_MUTATION 185 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 Also I saw that the nativetransportreqests was 23 in active and 23 in pending, I found this in opscenter. Any settings i can make to keep the load under control? Appreciate any help.. Thanks
Re: High load on cassandra node
Looks like a CASSANDRA-6405 (replicate on write is the counter tp). Upgrade to the latest 2.1 version and let us know if he situation improves. Major compactions are usually a bad idea by the way. Do you really want one huge sstable? On Jul 1, 2015 10:03 AM, Jayapandian Ponraj pandian...@gmail.com wrote: HI I have a 6 node cluster and I ran a major compaction on node 1 but I found that the load reached very high levels on node 2. Is this explainable? Attaching tpstats and metrics: cassandra-2 ~]$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MutationStage 0 0 185152938 0 0 ReadStage 0 0490 0 0 RequestResponseStage 0 0 168660091 0 0 ReadRepairStage 0 0 21247 0 0 ReplicateOnWriteStage32 6186 88699535 0 7163 MiscStage 0 0 0 0 0 HintedHandoff 0 1 1090 0 0 FlushWriter 0 0 2059 0 13 MemoryMeter 0 0 3922 0 0 GossipStage 0 02246873 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CompactionExecutor0 0 12353 0 0 ValidationExecutor0 0 0 0 0 MigrationStage0 0 1 0 0 commitlog_archiver0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 PendingRangeCalculator0 0 16 0 0 MemtablePostFlusher 0 0 10932 0 0 Message type Dropped READ 49051 RANGE_SLICE 0 _TRACE 0 MUTATION 269 COUNTER_MUTATION 185 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 Also I saw that the nativetransportreqests was 23 in active and 23 in pending, I found this in opscenter. Any settings i can make to keep the load under control? Appreciate any help.. Thanks
RE: How to measure disk space used by a keyspace?
That’s ok for a single node, but to answer the question, “how big is my table across the cluster?” it would be much better if the cluster could provide an answer. Sean Durity From: Jonathan Haddad [mailto:j...@jonhaddad.com] Sent: Monday, June 29, 2015 8:15 AM To: user Subject: Re: How to measure disk space used by a keyspace? If you're looking to measure actual disk space, I'd use the du command, assuming you're on a linux: http://linuxconfig.org/du-1-manual-page On Mon, Jun 29, 2015 at 2:26 AM shahab shahab.mok...@gmail.commailto:shahab.mok...@gmail.com wrote: Hi, Probably this question has been already asked in the mailing list, but I couldn't find it. The question is how to measure disk-space used by a keyspace, column family wise, excluding snapshots? best, /Shahab The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
Re: Cassandra leap second
We also experienced same, i.e. high cpu on Cassandra 1.1.4 node running in AWS. Restarting the vm worked. On Wed, Jul 1, 2015 at 4:58 AM, Jason Wee peich...@gmail.com wrote: same here too, on branch 1.1 and have not seen any high cpu usage. On Wed, Jul 1, 2015 at 2:52 PM, John Wong gokoproj...@gmail.com wrote: Which version are you running and what's your kernel version? We are still running on 1.2 branch but we have not seen any high cpu usage yet... On Tue, Jun 30, 2015 at 11:10 PM, snair123 . nair...@outlook.com wrote: reboot of the machine worked -- From: nair...@outlook.com To: user@cassandra.apache.org Subject: Cassandra leap second Date: Wed, 1 Jul 2015 02:54:53 + Is it ok to run this https://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ Seeing high cpu consumption for cassandra process -- Sent from Jeff Dean's printf() mobile console -- Narendra Sharma Software Engineer *http://www.aeris.com http://www.aeris.com* *http://narendrasharma.blogspot.com/ http://narendrasharma.blogspot.com/*
Re: Error while adding a new node.
One of the column family has SStable count as under : SSTable count: 98506 Can it be because of 2.1.3 version of cassandra.. I found this : https://issues.apache.org/jira/browse/CASSANDRA-8964 regards Neha On Wed, Jul 1, 2015 at 5:40 PM, Jason Wee peich...@gmail.com wrote: nodetool cfstats? On Wed, Jul 1, 2015 at 8:08 PM, Neha Trivedi nehajtriv...@gmail.com wrote: Hey.. nodetool compactionstats pending tasks: 0 no pending tasks. Dont have opscenter. how do I monitor sstables? On Wed, Jul 1, 2015 at 4:28 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: You also might want to check if you have compactions pending (Opscenter / nodetool compactionstats). Also you can monitor the number of sstables. C*heers Alain 2015-07-01 11:53 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Thanks I will checkout. I increased the ulimit to 10, but I am getting the same error, but after a while. regards Neha On Wed, Jul 1, 2015 at 2:22 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Just check the process owner to be sure (top, htop, ps, ...) http://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html#reference_ds_sxl_gf3_2k__user-resource-limits C*heers, Alain 2015-07-01 7:33 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Arun, I am logging on to Server as root and running (sudo service cassandra start) regards Neha On Wed, Jul 1, 2015 at 11:00 AM, Neha Trivedi nehajtriv...@gmail.com wrote: Thanks Arun ! I will try and get back ! On Wed, Jul 1, 2015 at 10:32 AM, Arun arunsi...@gmail.com wrote: Looks like you have too many open files issue. Increase the ulimit for the user. If you are starting the cassandra daemon using user cassandra, increase the ulimit for that user. On Jun 30, 2015, at 21:16, Neha Trivedi nehajtriv...@gmail.com wrote: Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Re: Cassandra leap second
same here too, on branch 1.1 and have not seen any high cpu usage. On Wed, Jul 1, 2015 at 2:52 PM, John Wong gokoproj...@gmail.com wrote: Which version are you running and what's your kernel version? We are still running on 1.2 branch but we have not seen any high cpu usage yet... On Tue, Jun 30, 2015 at 11:10 PM, snair123 . nair...@outlook.com wrote: reboot of the machine worked -- From: nair...@outlook.com To: user@cassandra.apache.org Subject: Cassandra leap second Date: Wed, 1 Jul 2015 02:54:53 + Is it ok to run this https://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ Seeing high cpu consumption for cassandra process -- Sent from Jeff Dean's printf() mobile console
Re: Error while adding a new node.
Hey.. nodetool compactionstats pending tasks: 0 no pending tasks. Dont have opscenter. how do I monitor sstables? On Wed, Jul 1, 2015 at 4:28 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: You also might want to check if you have compactions pending (Opscenter / nodetool compactionstats). Also you can monitor the number of sstables. C*heers Alain 2015-07-01 11:53 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Thanks I will checkout. I increased the ulimit to 10, but I am getting the same error, but after a while. regards Neha On Wed, Jul 1, 2015 at 2:22 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Just check the process owner to be sure (top, htop, ps, ...) http://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html#reference_ds_sxl_gf3_2k__user-resource-limits C*heers, Alain 2015-07-01 7:33 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Arun, I am logging on to Server as root and running (sudo service cassandra start) regards Neha On Wed, Jul 1, 2015 at 11:00 AM, Neha Trivedi nehajtriv...@gmail.com wrote: Thanks Arun ! I will try and get back ! On Wed, Jul 1, 2015 at 10:32 AM, Arun arunsi...@gmail.com wrote: Looks like you have too many open files issue. Increase the ulimit for the user. If you are starting the cassandra daemon using user cassandra, increase the ulimit for that user. On Jun 30, 2015, at 21:16, Neha Trivedi nehajtriv...@gmail.com wrote: Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Re: Error while adding a new node.
You also might want to check if you have compactions pending (Opscenter / nodetool compactionstats). Also you can monitor the number of sstables. C*heers Alain 2015-07-01 11:53 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Thanks I will checkout. I increased the ulimit to 10, but I am getting the same error, but after a while. regards Neha On Wed, Jul 1, 2015 at 2:22 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Just check the process owner to be sure (top, htop, ps, ...) http://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html#reference_ds_sxl_gf3_2k__user-resource-limits C*heers, Alain 2015-07-01 7:33 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Arun, I am logging on to Server as root and running (sudo service cassandra start) regards Neha On Wed, Jul 1, 2015 at 11:00 AM, Neha Trivedi nehajtriv...@gmail.com wrote: Thanks Arun ! I will try and get back ! On Wed, Jul 1, 2015 at 10:32 AM, Arun arunsi...@gmail.com wrote: Looks like you have too many open files issue. Increase the ulimit for the user. If you are starting the cassandra daemon using user cassandra, increase the ulimit for that user. On Jun 30, 2015, at 21:16, Neha Trivedi nehajtriv...@gmail.com wrote: Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Re: Error while adding a new node.
nodetool cfstats? On Wed, Jul 1, 2015 at 8:08 PM, Neha Trivedi nehajtriv...@gmail.com wrote: Hey.. nodetool compactionstats pending tasks: 0 no pending tasks. Dont have opscenter. how do I monitor sstables? On Wed, Jul 1, 2015 at 4:28 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: You also might want to check if you have compactions pending (Opscenter / nodetool compactionstats). Also you can monitor the number of sstables. C*heers Alain 2015-07-01 11:53 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Thanks I will checkout. I increased the ulimit to 10, but I am getting the same error, but after a while. regards Neha On Wed, Jul 1, 2015 at 2:22 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Just check the process owner to be sure (top, htop, ps, ...) http://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html#reference_ds_sxl_gf3_2k__user-resource-limits C*heers, Alain 2015-07-01 7:33 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Arun, I am logging on to Server as root and running (sudo service cassandra start) regards Neha On Wed, Jul 1, 2015 at 11:00 AM, Neha Trivedi nehajtriv...@gmail.com wrote: Thanks Arun ! I will try and get back ! On Wed, Jul 1, 2015 at 10:32 AM, Arun arunsi...@gmail.com wrote: Looks like you have too many open files issue. Increase the ulimit for the user. If you are starting the cassandra daemon using user cassandra, increase the ulimit for that user. On Jun 30, 2015, at 21:16, Neha Trivedi nehajtriv...@gmail.com wrote: Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Re: Error while adding a new node.
Thanks I will checkout. I increased the ulimit to 10, but I am getting the same error, but after a while. regards Neha On Wed, Jul 1, 2015 at 2:22 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Just check the process owner to be sure (top, htop, ps, ...) http://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html#reference_ds_sxl_gf3_2k__user-resource-limits C*heers, Alain 2015-07-01 7:33 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Arun, I am logging on to Server as root and running (sudo service cassandra start) regards Neha On Wed, Jul 1, 2015 at 11:00 AM, Neha Trivedi nehajtriv...@gmail.com wrote: Thanks Arun ! I will try and get back ! On Wed, Jul 1, 2015 at 10:32 AM, Arun arunsi...@gmail.com wrote: Looks like you have too many open files issue. Increase the ulimit for the user. If you are starting the cassandra daemon using user cassandra, increase the ulimit for that user. On Jun 30, 2015, at 21:16, Neha Trivedi nehajtriv...@gmail.com wrote: Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Re: Migrate table data to another table
Not yet, there is a ticket for CTAS for copy which will use spark to do what you are looking for. On Jul 1, 2015 3:31 AM, Umut Kocasaraç ukocasa...@gmail.com wrote: In both methods i have to export table to one file and the i can move data to new table. Can i copy from one table to another? 30 Haz 2015 Sal, 20:31 tarihinde, Sebastian Estevez sebastian.este...@datastax.com şunu yazdı: Another option is Brian's cassandra loader: https://github.com/brianmhess/cassandra-loader All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Tue, Jun 30, 2015 at 1:26 PM, John Sanda john.sa...@gmail.com wrote: You might want to take a look at CQLSSTableWriter[1] in the Cassandra source tree. http://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated On Tue, Jun 30, 2015 at 1:18 PM, Umut Kocasaraç ukocasa...@gmail.com wrote: Hi, I want to change clustering order column of my table. As far as i know it is not possible to use alter command so i have created new table and i would like to move data from old table to this one. I am using Cassandra 2.0.7 and there is almost 100GB data on table. Is there any easy method to move data except Copy command. Thanks Umut -- - John
Re: How to measure disk space used by a keyspace?
If you are pushing metric data to graphite, there is org.apache.cassandra.metrics.keyspace.keyspace_name.LiveDiskSpaceUsed.value … for each node; Easy enough to graph the sum across machines. Metrics/JMX are tied together in C*, so there is an equivalent value exposed via JMX… I don’t know what it is called off the top of my head, but would be something similar to the above. On Jul 1, 2015, at 9:28 AM, sean_r_dur...@homedepot.com wrote: That’s ok for a single node, but to answer the question, “how big is my table across the cluster?” it would be much better if the cluster could provide an answer. Sean Durity From: Jonathan Haddad [mailto:j...@jonhaddad.com mailto:j...@jonhaddad.com] Sent: Monday, June 29, 2015 8:15 AM To: user Subject: Re: How to measure disk space used by a keyspace? If you're looking to measure actual disk space, I'd use the du command, assuming you're on a linux: http://linuxconfig.org/du-1-manual-page http://linuxconfig.org/du-1-manual-page On Mon, Jun 29, 2015 at 2:26 AM shahab shahab.mok...@gmail.com mailto:shahab.mok...@gmail.com wrote: Hi, Probably this question has been already asked in the mailing list, but I couldn't find it. The question is how to measure disk-space used by a keyspace, column family wise, excluding snapshots? best, /Shahab The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment. smime.p7s Description: S/MIME cryptographic signature
RE: Stream failure while adding a new node
Hi Alain, We still have the timeout problem in OPSCenter and we still didn’t solve this problem so no we didn’t ran an entire repair with the repair service. And yes, during this try, we’ve set auto_bootstrap to true and ran a repair on the 9th node after it finished streaming. Thank you for your help. Best regards, [cid:image001.png@01D0B41D.54B89AA0] David CHARBONNIER Sysadmin T : +33 411 934 200 david.charbonn...@rgsystem.commailto:david.charbonn...@rgsystem.com ZAC Aéroport 125 Impasse Adam Smith 34470 Pérols - France www.rgsystem.comhttp://www.rgsystem.com/ [cid:image003.png@01D0B41D.54B89AA0] De : Alain RODRIGUEZ [mailto:arodr...@gmail.com] Envoyé : mardi 30 juin 2015 15:18 À : user@cassandra.apache.org Objet : Re: Stream failure while adding a new node Hi David, Are you sure you ran the repair entirely (9 days + repair logs ok on opscenter server) before adding the 10th node ? This is important to avoid potential data loss ! Did you set auto_bootstrap to true on this 10th node ? C*heers, Alain 2015-06-29 14:54 GMT+02:00 David CHARBONNIER david.charbonn...@rgsystem.commailto:david.charbonn...@rgsystem.com: Hi, We’re using Cassandra 2.0.8.39 through Datastax Enterprise 4.5.1 with a 9 nodes cluster. We need to add a few new nodes to the cluster but we’re experiencing an issue we don’t know how to solve. Here is exactly what we did : - We had 8 nodes and need to add a few ones - We tried to add 9th node but stream stucked a very long time and bootstrap never finish (related to streaming_socket_timeout_in_ms default value in cassandra.yaml) - We ran a solution given by a Datastax’s architect : restart the node with auto_bootstrap set to false and run a repair - After this issue, we ran into pathing the default configuration on all our nodes to avoid this problem and made a rolling restart of the cluster - Then, we tried adding a 10th node but it receives stream from only one node (node2). Here is the logs on this problematic node (node10) : INFO [main] 2015-06-26 15:25:59,490 StreamResultFuture.java (line 87) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Executing streaming plan for Bootstrap INFO [main] 2015-06-26 15:25:59,490 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node6 INFO [main] 2015-06-26 15:25:59,491 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node5 INFO [main] 2015-06-26 15:25:59,492 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node4 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node3 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node9 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node8 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node7 INFO [main] 2015-06-26 15:25:59,494 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node1 INFO [main] 2015-06-26 15:25:59,494 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node2 INFO [STREAM-IN-/node6] 2015-06-26 15:25:59,515 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node6 is complete INFO [STREAM-IN-/node4] 2015-06-26 15:25:59,516 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node4 is complete INFO [STREAM-IN-/node5] 2015-06-26 15:25:59,517 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node5 is complete INFO [STREAM-IN-/node3] 2015-06-26 15:25:59,527 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node3 is complete INFO [STREAM-IN-/node1] 2015-06-26 15:25:59,528 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node1 is complete INFO [STREAM-IN-/node8] 2015-06-26 15:25:59,530 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node8 is complete INFO [STREAM-IN-/node7] 2015-06-26 15:25:59,531 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node7 is complete INFO [STREAM-IN-/node9] 2015-06-26 15:25:59,533 StreamResultFuture.java (line 186) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Session with /node9 is complete INFO [STREAM-IN-/node2] 2015-06-26 15:26:04,874
Re: Cassandra leap second
I think it is more a kernel / java version issue than a Cassandra version related issue. From Datastax: Leap Second on June 30, 2015 Required Actions: Ensure that you are running on kernel version 3.4 or higher and using JDK version 7u60 or higher. This should protect you from the livelock problems users experienced in 2012. We had recent versions on both things and ran smoothly this night. C*heers 2015-07-01 16:29 GMT+02:00 Narendra Sharma narendra.sha...@gmail.com: We also experienced same, i.e. high cpu on Cassandra 1.1.4 node running in AWS. Restarting the vm worked. On Wed, Jul 1, 2015 at 4:58 AM, Jason Wee peich...@gmail.com wrote: same here too, on branch 1.1 and have not seen any high cpu usage. On Wed, Jul 1, 2015 at 2:52 PM, John Wong gokoproj...@gmail.com wrote: Which version are you running and what's your kernel version? We are still running on 1.2 branch but we have not seen any high cpu usage yet... On Tue, Jun 30, 2015 at 11:10 PM, snair123 . nair...@outlook.com wrote: reboot of the machine worked -- From: nair...@outlook.com To: user@cassandra.apache.org Subject: Cassandra leap second Date: Wed, 1 Jul 2015 02:54:53 + Is it ok to run this https://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ Seeing high cpu consumption for cassandra process -- Sent from Jeff Dean's printf() mobile console -- Narendra Sharma Software Engineer *http://www.aeris.com http://www.aeris.com* *http://narendrasharma.blogspot.com/ http://narendrasharma.blogspot.com/*
Re: Issue when node goes away?
Adding new nodes of a different version is asking for trouble, as others have said. I don't know if this particular version bump would expect any issues, but why risk it? If you want to upgrade, do it to the nodes in place. Don't bootstrap new nodes, don't run repair, don't remove nodes. My 2 cents. On Wed, Jul 1, 2015 at 11:59 AM David Aronchick aronch...@gmail.com wrote: I mean add an additional two nodes to my cluster and pointing them at the other nodes in the cluster, to handle data migration. On Wed, Jul 1, 2015 at 11:40 AM, Jonathan Haddad j...@jonhaddad.com wrote: When you say add 2 nodes, do you mean bootstrap, or upgrade in place? On Wed, Jul 1, 2015 at 11:37 AM David Aronchick aronch...@gmail.com wrote: This helps - so let me understand: Starting point: - 4 nodes running 2.1.4 - System is healthy Decide to upgrade: - Add 2 nodes running 2.1.5 - Run nodetool upgradestable - Wait until system is healthy - Stop 2 nodes running 2.1.4 - Run nodetool upgradestable - Add 2 nodes running 2.1.5 - Run nodetool upgradestable - Stop 2 nodes running 2.1.4 - Run nodetool upgradestable Finished? On Wed, Jul 1, 2015 at 1:49 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Basically, when you add nodes, add them on the correct version to avoid schema / network issues in your streams. Also, try to update all the node using rolling restarts in a reduced time frame after stopping repairs, with all the nodes up, etc. You must have a healthy cluster before performing an upgrade. Remember running a nodetool upgradesstable after any upgrade (if not needed it will end fast, so I would run it always as a best practice, just in case). C*heers, Alain 2015-07-01 2:16 GMT+02:00 David Aronchick aronch...@gmail.com: That is a GREAT lead! So it looks like I can't add a few nodes to the cluster of the new version, have it settle down, and then upgrade the rest? On Tue, Jun 30, 2015 at 11:58 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Would it matter that I'm mixing cassandra versions? From: http://docs.datastax.com/en/upgrade/doc/upgrade/datastax_enterprise/upgrdLim.html General upgrade limitations¶ Do not run nodetool repair. Do not enable new features. Do not issue these types of queries during a rolling restart: DDL, TRUNCATE *During upgrades, the nodes on different versions show a schema disagreement*. I think this is a good lead. C*heers, Alain 2015-06-30 20:22 GMT+02:00 David Aronchick aronch...@gmail.com: I appreciate the thoughts! My issue is that it seems to work perfectly, until the node goes away. Would it matter that I'm mixing cassandra versions? (2.1.4 and 2.1.5)? On Tue, Jun 30, 2015 at 5:23 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi David ? What does a nodetool describecluster output look like ? My guess is you might be having a schema version desynchronisation. If you see a node with different schema version you might want to try a nodetool resetlocal*schema* - Reset node's local *schema* and resync You asked for any thoughts, this is a thought, not sure if it will help, I hope so. C*heers, Alain 2015-06-30 1:44 GMT+02:00 Robert Coli rc...@eventbrite.com: On Mon, Jun 29, 2015 at 2:43 PM, David Aronchick aronch...@gmail.com wrote: Ping--- any thoughts here? I don't have any thoughts on your specific issue at this time, but FWIW #cassandra on freenode is sometimes a better forum for interactive debugging of operational edge cases. =Rob
Re: Issue when node goes away?
On Wed, Jul 1, 2015 at 12:14 PM, Jonathan Haddad j...@jonhaddad.com wrote: Adding new nodes of a different version is asking for trouble, as others have said. I don't know if this particular version bump would expect any issues, but why risk it? If you want to upgrade, do it to the nodes in place. Don't bootstrap new nodes, don't run repair, don't remove nodes. My 2 cents. If someone else adds in a penny, we'll have a nickle. :D =Rob
Re: Issue when node goes away?
I mean add an additional two nodes to my cluster and pointing them at the other nodes in the cluster, to handle data migration. On Wed, Jul 1, 2015 at 11:40 AM, Jonathan Haddad j...@jonhaddad.com wrote: When you say add 2 nodes, do you mean bootstrap, or upgrade in place? On Wed, Jul 1, 2015 at 11:37 AM David Aronchick aronch...@gmail.com wrote: This helps - so let me understand: Starting point: - 4 nodes running 2.1.4 - System is healthy Decide to upgrade: - Add 2 nodes running 2.1.5 - Run nodetool upgradestable - Wait until system is healthy - Stop 2 nodes running 2.1.4 - Run nodetool upgradestable - Add 2 nodes running 2.1.5 - Run nodetool upgradestable - Stop 2 nodes running 2.1.4 - Run nodetool upgradestable Finished? On Wed, Jul 1, 2015 at 1:49 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Basically, when you add nodes, add them on the correct version to avoid schema / network issues in your streams. Also, try to update all the node using rolling restarts in a reduced time frame after stopping repairs, with all the nodes up, etc. You must have a healthy cluster before performing an upgrade. Remember running a nodetool upgradesstable after any upgrade (if not needed it will end fast, so I would run it always as a best practice, just in case). C*heers, Alain 2015-07-01 2:16 GMT+02:00 David Aronchick aronch...@gmail.com: That is a GREAT lead! So it looks like I can't add a few nodes to the cluster of the new version, have it settle down, and then upgrade the rest? On Tue, Jun 30, 2015 at 11:58 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Would it matter that I'm mixing cassandra versions? From: http://docs.datastax.com/en/upgrade/doc/upgrade/datastax_enterprise/upgrdLim.html General upgrade limitations¶ Do not run nodetool repair. Do not enable new features. Do not issue these types of queries during a rolling restart: DDL, TRUNCATE *During upgrades, the nodes on different versions show a schema disagreement*. I think this is a good lead. C*heers, Alain 2015-06-30 20:22 GMT+02:00 David Aronchick aronch...@gmail.com: I appreciate the thoughts! My issue is that it seems to work perfectly, until the node goes away. Would it matter that I'm mixing cassandra versions? (2.1.4 and 2.1.5)? On Tue, Jun 30, 2015 at 5:23 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi David ? What does a nodetool describecluster output look like ? My guess is you might be having a schema version desynchronisation. If you see a node with different schema version you might want to try a nodetool resetlocal*schema* - Reset node's local *schema* and resync You asked for any thoughts, this is a thought, not sure if it will help, I hope so. C*heers, Alain 2015-06-30 1:44 GMT+02:00 Robert Coli rc...@eventbrite.com: On Mon, Jun 29, 2015 at 2:43 PM, David Aronchick aronch...@gmail.com wrote: Ping--- any thoughts here? I don't have any thoughts on your specific issue at this time, but FWIW #cassandra on freenode is sometimes a better forum for interactive debugging of operational edge cases. =Rob
Re: Cassandra leap second
Here is the full Datastax article: http://www.datastax.com/dev/blog/preparing-for-the-leap-second We had issues with C* and MySQL last time around (2012) but no issues this time thanks to upgraded systems. On 1 July 2015 at 14:39, Alain RODRIGUEZ arodr...@gmail.com wrote: I think it is more a kernel / java version issue than a Cassandra version related issue. From Datastax: Leap Second on June 30, 2015 Required Actions: Ensure that you are running on kernel version 3.4 or higher and using JDK version 7u60 or higher. This should protect you from the livelock problems users experienced in 2012. We had recent versions on both things and ran smoothly this night. C*heers 2015-07-01 16:29 GMT+02:00 Narendra Sharma narendra.sha...@gmail.com: We also experienced same, i.e. high cpu on Cassandra 1.1.4 node running in AWS. Restarting the vm worked. On Wed, Jul 1, 2015 at 4:58 AM, Jason Wee peich...@gmail.com wrote: same here too, on branch 1.1 and have not seen any high cpu usage. On Wed, Jul 1, 2015 at 2:52 PM, John Wong gokoproj...@gmail.com wrote: Which version are you running and what's your kernel version? We are still running on 1.2 branch but we have not seen any high cpu usage yet... On Tue, Jun 30, 2015 at 11:10 PM, snair123 . nair...@outlook.com wrote: reboot of the machine worked -- From: nair...@outlook.com To: user@cassandra.apache.org Subject: Cassandra leap second Date: Wed, 1 Jul 2015 02:54:53 + Is it ok to run this https://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ Seeing high cpu consumption for cassandra process -- Sent from Jeff Dean's printf() mobile console -- Narendra Sharma Software Engineer *http://www.aeris.com http://www.aeris.com* *http://narendrasharma.blogspot.com/ http://narendrasharma.blogspot.com/*
Re: How to measure disk space used by a keyspace?
nodetool cfstats would be your best bet. Sum all the column families info., within a keyspace to get to the number you are looking for. Jan/ On Wednesday, July 1, 2015 9:05 AM, graham sanderson gra...@vast.com wrote: If you are pushing metric data to graphite, there is org.apache.cassandra.metrics.keyspace.keyspace_name.LiveDiskSpaceUsed.value … for each node; Easy enough to graph the sum across machines. Metrics/JMX are tied together in C*, so there is an equivalent value exposed via JMX… I don’t know what it is called off the top of my head, but would be something similar to the above. On Jul 1, 2015, at 9:28 AM, sean_r_dur...@homedepot.com wrote: That’s ok for a single node, but to answer the question, “how big is my table across the cluster?” it would be much better if the cluster could provide an answer. Sean Durity From: Jonathan Haddad [mailto:j...@jonhaddad.com] Sent: Monday, June 29, 2015 8:15 AM To: user Subject: Re: How to measure disk space used by a keyspace? If you're looking to measure actual disk space, I'd use the du command, assuming you're on a linux: http://linuxconfig.org/du-1-manual-page On Mon, Jun 29, 2015 at 2:26 AM shahab shahab.mok...@gmail.com wrote: Hi, Probably this question has been already asked in the mailing list, but I couldn't find it. The question is how to measure disk-space used by a keyspace, column family wise, excluding snapshots? best,/Shahab The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
Re: Stream failure while adding a new node
David ; bring down all the nodes with the exception of the 'seed' node.Now bring up the 10th node. Run 'nodetool status' and wait until this 10th node is UP. Bring up the rest of the nodes after that. Run 'nodetool status' again and check that all the nodes are UP. Alternatively;decommission the 10th node completely.drop it from the Cluster. Build a new node with the same IP and hostname and have it join the running cluster. hope this helpsJan On Wednesday, July 1, 2015 7:56 AM, David CHARBONNIER david.charbonn...@rgsystem.com wrote: #yiv2507924157 #yiv2507924157 -- _filtered #yiv2507924157 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv2507924157 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv2507924157 {font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;}#yiv2507924157 #yiv2507924157 p.yiv2507924157MsoNormal, #yiv2507924157 li.yiv2507924157MsoNormal, #yiv2507924157 div.yiv2507924157MsoNormal {margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;}#yiv2507924157 a:link, #yiv2507924157 span.yiv2507924157MsoHyperlink {color:blue;text-decoration:underline;}#yiv2507924157 a:visited, #yiv2507924157 span.yiv2507924157MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv2507924157 p {margin-right:0cm;margin-left:0cm;font-size:12.0pt;}#yiv2507924157 p.yiv2507924157MsoAcetate, #yiv2507924157 li.yiv2507924157MsoAcetate, #yiv2507924157 div.yiv2507924157MsoAcetate {margin:0cm;margin-bottom:.0001pt;font-size:8.0pt;}#yiv2507924157 span.yiv2507924157EmailStyle18 {color:#1F497D;}#yiv2507924157 span.yiv2507924157TextedebullesCar {}#yiv2507924157 .yiv2507924157MsoChpDefault {} _filtered #yiv2507924157 {margin:70.85pt 70.85pt 70.85pt 70.85pt;}#yiv2507924157 div.yiv2507924157WordSection1 {}#yiv2507924157 Hi Alain, We still have the timeout problem in OPSCenter and we still didn’t solve this problem so no we didn’t ran an entire repair with the repair service. And yes, during this try, we’ve set auto_bootstrap to true and ran a repair on the 9th node after it finished streaming. Thank you for your help. Best regards, | | | David CHARBONNIER | | Sysadmin | | T : +33 411 934 200 | | david.charbonn...@rgsystem.com | | | | ZAC Aéroport | | 125 Impasse Adam Smith | | 34470 Pérols - France | | www.rgsystem.com | | | | De : Alain RODRIGUEZ [mailto:arodr...@gmail.com] Envoyé : mardi 30 juin 2015 15:18 À : user@cassandra.apache.org Objet : Re: Stream failure while adding a new node Hi David, Are you sure you ran the repair entirely (9 days + repair logs ok on opscenter server) before adding the 10th node ? This is important to avoid potential data loss ! Did you set auto_bootstrap to true on this 10th node ? C*heers, Alain 2015-06-29 14:54 GMT+02:00 David CHARBONNIER david.charbonn...@rgsystem.com: Hi, We’re using Cassandra 2.0.8.39 through Datastax Enterprise 4.5.1 with a 9 nodes cluster. We need to add a few new nodes to the cluster but we’re experiencing an issue we don’t know how to solve. Here is exactly what we did : - We had 8 nodes and need to add a few ones - We tried to add 9th node but stream stucked a very long time and bootstrap never finish (related to streaming_socket_timeout_in_ms default value in cassandra.yaml) - We ran a solution given by a Datastax’s architect : restart the node with auto_bootstrap set to false and run a repair - After this issue, we ran into pathing the default configuration on all our nodes to avoid this problem and made a rolling restart of the cluster - Then, we tried adding a 10th node but it receives stream from only one node (node2). Here is the logs on this problematic node (node10) : INFO [main] 2015-06-26 15:25:59,490 StreamResultFuture.java (line 87) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Executing streaming plan for Bootstrap INFO [main] 2015-06-26 15:25:59,490 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node6 INFO [main] 2015-06-26 15:25:59,491 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node5 INFO [main] 2015-06-26 15:25:59,492 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node4 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node3 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node9 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning stream session with /node8 INFO [main] 2015-06-26 15:25:59,493 StreamResultFuture.java (line 91) [Stream #a5226b30-1c17-11e5-a58b-e35f08264ca1] Beginning
Truncate really slow
I have two test clusters, both 2.0.15. One has a single node and one has three nodes. Truncate on the three node cluster is really slow, but is quite fast on the single-node cluster. My test cases truncate tables before each test, and 95% of the time in my test cases is spent truncating tables on the 3-node cluster. Auto-snapshotting is off. I know there’s some coordination that has to occur when a truncate happens, but it seems really excessive. Almost one second to truncate each table with an otherwise idle cluster. Any thoughts? Thanks in advance Robert
Lots of write timeouts and missing data during decomission/bootstrap
We get lots of write timeouts when we decommission a node. About 80% of them are write timeout and just about 20% of them are read timeout. We’ve tried to adjust streamthroughput (and compaction throughput) for that matter and that doesn’t resolve the issue. We’ve increased write_request_timeout_in_ms … and read timeout as well. Is there anything else I should be looking at? I can’t seem to find the documentation that explains what the heck is happening. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts
Re: Lots of write timeouts and missing data during decomission/bootstrap
Looks like all of this is happening because we’re using CAS operations and the driver is going to SERIAL consistency level. SERIAL and LOCAL_SERIAL write failure scenarios¶ http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html?scroll=concept_ds_umf_5xx_zj__failure-scenariosIf one of three nodes is down, the Paxos commit fails under the following conditions: - CQL query-configured consistency level of ALL - Driver-configured serial consistency level of SERIAL - Replication factor of 3 I don’t understand why this would fail.. it seems completely broken in this situation. We were having write timeout at replication factor of 2 .. and a lot of people from the list said of course , because 2 nodes with 1 node down means there’s no quorum and paxos needs a quorum. .. and not sure why I missed that :-P So we went with 3 replicas, and a quorum, but this is new and I didn’t see this documented. We set the driver to QUORUM but then I guess the driver sees that this is a CAS operation and forces it back to SERIAL? Doesn’t this mean that all decommissions result in failures of CAS? This is Cassandra 2.0.9 btw. On Wed, Jul 1, 2015 at 2:22 PM, Kevin Burton bur...@spinn3r.com wrote: We get lots of write timeouts when we decommission a node. About 80% of them are write timeout and just about 20% of them are read timeout. We’ve tried to adjust streamthroughput (and compaction throughput) for that matter and that doesn’t resolve the issue. We’ve increased write_request_timeout_in_ms … and read timeout as well. Is there anything else I should be looking at? I can’t seem to find the documentation that explains what the heck is happening. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts
Re: Truncate really slow
Hi, you have to enable -Dcassandra.unsafesystem=true in cassandra-env.sh. Also disable durables writes for your CFs. This should speed things up and should reduce IOWait dramatically. kind regards, Christian On Wed, Jul 1, 2015 at 11:52 PM, Robert Wille rwi...@fold3.com wrote: I have two test clusters, both 2.0.15. One has a single node and one has three nodes. Truncate on the three node cluster is really slow, but is quite fast on the single-node cluster. My test cases truncate tables before each test, and 95% of the time in my test cases is spent truncating tables on the 3-node cluster. Auto-snapshotting is off. I know there’s some coordination that has to occur when a truncate happens, but it seems really excessive. Almost one second to truncate each table with an otherwise idle cluster. Any thoughts? Thanks in advance Robert
Re: High load on cassandra node
We heavily use counters, will upgrade and check if it solves the issue... The current cluster version is 2.0.14 We do a lot of delete operations and do major compaction to remove the tombstones. Is there any better way? On 1 July 2015 at 20:02, Sebastian Estevez sebastian.este...@datastax.com wrote: Looks like a CASSANDRA-6405 (replicate on write is the counter tp). Upgrade to the latest 2.1 version and let us know if he situation improves. Major compactions are usually a bad idea by the way. Do you really want one huge sstable? On Jul 1, 2015 10:03 AM, Jayapandian Ponraj pandian...@gmail.com wrote: HI I have a 6 node cluster and I ran a major compaction on node 1 but I found that the load reached very high levels on node 2. Is this explainable? Attaching tpstats and metrics: cassandra-2 ~]$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MutationStage 0 0 185152938 0 0 ReadStage 0 0490 0 0 RequestResponseStage 0 0 168660091 0 0 ReadRepairStage 0 0 21247 0 0 ReplicateOnWriteStage32 6186 88699535 0 7163 MiscStage 0 0 0 0 0 HintedHandoff 0 1 1090 0 0 FlushWriter 0 0 2059 0 13 MemoryMeter 0 0 3922 0 0 GossipStage 0 02246873 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CompactionExecutor0 0 12353 0 0 ValidationExecutor0 0 0 0 0 MigrationStage0 0 1 0 0 commitlog_archiver0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 PendingRangeCalculator0 0 16 0 0 MemtablePostFlusher 0 0 10932 0 0 Message type Dropped READ 49051 RANGE_SLICE 0 _TRACE 0 MUTATION 269 COUNTER_MUTATION 185 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 Also I saw that the nativetransportreqests was 23 in active and 23 in pending, I found this in opscenter. Any settings i can make to keep the load under control? Appreciate any help.. Thanks
Re: Decommissioned node still in Gossip
Thanks for the tip Aiman, but this node is not in the seed list anywhere. Jeff On 30 June 2015 at 18:16, Aiman Parvaiz ai...@flipagram.com wrote: I was having exactly the same issue with the same version, check your seed list and make sure it contains only the live nodes, I know that seeds are only read when cassandra starts but updating the seed list to live nodes and then doing a roiling restart fixed this issue for me. I hope this helps you. Thanks Sent from my iPhone On Jun 30, 2015, at 4:42 AM, Jeff Williams je...@wherethebitsroam.com wrote: Hi, I have a cluster which had 4 datacenters running 2.0.12. Last week one of the datacenters was decommissioned using nodetool decommission on each of the servers in turn. This seemed to work fine until one of the nodes started appearing in the logs of all of the remaining servers with messages like: INFO [GossipStage:3] 2015-06-30 11:22:39,189 Gossiper.java (line 924) InetAddress /172.29.8.8 is now DOWN INFO [GossipStage:3] 2015-06-30 11:22:39,190 StorageService.java (line 1773) Removing tokens [...] for /172.29.8.8 These come up in the log every minute or two. I believe it may have re-appeared after a repair, but I'm not sure. The problem is that this node does not exist in nodetool status, nodetool gossipinfo or in the system.peers table. So how can tell the cluster that this node is decommissioned? Regards, Jeff
Re: Migrate table data to another table
In both methods i have to export table to one file and the i can move data to new table. Can i copy from one table to another? 30 Haz 2015 Sal, 20:31 tarihinde, Sebastian Estevez sebastian.este...@datastax.com şunu yazdı: Another option is Brian's cassandra loader: https://github.com/brianmhess/cassandra-loader All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Tue, Jun 30, 2015 at 1:26 PM, John Sanda john.sa...@gmail.com wrote: You might want to take a look at CQLSSTableWriter[1] in the Cassandra source tree. http://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated On Tue, Jun 30, 2015 at 1:18 PM, Umut Kocasaraç ukocasa...@gmail.com wrote: Hi, I want to change clustering order column of my table. As far as i know it is not possible to use alter command so i have created new table and i would like to move data from old table to this one. I am using Cassandra 2.0.7 and there is almost 100GB data on table. Is there any easy method to move data except Copy command. Thanks Umut -- - John
Re: Issue when node goes away?
I understand - I was actually trying to use the containerized Cassandra, so upgrading in place doesn't make sense. I guess I'll wait until that's better supported. On Wed, Jul 1, 2015 at 12:14 PM, Jonathan Haddad j...@jonhaddad.com wrote: Adding new nodes of a different version is asking for trouble, as others have said. I don't know if this particular version bump would expect any issues, but why risk it? If you want to upgrade, do it to the nodes in place. Don't bootstrap new nodes, don't run repair, don't remove nodes. My 2 cents. On Wed, Jul 1, 2015 at 11:59 AM David Aronchick aronch...@gmail.com wrote: I mean add an additional two nodes to my cluster and pointing them at the other nodes in the cluster, to handle data migration. On Wed, Jul 1, 2015 at 11:40 AM, Jonathan Haddad j...@jonhaddad.com wrote: When you say add 2 nodes, do you mean bootstrap, or upgrade in place? On Wed, Jul 1, 2015 at 11:37 AM David Aronchick aronch...@gmail.com wrote: This helps - so let me understand: Starting point: - 4 nodes running 2.1.4 - System is healthy Decide to upgrade: - Add 2 nodes running 2.1.5 - Run nodetool upgradestable - Wait until system is healthy - Stop 2 nodes running 2.1.4 - Run nodetool upgradestable - Add 2 nodes running 2.1.5 - Run nodetool upgradestable - Stop 2 nodes running 2.1.4 - Run nodetool upgradestable Finished? On Wed, Jul 1, 2015 at 1:49 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Basically, when you add nodes, add them on the correct version to avoid schema / network issues in your streams. Also, try to update all the node using rolling restarts in a reduced time frame after stopping repairs, with all the nodes up, etc. You must have a healthy cluster before performing an upgrade. Remember running a nodetool upgradesstable after any upgrade (if not needed it will end fast, so I would run it always as a best practice, just in case). C*heers, Alain 2015-07-01 2:16 GMT+02:00 David Aronchick aronch...@gmail.com: That is a GREAT lead! So it looks like I can't add a few nodes to the cluster of the new version, have it settle down, and then upgrade the rest? On Tue, Jun 30, 2015 at 11:58 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Would it matter that I'm mixing cassandra versions? From: http://docs.datastax.com/en/upgrade/doc/upgrade/datastax_enterprise/upgrdLim.html General upgrade limitations¶ Do not run nodetool repair. Do not enable new features. Do not issue these types of queries during a rolling restart: DDL, TRUNCATE *During upgrades, the nodes on different versions show a schema disagreement*. I think this is a good lead. C*heers, Alain 2015-06-30 20:22 GMT+02:00 David Aronchick aronch...@gmail.com: I appreciate the thoughts! My issue is that it seems to work perfectly, until the node goes away. Would it matter that I'm mixing cassandra versions? (2.1.4 and 2.1.5)? On Tue, Jun 30, 2015 at 5:23 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi David ? What does a nodetool describecluster output look like ? My guess is you might be having a schema version desynchronisation. If you see a node with different schema version you might want to try a nodetool resetlocal*schema* - Reset node's local *schema* and resync You asked for any thoughts, this is a thought, not sure if it will help, I hope so. C*heers, Alain 2015-06-30 1:44 GMT+02:00 Robert Coli rc...@eventbrite.com: On Mon, Jun 29, 2015 at 2:43 PM, David Aronchick aronch...@gmail.com wrote: Ping--- any thoughts here? I don't have any thoughts on your specific issue at this time, but FWIW #cassandra on freenode is sometimes a better forum for interactive debugging of operational edge cases. =Rob
Re: Truncate really slow
On Wed, Jul 1, 2015 at 2:58 PM, horschi hors...@gmail.com wrote: you have to enable -Dcassandra.unsafesystem=true in cassandra-env.sh. Also disable durables writes for your CFs. This should speed things up and should reduce IOWait dramatically. The above two suggestions are almost always bad advice anywhere but in a test environment. =Rob
Re: Lots of write timeouts and missing data during decomission/bootstrap
On Wed, Jul 1, 2015 at 2:58 PM, Kevin Burton bur...@spinn3r.com wrote: Looks like all of this is happening because we’re using CAS operations and the driver is going to SERIAL consistency level. ... This is Cassandra 2.0.9 btw. https://issues.apache.org/jira/browse/CASSANDRA-8640 =Rob (credit to iamaleksey on IRC for remembering the JIRA #)
Cassandra leap second
Which version are you running and what's your kernel version? We are still running on 1.2 branch but we have not seen any high cpu usage yet... On Tue, Jun 30, 2015 at 11:10 PM, snair123 . nair...@outlook.com javascript:_e(%7B%7D,'cvml','nair...@outlook.com'); wrote: reboot of the machine worked -- From: nair...@outlook.com javascript:_e(%7B%7D,'cvml','nair...@outlook.com'); To: user@cassandra.apache.org javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org'); Subject: Cassandra leap second Date: Wed, 1 Jul 2015 02:54:53 + Is it ok to run this https://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ Seeing high cpu consumption for cassandra process -- Sent from Jeff Dean's printf() mobile console
Re: Error while adding a new node.
also: root@cas03:~# sudo service cassandra start root@cas03:~# lsof -n | grep java | wc -l 5315 root@cas03:~# lsof -n | grep java | wc -l 977317 root@cas03:~# lsof -n | grep java | wc -l 880240 root@cas03:~# lsof -n | grep java | wc -l 882402 On Wed, Jul 1, 2015 at 6:31 PM, Neha Trivedi nehajtriv...@gmail.com wrote: One of the column family has SStable count as under : SSTable count: 98506 Can it be because of 2.1.3 version of cassandra.. I found this : https://issues.apache.org/jira/browse/CASSANDRA-8964 regards Neha On Wed, Jul 1, 2015 at 5:40 PM, Jason Wee peich...@gmail.com wrote: nodetool cfstats? On Wed, Jul 1, 2015 at 8:08 PM, Neha Trivedi nehajtriv...@gmail.com wrote: Hey.. nodetool compactionstats pending tasks: 0 no pending tasks. Dont have opscenter. how do I monitor sstables? On Wed, Jul 1, 2015 at 4:28 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: You also might want to check if you have compactions pending (Opscenter / nodetool compactionstats). Also you can monitor the number of sstables. C*heers Alain 2015-07-01 11:53 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Thanks I will checkout. I increased the ulimit to 10, but I am getting the same error, but after a while. regards Neha On Wed, Jul 1, 2015 at 2:22 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Just check the process owner to be sure (top, htop, ps, ...) http://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html#reference_ds_sxl_gf3_2k__user-resource-limits C*heers, Alain 2015-07-01 7:33 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Arun, I am logging on to Server as root and running (sudo service cassandra start) regards Neha On Wed, Jul 1, 2015 at 11:00 AM, Neha Trivedi nehajtriv...@gmail.com wrote: Thanks Arun ! I will try and get back ! On Wed, Jul 1, 2015 at 10:32 AM, Arun arunsi...@gmail.com wrote: Looks like you have too many open files issue. Increase the ulimit for the user. If you are starting the cassandra daemon using user cassandra, increase the ulimit for that user. On Jun 30, 2015, at 21:16, Neha Trivedi nehajtriv...@gmail.com wrote: Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Re: Lots of write timeouts and missing data during decomission/bootstrap
WOW.. nice. you rock!! On Wed, Jul 1, 2015 at 3:18 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Jul 1, 2015 at 2:58 PM, Kevin Burton bur...@spinn3r.com wrote: Looks like all of this is happening because we’re using CAS operations and the driver is going to SERIAL consistency level. ... This is Cassandra 2.0.9 btw. https://issues.apache.org/jira/browse/CASSANDRA-8640 =Rob (credit to iamaleksey on IRC for remembering the JIRA #) -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts