How to disconnect a data center with Astyanax
Hi, All, I need to remove a Cassandra data center from a set of connected Cassandra data centers (I use Cassandra 2.0.7). I found a related document at http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_decomission_dc_t.html But I need to do these in my Java programs. My understanding is that the step #3 is to change the keyspace definition on all remaining data centers. E.g. I have three data centers connected dc1, dc2 and dc3, And now I need to disconnect dc3 from them. After executing step #1 and #2, I need to run step #3 on dc1 and dc2. Can anyone tell me how to do this with Astyanax APIs? Can I use KeyspaceDefinition.setStrageOptions() to do this? Thanks Boying
order by on different columns
Hi, We need to retrieve the data stored in cassandra on something different than its natural order; we are looking for possible ways to sort results from a column family data based on columns that are not part of the primary key; is denormalizing the data on another column family the only option to do this? Thanks, Tommaso
MemtablePostFlusher and FlushWriter
Hi all, I have a small cluster (2 nodes RF 2) running with C* 2.0.6 on I2 Extra Large (AWS) with SSD disk, the nodetool tpstats shows many MemtablePostFlusher pending and FlushWriter All time blocked. The two nodes have the default configuration. All CF use size-tiered compaction strategy. There are 10 times more reads than writes (1300 reads/s and 150 writes/s). ubuntu@node1 :~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1158 159590 0 0 FlushWriter 0 0 11568 0 1031 ubuntu@node1:~$ nodetool compactionstats pending tasks: 90 Active compaction remaining time :n/a ubuntu@node2:~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1020 50987 0 0 FlushWriter 0 0 6672 0 948 ubuntu@node2:~$ nodetool compactionstats pending tasks: 89 Active compaction remaining time :n/a I think there is something wrong, thank you for your help.
Only one node in Cassandra AMI cluster
When I start Cassandra AMI cluster with 4 nodes on AWS and open the OpsCenter there is only one node shown. I followed the doc http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installAMILaunch.html I choose Region AMI us-east-1 ami-ada2b6c4 I set the number of Ec2 instances to 4 and advanced details to --clustername myDSCcluster --totalnodes 4 --version community Then after launching and waiting a bit I can only see one node in the cluster ubuntu@ip-:~$ nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns Host ID Rack UN XXX 59.24 KB 256 100.0% 1c
Re: Only one node in Cassandra AMI cluster
Have you setup your security group correctly so that all nodes can communicate. Is there anything in the logs to suggest nodes could not communicate to each other? When you log into the other instances, is Cassandra running correctly? On Tue, Jul 15, 2014 at 6:28 PM, Lars Schouw schou...@yahoo.com wrote: When I start Cassandra AMI cluster with 4 nodes on AWS and open the OpsCenter there is only one node shown. I followed the doc http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installAMILaunch.html I choose Region AMI us-east-1 ami-ada2b6c4 I set the number of Ec2 instances to 4 and advanced details to --clustername myDSCcluster --totalnodes 4 --version community [image: enter image description here] Then after launching and waiting a bit I can only see one node in the cluster ubuntu@ip-:~$ nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns Host ID Rack UN XXX 59.24 KB 256 100.0% 1c
Re: Index creation sometimes fails
FWIW I was able to work around this problem by having my code run the following loop: while (the index doesn't exit and we haven't hit our limit of number of tries): try to create the index read from system.schema_columns to see if the index exists if so, then break if not, pause 5 seconds This loop took three iterations to create the index. Is this expected? This seems really weird! Best regards, Clint On Mon, Jul 14, 2014 at 5:54 PM, Clint Kelly clint.ke...@gmail.com wrote: BTW I have seen this using versions 2.0.1 and 2.0.3 of the java driver on a three-node cluster with DSE 4.5. On Mon, Jul 14, 2014 at 5:51 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi everyone, I have some code that I've been fiddling with today that uses the DataStax Java driver to create a table and then create a secondary index on a column in that table. I've testing this code fairly thoroughly on a single-node Cassandra instance on my laptop and in unit test (using the CassandraDaemon). When running on a three-node cluster, however, I see strange behavior. Although my table always gets created, the secondary index often does not! If I delete the table and then create it again (through the same code that I've written), I've never seen the index fail to appear the second time. Does anyone have any idea what to look for here? I have no experience working on a Cassandra cluster and I wonder if maybe I am doing something dumb (I basically just installed DSE and started up the three nodes and that was it). I don't see anything that looks unusual in OpsCenter for DSE. The only thing I've noticed is that the presence of output like the following from my program after executing the command to create the index is perfectly correlated with successful creation of the index: 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Received event EVENT CREATED kiji_retail2.t_model_repo, scheduling delivery 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: [Control connection] Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [b309518a-35d2-3790-bb66-ea39bb0d188c] If anyone can give me a hand, I would really appreciate it. I am out of ideas! Best regards, Clint
Re: Index creation sometimes fails
As far as I know, schema propagation always takes some times in the cluster. On this mailing list some people in the past faced similar behavior. On Tue, Jul 15, 2014 at 8:20 PM, Clint Kelly clint.ke...@gmail.com wrote: FWIW I was able to work around this problem by having my code run the following loop: while (the index doesn't exit and we haven't hit our limit of number of tries): try to create the index read from system.schema_columns to see if the index exists if so, then break if not, pause 5 seconds This loop took three iterations to create the index. Is this expected? This seems really weird! Best regards, Clint On Mon, Jul 14, 2014 at 5:54 PM, Clint Kelly clint.ke...@gmail.com wrote: BTW I have seen this using versions 2.0.1 and 2.0.3 of the java driver on a three-node cluster with DSE 4.5. On Mon, Jul 14, 2014 at 5:51 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi everyone, I have some code that I've been fiddling with today that uses the DataStax Java driver to create a table and then create a secondary index on a column in that table. I've testing this code fairly thoroughly on a single-node Cassandra instance on my laptop and in unit test (using the CassandraDaemon). When running on a three-node cluster, however, I see strange behavior. Although my table always gets created, the secondary index often does not! If I delete the table and then create it again (through the same code that I've written), I've never seen the index fail to appear the second time. Does anyone have any idea what to look for here? I have no experience working on a Cassandra cluster and I wonder if maybe I am doing something dumb (I basically just installed DSE and started up the three nodes and that was it). I don't see anything that looks unusual in OpsCenter for DSE. The only thing I've noticed is that the presence of output like the following from my program after executing the command to create the index is perfectly correlated with successful creation of the index: 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Received event EVENT CREATED kiji_retail2.t_model_repo, scheduling delivery 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: [Control connection] Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [b309518a-35d2-3790-bb66-ea39bb0d188c] If anyone can give me a hand, I would really appreciate it. I am out of ideas! Best regards, Clint
Re: Index creation sometimes fails
Hi DuyHai, Thanks for the response. Is the recommended best practice therefore to do what I describe above? Is there some way to get the driver to block until the schema code has propagated everywhere? My currently solution feels rather janky! Thanks! Best regards, Clint On Tue, Jul 15, 2014 at 11:32 AM, DuyHai Doan doanduy...@gmail.com wrote: As far as I know, schema propagation always takes some times in the cluster. On this mailing list some people in the past faced similar behavior. On Tue, Jul 15, 2014 at 8:20 PM, Clint Kelly clint.ke...@gmail.com wrote: FWIW I was able to work around this problem by having my code run the following loop: while (the index doesn't exit and we haven't hit our limit of number of tries): try to create the index read from system.schema_columns to see if the index exists if so, then break if not, pause 5 seconds This loop took three iterations to create the index. Is this expected? This seems really weird! Best regards, Clint On Mon, Jul 14, 2014 at 5:54 PM, Clint Kelly clint.ke...@gmail.com wrote: BTW I have seen this using versions 2.0.1 and 2.0.3 of the java driver on a three-node cluster with DSE 4.5. On Mon, Jul 14, 2014 at 5:51 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi everyone, I have some code that I've been fiddling with today that uses the DataStax Java driver to create a table and then create a secondary index on a column in that table. I've testing this code fairly thoroughly on a single-node Cassandra instance on my laptop and in unit test (using the CassandraDaemon). When running on a three-node cluster, however, I see strange behavior. Although my table always gets created, the secondary index often does not! If I delete the table and then create it again (through the same code that I've written), I've never seen the index fail to appear the second time. Does anyone have any idea what to look for here? I have no experience working on a Cassandra cluster and I wonder if maybe I am doing something dumb (I basically just installed DSE and started up the three nodes and that was it). I don't see anything that looks unusual in OpsCenter for DSE. The only thing I've noticed is that the presence of output like the following from my program after executing the command to create the index is perfectly correlated with successful creation of the index: 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Received event EVENT CREATED kiji_retail2.t_model_repo, scheduling delivery 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: [Control connection] Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.Cluster: Refreshing schema for kiji_retail2 14/07/14 17:40:01 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [9a8d72f9-e384-3aa8-bc85-185e2c303ade, b309518a-35d2-3790-bb66-ea39bb0d188c] 14/07/14 17:40:02 DEBUG com.datastax.driver.core.ControlConnection: Checking for schema agreement: versions are [b309518a-35d2-3790-bb66-ea39bb0d188c] If anyone can give me a hand, I would really appreciate it. I am out of ideas! Best regards, Clint
Re: Only one node in Cassandra AMI cluster
Good to hear. Mark On Tue, Jul 15, 2014 at 8:16 PM, Lars Schouw schou...@yahoo.com wrote: Yes, it works when setting the security group so all traffic is allowed from anywhere. I also found the doc that tells me how to add the more restrictive security http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installAMISecurityGroup.html On Tuesday, July 15, 2014 6:43 PM, Mark Reddy mark.re...@boxever.com wrote: Have you setup your security group correctly so that all nodes can communicate. Is there anything in the logs to suggest nodes could not communicate to each other? When you log into the other instances, is Cassandra running correctly? On Tue, Jul 15, 2014 at 6:28 PM, Lars Schouw schou...@yahoo.com wrote: When I start Cassandra AMI cluster with 4 nodes on AWS and open the OpsCenter there is only one node shown. I followed the doc http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installAMILaunch.html I choose Region AMI us-east-1 ami-ada2b6c4 I set the number of Ec2 instances to 4 and advanced details to --clustername myDSCcluster --totalnodes 4 --version community [image: enter image description here] Then after launching and waiting a bit I can only see one node in the cluster ubuntu@ip-:~$ nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns Host ID Rack UN XXX 59.24 KB 256 100.0% 1c
Re: Only one node in Cassandra AMI cluster
Yes, it works when setting the security group so all traffic is allowed from anywhere. I also found the doc that tells me how to add the more restrictive security http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installAMISecurityGroup.html On Tuesday, July 15, 2014 6:43 PM, Mark Reddy mark.re...@boxever.com wrote: Have you setup your security group correctly so that all nodes can communicate. Is there anything in the logs to suggest nodes could not communicate to each other? When you log into the other instances, is Cassandra running correctly? On Tue, Jul 15, 2014 at 6:28 PM, Lars Schouw schou...@yahoo.com wrote: When I start Cassandra AMI cluster with 4 nodes on AWS and open the OpsCenter there is only one node shown. I followed the doc http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installAMILaunch.html I choose Region AMI us-east-1 ami-ada2b6c4 I set the number of Ec2 instances to 4 and advanced details to --clustername myDSCcluster --totalnodes 4 --version community Then after launching and waiting a bit I can only see one node in the cluster ubuntu@ip-:~$ nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: us-east === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns Host ID Rack UN XXX 59.24 KB 256 100.0% 1c
Re: MemtablePostFlusher and FlushWriter
I have seen this behavour when Commitlog files got deleted (or permissions were set to read only). MemtablePostFlusher is the stage that marks the Commitlog as flushed. When they fail it usually means there is something wrong with the commitlog files. Check your logfiles for any commitlog related errors. regards, Christian On Tue, Jul 15, 2014 at 7:03 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi all, I have a small cluster (2 nodes RF 2) running with C* 2.0.6 on I2 Extra Large (AWS) with SSD disk, the nodetool tpstats shows many MemtablePostFlusher pending and FlushWriter All time blocked. The two nodes have the default configuration. All CF use size-tiered compaction strategy. There are 10 times more reads than writes (1300 reads/s and 150 writes/s). ubuntu@node1 :~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1158 159590 0 0 FlushWriter 0 0 11568 0 1031 ubuntu@node1:~$ nodetool compactionstats pending tasks: 90 Active compaction remaining time :n/a ubuntu@node2:~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1020 50987 0 0 FlushWriter 0 0 6672 0 948 ubuntu@node2:~$ nodetool compactionstats pending tasks: 89 Active compaction remaining time :n/a I think there is something wrong, thank you for your help.
Re: high pending compactions
Number of compactions can spike for perfectly reasonable scenarios (ie after a repair or bootstrap) so no matter what you set it to there will likely be periodic false alarms. I would however put a number 15 on the limit of any thread pool backlog as matter of principle until woken up too many times in middle of night. concurrent compactors will likely be to low (depending on number of cores). --- Chris Lohfink On Jul 14, 2014, at 7:31 PM, Greg Bone gbon...@gmail.com wrote: I'm looking into creation of monitoring thresholds for cassandra to report on its health. Does it make sense to set an alert threshold on compaction stats? If so, would setting it to a value equal to or greater than concurrent compactions make sense? Thanks, Greg On Mon, Jun 9, 2014 at 2:14 PM, S C as...@outlook.com wrote: Thank you all for quick responses. From: clohf...@blackbirdit.com Subject: Re: high pending compactions Date: Mon, 9 Jun 2014 14:11:36 -0500 To: user@cassandra.apache.org Bean: org.apache.cassandra.db.CompactionManager also nodetool compactionstats gives you how many are in the queue + estimate of how many will be needed. in 1.1 you will OOM far before you hit the limit,. In theory though, the compaction executor is a little special cased and will actually throw an exception (normally it will block) Chris On Jun 9, 2014, at 7:49 AM, S C as...@outlook.com wrote: Thank you all for valuable suggestions. Couple more questions, How to check the compaction queue? MBean/C* system log ? What happens if the queue is full? From: colinkuo...@gmail.com Date: Mon, 9 Jun 2014 18:53:41 +0800 Subject: Re: high pending compactions To: user@cassandra.apache.org As Jake suggested, you could firstly increase compaction_throughput_mb_per_sec and concurrent_compactions to suitable values if system resource is allowed. From my understanding, major compaction will internally acquire lock before running compaction. In your case, there might be a major compaction blocking the pending following compaction tasks. You could check the result of nodetool compactionstats and C* system log for double confirm. If the running compaction is compacting wide row for a long time, you could try to tune in_memory_compaction_limit_in_mb value. Thanks, On Sun, Jun 8, 2014 at 11:27 PM, S C as...@outlook.com wrote: I am using Cassandra 1.1 (sorry bit old) and I am seeing high pending compaction count. pending tasks: 67 while active compaction tasks are not more than 5. I have a 24CPU machine. Shouldn't I be seeing more compactions? Is this a pattern of high writes and compactions backing up? How can I improve this? Here are my thoughts. Increase memtable_total_space_in_mb Increase compaction_throughput_mb_per_sec Increase concurrent_compactions Sorry if this was discussed already. Any pointers is much appreciated. Thanks, Kumar
Re: MemtablePostFlusher and FlushWriter
The MemtablePostFlusher is also used for flushing non-cf backed (solr) indexes. Are you using DSE and solr by chance? Chris On Jul 15, 2014, at 5:01 PM, horschi hors...@gmail.com wrote: I have seen this behavour when Commitlog files got deleted (or permissions were set to read only). MemtablePostFlusher is the stage that marks the Commitlog as flushed. When they fail it usually means there is something wrong with the commitlog files. Check your logfiles for any commitlog related errors. regards, Christian On Tue, Jul 15, 2014 at 7:03 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi all, I have a small cluster (2 nodes RF 2) running with C* 2.0.6 on I2 Extra Large (AWS) with SSD disk, the nodetool tpstats shows many MemtablePostFlusher pending and FlushWriter All time blocked. The two nodes have the default configuration. All CF use size-tiered compaction strategy. There are 10 times more reads than writes (1300 reads/s and 150 writes/s). ubuntu@node1 :~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1158 159590 0 0 FlushWriter 0 0 11568 0 1031 ubuntu@node1:~$ nodetool compactionstats pending tasks: 90 Active compaction remaining time :n/a ubuntu@node2:~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1020 50987 0 0 FlushWriter 0 0 6672 0 948 ubuntu@node2:~$ nodetool compactionstats pending tasks: 89 Active compaction remaining time :n/a I think there is something wrong, thank you for your help.