Upgrading 1.1 to 1.2 in-place
Hello list, I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1. Here's my nodetool ring: 1.1.1.1 eu-west 1a Up Normal 55.07 GB50.00% 0 2.2.2.1 us-east 1b Up Normal 107.82 GB 100.00% 1 1.1.1.2 eu-west 1b Up Normal 53.98 GB50.00% 28356863910078205288614550619314017622 1.1.1.3 eu-west 1c Up Normal 54.85 GB50.00% 56713727820156410577229101238628035242 2.2.2.2 us-east 1d Up Normal 107.25 GB 100.00% 56713727820156410577229101238628035243 1.1.1.4 eu-west 1a Up Normal 54.99 GB50.00% 85070591730234615865843651857942052863 1.1.1.5 eu-west 1b Up Normal 55.1 GB 50.00% 113427455640312821154458202477256070484 2.2.2.3 us-east 1e Up Normal 106.78 GB 100.00% 113427455640312821154458202477256070485 1.1.1.6 eu-west 1c Up Normal 55.01 GB50.00% 141784319550391026443072753096570088105 I am going to upgrade my machine type, upgrade to 1.2 and change the 6-node to 3 nodes. I will have to do it on the live system. I'd appreciate any comments about my plan. 1. Decommission a 1.1 node. 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am trying to avoid a re-balance later on). 3. When done, decommission nodes 4-6 at DC1 Issues i've spotted: 1. I'm guessing I will have an unbalanced cluster for the time period where I have 1.2+vnodes and 1.1 mixed. 2. Rollback is cumbersome, snapshots won't help here. Any feedback appreciated Katriel
Re: Upgrading 1.1 to 1.2 in-place
No. This is not going to work. The vnodes feature requires the murmur3 partitioner which was introduced with Cassandra 1.2. Since you are currently using 1.1, you must be using the random partitioner, which is not compatible with vnodes. Because the partitioner determines the physical layout of all of your data on disk and across the cluster, it is not possible to change partitioner without taking some downtime to rewrite all of your data. You should probably plan on an upgrade to 1.2 but without also switching to vnodes at this point. -Tupshin On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote: Hello list, I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1. Here's my nodetool ring: 1.1.1.1 eu-west 1a Up Normal 55.07 GB50.00% 0 2.2.2.1 us-east 1b Up Normal 107.82 GB 100.00% 1 1.1.1.2 eu-west 1b Up Normal 53.98 GB50.00% 28356863910078205288614550619314017622 1.1.1.3 eu-west 1c Up Normal 54.85 GB50.00% 56713727820156410577229101238628035242 2.2.2.2 us-east 1d Up Normal 107.25 GB 100.00% 56713727820156410577229101238628035243 1.1.1.4 eu-west 1a Up Normal 54.99 GB50.00% 85070591730234615865843651857942052863 1.1.1.5 eu-west 1b Up Normal 55.1 GB 50.00% 113427455640312821154458202477256070484 2.2.2.3 us-east 1e Up Normal 106.78 GB 100.00% 113427455640312821154458202477256070485 1.1.1.6 eu-west 1c Up Normal 55.01 GB50.00% 141784319550391026443072753096570088105 I am going to upgrade my machine type, upgrade to 1.2 and change the 6-node to 3 nodes. I will have to do it on the live system. I'd appreciate any comments about my plan. 1. Decommission a 1.1 node. 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am trying to avoid a re-balance later on). 3. When done, decommission nodes 4-6 at DC1 Issues i've spotted: 1. I'm guessing I will have an unbalanced cluster for the time period where I have 1.2+vnodes and 1.1 mixed. 2. Rollback is cumbersome, snapshots won't help here. Any feedback appreciated Katriel
Re: Upgrading 1.1 to 1.2 in-place
Hi, I don't know how your application works, but I explained during the last Cassandra Summit Europe how we did the migration from relational database to Cassandra without any interruption of service. You can have a look at the video http://www.youtube.com/watch?v=mefOE9K7sLI And use the mod-dup module https://github.com/Orange-OpenSource/mod_dup For copying data from your Cassandra cluster 1.1 to the Cassandra cluster 1.2, you can backup your data and then use sstableloader (in this case, you will not have to modify the timestamp as I did for the migration from relational to Cassandra). Hope that helps !! Jean Armel 2013/12/30 Tupshin Harper tups...@tupshin.com No. This is not going to work. The vnodes feature requires the murmur3 partitioner which was introduced with Cassandra 1.2. Since you are currently using 1.1, you must be using the random partitioner, which is not compatible with vnodes. Because the partitioner determines the physical layout of all of your data on disk and across the cluster, it is not possible to change partitioner without taking some downtime to rewrite all of your data. You should probably plan on an upgrade to 1.2 but without also switching to vnodes at this point. -Tupshin On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote: Hello list, I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1. Here's my nodetool ring: 1.1.1.1 eu-west 1a Up Normal 55.07 GB50.00% 0 2.2.2.1 us-east 1b Up Normal 107.82 GB 100.00% 1 1.1.1.2 eu-west 1b Up Normal 53.98 GB50.00% 28356863910078205288614550619314017622 1.1.1.3 eu-west 1c Up Normal 54.85 GB50.00% 56713727820156410577229101238628035242 2.2.2.2 us-east 1d Up Normal 107.25 GB 100.00% 56713727820156410577229101238628035243 1.1.1.4 eu-west 1a Up Normal 54.99 GB50.00% 85070591730234615865843651857942052863 1.1.1.5 eu-west 1b Up Normal 55.1 GB 50.00% 113427455640312821154458202477256070484 2.2.2.3 us-east 1e Up Normal 106.78 GB 100.00% 113427455640312821154458202477256070485 1.1.1.6 eu-west 1c Up Normal 55.01 GB50.00% 141784319550391026443072753096570088105 I am going to upgrade my machine type, upgrade to 1.2 and change the 6-node to 3 nodes. I will have to do it on the live system. I'd appreciate any comments about my plan. 1. Decommission a 1.1 node. 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am trying to avoid a re-balance later on). 3. When done, decommission nodes 4-6 at DC1 Issues i've spotted: 1. I'm guessing I will have an unbalanced cluster for the time period where I have 1.2+vnodes and 1.1 mixed. 2. Rollback is cumbersome, snapshots won't help here. Any feedback appreciated Katriel
Opscenter's Meaning of 'Requests'
Hi Guys, I have started understanding Cassandra and am working with it recently. I have created two Column Families. For CF1, a write is an insert into a unique row with all column values. Eg: Key Col1 Col2 Col3 k1 c11 c12 c13 k2 c21 c22 c23 For CF2. a write is an insert into a time stamped column of a row. Eg: Key timeCol1 timeCol2 k1 ct11 k1 ct12 k2ct21 k2 ct22 I am using YCSB and using thrift based *client.batch_mutate()* call. For CF1, i send all column vals for a row through the call. For CF2, i send the new column vals for a row. Now say opscenter reports the write requests as say 1000 *operations*/sec when a record count is say 1 records. OpsCenter API docs say 'Write Requests as requests per second. What does an operation/request mean from opscenter perspective? Does it mean unique row inserts across all column families ? Does it mean count of each mutations for a row ? How does opscenter identify a unique operation/request ? Is it related or dependent on the row count or mutation count of batch_mutate() call ? From application perspective an operation means differently for both column families. Can some one guide me ? Thanks, Arun
Re: Upgrading 1.1 to 1.2 in-place
What is the technical limitation that vnodes need murmer? That seems uncool for long time users? On Monday, December 30, 2013, Jean-Armel Luce jaluc...@gmail.com wrote: Hi, I don't know how your application works, but I explained during the last Cassandra Summit Europe how we did the migration from relational database to Cassandra without any interruption of service. You can have a look at the video C* Summit EU 2013: The Cassandra Experience at Orange And use the mod-dup module https://github.com/Orange-OpenSource/mod_dup For copying data from your Cassandra cluster 1.1 to the Cassandra cluster 1.2, you can backup your data and then use sstableloader (in this case, you will not have to modify the timestamp as I did for the migration from relational to Cassandra). Hope that helps !! Jean Armel 2013/12/30 Tupshin Harper tups...@tupshin.com No. This is not going to work. The vnodes feature requires the murmur3 partitioner which was introduced with Cassandra 1.2. Since you are currently using 1.1, you must be using the random partitioner, which is not compatible with vnodes. Because the partitioner determines the physical layout of all of your data on disk and across the cluster, it is not possible to change partitioner without taking some downtime to rewrite all of your data. You should probably plan on an upgrade to 1.2 but without also switching to vnodes at this point. -Tupshin On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote: Hello list, I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1. Here's my nodetool ring: 1.1.1.1 eu-west 1a Up Normal 55.07 GB50.00% 0 2.2.2.1 us-east 1b Up Normal 107.82 GB 100.00% 1 1.1.1.2 eu-west 1b Up Normal 53.98 GB50.00% 28356863910078205288614550619314017622 1.1.1.3 eu-west 1c Up Normal 54.85 GB50.00% 56713727820156410577229101238628035242 2.2.2.2 us-east 1d Up Normal 107.25 GB 100.00% 56713727820156410577229101238628035243 1.1.1.4 eu-west 1a Up Normal 54.99 GB50.00% 85070591730234615865843651857942052863 1.1.1.5 eu-west 1b Up Normal 55.1 GB 50.00% 113427455640312821154458202477256070484 2.2.2.3 us-east 1e Up Normal 106.78 GB 100.00% 113427455640312821154458202477256070485 1.1.1.6 eu-west 1c Up Normal 55.01 GB50.00% 141784319550391026443072753096570088105 I am going to upgrade my machine type, upgrade to 1.2 and change the 6-node to 3 nodes. I will have to do it on the live system. I'd appreciate any comments about my plan. 1. Decommission a 1.1 node. 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am trying to avoid a re-balance later on). 3. When done, decommission nodes 4-6 at DC1 Issues i've spotted: 1. I'm guessing I will have an unbalanced cluster for the time period where I have 1.2+vnodes and 1.1 mixed. 2. Rollback is cumbersome, snapshots won't help here. Any feedback appreciated Katriel -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.
Re: Upgrading 1.1 to 1.2 in-place
Hi, Random Partitioner + VNodes are a supported combo based on DataStax documentation: http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/architecture/architecturePartitionerAbout_c.html How else would you even migrate from 1.1 to Vnodes since migration from one partitioner to another is such a huge amount of work? Cheers, Hannu 2013/12/30 Edward Capriolo edlinuxg...@gmail.com What is the technical limitation that vnodes need murmer? That seems uncool for long time users? On Monday, December 30, 2013, Jean-Armel Luce jaluc...@gmail.com wrote: Hi, I don't know how your application works, but I explained during the last Cassandra Summit Europe how we did the migration from relational database to Cassandra without any interruption of service. You can have a look at the video C* Summit EU 2013: The Cassandra Experience at Orange And use the mod-dup module https://github.com/Orange-OpenSource/mod_dup For copying data from your Cassandra cluster 1.1 to the Cassandra cluster 1.2, you can backup your data and then use sstableloader (in this case, you will not have to modify the timestamp as I did for the migration from relational to Cassandra). Hope that helps !! Jean Armel 2013/12/30 Tupshin Harper tups...@tupshin.com No. This is not going to work. The vnodes feature requires the murmur3 partitioner which was introduced with Cassandra 1.2. Since you are currently using 1.1, you must be using the random partitioner, which is not compatible with vnodes. Because the partitioner determines the physical layout of all of your data on disk and across the cluster, it is not possible to change partitioner without taking some downtime to rewrite all of your data. You should probably plan on an upgrade to 1.2 but without also switching to vnodes at this point. -Tupshin On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote: Hello list, I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1. Here's my nodetool ring: 1.1.1.1 eu-west 1a Up Normal 55.07 GB50.00% 0 2.2.2.1 us-east 1b Up Normal 107.82 GB 100.00% 1 1.1.1.2 eu-west 1b Up Normal 53.98 GB50.00% 28356863910078205288614550619314017622 1.1.1.3 eu-west 1c Up Normal 54.85 GB50.00% 56713727820156410577229101238628035242 2.2.2.2 us-east 1d Up Normal 107.25 GB 100.00% 56713727820156410577229101238628035243 1.1.1.4 eu-west 1a Up Normal 54.99 GB50.00% 85070591730234615865843651857942052863 1.1.1.5 eu-west 1b Up Normal 55.1 GB 50.00% 113427455640312821154458202477256070484 2.2.2.3 us-east 1e Up Normal 106.78 GB 100.00% 113427455640312821154458202477256070485 1.1.1.6 eu-west 1c Up Normal 55.01 GB50.00% 141784319550391026443072753096570088105 I am going to upgrade my machine type, upgrade to 1.2 and change the 6-node to 3 nodes. I will have to do it on the live system. I'd appreciate any comments about my plan. 1. Decommission a 1.1 node. 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am trying to avoid a re-balance later on). 3. When done, decommission nodes 4-6 at DC1 Issues i've spotted: 1. I'm guessing I will have an unbalanced cluster for the time period where I have 1.2+vnodes and 1.1 mixed. 2. Rollback is cumbersome, snapshots won't help here. Any feedback appreciated Katriel -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.
Re: Upgrading 1.1 to 1.2 in-place
Sorry for the misinformation. Totally forgot about that being supported since I've never seen the combination actually used. Correct that it should work, though. On Dec 30, 2013 2:18 PM, Hannu Kröger hkro...@gmail.com wrote: Hi, Random Partitioner + VNodes are a supported combo based on DataStax documentation: http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/architecture/architecturePartitionerAbout_c.html How else would you even migrate from 1.1 to Vnodes since migration from one partitioner to another is such a huge amount of work? Cheers, Hannu 2013/12/30 Edward Capriolo edlinuxg...@gmail.com What is the technical limitation that vnodes need murmer? That seems uncool for long time users? On Monday, December 30, 2013, Jean-Armel Luce jaluc...@gmail.com wrote: Hi, I don't know how your application works, but I explained during the last Cassandra Summit Europe how we did the migration from relational database to Cassandra without any interruption of service. You can have a look at the video C* Summit EU 2013: The Cassandra Experience at Orange And use the mod-dup module https://github.com/Orange-OpenSource/mod_dup For copying data from your Cassandra cluster 1.1 to the Cassandra cluster 1.2, you can backup your data and then use sstableloader (in this case, you will not have to modify the timestamp as I did for the migration from relational to Cassandra). Hope that helps !! Jean Armel 2013/12/30 Tupshin Harper tups...@tupshin.com No. This is not going to work. The vnodes feature requires the murmur3 partitioner which was introduced with Cassandra 1.2. Since you are currently using 1.1, you must be using the random partitioner, which is not compatible with vnodes. Because the partitioner determines the physical layout of all of your data on disk and across the cluster, it is not possible to change partitioner without taking some downtime to rewrite all of your data. You should probably plan on an upgrade to 1.2 but without also switching to vnodes at this point. -Tupshin On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote: Hello list, I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1. Here's my nodetool ring: 1.1.1.1 eu-west 1a Up Normal 55.07 GB 50.00% 0 2.2.2.1 us-east 1b Up Normal 107.82 GB 100.00% 1 1.1.1.2 eu-west 1b Up Normal 53.98 GB 50.00% 28356863910078205288614550619314017622 1.1.1.3 eu-west 1c Up Normal 54.85 GB 50.00% 56713727820156410577229101238628035242 2.2.2.2 us-east 1d Up Normal 107.25 GB 100.00% 56713727820156410577229101238628035243 1.1.1.4 eu-west 1a Up Normal 54.99 GB 50.00% 85070591730234615865843651857942052863 1.1.1.5 eu-west 1b Up Normal 55.1 GB 50.00% 113427455640312821154458202477256070484 2.2.2.3 us-east 1e Up Normal 106.78 GB 100.00% 113427455640312821154458202477256070485 1.1.1.6 eu-west 1c Up Normal 55.01 GB 50.00% 141784319550391026443072753096570088105 I am going to upgrade my machine type, upgrade to 1.2 and change the 6-node to 3 nodes. I will have to do it on the live system. I'd appreciate any comments about my plan. 1. Decommission a 1.1 node. 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am trying to avoid a re-balance later on). 3. When done, decommission nodes 4-6 at DC1 Issues i've spotted: 1. I'm guessing I will have an unbalanced cluster for the time period where I have 1.2+vnodes and 1.1 mixed. 2. Rollback is cumbersome, snapshots won't help here. Any feedback appreciated Katriel -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.
Re: Upgrading 1.1 to 1.2 in-place
OK. Given the correction of my unfortunate partitioner error, you can, and probably should, upgrade in place to 1.2, but with num_tokens=1 so it will initially behave like 1.1 non vnodes would. Then you can do a rolling conversion to more than one vnode per node, and once complete, shuffle your vnodes. http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes-2 There should be no time where your cluster is unbalanced. -Tupshin On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote: Hello list, I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1. Here's my nodetool ring: 1.1.1.1 eu-west 1a Up Normal 55.07 GB50.00% 0 2.2.2.1 us-east 1b Up Normal 107.82 GB 100.00% 1 1.1.1.2 eu-west 1b Up Normal 53.98 GB50.00% 28356863910078205288614550619314017622 1.1.1.3 eu-west 1c Up Normal 54.85 GB50.00% 56713727820156410577229101238628035242 2.2.2.2 us-east 1d Up Normal 107.25 GB 100.00% 56713727820156410577229101238628035243 1.1.1.4 eu-west 1a Up Normal 54.99 GB50.00% 85070591730234615865843651857942052863 1.1.1.5 eu-west 1b Up Normal 55.1 GB 50.00% 113427455640312821154458202477256070484 2.2.2.3 us-east 1e Up Normal 106.78 GB 100.00% 113427455640312821154458202477256070485 1.1.1.6 eu-west 1c Up Normal 55.01 GB50.00% 141784319550391026443072753096570088105 I am going to upgrade my machine type, upgrade to 1.2 and change the 6-node to 3 nodes. I will have to do it on the live system. I'd appreciate any comments about my plan. 1. Decommission a 1.1 node. 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am trying to avoid a re-balance later on). 3. When done, decommission nodes 4-6 at DC1 Issues i've spotted: 1. I'm guessing I will have an unbalanced cluster for the time period where I have 1.2+vnodes and 1.1 mixed. 2. Rollback is cumbersome, snapshots won't help here. Any feedback appreciated Katriel
CQL 3, Schema change management and best practices
Are there published best practices for managing Schema with CQL 3.0? Say for bootstrapping the schema for a new feature? Do folks query the system.schema_keyspaces on startup and create the necessary schema if it doesn't exist? Or do you have one-off scripts that create schema? Is there a more accepted way of dealing with this on an on-going basis? Thanks! tc
Re: Upgrading 1.1 to 1.2 in-place
On Mon, Dec 30, 2013 at 6:45 AM, Tupshin Harper tups...@tupshin.com wrote: OK. Given the correction of my unfortunate partitioner error, you can, and probably should, upgrade in place to 1.2, but with num_tokens=1 so it will initially behave like 1.1 non vnodes would. Then you can do a rolling conversion to more than one vnode per node, and once complete, shuffle your vnodes. @OP : 1) You should remove nodes, then upgrade in place to 1.2, then optionally convert to vnodes. Bootstrapping into a mixed version cluster is, if I understand correctly, not supported. 2) Depending on your data size and process used, using shuffle to convert to vnodes may fill your nodes/take forever/not work. [1] 3) I have not personally heard a single report of someone successfully using shuffle on a production cluster with real data. Try it on a representative data size in non-production first, and if you succeed, let the list know! :D If I were you, I would probably : a) question how much I really need vnodes on a 3 physical nodes per DC cluster b) if I decided I didn't really need vnodes, I would first decommission nodes 6 nodes down to 3 on 1.1 c) Then upgrade my 3 nodes to 1.2 via rolling restart d) Then use auto_bootstrap:false to upgrade instance hardware on the nodes by i) pre-copy data directory to target with rsync ii) configure target node with same initial_token as source node iii) drain and stop source node iv) re-copy data directory with rsync with --delete, to do final sync of data directories v) start new node with auto_bootstrap:false in conf file [2] =Rob [1] https://issues.apache.org/jira/browse/CASSANDRA-5525 [2] https://engineering.eventbrite.com/changing-the-ip-address-of-a-cassandra-node-with-auto_bootstrapfalse/
Re: Can't write to row key, even at ALL. Tombstones?
On Fri, Dec 27, 2013 at 6:13 PM, Josh Dzielak j...@keen.io wrote: Our suspicion is that we somehow have a row level tombstone that is future-dated and has not gone away (we’ve lowered gc_grace_seconds in hope that it’d get compacted, but no luck so far, even though the stables that hold the row key have all cycled since). What version of Cassandra? ecapriolo is right, use sstablekeys/sstable2json to inspect suspect rows in SSTables. If that's the problem, you can follow this procedure : http://thelastpickle.com/2011/12/15/Anatomy-of-a-Cassandra-Partition/ Or you can dump/reload with sstable2json/json2sstable and filter out bad values. =Rob
[RELEASE] Apache Cassandra 2.0.4
The Cassandra team is pleased to announce the release of Apache Cassandra version 2.0.4. Cassandra is a highly scalable second-generation distributed database; You can read more here: http://cassandra.apache.org/ Downloads of source and binary distributions are listed in our download section: http://cassandra.apache.org/download/ This version is a bug fix[1] release, and a recommended upgrade. As always please pay attention to the release notes[2] and let us know[3] if you encounter any problems. Enjoy! P.S. Once again, I'm afraid that an update to the APT repository will have to wait until Sylvain's return, my apologies. Until such time, you can access an updated Debian package from my home directory[4]. [1]: http://goo.gl/6OM7dZ (CHANGES.txt) [2]: http://goo.gl/At9VU3 (NEWS.txt) [3]: https://issues.apache.org/jira/browse/CASSANDRA [4]: http://people.apache.org/~eevans (Debian package) -- Eric Evans eev...@sym-link.com signature.asc Description: Digital signature
Re: CentOS - Could not setup cluster(snappy error)
You can add something like this to cassandra-env.sh : JVM_OPTS=$JVM_OPTS -Dorg.xerial.snappy.tempdir=/path/that/allows/executables - Erik - On 12/28/2013 08:36 AM, Edward Capriolo wrote: Check your fstabs settings. On some systems /tmp has noexec set and unpacking a library into temp and trying to run it does not work. On Fri, Dec 27, 2013 at 5:33 PM, Víctor Hugo Oliveira Molinar vhmoli...@gmail.com mailto:vhmoli...@gmail.com wrote: Hi, I'm not being able to start a multiple node cluster in a CentOs environment due to snappy loading error. Here is my current setup for both machines(Node 1 and 2), CentOs: CentOS release 6.5 (Final) Java java version 1.7.0_25 Java(TM) SE Runtime Environment (build 1.7.0_25-b15) Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) Cassandra Version: 2.0.3 Also, I've already replaced the current snappy jar(snappy-java-1.0.5.jar) by the older(snappy-java-1.0.4.1.jar). Although the following error is still happening when I try to start the second node: INFO 20:25:51,879 Handshaking version with /200.219.219.51 http://200.219.219.51 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:312) at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219) at org.xerial.snappy.Snappy.clinit(Snappy.java:44) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:79) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:66) at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:359) at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150) Caused by: java.lang.UnsatisfiedLinkError: /tmp/snappy-1.0.4.1-libsnappyjava.so http://snappy-1.0.4.1-libsnappyjava.so: /tmp/snappy-1.0.4.1-libsnappyjava.so http://snappy-1.0.4.1-libsnappyjava.so: failed to map segment from shared object: Operation not permitted at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1957) at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1882) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1843) at java.lang.Runtime.load0(Runtime.java:795) at java.lang.System.load(System.java:1061) at org.xerial.snappy.SnappyNativeLoader.load(SnappyNativeLoader.java:39) ... 11 more ERROR 20:25:52,201 Exception in thread Thread[WRITE-/200.219.219.51 http://200.219.219.51,5,main] org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229) at org.xerial.snappy.Snappy.clinit(Snappy.java:44) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:79) at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:66) at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:359) at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150) ERROR 20:26:22,924 Exception encountered during startup java.lang.RuntimeException: Unable to gossip with any seeds at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1160) at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:416) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:608) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:576) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:475) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:461) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:504) What else can I do to fix it? Att, /Víctor Hugo Molinar/
Replication Latency between cross data centers
I want to determine data replication latency between data centers. Is there any metrics that is available to capture it in JConsole or other ways?
Re: Slow pre-decommission repair
On Tue, Dec 17, 2013 at 1:46 PM, Joel Segerlind j...@kogito.se wrote: Thanks for the info. However, wouldn't this also affect nodetool -pr (although not as much), which I ran on the same node the other day in about 35 min? I cannot understand how it can take 35 min for the primary range, and 25 h for a full repair. Sure, it would. I agree that this seems pathologically long, even with vnodes. =Rob
Re: Adding nodes to a cluster and 2 minutes rule
On Mon, Nov 18, 2013 at 10:28 AM, Carlos Alvarez cbalva...@gmail.comwrote: Here http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/operations/ops_add_node_to_cluster_t.html says that it is needed to wait 2 minutes between adding nodes. I was trying to figure out why, and how to check if after 2 minutes the conditions to add more nodes are met or I have to wait more... any clues? Why includes : https://issues.apache.org/jira/browse/CASSANDRA-2434 =Rob
Re: Crash with TombstoneOverwhelmingException
On Wed, Dec 25, 2013 at 10:01 AM, Edward Capriolo edlinuxg...@gmail.comwrote: I have to hijack this thread. There seem to be many problems with the 2.0.3 release. +1. There is no 2.0.x release I consider production ready, even after today's 2.0.4. Outside of passing all unit tests, factors into the release voting process? What other type of extended real world testing should be done to find bugs like this one that unit testing wont? I also +1 these questions. Voting seems of limited use given the outputs of the process. Here is a whack y idea that I am half serious about. Make a CMS for http://cassndra.apache.org that back ends it's data and reporting into cassandra. No release unless Cassanda db that servers the site is upgraded first. :) I agree wholeheartedly that eating ones own dogfood is informative. =Rob
Re: MUTATION messages dropped
I ended up changing memtable_flush_queue_size to be large enough to contain the biggest flood I saw. As part of the flush process the “Switch Lock” is taken to synchronise around the commit log. This is a reentrant Read Write lock, the flush path takes the write lock and write path takes the read part. When flushing a CF the write lock is taken, the commit log is updated, and memtable is added to the flush queue. If the queue is full then the write lock will be held blocking the write threads from taking the read lock. There are a few reasons why the queue may be full, the simple one is the disk IO is not fast enough. Others are that the commit log segments are too small, there are lots of CF’s and/or lots of secondary indexes, or nodetoo flush is called frequently. Increasing the size of the queue is a good work around, and the correct approach if you have a lot of CF’s and/or secondary indexes. Hope that helps. - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 21/12/2013, at 6:03 am, Ken Hancock ken.hanc...@schange.com wrote: I ended up changing memtable_flush_queue_size to be large enough to contain the biggest flood I saw. I monitored tpstats over time using a collection script and an analysis script that I wrote to figure out what my largest peaks were. In my case, all my mutation drops correlated with hitting the maximum memtable_flush_queue_size and then mutations drops stopped as soon as the queue size dropped below the max. I threw the scripts up on github in case they're useful... https://github.com/hancockks/tpstats On Fri, Dec 20, 2013 at 1:08 AM, Alexander Shutyaev shuty...@gmail.com wrote: Thanks for you answers. srmore, We are using v2.0.0. As for GC I guess it does not correlate in our case, because we had cassandra running 9 days under production load and no dropped messages and I guess that during this time there were a lot of GCs. Ken, I've checked the values you indicated. Here they are: node1 6498 node2 6476 node3 6642 I guess this is not good :) What can we do to fix this problem? 2013/12/19 Ken Hancock ken.hanc...@schange.com We had issues where the number of CF families that were being flushed would align and then block writes for a very brief period. If that happened when a bunch of writes came in, we'd see a spike in Mutation drops. Check nodetool tpstats for FlushWriter all time blocked. On Thu, Dec 19, 2013 at 7:12 AM, Alexander Shutyaev shuty...@gmail.com wrote: Hi all! We've had a problem with cassandra recently. We had 2 one-minute periods when we got a lot of timeouts on the client side (the only timeouts during 9 days we are using cassandra in production). In the logs we've found corresponding messages saying something about MUTATION messages dropped. Now, the official faq [1] says that this is an indicator that the load is too high. We've checked our monitoring and found out that 1-minute average cpu load had a local peak at the time of the problem, but it was like 0.8 against 0.2 usual which I guess is nothing for a 2 core virtual machine. We've also checked java threads - there was no peak there and their count was reasonable ~240-250. Can anyone give us a hint - what should we monitor to see this high load and what should we tune to make it acceptable? Thanks in advance, Alexander [1] http://wiki.apache.org/cassandra/FAQ#dropped_messages -- Ken Hancock | System Architect, Advanced Advertising SeaChange International 50 Nagog Park Acton, Massachusetts 01720 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC Office: +1 (978) 889-3329 | ken.hanc...@schange.com | hancockks | hancockks This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International. -- Ken Hancock | System Architect, Advanced Advertising SeaChange International 50 Nagog Park Acton, Massachusetts 01720 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC Office: +1 (978) 889-3329 | ken.hanc...@schange.com | hancockks | hancockks This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International.
Re: Astyanax - multiple key search with pagination
You will need to paginate the list of keys to read in your app. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 21/12/2013, at 12:58 pm, Parag Patel parag.pa...@fusionts.com wrote: Hi, I’m using Astyanax and trying to do search for multiple keys with pagination. I tried “.getKeySlice” with a list a of primary keys, but it doesn’t allow pagination. Does anyone know how to tackle this issue with Astyanax? Parag
Re: Broken pipe with Thrift
One question, which is confusing , it's a server side issue or client side? Check the server log for errors to make sure it’s not a server side issue. Also check if there could be something in network that is killing long lived connections. Check the thrift lib the client is using is the same as the one in the cassandra lib on the server. Can you do some simple tests using cqlsh from the client machine? That would eliminate the client driver. Hope that helps. - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 25/12/2013, at 4:35 am, Steven A Robenalt srobe...@stanford.edu wrote: In our case, the issue was on the server side, but since you're on the 1.2.x branch, it's not likely to be the same issue. Hopefully, somone else who is using the 1.2.x branch will have more insight than I do. On Mon, Dec 23, 2013 at 11:52 PM, Vivek Mishra mishra.v...@gmail.com wrote: Hi Steven, One question, which is confusing , it's a server side issue or client side? -Vivek On Tue, Dec 24, 2013 at 12:30 PM, Vivek Mishra mishra.v...@gmail.com wrote: Hi Steven, Thanks for your reply. We are using version 1.2.9. -Vivek On Tue, Dec 24, 2013 at 12:27 PM, Steven A Robenalt srobe...@stanford.edu wrote: Hi Vivek, Which release are you using? We had an issue with 2.0.2 that was solved by a fix in 2.0.3. On Mon, Dec 23, 2013 at 10:47 PM, Vivek Mishra mishra.v...@gmail.com wrote: Also to add. It works absolutely fine on single node. -Vivek On Tue, Dec 24, 2013 at 12:15 PM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, I have a 6 node, 2DC cluster setup. I have configured consistency level to QUORUM. But very often i am getting Broken pipe com.impetus.client.cassandra.CassandraClientBase (CassandraClientBase.java:1926) - Error while executing native CQL query Caused by: . org.apache.thrift.transport.TTransportExceptionjava.net.SocketException: Broken pipe at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransportjava:147) at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65) at org.apache.cassandra.thrift.Cassandra$Client.send_execute_cql3_query(Cassandra.java:1556) at org.apache.cassandra.thrift.Cassandra$Client.execute_cql3_query(Cassandra.java:1546) I am simply reading few records from a column family(not huge amount of data) Connection pooling and socket time out is properly configured. I have even modified read_request_timeout_in_ms request_timeout_in_ms write_request_timeout_in_ms in cassandra.yaml to higher value. any idea? Is it an issue at server side or with client API? -Vivek -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu
Re: querying time series from hadoop
So now i will try to patch my cassandra 1.2.11 installation but i just wanted to ask you guys first, if there is any other solution that does not involve a release. That patch in CASSANDRA-6311 is for 2.0 you cannot apply it to 1.2 but when i am using the java driver, the driver already uses row key for token statements and i cannot execute the query above, therefore it does a full scan of rows. The ColumnFamilyRecordReader is designed to read lots of rows, not a single row. You should be able to use the java driver from a hadoop task though to read a single row. Can you provide some more info on what you are doing ? Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 26/12/2013, at 9:56 pm, mete efk...@gmail.com wrote: Hello folks, i have come up with a basic time series cql schema based on the articles here: http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra so simply put its something like: rowkey, timestamp, col3, col4 etc... where rowkey and timestamp are compound keys. Where i am having issues is to efficiently query this data structure. When i use cqlsh and query it is perfectly fine: select * from table where rowkey='row key' and date xxx and date = yyy but when i am using the java driver, the driver already uses row key for token statements and i cannot execute the query above, therefore it does a full scan of rows. The issue that i am having is discussed here: http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig i have gone through the relevant jira issues 6151 and 6311. This behaviour is supposed to be fixed in 2.0.x but so far it is not there. So now i will try to patch my cassandra 1.2.11 installation but i just wanted to ask you guys first, if there is any other solution that does not involve a release. i assume that this is somewhat a common use case, the articles i referred seems to be old enough and unless i am missing something obvious i cannot query this schema efficiently with the current version (1.2.x or 2.0.x) Does anyone has a similar issue? Any pointers are welcome. Regards Mete
Re: Offline migration: Random-Murmur
I wrote a small (yet untested) utility, which should be able to read SSTable files from disk and write them into a cassandra cluster using Hector. Consider using the SSTableSimpleUnsortedWriter (see http://www.datastax.com/dev/blog/bulk-loading) to create the SSTables you can then bulk load them into the destination system.This will be much faster. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 29/12/2013, at 6:26 am, Edward Capriolo edlinuxg...@gmail.com wrote: Internally we have a tool that does get range slice on the souce cluster and replicates to destination. Remeber that writes are itempotemt. Our tool can optionally only replicate data between two timestamps, allowing incremental transfers. So if you get your application writing new data to both clusters you can run a range scanning program to copy all the data. On Monday, December 23, 2013, horschi hors...@gmail.com wrote: Interesting you even dare to do a live migration :-) Do you do all Murmur-writes with the timestamp from the Random-data? So that all migrated data is written with timestamps from the past. On Mon, Dec 23, 2013 at 3:59 PM, Rahul Menon ra...@apigee.com wrote: Christian, I have been planning to migrate my cluster from random to murmur3 in a similar manner. I intend to use pycassa to read and then write to the newer cluster. My only concern would be ensuring the consistency of already migrated data as the cluster ( with random ) would be constantly serving the production traffic. I was able to do this on a non prod cluster, but production is a different game. I would also like to hear more about this, especially if someone was able to successfully do this. Thanks Rahul On Mon, Dec 23, 2013 at 6:45 PM, horschi hors...@gmail.com wrote: Hi list, has anyone ever tried to migrate a cluster from Random to Murmur? We would like to do so, to have a more standardized setup. I wrote a small (yet untested) utility, which should be able to read SSTable files from disk and write them into a cassandra cluster using Hector. This migration would be offline of course and would only work for smaller clusters. Any thoughts on the topic? kind regards, Christian PS: The reason for doing so are not performance. It is to simplify operational stuff for the years to come. :-) -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.
Re: cassandra monitoring
JMX is doing it's thing on the cassandra node and is running on port 8081 Have you set the JMX port for the cluster in Ops Centre ? The default JMX port has been 7199 for a while. Off the top of the my head it’s in the same area where you specify the initial nodes in the cluster, maybe behind an “Advanced” button. The Ops Centre agent talks to the server to find out what JMX port it should use to talk to the local Cassandra install. Also check the logs in /var/log/datastax Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 30/12/2013, at 2:21 am, Tim Dunphy bluethu...@gmail.com wrote: Hi all, I'm attempting to configure datastax agent so that opscenter can monitor cassandra. I am running cassandra 2.0.3 and opscenter-4.0.1-2.noarch running. Cassandra is running on a centos 5.9 host and the opscenter host is running on centos 6.5 A ps shows the agent running [root@beta:~] #ps -ef | grep datastax | grep -v grep root 2166 1 0 03:31 ?00:00:00 /bin/bash /usr/share/datastax-agent/bin/datastax_agent_monitor 106 2187 1 0 03:31 ?00:01:37 /etc/alternatives/javahome/bin/java -Xmx40M -Xms40M -Djavax.net.ssl.trustStore=/var/lib/datastax-agent/ssl/agentKeyStore -Djavax.net.ssl.keyStore=/var/lib/datastax-agent/ssl/agentKeyStore -Djavax.net.ssl.keyStorePassword=opscenter -Dagent-pidfile=/var/run/datastax-agent/datastax-agent.pid -Dlog4j.configuration=/etc/datastax-agent/log4j.properties -jar datastax-agent-4.0.2-standalone.jar /var/lib/datastax-agent/conf/address.yaml And the service itself claims that it is running: [root@beta:~] #service datastax-agent status datastax-agent (pid 2187) is running... On the cassandra node I have ports 61620 and 61621 open on the firewall. But if I do an lsof and look for those ports I see no activity there. [root@beta:~] #lsof -i :61620 [root@beta:~] #lsof -i :61621 And a netstat turns up nothing either: [root@beta:~] #netstat -tapn | egrep (datastax|ops) So I guess it should come as no surprise that the opscenter interface reports the node as down. And trying to reinstall the agent remotely by clicking the 'fix' link errors out: g is null If you need to make changes, you can press Retry and the installations will be retried. And also I got on another attempt: Cannot call method 'getRequstStatus' of null. I'm really wondering why I'm doing wrong here, and how I can work my way out of this quagmire. It would be beyond awesome to actually get this working! I've also attempted to get Cassandra Cluster Admin working. JMX is doing it's thing on the cassandra node and is running on port 8081. CCA is running on the same host as the opscenter. But cca gives me this error once I log in: Cassandra Cluster Admin Logout Fatal error: Uncaught exception 'TTransportException' with message 'TSocket: timed out reading 4 bytes from beta.jokefire.com:9160' in /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TSocket.php:268 Stack trace: #0 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TTransport.php(87): TSocket-read(4) #1 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TFramedTransport.php(135): TTransport-readAll(4) #2 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TFramedTransport.php(102): TFramedTransport-readFrame() #3 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TTransport.php(87): TFramedTransport-read(4) #4 /var/www/Cassandra-Cluster-Admin/include/thrift/protocol/TBinaryProtocol.php(300): TTransport-readAll(4) #5 /var/www/Cassandra-Cluster-Admin/include/thrift/protocol/TBinaryProtocol.php(192): TBinaryProtocol-readI32(NULL) #6 /var/www/Cassandra-Cluster-Admin/include/thrift/packages/cassandra/cassandra.Cassandra.client.php(1017): TBinaryProtocol-readMessageBegin(NULL, 0, 0) # in /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TSocket.php on line 268 Any advice I could get on my CCA problem and /or my Opcenter problem would be great and appreciated. Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: Cleanup and old files
Check the SSTable is actually in use by cassandra, if it’s missing a component or otherwise corrupt it will not be opened at run time and so not included in all the fun games the other SSTables get to play. If you have the last startup in the logs check for an “Opening… “ message or an ERROR about the file. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 30/12/2013, at 1:28 pm, David McNelis dmcne...@gmail.com wrote: I am currently running a cluster with 1.2.8. One of my larger column families on one of my nodes has keyspace-tablename-ic--Data.db with a modify date in August. Since august we have added several nodes (with vnodes), with the same number of vnodes as all the existing nodes. As a result, (we've since gone from 15 to 21 nodes), then ~32% of my data of the original 15 nodes should have been essentially balanced out to the 6 new nodes. (1/15 + 1/16 + 1/21). When I run a cleanup, however, the old data files never get updated, and I can't believe that they all should have remained the same. The only recently updated files in that data directory are secondary index sstable files. Am I doing something wrong here? Am I thinking about this wrong? David
Re: cassandra monitoring
Hi Aaron, You were right. JMX is running on port 7199, it's just the web interface that's on 8081. My mistake. But what I did was to delete my existing cluster and try to build a new cluster within opscenter and try pointing it at my existing cassandra node. Just one node for now, but when we go to production we plan to scale out. When I tried to install the agent with opscetner, the installation begins but fails with this message a few moments later: Install Errored: Failure installing agent on beta.jokefire.com. Error output: /var/lib/opscenter/ssl/agentKeyStore.pem: No such file or directory Exit code: 1 I was wondering where I could go from here. Also I would like to password protect my OpsCenter installation (assuming I can ever get any useful data into it). Are there any docs on how I can do that? Thanks Tim - Original Message - From: Aaron Morton aa...@thelastpickle.com To: Cassandra User user@cassandra.apache.org Sent: Monday, December 30, 2013 9:19:05 PM Subject: Re: cassandra monitoring JMX is doing it's thing on the cassandra node and is running on port 8081 Have you set the JMX port for the cluster in Ops Centre ? The default JMX port has been 7199 for a while. Off the top of the my head it’s in the same area where you specify the initial nodes in the cluster, maybe behind an “Advanced” button. The Ops Centre agent talks to the server to find out what JMX port it should use to talk to the local Cassandra install. Also check the logs in /var/log/datastax Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 30/12/2013, at 2:21 am, Tim Dunphy bluethu...@gmail.com wrote: blockquote Hi all, I'm attempting to configure datastax agent so that opscenter can monitor cassandra. I am running cassandra 2.0.3 and opscenter-4.0.1-2.noarch running. Cassandra is running on a centos 5.9 host and the opscenter host is running on centos 6.5 A ps shows the agent running [root@beta:~] #ps -ef | grep datastax | grep -v grep root 2166 1 0 03:31 ? 00:00:00 /bin/bash /usr/share/datastax-agent/bin/datastax_agent_monitor 106 2187 1 0 03:31 ? 00:01:37 /etc/alternatives/javahome/bin/java -Xmx40M -Xms40M -Djavax.net.ssl.trustStore=/var/lib/datastax-agent/ssl/agentKeyStore -Djavax.net.ssl.keyStore=/var/lib/datastax-agent/ssl/agentKeyStore -Djavax.net.ssl.keyStorePassword=opscenter -Dagent-pidfile=/var/run/datastax-agent/datastax-agent.pid -Dlog4j.configuration=/etc/datastax-agent/log4j.properties -jar datastax-agent-4.0.2-standalone.jar /var/lib/datastax-agent/conf/address.yaml And the service itself claims that it is running: [root@beta:~] #service datastax-agent status datastax-agent (pid 2187) is running... On the cassandra node I have ports 61620 and 61621 open on the firewall. But if I do an lsof and look for those ports I see no activity there. [root@beta:~] #lsof -i :61620 [root@beta:~] #lsof -i :61621 And a netstat turns up nothing either: [root@beta:~] #netstat -tapn | egrep (datastax|ops) So I guess it should come as no surprise that the opscenter interface reports the node as down. And trying to reinstall the agent remotely by clicking the 'fix' link errors out: g is null If you need to make changes, you can press Retry and the installations will be retried. And also I got on another attempt: Cannot call method 'getRequstStatus' of null. I'm really wondering why I'm doing wrong here, and how I can work my way out of this quagmire. It would be beyond awesome to actually get this working! I've also attempted to get Cassandra Cluster Admin working. JMX is doing it's thing on the cassandra node and is running on port 8081. CCA is running on the same host as the opscenter. But cca gives me this error once I log in: Cassandra Cluster Admin Logout Fatal error : Uncaught exception 'TTransportException' with message 'TSocket: timed out reading 4 bytes from beta.jokefire.com:9160 ' in /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TSocket.php:268 Stack trace: #0 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TTransport.php(87): TSocket-read(4) #1 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TFramedTransport.php(135): TTransport-readAll(4) #2 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TFramedTransport.php(102): TFramedTransport-readFrame() #3 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TTransport.php(87): TFramedTransport-read(4) #4 /var/www/Cassandra-Cluster-Admin/include/thrift/protocol/TBinaryProtocol.php(300): TTransport-readAll(4) #5 /var/www/Cassandra-Cluster-Admin/include/thrift/protocol/TBinaryProtocol.php(192): TBinaryProtocol-readI32(NULL) #6 /var/www/Cassandra-Cluster-Admin/include/thrift/packages/cassandra/cassandra.Cassandra.client.php(1017): TBinaryProtocol-readMessageBegin(NULL, 0, 0) # in
Re: Cleanup and old files
I see the SSTable in this log statement: Stream context metadata (along with a bunch of other files)but I do not see it in the list of files Opening (which I see quite a bit of, as expected). Safe to try moving that file off server (to a backup location)? If I tried this, would I want to shut down the node first and monitor startup to see if it all of a sudden is 'missing' something / throws an error then? On Mon, Dec 30, 2013 at 9:26 PM, Aaron Morton aa...@thelastpickle.comwrote: Check the SSTable is actually in use by cassandra, if it’s missing a component or otherwise corrupt it will not be opened at run time and so not included in all the fun games the other SSTables get to play. If you have the last startup in the logs check for an “Opening… “ message or an ERROR about the file. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 30/12/2013, at 1:28 pm, David McNelis dmcne...@gmail.com wrote: I am currently running a cluster with 1.2.8. One of my larger column families on one of my nodes has keyspace-tablename-ic--Data.db with a modify date in August. Since august we have added several nodes (with vnodes), with the same number of vnodes as all the existing nodes. As a result, (we've since gone from 15 to 21 nodes), then ~32% of my data of the original 15 nodes should have been essentially balanced out to the 6 new nodes. (1/15 + 1/16 + 1/21). When I run a cleanup, however, the old data files never get updated, and I can't believe that they all should have remained the same. The only recently updated files in that data directory are secondary index sstable files. Am I doing something wrong here? Am I thinking about this wrong? David
Opscenter Metrics
Hi guys, I am using YCSB and using thrift based *client.batch_mutate()* call. Now say opscenter reports the write requests as say 1000 *operations*/sec when a record count is say 1 records. OpsCenter API docs say 'Write Requests as requests per second 1 What does an 'operation or request' mean from opscenter perspective? 2 Does it mean unique row inserts across all column families ? or Does it mean count of each mutations for a row ? Can some one guide me ? Thanks, Arun