Upgrading 1.1 to 1.2 in-place

2013-12-30 Thread Katriel Traum
Hello list,

I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6 nodes,
DC2 has 3. This whole setup runs on AWS, running cassandra 1.1.
Here's my nodetool ring:
1.1.1.1  eu-west 1a  Up Normal  55.07 GB50.00%
 0
2.2.2.1  us-east 1b  Up Normal  107.82 GB   100.00%
1
1.1.1.2  eu-west 1b  Up Normal  53.98 GB50.00%
 28356863910078205288614550619314017622
1.1.1.3  eu-west 1c  Up Normal  54.85 GB50.00%
 56713727820156410577229101238628035242
2.2.2.2  us-east 1d  Up Normal  107.25 GB   100.00%
56713727820156410577229101238628035243
1.1.1.4  eu-west 1a  Up Normal  54.99 GB50.00%
 85070591730234615865843651857942052863
1.1.1.5  eu-west 1b  Up Normal  55.1 GB 50.00%
 113427455640312821154458202477256070484
2.2.2.3  us-east 1e  Up Normal  106.78 GB   100.00%
113427455640312821154458202477256070485
1.1.1.6  eu-west 1c  Up Normal  55.01 GB50.00%
 141784319550391026443072753096570088105


I am going to upgrade my machine type, upgrade to 1.2 and change the 6-node
to 3 nodes. I will have to do it on the live system.
I'd appreciate any comments about my plan.
1. Decommission a 1.1 node.
2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am trying
to avoid a re-balance later on).
3. When done, decommission nodes 4-6 at DC1

Issues i've spotted:
1. I'm guessing I will have an unbalanced cluster for the time period where
I have 1.2+vnodes and 1.1 mixed.
2. Rollback is cumbersome, snapshots won't help here.

Any feedback appreciated

Katriel


Re: Upgrading 1.1 to 1.2 in-place

2013-12-30 Thread Tupshin Harper
No.  This is not going to work.  The vnodes feature requires the murmur3
partitioner which was introduced with Cassandra 1.2.

Since you are currently using 1.1, you must be using the random
partitioner, which is not compatible with vnodes.

Because the partitioner determines the physical layout of all of your data
on disk and across the cluster, it is not possible to change partitioner
without taking some downtime to rewrite all of your data.

You should probably plan on an upgrade to 1.2 but without also switching to
vnodes at this point.

-Tupshin
On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote:

 Hello list,

 I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6
 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1.
 Here's my nodetool ring:
 1.1.1.1  eu-west 1a  Up Normal  55.07 GB50.00%
  0
 2.2.2.1  us-east 1b  Up Normal  107.82 GB   100.00%
   1
 1.1.1.2  eu-west 1b  Up Normal  53.98 GB50.00%
  28356863910078205288614550619314017622
 1.1.1.3  eu-west 1c  Up Normal  54.85 GB50.00%
  56713727820156410577229101238628035242
 2.2.2.2  us-east 1d  Up Normal  107.25 GB   100.00%
   56713727820156410577229101238628035243
 1.1.1.4  eu-west 1a  Up Normal  54.99 GB50.00%
  85070591730234615865843651857942052863
 1.1.1.5  eu-west 1b  Up Normal  55.1 GB 50.00%
  113427455640312821154458202477256070484
 2.2.2.3  us-east 1e  Up Normal  106.78 GB   100.00%
   113427455640312821154458202477256070485
 1.1.1.6  eu-west 1c  Up Normal  55.01 GB50.00%
  141784319550391026443072753096570088105


 I am going to upgrade my machine type, upgrade to 1.2 and change the
 6-node to 3 nodes. I will have to do it on the live system.
 I'd appreciate any comments about my plan.
 1. Decommission a 1.1 node.
 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am
 trying to avoid a re-balance later on).
 3. When done, decommission nodes 4-6 at DC1

 Issues i've spotted:
 1. I'm guessing I will have an unbalanced cluster for the time period
 where I have 1.2+vnodes and 1.1 mixed.
 2. Rollback is cumbersome, snapshots won't help here.

 Any feedback appreciated

 Katriel




Re: Upgrading 1.1 to 1.2 in-place

2013-12-30 Thread Jean-Armel Luce
Hi,

I don't know how your application works, but I explained during the last
Cassandra Summit Europe how we did the migration from relational database
to Cassandra without any interruption of service.

You can have a look at the video http://www.youtube.com/watch?v=mefOE9K7sLI

And use the mod-dup module https://github.com/Orange-OpenSource/mod_dup

For copying data from your Cassandra cluster 1.1 to the Cassandra cluster
1.2, you can backup your data and then use sstableloader (in this case, you
will not have to modify the timestamp as I did for the migration from
relational to Cassandra).

Hope that helps !!

Jean Armel



2013/12/30 Tupshin Harper tups...@tupshin.com

 No.  This is not going to work.  The vnodes feature requires the murmur3
 partitioner which was introduced with Cassandra 1.2.

 Since you are currently using 1.1, you must be using the random
 partitioner, which is not compatible with vnodes.

 Because the partitioner determines the physical layout of all of your data
 on disk and across the cluster, it is not possible to change partitioner
 without taking some downtime to rewrite all of your data.

 You should probably plan on an upgrade to 1.2 but without also switching
 to vnodes at this point.

 -Tupshin
 On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote:

 Hello list,

 I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6
 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1.
 Here's my nodetool ring:
 1.1.1.1  eu-west 1a  Up Normal  55.07 GB50.00%
0
 2.2.2.1  us-east 1b  Up Normal  107.82 GB   100.00%
   1
 1.1.1.2  eu-west 1b  Up Normal  53.98 GB50.00%
28356863910078205288614550619314017622
 1.1.1.3  eu-west 1c  Up Normal  54.85 GB50.00%
56713727820156410577229101238628035242
 2.2.2.2  us-east 1d  Up Normal  107.25 GB   100.00%
   56713727820156410577229101238628035243
 1.1.1.4  eu-west 1a  Up Normal  54.99 GB50.00%
85070591730234615865843651857942052863
 1.1.1.5  eu-west 1b  Up Normal  55.1 GB 50.00%
113427455640312821154458202477256070484
 2.2.2.3  us-east 1e  Up Normal  106.78 GB   100.00%
   113427455640312821154458202477256070485
 1.1.1.6  eu-west 1c  Up Normal  55.01 GB50.00%
141784319550391026443072753096570088105


 I am going to upgrade my machine type, upgrade to 1.2 and change the
 6-node to 3 nodes. I will have to do it on the live system.
 I'd appreciate any comments about my plan.
 1. Decommission a 1.1 node.
 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am
 trying to avoid a re-balance later on).
 3. When done, decommission nodes 4-6 at DC1

 Issues i've spotted:
 1. I'm guessing I will have an unbalanced cluster for the time period
 where I have 1.2+vnodes and 1.1 mixed.
 2. Rollback is cumbersome, snapshots won't help here.

 Any feedback appreciated

 Katriel




Opscenter's Meaning of 'Requests'

2013-12-30 Thread Arun Kumar K
Hi Guys,

I have started understanding Cassandra and am working with it recently.

I have created two Column Families. For CF1, a write is an insert into a
unique row with all column values. Eg:

  Key Col1  Col2   Col3
  k1  c11   c12   c13
  k2  c21   c22   c23

For CF2. a write is an insert into a time stamped column of a row. Eg:

 Key  timeCol1  timeCol2
 k1   ct11
 k1  ct12
 k2ct21
 k2   ct22

I am using YCSB and using thrift based *client.batch_mutate()* call. For
CF1, i send all column vals for a row through the call. For CF2, i send the
new column vals for a row.

Now say opscenter reports the write requests as say 1000 *operations*/sec
when a record count is say 1 records.

OpsCenter API docs say 'Write Requests as requests per second.

What does an operation/request mean from opscenter perspective? Does it
mean unique row inserts across all column families ? Does it mean count of
each mutations for a row ?

How does opscenter identify a unique operation/request ? Is it related or
dependent on the row count or mutation count of batch_mutate() call ?

From application perspective an operation means differently for both column
families.

Can some one guide me ?

Thanks,

Arun


Re: Upgrading 1.1 to 1.2 in-place

2013-12-30 Thread Edward Capriolo
What is the technical limitation that vnodes need murmer? That seems uncool
for long time users?

On Monday, December 30, 2013, Jean-Armel Luce jaluc...@gmail.com wrote:
 Hi,

 I don't know how your application works, but I explained during the last
Cassandra Summit Europe how we did the migration from relational database
to Cassandra without any interruption of service.

 You can have a look at the video C* Summit EU 2013: The Cassandra
Experience at Orange

 And use the mod-dup module https://github.com/Orange-OpenSource/mod_dup

 For copying data from your Cassandra cluster 1.1 to the Cassandra cluster
1.2, you can backup your data and then use sstableloader (in this case, you
will not have to modify the timestamp as I did for the migration from
relational to Cassandra).

 Hope that helps !!

 Jean Armel



 2013/12/30 Tupshin Harper tups...@tupshin.com

 No.  This is not going to work.  The vnodes feature requires the murmur3
partitioner which was introduced with Cassandra 1.2.

 Since you are currently using 1.1, you must be using the random
partitioner, which is not compatible with vnodes.

 Because the partitioner determines the physical layout of all of your
data on disk and across the cluster, it is not possible to change
partitioner without taking some downtime to rewrite all of your data.

 You should probably plan on an upgrade to 1.2 but without also switching
to vnodes at this point.

 -Tupshin

 On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote:

 Hello list,
 I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6
nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1.
 Here's my nodetool ring:
 1.1.1.1  eu-west 1a  Up Normal  55.07 GB50.00%
 0
 2.2.2.1  us-east 1b  Up Normal  107.82 GB   100.00%
1
 1.1.1.2  eu-west 1b  Up Normal  53.98 GB50.00%
 28356863910078205288614550619314017622
 1.1.1.3  eu-west 1c  Up Normal  54.85 GB50.00%
 56713727820156410577229101238628035242
 2.2.2.2  us-east 1d  Up Normal  107.25 GB   100.00%
56713727820156410577229101238628035243
 1.1.1.4  eu-west 1a  Up Normal  54.99 GB50.00%
 85070591730234615865843651857942052863
 1.1.1.5  eu-west 1b  Up Normal  55.1 GB 50.00%
 113427455640312821154458202477256070484
 2.2.2.3  us-east 1e  Up Normal  106.78 GB   100.00%
113427455640312821154458202477256070485
 1.1.1.6  eu-west 1c  Up Normal  55.01 GB50.00%
 141784319550391026443072753096570088105

 I am going to upgrade my machine type, upgrade to 1.2 and change the
6-node to 3 nodes. I will have to do it on the live system.
 I'd appreciate any comments about my plan.
 1. Decommission a 1.1 node.
 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am
trying to avoid a re-balance later on).
 3. When done, decommission nodes 4-6 at DC1
 Issues i've spotted:
 1. I'm guessing I will have an unbalanced cluster for the time period
where I have 1.2+vnodes and 1.1 mixed.
 2. Rollback is cumbersome, snapshots won't help here.
 Any feedback appreciated
 Katriel



-- 
Sorry this was sent from mobile. Will do less grammar and spell check than
usual.


Re: Upgrading 1.1 to 1.2 in-place

2013-12-30 Thread Hannu Kröger
Hi,

Random Partitioner + VNodes are a supported combo based on DataStax
documentation:
http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/architecture/architecturePartitionerAbout_c.html

How else would you even migrate from 1.1 to Vnodes since migration from one
partitioner to another is such a huge amount of work?

Cheers,
Hannu


2013/12/30 Edward Capriolo edlinuxg...@gmail.com

 What is the technical limitation that vnodes need murmer? That seems
 uncool for long time users?


 On Monday, December 30, 2013, Jean-Armel Luce jaluc...@gmail.com wrote:
  Hi,
 
  I don't know how your application works, but I explained during the last
 Cassandra Summit Europe how we did the migration from relational database
 to Cassandra without any interruption of service.
 
  You can have a look at the video C* Summit EU 2013: The Cassandra
 Experience at Orange

 
  And use the mod-dup module https://github.com/Orange-OpenSource/mod_dup
 
  For copying data from your Cassandra cluster 1.1 to the Cassandra
 cluster 1.2, you can backup your data and then use sstableloader (in this
 case, you will not have to modify the timestamp as I did for the migration
 from relational to Cassandra).
 
  Hope that helps !!
 
  Jean Armel
 
 
 
  2013/12/30 Tupshin Harper tups...@tupshin.com
 
  No.  This is not going to work.  The vnodes feature requires the
 murmur3 partitioner which was introduced with Cassandra 1.2.
 
  Since you are currently using 1.1, you must be using the random
 partitioner, which is not compatible with vnodes.
 
  Because the partitioner determines the physical layout of all of your
 data on disk and across the cluster, it is not possible to change
 partitioner without taking some downtime to rewrite all of your data.
 
  You should probably plan on an upgrade to 1.2 but without also
 switching to vnodes at this point.
 
  -Tupshin
 
  On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote:
 
  Hello list,
  I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6
 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1.
  Here's my nodetool ring:
  1.1.1.1  eu-west 1a  Up Normal  55.07 GB50.00%
  0
  2.2.2.1  us-east 1b  Up Normal  107.82 GB
 100.00% 1
  1.1.1.2  eu-west 1b  Up Normal  53.98 GB50.00%
  28356863910078205288614550619314017622
  1.1.1.3  eu-west 1c  Up Normal  54.85 GB50.00%
  56713727820156410577229101238628035242
  2.2.2.2  us-east 1d  Up Normal  107.25 GB
 100.00% 56713727820156410577229101238628035243
  1.1.1.4  eu-west 1a  Up Normal  54.99 GB50.00%
  85070591730234615865843651857942052863
  1.1.1.5  eu-west 1b  Up Normal  55.1 GB 50.00%
  113427455640312821154458202477256070484
  2.2.2.3  us-east 1e  Up Normal  106.78 GB
 100.00% 113427455640312821154458202477256070485
  1.1.1.6  eu-west 1c  Up Normal  55.01 GB50.00%
  141784319550391026443072753096570088105
 
  I am going to upgrade my machine type, upgrade to 1.2 and change the
 6-node to 3 nodes. I will have to do it on the live system.
  I'd appreciate any comments about my plan.
  1. Decommission a 1.1 node.
  2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am
 trying to avoid a re-balance later on).
  3. When done, decommission nodes 4-6 at DC1
  Issues i've spotted:
  1. I'm guessing I will have an unbalanced cluster for the time period
 where I have 1.2+vnodes and 1.1 mixed.
  2. Rollback is cumbersome, snapshots won't help here.
  Any feedback appreciated
  Katriel
 
 

 --
 Sorry this was sent from mobile. Will do less grammar and spell check than
 usual.



Re: Upgrading 1.1 to 1.2 in-place

2013-12-30 Thread Tupshin Harper
Sorry for the misinformation.  Totally forgot about that being supported
since I've never seen the combination actually used.  Correct that it
should work, though.
On Dec 30, 2013 2:18 PM, Hannu Kröger hkro...@gmail.com wrote:

 Hi,

 Random Partitioner + VNodes are a supported combo based on DataStax
 documentation:

 http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/architecture/architecturePartitionerAbout_c.html

 How else would you even migrate from 1.1 to Vnodes since migration from
 one partitioner to another is such a huge amount of work?

 Cheers,
 Hannu


 2013/12/30 Edward Capriolo edlinuxg...@gmail.com

 What is the technical limitation that vnodes need murmer? That seems
 uncool for long time users?


 On Monday, December 30, 2013, Jean-Armel Luce jaluc...@gmail.com wrote:
  Hi,
 
  I don't know how your application works, but I explained during the
 last Cassandra Summit Europe how we did the migration from relational
 database to Cassandra without any interruption of service.
 
  You can have a look at the video C* Summit EU 2013: The Cassandra
 Experience at Orange

 
  And use the mod-dup module https://github.com/Orange-OpenSource/mod_dup
 
  For copying data from your Cassandra cluster 1.1 to the Cassandra
 cluster 1.2, you can backup your data and then use sstableloader (in this
 case, you will not have to modify the timestamp as I did for the migration
 from relational to Cassandra).
 
  Hope that helps !!
 
  Jean Armel
 
 
 
  2013/12/30 Tupshin Harper tups...@tupshin.com
 
  No.  This is not going to work.  The vnodes feature requires the
 murmur3 partitioner which was introduced with Cassandra 1.2.
 
  Since you are currently using 1.1, you must be using the random
 partitioner, which is not compatible with vnodes.
 
  Because the partitioner determines the physical layout of all of your
 data on disk and across the cluster, it is not possible to change
 partitioner without taking some downtime to rewrite all of your data.
 
  You should probably plan on an upgrade to 1.2 but without also
 switching to vnodes at this point.
 
  -Tupshin
 
  On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote:
 
  Hello list,
  I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6
 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1.
  Here's my nodetool ring:
  1.1.1.1  eu-west 1a  Up Normal  55.07 GB
  50.00%  0
  2.2.2.1  us-east 1b  Up Normal  107.82 GB
 100.00% 1
  1.1.1.2  eu-west 1b  Up Normal  53.98 GB
  50.00%  28356863910078205288614550619314017622
  1.1.1.3  eu-west 1c  Up Normal  54.85 GB
  50.00%  56713727820156410577229101238628035242
  2.2.2.2  us-east 1d  Up Normal  107.25 GB
 100.00% 56713727820156410577229101238628035243
  1.1.1.4  eu-west 1a  Up Normal  54.99 GB
  50.00%  85070591730234615865843651857942052863
  1.1.1.5  eu-west 1b  Up Normal  55.1 GB
 50.00%  113427455640312821154458202477256070484
  2.2.2.3  us-east 1e  Up Normal  106.78 GB
 100.00% 113427455640312821154458202477256070485
  1.1.1.6  eu-west 1c  Up Normal  55.01 GB
  50.00%  141784319550391026443072753096570088105
 
  I am going to upgrade my machine type, upgrade to 1.2 and change the
 6-node to 3 nodes. I will have to do it on the live system.
  I'd appreciate any comments about my plan.
  1. Decommission a 1.1 node.
  2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am
 trying to avoid a re-balance later on).
  3. When done, decommission nodes 4-6 at DC1
  Issues i've spotted:
  1. I'm guessing I will have an unbalanced cluster for the time period
 where I have 1.2+vnodes and 1.1 mixed.
  2. Rollback is cumbersome, snapshots won't help here.
  Any feedback appreciated
  Katriel
 
 

 --
 Sorry this was sent from mobile. Will do less grammar and spell check
 than usual.





Re: Upgrading 1.1 to 1.2 in-place

2013-12-30 Thread Tupshin Harper
OK.  Given the correction of my unfortunate partitioner error, you can, and
probably should, upgrade in place to 1.2, but with num_tokens=1 so it will
initially behave like 1.1 non vnodes would. Then you can do a rolling
conversion to more than one vnode per node, and once complete, shuffle your
vnodes.

http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes-2

There should be no time where your cluster is unbalanced.

-Tupshin
On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote:

 Hello list,

 I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6
 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1.
 Here's my nodetool ring:
 1.1.1.1  eu-west 1a  Up Normal  55.07 GB50.00%
  0
 2.2.2.1  us-east 1b  Up Normal  107.82 GB   100.00%
   1
 1.1.1.2  eu-west 1b  Up Normal  53.98 GB50.00%
  28356863910078205288614550619314017622
 1.1.1.3  eu-west 1c  Up Normal  54.85 GB50.00%
  56713727820156410577229101238628035242
 2.2.2.2  us-east 1d  Up Normal  107.25 GB   100.00%
   56713727820156410577229101238628035243
 1.1.1.4  eu-west 1a  Up Normal  54.99 GB50.00%
  85070591730234615865843651857942052863
 1.1.1.5  eu-west 1b  Up Normal  55.1 GB 50.00%
  113427455640312821154458202477256070484
 2.2.2.3  us-east 1e  Up Normal  106.78 GB   100.00%
   113427455640312821154458202477256070485
 1.1.1.6  eu-west 1c  Up Normal  55.01 GB50.00%
  141784319550391026443072753096570088105


 I am going to upgrade my machine type, upgrade to 1.2 and change the
 6-node to 3 nodes. I will have to do it on the live system.
 I'd appreciate any comments about my plan.
 1. Decommission a 1.1 node.
 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am
 trying to avoid a re-balance later on).
 3. When done, decommission nodes 4-6 at DC1

 Issues i've spotted:
 1. I'm guessing I will have an unbalanced cluster for the time period
 where I have 1.2+vnodes and 1.1 mixed.
 2. Rollback is cumbersome, snapshots won't help here.

 Any feedback appreciated

 Katriel




CQL 3, Schema change management and best practices

2013-12-30 Thread Todd Carrico
Are there published best practices for managing Schema with CQL 3.0?

Say for bootstrapping the schema for a new feature?

Do folks query the system.schema_keyspaces on startup and create the necessary 
schema if it doesn't exist?

Or do you have one-off scripts that create schema?

Is there a more accepted way of dealing with this on an on-going basis?

Thanks!

tc




Re: Upgrading 1.1 to 1.2 in-place

2013-12-30 Thread Robert Coli
On Mon, Dec 30, 2013 at 6:45 AM, Tupshin Harper tups...@tupshin.com wrote:

 OK.  Given the correction of my unfortunate partitioner error, you can,
 and probably should, upgrade in place to 1.2, but with num_tokens=1 so it
 will initially behave like 1.1 non vnodes would. Then you can do a rolling
 conversion to more than one vnode per node, and once complete, shuffle your
 vnodes.


@OP :

1) You should remove nodes, then upgrade in place to 1.2, then optionally
convert to vnodes. Bootstrapping into a mixed version cluster is, if I
understand correctly, not supported.
2) Depending on your data size and process used, using shuffle to convert
to vnodes may fill your nodes/take forever/not work. [1]
3) I have not personally heard a single report of someone successfully
using shuffle on a production cluster with real data. Try it on a
representative data size in non-production first, and if you succeed, let
the list know! :D

If I were you, I would probably :

a) question how much I really need vnodes on a 3 physical nodes per DC
cluster
b) if I decided I didn't really need vnodes, I would first decommission
nodes 6 nodes down to 3 on 1.1
c) Then upgrade my 3 nodes to 1.2 via rolling restart
d) Then use auto_bootstrap:false to upgrade instance hardware on the nodes
by
   i) pre-copy data directory to target with rsync
   ii) configure target node with same initial_token as source node
   iii) drain and stop source node
   iv) re-copy data directory with rsync with --delete, to do final sync of
data directories
   v) start new node with auto_bootstrap:false in conf file [2]

=Rob
[1] https://issues.apache.org/jira/browse/CASSANDRA-5525
[2]
https://engineering.eventbrite.com/changing-the-ip-address-of-a-cassandra-node-with-auto_bootstrapfalse/


Re: Can't write to row key, even at ALL. Tombstones?

2013-12-30 Thread Robert Coli
On Fri, Dec 27, 2013 at 6:13 PM, Josh Dzielak j...@keen.io wrote:

 Our suspicion is that we somehow have a row level tombstone that
 is future-dated and has not gone away (we’ve lowered gc_grace_seconds in
 hope that it’d get compacted, but no luck so far, even though the stables
 that hold the row key have all cycled since).


What version of Cassandra?

ecapriolo is right, use sstablekeys/sstable2json to inspect suspect rows in
SSTables.

If that's the problem, you can follow this procedure :

http://thelastpickle.com/2011/12/15/Anatomy-of-a-Cassandra-Partition/

Or you can dump/reload with sstable2json/json2sstable and filter out bad
values.

=Rob


[RELEASE] Apache Cassandra 2.0.4

2013-12-30 Thread Eric Evans


The Cassandra team is pleased to announce the release of Apache Cassandra
version 2.0.4.

Cassandra is a highly scalable second-generation distributed database; 
You can read more here:

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is a bug fix[1] release, and a recommended upgrade. As always
please pay attention to the release notes[2] and let us know[3] if you
encounter any problems.

Enjoy!


P.S. Once again, I'm afraid that an update to the APT repository will have
to wait until Sylvain's return, my apologies.  Until such time, you can 
access an updated Debian package from my home directory[4].


[1]: http://goo.gl/6OM7dZ (CHANGES.txt)
[2]: http://goo.gl/At9VU3 (NEWS.txt)
[3]: https://issues.apache.org/jira/browse/CASSANDRA
[4]: http://people.apache.org/~eevans (Debian package)


--
Eric Evans
eev...@sym-link.com


signature.asc
Description: Digital signature


Re: CentOS - Could not setup cluster(snappy error)

2013-12-30 Thread Erik Forkalsud


You can add something like this to cassandra-env.sh :

JVM_OPTS=$JVM_OPTS 
-Dorg.xerial.snappy.tempdir=/path/that/allows/executables



- Erik -


On 12/28/2013 08:36 AM, Edward Capriolo wrote:
Check your fstabs settings. On some systems /tmp has noexec set and 
unpacking a library into temp and trying to run it does not work.



On Fri, Dec 27, 2013 at 5:33 PM, Víctor Hugo Oliveira Molinar 
vhmoli...@gmail.com mailto:vhmoli...@gmail.com wrote:


Hi, I'm not being able to start a multiple node cluster in a
CentOs environment due to snappy loading error.

Here is my current setup for both machines(Node 1 and 2),
CentOs:
   CentOS release 6.5 (Final)

Java
   java version 1.7.0_25
   Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
   Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

Cassandra Version:
   2.0.3

Also, I've already replaced the current snappy
jar(snappy-java-1.0.5.jar) by the older(snappy-java-1.0.4.1.jar).
Although the following error is still happening when I try to
start the second node:


 INFO 20:25:51,879 Handshaking version with /200.219.219.51
http://200.219.219.51
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:312)
at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219)
at org.xerial.snappy.Snappy.clinit(Snappy.java:44)
at
org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:79)
at
org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:66)
at

org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:359)
at

org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150)
Caused by: java.lang.UnsatisfiedLinkError:
/tmp/snappy-1.0.4.1-libsnappyjava.so
http://snappy-1.0.4.1-libsnappyjava.so:
/tmp/snappy-1.0.4.1-libsnappyjava.so
http://snappy-1.0.4.1-libsnappyjava.so: failed to map segment
from shared object: Operation not permitted
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1957)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1882)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1843)
at java.lang.Runtime.load0(Runtime.java:795)
at java.lang.System.load(System.java:1061)
at
org.xerial.snappy.SnappyNativeLoader.load(SnappyNativeLoader.java:39)
... 11 more
ERROR 20:25:52,201 Exception in thread
Thread[WRITE-/200.219.219.51 http://200.219.219.51,5,main]
org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null
at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229)
at org.xerial.snappy.Snappy.clinit(Snappy.java:44)
at
org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:79)
at
org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:66)
at

org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:359)
at

org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:150)
ERROR 20:26:22,924 Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at
org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1160)
at

org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:416)
at

org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:608)
at

org.apache.cassandra.service.StorageService.initServer(StorageService.java:576)
at

org.apache.cassandra.service.StorageService.initServer(StorageService.java:475)
at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346)
at

org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:461)
at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:504)




What else can I do to fix it?
Att,
/Víctor Hugo Molinar/






Replication Latency between cross data centers

2013-12-30 Thread Senthil, Athinanthny X. -ND
I want to determine data replication latency between data centers. Is there any 
metrics that is available to capture it in JConsole or other ways?







Re: Slow pre-decommission repair

2013-12-30 Thread Robert Coli
On Tue, Dec 17, 2013 at 1:46 PM, Joel Segerlind j...@kogito.se wrote:

 Thanks for the info. However, wouldn't this also affect nodetool -pr (although
 not as much), which I ran on the same node the other day in about 35 min?
 I cannot understand how it can take 35 min for the primary range, and 25 h
 for a full repair.


Sure, it would. I agree that this seems pathologically long, even with
vnodes.

=Rob


Re: Adding nodes to a cluster and 2 minutes rule

2013-12-30 Thread Robert Coli
On Mon, Nov 18, 2013 at 10:28 AM, Carlos Alvarez cbalva...@gmail.comwrote:

 Here
 http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/operations/ops_add_node_to_cluster_t.html
  says
 that it is needed to wait 2 minutes between adding nodes.

 I was trying to figure out why, and how to check if after 2 minutes the
 conditions to add more nodes are met or I have to wait more... any clues?


Why includes : https://issues.apache.org/jira/browse/CASSANDRA-2434

=Rob


Re: Crash with TombstoneOverwhelmingException

2013-12-30 Thread Robert Coli
 On Wed, Dec 25, 2013 at 10:01 AM, Edward Capriolo edlinuxg...@gmail.comwrote:

 I have to hijack this thread. There seem to be many problems with the
 2.0.3 release.


+1. There is no 2.0.x release I consider production ready, even after
today's 2.0.4.

Outside of passing all unit tests, factors into the release voting process?
 What other type of extended real world testing should be done to find bugs
 like this one that unit testing wont?


I also +1 these questions. Voting seems of limited use given the outputs of
the process.


 Here is a whack y idea that I am half serious about. Make a CMS for
 http://cassndra.apache.org  that back ends it's data and reporting into
 cassandra. No release unless Cassanda db that servers the site is upgraded
 first. :)


I agree wholeheartedly that eating ones own dogfood is informative.

=Rob


Re: MUTATION messages dropped

2013-12-30 Thread Aaron Morton
 I ended up changing memtable_flush_queue_size to be large enough to contain 
 the biggest flood I saw.
As part of the flush process the “Switch Lock” is taken to synchronise around 
the commit log. This is a reentrant Read Write lock, the flush path takes the 
write lock and write path takes the read part. When flushing a CF the write 
lock is taken, the commit log is updated, and memtable is added to the flush 
queue. If the queue is full then the write lock will be held blocking the write 
threads from taking the read lock. 

There are a few reasons why the queue may be full, the simple one is the disk 
IO is not fast enough. Others are that the commit log segments are too small, 
there are lots of CF’s and/or lots of secondary indexes, or nodetoo flush is 
called frequently. 

Increasing the size of the queue is a good work around, and the correct 
approach if you have a lot of CF’s and/or secondary indexes. 

Hope that helps.


-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 21/12/2013, at 6:03 am, Ken Hancock ken.hanc...@schange.com wrote:

 I ended up changing memtable_flush_queue_size to be large enough to contain 
 the biggest flood I saw.
 
 I monitored tpstats over time using a collection script and an analysis 
 script that I wrote to figure out what my largest peaks were.  In my case, 
 all my mutation drops correlated with hitting the maximum 
 memtable_flush_queue_size and then mutations drops stopped as soon as the 
 queue size dropped below the max.
 
 I threw the scripts up on github in case they're useful...
 
 https://github.com/hancockks/tpstats
 
 
 
 
 On Fri, Dec 20, 2013 at 1:08 AM, Alexander Shutyaev shuty...@gmail.com 
 wrote:
 Thanks for you answers.
 
 srmore,
 
 We are using v2.0.0. As for GC I guess it does not correlate in our case, 
 because we had cassandra running 9 days under production load and no dropped 
 messages and I guess that during this time there were a lot of GCs.
 
 Ken,
 
 I've checked the values you indicated. Here they are:
 
 node1 6498
 node2 6476
 node3 6642
 
 I guess this is not good :) What can we do to fix this problem?
 
 
 2013/12/19 Ken Hancock ken.hanc...@schange.com
 We had issues where the number of CF families that were being flushed would 
 align and then block writes for a very brief period. If that happened when a 
 bunch of writes came in, we'd see a spike in Mutation drops.
 
 Check nodetool tpstats for FlushWriter all time blocked.
 
 
 On Thu, Dec 19, 2013 at 7:12 AM, Alexander Shutyaev shuty...@gmail.com 
 wrote:
 Hi all!
 
 We've had a problem with cassandra recently. We had 2 one-minute periods when 
 we got a lot of timeouts on the client side (the only timeouts during 9 days 
 we are using cassandra in production). In the logs we've found corresponding 
 messages saying something about MUTATION messages dropped.
 
 Now, the official faq [1] says that this is an indicator that the load is too 
 high. We've checked our monitoring and found out that 1-minute average cpu 
 load had a local peak at the time of the problem, but it was like 0.8 against 
 0.2 usual which I guess is nothing for a 2 core virtual machine. We've also 
 checked java threads - there was no peak there and their count was reasonable 
 ~240-250.
 
 Can anyone give us a hint - what should we monitor to see this high load 
 and what should we tune to make it acceptable?
 
 Thanks in advance,
 Alexander
 
 [1] http://wiki.apache.org/cassandra/FAQ#dropped_messages
 
 
 
 -- 
 Ken Hancock | System Architect, Advanced Advertising 
 SeaChange International 
 50 Nagog Park
 Acton, Massachusetts 01720
 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC 
 Office: +1 (978) 889-3329 |  ken.hanc...@schange.com | hancockks | hancockks  
 
 
 This e-mail and any attachments may contain information which is SeaChange 
 International confidential. The information enclosed is intended only for the 
 addressees herein and may not be copied or forwarded without permission from 
 SeaChange International.
 
 
 
 
 -- 
 Ken Hancock | System Architect, Advanced Advertising 
 SeaChange International 
 50 Nagog Park
 Acton, Massachusetts 01720
 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC 
 Office: +1 (978) 889-3329 |  ken.hanc...@schange.com | hancockks | hancockks  
 
 
 This e-mail and any attachments may contain information which is SeaChange 
 International confidential. The information enclosed is intended only for the 
 addressees herein and may not be copied or forwarded without permission from 
 SeaChange International.



Re: Astyanax - multiple key search with pagination

2013-12-30 Thread Aaron Morton
You will need to paginate the list of keys to read in your app. 

Cheers


-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 21/12/2013, at 12:58 pm, Parag Patel parag.pa...@fusionts.com wrote:

 Hi,
  
 I’m using Astyanax and trying to do search for multiple keys with pagination. 
  I tried “.getKeySlice” with a list a of primary keys, but it doesn’t allow 
 pagination.  Does anyone know how to tackle this issue with Astyanax?
  
 Parag



Re: Broken pipe with Thrift

2013-12-30 Thread Aaron Morton
 One question, which is confusing , it's a server side issue or client side?
Check the server log for errors to make sure it’s not a server side issue. 
Also check if there could be something in network that is killing long lived 
connections. 
Check the thrift lib the client is using is the same as the one in the 
cassandra lib on the server. 

Can you do some simple tests using cqlsh from the client machine? That would 
eliminate the client driver. 

Hope that helps.


-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 25/12/2013, at 4:35 am, Steven A Robenalt srobe...@stanford.edu wrote:

 In our case, the issue was on the server side, but since you're on the 1.2.x 
 branch, it's not likely to be the same issue. Hopefully, somone else who is 
 using the 1.2.x branch will have more insight than I do.
 
 
 On Mon, Dec 23, 2013 at 11:52 PM, Vivek Mishra mishra.v...@gmail.com wrote:
 Hi Steven,
 One question, which is confusing , it's a server side issue or client side?
 
 -Vivek
 
 
 
 
 On Tue, Dec 24, 2013 at 12:30 PM, Vivek Mishra mishra.v...@gmail.com wrote:
 Hi Steven,
 Thanks for your reply. We are using version 1.2.9.
 
 -Vivek
 
 
 On Tue, Dec 24, 2013 at 12:27 PM, Steven A Robenalt srobe...@stanford.edu 
 wrote:
 Hi Vivek,
 
 Which release are you using? We had an issue with 2.0.2 that was solved by a 
 fix in 2.0.3.
 
 
 On Mon, Dec 23, 2013 at 10:47 PM, Vivek Mishra mishra.v...@gmail.com wrote:
 Also to add. It works absolutely fine on single node.
 
 -Vivek
 
 
 On Tue, Dec 24, 2013 at 12:15 PM, Vivek Mishra mishra.v...@gmail.com wrote:
 Hi,
 I have a 6 node, 2DC cluster setup. I have configured consistency level to 
 QUORUM.  But very often i am getting Broken pipe
 com.impetus.client.cassandra.CassandraClientBase
 (CassandraClientBase.java:1926) - Error while executing native CQL
 query Caused by: .
 org.apache.thrift.transport.TTransportExceptionjava.net.SocketException: 
 Broken pipe
at
 org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransportjava:147)
 at 
 org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
 at
 org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65)
 at
 org.apache.cassandra.thrift.Cassandra$Client.send_execute_cql3_query(Cassandra.java:1556)
 at
 org.apache.cassandra.thrift.Cassandra$Client.execute_cql3_query(Cassandra.java:1546)
 
 
 I am simply reading few records from a column family(not huge amount of data)
 
 Connection pooling and socket time out is properly configured. I have even 
 modified 
 read_request_timeout_in_ms
 request_timeout_in_ms
 write_request_timeout_in_ms  in cassandra.yaml to higher value.
 
 
 any idea? Is it an issue at server side or with client API?
 
 -Vivek
 
 
 
 
 -- 
 Steve Robenalt
 Software Architect
 HighWire | Stanford University 
 425 Broadway St, Redwood City, CA 94063 
 
 srobe...@stanford.edu 
 http://highwire.stanford.edu 
 
 
 
 
 
 
 
 
 
 
 -- 
 Steve Robenalt
 Software Architect
 HighWire | Stanford University 
 425 Broadway St, Redwood City, CA 94063 
 
 srobe...@stanford.edu 
 http://highwire.stanford.edu 
 
 
 
 
 



Re: querying time series from hadoop

2013-12-30 Thread Aaron Morton
 So now i will try to patch my cassandra 1.2.11 installation but i just wanted 
 to ask you guys first, if there is any other solution that does not involve a 
 release.
That patch in CASSANDRA-6311 is for 2.0 you cannot apply it to 1.2

 but when i am using the java driver, the driver already uses row key for 
 token statements and i cannot execute the query above, therefore it does a 
 full scan of rows.
The  ColumnFamilyRecordReader is designed to read lots of rows, not a single 
row. 

You should be able to use the java driver from a hadoop task though to read a 
single row. Can you provide some more info on what you are doing ? 

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 26/12/2013, at 9:56 pm, mete efk...@gmail.com wrote:

 Hello  folks, 
 
 i have come up with a basic time series cql schema based on the articles here:
 
 http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra
 
 so simply put its something like:
 
 rowkey, timestamp, col3, col4 etc... 
 
 where rowkey and timestamp are compound keys.
 
 Where i am having issues is to efficiently query this data structure. 
 
 When i use cqlsh and query it is perfectly fine:
 
 select * from table where rowkey='row key' and date  xxx and date = yyy
 
 but when i am using the java driver, the driver already uses row key for 
 token statements and i cannot execute the query above, therefore it does a 
 full scan of rows.
 
 The issue that i am having is discussed here:
 
 http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig
 
 i have gone through the relevant jira issues 6151 and 6311. This behaviour is 
 supposed to be fixed in 2.0.x but so far it is not there. So now i will try 
 to patch my cassandra 1.2.11 installation but i just wanted to ask you guys 
 first, if there is any other solution that does not involve a release.
 
 i assume that this is somewhat a common use case, the articles i referred 
 seems to be old enough and unless i am missing something obvious i cannot 
 query this schema efficiently with the current version (1.2.x or 2.0.x)
 
 Does anyone has a similar issue? Any pointers are welcome.
 
 Regards
 Mete
 
 
 
 



Re: Offline migration: Random-Murmur

2013-12-30 Thread Aaron Morton
  I wrote a small (yet untested) utility, which should be able to read SSTable 
 files from disk and write them into a cassandra cluster using Hector.
Consider using the SSTableSimpleUnsortedWriter (see 
http://www.datastax.com/dev/blog/bulk-loading) to create the SSTables you can 
then bulk load them into the destination system.This will be much faster. 


Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 29/12/2013, at 6:26 am, Edward Capriolo edlinuxg...@gmail.com wrote:

 Internally we have a tool that does get range slice on the souce cluster and 
 replicates to destination.
 
 Remeber that writes are itempotemt. Our tool can optionally only replicate 
 data between two timestamps, allowing incremental transfers.
 
 So if you get your application writing new data to both clusters you can run 
 a range scanning program to copy all the data.
 
 On Monday, December 23, 2013, horschi hors...@gmail.com wrote:
  Interesting you even dare to do a live migration :-)
 
  Do you do all Murmur-writes with the timestamp from the Random-data? So 
  that all migrated data is written with timestamps from the past.
 
 
 
  On Mon, Dec 23, 2013 at 3:59 PM, Rahul Menon ra...@apigee.com wrote:
 
  Christian,
 
  I have been planning to migrate my cluster from random to murmur3 in a 
  similar manner. I intend to use pycassa to read and then write to the 
  newer cluster. My only concern would be ensuring the consistency of 
  already migrated data as the cluster ( with random ) would be constantly 
  serving the production traffic. I was able to do this on a non prod 
  cluster, but production is a different game.
 
  I would also like to hear more about this, especially if someone was able 
  to successfully do this.
 
  Thanks
  Rahul
 
 
  On Mon, Dec 23, 2013 at 6:45 PM, horschi hors...@gmail.com wrote:
 
  Hi list,
 
  has anyone ever tried to migrate a cluster from Random to Murmur?
 
  We would like to do so, to have a more standardized setup. I wrote a 
  small (yet untested) utility, which should be able to read SSTable files 
  from disk and write them into a cassandra cluster using Hector. This 
  migration would be offline of course and would only work for smaller 
  clusters.
 
  Any thoughts on the topic?
 
  kind regards,
  Christian
 
  PS: The reason for doing so are not performance. It is to simplify 
  operational stuff for the years to come. :-)
 
 
 
 
 -- 
 Sorry this was sent from mobile. Will do less grammar and spell check than 
 usual.



Re: cassandra monitoring

2013-12-30 Thread Aaron Morton
  JMX is doing it's thing on the cassandra node and is running on port 8081
Have you set the JMX port for the cluster in Ops Centre ? The default JMX port 
has been 7199 for a while.

Off the top of the my head it’s in the same area where you specify the initial 
nodes in the cluster, maybe behind an “Advanced” button. 

The Ops Centre agent talks to the server to find out what JMX port it should 
use to talk to the local Cassandra install. 

Also check the logs in /var/log/datastax 

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/12/2013, at 2:21 am, Tim Dunphy bluethu...@gmail.com wrote:

 Hi all,
 
 I'm attempting to configure datastax agent so that opscenter can monitor 
 cassandra. I am running cassandra 2.0.3 and opscenter-4.0.1-2.noarch running. 
 Cassandra is running on a centos 5.9 host and the opscenter host is running 
 on centos 6.5
 
 A ps shows the agent running
 
 [root@beta:~] #ps -ef | grep datastax | grep -v grep 
 root  2166 1  0 03:31 ?00:00:00 /bin/bash 
 /usr/share/datastax-agent/bin/datastax_agent_monitor
 106   2187 1  0 03:31 ?00:01:37 
 /etc/alternatives/javahome/bin/java -Xmx40M -Xms40M 
 -Djavax.net.ssl.trustStore=/var/lib/datastax-agent/ssl/agentKeyStore 
 -Djavax.net.ssl.keyStore=/var/lib/datastax-agent/ssl/agentKeyStore 
 -Djavax.net.ssl.keyStorePassword=opscenter 
 -Dagent-pidfile=/var/run/datastax-agent/datastax-agent.pid 
 -Dlog4j.configuration=/etc/datastax-agent/log4j.properties -jar 
 datastax-agent-4.0.2-standalone.jar /var/lib/datastax-agent/conf/address.yaml
 
 And the service itself claims that it is running:
 
 [root@beta:~] #service datastax-agent status 
 datastax-agent (pid  2187) is running...
 
 On the cassandra node I have ports 61620 and 61621 open on the firewall.
 
 But if I do an lsof and look for those ports I see no activity there.
 
 [root@beta:~] #lsof -i :61620 
 [root@beta:~] #lsof -i :61621
 
 And a netstat turns up nothing either:
 [root@beta:~] #netstat -tapn | egrep (datastax|ops)
 
 
 So I guess it should come as no surprise that the opscenter interface reports 
 the node as down.
 
 And trying to reinstall the agent remotely by clicking the 'fix' link errors 
 out:
 
 g is null
 
 If you need to make changes, you can press Retry and the installations will 
 be retried.
 
 And also I got on another attempt:
 
 Cannot call method 'getRequstStatus' of null. 
 
 I'm really wondering why I'm doing wrong here, and how I can work my way out 
 of this quagmire. It would be beyond awesome to actually get this working!
 
 I've also attempted to get Cassandra Cluster Admin working. JMX is doing it's 
 thing on the cassandra node and is running on port 8081. CCA is running on 
 the same host as the opscenter.
 
 But cca gives me this error once I log in:
 
 Cassandra Cluster Admin
 
 Logout
 
 Fatal error: Uncaught exception 'TTransportException' with message 'TSocket: 
 timed out reading 4 bytes from beta.jokefire.com:9160' in 
 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TSocket.php:268 
 Stack trace: #0 
 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TTransport.php(87): 
 TSocket-read(4) #1 
 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TFramedTransport.php(135):
  TTransport-readAll(4) #2 
 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TFramedTransport.php(102):
  TFramedTransport-readFrame() #3 
 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TTransport.php(87): 
 TFramedTransport-read(4) #4 
 /var/www/Cassandra-Cluster-Admin/include/thrift/protocol/TBinaryProtocol.php(300):
  TTransport-readAll(4) #5 
 /var/www/Cassandra-Cluster-Admin/include/thrift/protocol/TBinaryProtocol.php(192):
  TBinaryProtocol-readI32(NULL) #6 
 /var/www/Cassandra-Cluster-Admin/include/thrift/packages/cassandra/cassandra.Cassandra.client.php(1017):
  TBinaryProtocol-readMessageBegin(NULL, 0, 0) # in 
 /var/www/Cassandra-Cluster-Admin/include/thrift/transport/TSocket.php on line 
 268
 
 Any advice I could get on my CCA problem and /or my Opcenter problem would be 
 great and appreciated.
 
 Thanks
 Tim
 
 -- 
 GPG me!!
 
 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
 



Re: Cleanup and old files

2013-12-30 Thread Aaron Morton
Check the SSTable is actually in use by cassandra, if it’s missing a component 
or otherwise corrupt it will not be opened at run time and so not included in 
all the fun games the other SSTables get to play. 

If you have the last startup in the logs check for an “Opening… “ message or an 
ERROR about the file. 

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/12/2013, at 1:28 pm, David McNelis dmcne...@gmail.com wrote:

 I am currently running a cluster with 1.2.8.  One of my larger column 
 families on one of my nodes has keyspace-tablename-ic--Data.db with a 
 modify date in August.
 
 Since august we have added several nodes (with vnodes), with the same number 
 of vnodes as all the existing nodes.
 
 As a result, (we've since gone from 15 to 21 nodes), then ~32% of my data of 
 the original 15 nodes should have been essentially balanced out to the 6 new 
 nodes.  (1/15 + 1/16 +  1/21).
 
 When I run a cleanup, however, the old data files never get updated, and I 
 can't believe that they all should have remained the same.
 
 The only recently updated files in that data directory are secondary index 
 sstable files.  Am I doing something wrong here?  Am I thinking about this 
 wrong?
 
 David



Re: cassandra monitoring

2013-12-30 Thread Timothy P. Dunphy
Hi Aaron, 

You were right. JMX is running on port 7199, it's just the web interface that's 
on 8081. My mistake. But what I did was to delete my existing cluster and try 
to build a new cluster within opscenter and try pointing it at my existing 
cassandra node. Just one node for now, but when we go to production we plan to 
scale out. 

When I tried to install the agent with opscetner, the installation begins but 
fails with this message a few moments later: 

Install Errored: Failure installing agent on beta.jokefire.com. Error output: 
/var/lib/opscenter/ssl/agentKeyStore.pem: No such file or directory Exit code: 
1 

I was wondering where I could go from here. Also I would like to password 
protect my OpsCenter installation (assuming I can ever get any useful data into 
it). Are there any docs on how I can do that? 

Thanks 
Tim 


- Original Message -

From: Aaron Morton aa...@thelastpickle.com 
To: Cassandra User user@cassandra.apache.org 
Sent: Monday, December 30, 2013 9:19:05 PM 
Subject: Re: cassandra monitoring 




JMX is doing it's thing on the cassandra node and is running on port 8081 


Have you set the JMX port for the cluster in Ops Centre ? The default JMX port 
has been 7199 for a while. 

Off the top of the my head it’s in the same area where you specify the initial 
nodes in the cluster, maybe behind an “Advanced” button. 

The Ops Centre agent talks to the server to find out what JMX port it should 
use to talk to the local Cassandra install. 

Also check the logs in /var/log/datastax 

Cheers 

- 
Aaron Morton 
New Zealand 
@aaronmorton 

Co-Founder  Principal Consultant 
Apache Cassandra Consulting 
http://www.thelastpickle.com 

On 30/12/2013, at 2:21 am, Tim Dunphy  bluethu...@gmail.com  wrote: 


blockquote

Hi all, 

I'm attempting to configure datastax agent so that opscenter can monitor 
cassandra. I am running cassandra 2.0.3 and opscenter-4.0.1-2.noarch running. 
Cassandra is running on a centos 5.9 host and the opscenter host is running on 
centos 6.5 

A ps shows the agent running 

[root@beta:~] #ps -ef | grep datastax | grep -v grep 
root 2166 1 0 03:31 ? 00:00:00 /bin/bash 
/usr/share/datastax-agent/bin/datastax_agent_monitor 
106 2187 1 0 03:31 ? 00:01:37 /etc/alternatives/javahome/bin/java -Xmx40M 
-Xms40M -Djavax.net.ssl.trustStore=/var/lib/datastax-agent/ssl/agentKeyStore 
-Djavax.net.ssl.keyStore=/var/lib/datastax-agent/ssl/agentKeyStore 
-Djavax.net.ssl.keyStorePassword=opscenter 
-Dagent-pidfile=/var/run/datastax-agent/datastax-agent.pid 
-Dlog4j.configuration=/etc/datastax-agent/log4j.properties -jar 
datastax-agent-4.0.2-standalone.jar /var/lib/datastax-agent/conf/address.yaml 

And the service itself claims that it is running: 

[root@beta:~] #service datastax-agent status 
datastax-agent (pid 2187) is running... 

On the cassandra node I have ports 61620 and 61621 open on the firewall. 

But if I do an lsof and look for those ports I see no activity there. 

[root@beta:~] #lsof -i :61620 
[root@beta:~] #lsof -i :61621 

And a netstat turns up nothing either: 
[root@beta:~] #netstat -tapn | egrep (datastax|ops) 


So I guess it should come as no surprise that the opscenter interface reports 
the node as down. 

And trying to reinstall the agent remotely by clicking the 'fix' link errors 
out: 

g is null 

If you need to make changes, you can press Retry and the installations will 
be retried. 

And also I got on another attempt: 

Cannot call method 'getRequstStatus' of null. 

I'm really wondering why I'm doing wrong here, and how I can work my way out of 
this quagmire. It would be beyond awesome to actually get this working! 

I've also attempted to get Cassandra Cluster Admin working. JMX is doing it's 
thing on the cassandra node and is running on port 8081. CCA is running on the 
same host as the opscenter. 

But cca gives me this error once I log in: 

Cassandra Cluster Admin 
Logout 

Fatal error : Uncaught exception 'TTransportException' with message 'TSocket: 
timed out reading 4 bytes from beta.jokefire.com:9160 ' in 
/var/www/Cassandra-Cluster-Admin/include/thrift/transport/TSocket.php:268 Stack 
trace: #0 
/var/www/Cassandra-Cluster-Admin/include/thrift/transport/TTransport.php(87): 
TSocket-read(4) #1 
/var/www/Cassandra-Cluster-Admin/include/thrift/transport/TFramedTransport.php(135):
 TTransport-readAll(4) #2 
/var/www/Cassandra-Cluster-Admin/include/thrift/transport/TFramedTransport.php(102):
 TFramedTransport-readFrame() #3 
/var/www/Cassandra-Cluster-Admin/include/thrift/transport/TTransport.php(87): 
TFramedTransport-read(4) #4 
/var/www/Cassandra-Cluster-Admin/include/thrift/protocol/TBinaryProtocol.php(300):
 TTransport-readAll(4) #5 
/var/www/Cassandra-Cluster-Admin/include/thrift/protocol/TBinaryProtocol.php(192):
 TBinaryProtocol-readI32(NULL) #6 
/var/www/Cassandra-Cluster-Admin/include/thrift/packages/cassandra/cassandra.Cassandra.client.php(1017):
 TBinaryProtocol-readMessageBegin(NULL, 0, 0) # in 

Re: Cleanup and old files

2013-12-30 Thread David McNelis
I see the SSTable in this log statement:   Stream context metadata (along
with a bunch of other files)but I do not see it in the list of files
Opening (which I see quite a bit of, as expected).

Safe to try moving that file off server (to a backup location)?  If I tried
this, would I want to shut down the node first and monitor startup to see
if it all of a sudden is 'missing' something / throws an error then?


On Mon, Dec 30, 2013 at 9:26 PM, Aaron Morton aa...@thelastpickle.comwrote:

 Check the SSTable is actually in use by cassandra, if it’s missing a
 component or otherwise corrupt it will not be opened at run time and so not
 included in all the fun games the other SSTables get to play.

 If you have the last startup in the logs check for an “Opening… “ message
 or an ERROR about the file.

 Cheers

 -
 Aaron Morton
 New Zealand
 @aaronmorton

 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

 On 30/12/2013, at 1:28 pm, David McNelis dmcne...@gmail.com wrote:

 I am currently running a cluster with 1.2.8.  One of my larger column
 families on one of my nodes has keyspace-tablename-ic--Data.db with a
 modify date in August.

 Since august we have added several nodes (with vnodes), with the same
 number of vnodes as all the existing nodes.

 As a result, (we've since gone from 15 to 21 nodes), then ~32% of my data
 of the original 15 nodes should have been essentially balanced out to the 6
 new nodes.  (1/15 + 1/16 +  1/21).

 When I run a cleanup, however, the old data files never get updated, and I
 can't believe that they all should have remained the same.

 The only recently updated files in that data directory are secondary index
 sstable files.  Am I doing something wrong here?  Am I thinking about this
 wrong?

 David





Opscenter Metrics

2013-12-30 Thread Arun Kumar K
Hi guys,

I am using YCSB and using thrift based *client.batch_mutate()* call.

Now say opscenter reports the write requests as say 1000 *operations*/sec
when a record count is say 1 records.

OpsCenter API docs say 'Write Requests as requests per second

1 What does an 'operation or request' mean from opscenter perspective?

2 Does it mean unique row inserts across all column families ? or  Does it
   mean count of each mutations for a row ?

 Can some one guide me ?


Thanks,

 Arun