Re: failure node rejoin

2016-10-20 Thread Ben Slater
OK. Are you certain your tests don’t generate any overlapping inserts (by
PK)? Cassandra basically treats any inserts with the same primary key as
updates (so 1000 insert operations may not necessarily result in 1000 rows
in the DB).

On Fri, 21 Oct 2016 at 16:30 Yuji Ito  wrote:

> thanks Ben,
>
> > 1) At what stage did you have (or expect to have) 1000 rows (and have
> the mismatch between actual and expected) - at that end of operation (2) or
> after operation (3)?
>
> after operation 3), at operation 4) which reads all rows by cqlsh with
> CL.SERIAL
>
> > 2) What replication factor and replication strategy is used by the test
> keyspace? What consistency level is used by your operations?
>
> - create keyspace testkeyspace WITH REPLICATION =
> {'class':'SimpleStrategy','replication_factor':3};
> - consistency level is SERIAL
>
>
> On Fri, Oct 21, 2016 at 12:04 PM, Ben Slater 
> wrote:
>
>
> A couple of questions:
> 1) At what stage did you have (or expect to have) 1000 rows (and have the
> mismatch between actual and expected) - at that end of operation (2) or
> after operation (3)?
> 2) What replication factor and replication strategy is used by the test
> keyspace? What consistency level is used by your operations?
>
>
> Cheers
> Ben
>
> On Fri, 21 Oct 2016 at 13:57 Yuji Ito  wrote:
>
> Thanks Ben,
>
> I tried to run a rebuild and repair after the failure node rejoined the
> cluster as a "new" node with -Dcassandra.replace_address_first_boot.
> The failure node could rejoined and I could read all rows successfully.
> (Sometimes a repair failed because the node cannot access other node. If
> it failed, I retried a repair)
>
> But some rows were lost after my destructive test repeated (after about
> 5-6 hours).
> After the test inserted 1000 rows, there were only 953 rows at the end of
> the test.
>
> My destructive test:
> - each C* node is killed & restarted at the random interval (within about
> 5 min) throughout this test
> 1) truncate all tables
> 2) insert initial rows (check if all rows are inserted successfully)
> 3) request a lot of read/write to random rows for about 30min
> 4) check all rows
> If operation 1), 2) or 4) fail due to C* failure, the test retry the
> operation.
>
> Does anyone have the similar problem?
> What causes data lost?
> Does the test need any operation when C* node is restarted? (Currently, I
> just restarted C* process)
>
> Regards,
>
>
> On Tue, Oct 18, 2016 at 2:18 PM, Ben Slater 
> wrote:
>
> OK, that’s a bit more unexpected (to me at least) but I think the solution
> of running a rebuild or repair still applies.
>
> On Tue, 18 Oct 2016 at 15:45 Yuji Ito  wrote:
>
> Thanks Ben, Jeff
>
> Sorry that my explanation confused you.
>
> Only node1 is the seed node.
> Node2 whose C* data is deleted is NOT a seed.
>
> I restarted the failure node(node2) after restarting the seed node(node1).
> The restarting node2 succeeded without the exception.
> (I couldn't restart node2 before restarting node1 as expected.)
>
> Regards,
>
>
> On Tue, Oct 18, 2016 at 1:06 PM, Jeff Jirsa 
> wrote:
>
> The unstated "problem" here is that node1 is a seed, which implies
> auto_bootstrap=false (can't bootstrap a seed, so it was almost certainly
> setup to start without bootstrapping).
>
> That means once the data dir is wiped, it's going to start again without a
> bootstrap, and make a single node cluster or join an existing cluster if
> the seed list is valid
>
>
>
> --
> Jeff Jirsa
>
>
> On Oct 17, 2016, at 8:51 PM, Ben Slater 
> wrote:
>
> OK, sorry - I think understand what you are asking now.
>
> However, I’m still a little confused by your description. I think your
> scenario is:
> 1) Stop C* on all nodes in a cluster (Nodes A,B,C)
> 2) Delete all data from Node A
> 3) Restart Node A
> 4) Restart Node B,C
>
> Is this correct?
>
> If so, this isn’t a scenario I’ve tested/seen but I’m not surprised Node A
> starts succesfully as there are no running nodes to tell it via gossip that
> it shouldn’t start up without the “replaces” flag.
>
> I think that right way to recover in this scenario is to run a nodetool
> rebuild on Node A after the other two nodes are running. You could
> theoretically also run a repair (which would be good practice after a weird
> failure scenario like this) but rebuild will probably be quicker given you
> know all the data needs to be re-streamed.
>
> Cheers
> Ben
>
> On Tue, 18 Oct 2016 at 14:03 Yuji Ito  wrote:
>
> Thank you Ben, Yabin
>
> I understood the rejoin was illegal.
> I expected this rejoin would fail with the exception.
> But I could add the failure node to the cluster without the
> exception after 2) and 3).
> I want to know why the rejoin succeeds. Should the exception happen?
>
> Regards,
>
>
> On Tue, Oct 18, 2016 at 1:51 AM, Yabin Meng 

Re: failure node rejoin

2016-10-20 Thread Yuji Ito
thanks Ben,

> 1) At what stage did you have (or expect to have) 1000 rows (and have the
mismatch between actual and expected) - at that end of operation (2) or
after operation (3)?

after operation 3), at operation 4) which reads all rows by cqlsh with
CL.SERIAL

> 2) What replication factor and replication strategy is used by the test
keyspace? What consistency level is used by your operations?

- create keyspace testkeyspace WITH REPLICATION =
{'class':'SimpleStrategy','replication_factor':3};
- consistency level is SERIAL


On Fri, Oct 21, 2016 at 12:04 PM, Ben Slater 
wrote:

>
> A couple of questions:
> 1) At what stage did you have (or expect to have) 1000 rows (and have the
> mismatch between actual and expected) - at that end of operation (2) or
> after operation (3)?
> 2) What replication factor and replication strategy is used by the test
> keyspace? What consistency level is used by your operations?
>
>
> Cheers
> Ben
>
> On Fri, 21 Oct 2016 at 13:57 Yuji Ito  wrote:
>
>> Thanks Ben,
>>
>> I tried to run a rebuild and repair after the failure node rejoined the
>> cluster as a "new" node with -Dcassandra.replace_address_first_boot.
>> The failure node could rejoined and I could read all rows successfully.
>> (Sometimes a repair failed because the node cannot access other node. If
>> it failed, I retried a repair)
>>
>> But some rows were lost after my destructive test repeated (after about
>> 5-6 hours).
>> After the test inserted 1000 rows, there were only 953 rows at the end of
>> the test.
>>
>> My destructive test:
>> - each C* node is killed & restarted at the random interval (within about
>> 5 min) throughout this test
>> 1) truncate all tables
>> 2) insert initial rows (check if all rows are inserted successfully)
>> 3) request a lot of read/write to random rows for about 30min
>> 4) check all rows
>> If operation 1), 2) or 4) fail due to C* failure, the test retry the
>> operation.
>>
>> Does anyone have the similar problem?
>> What causes data lost?
>> Does the test need any operation when C* node is restarted? (Currently, I
>> just restarted C* process)
>>
>> Regards,
>>
>>
>> On Tue, Oct 18, 2016 at 2:18 PM, Ben Slater 
>> wrote:
>>
>> OK, that’s a bit more unexpected (to me at least) but I think the
>> solution of running a rebuild or repair still applies.
>>
>> On Tue, 18 Oct 2016 at 15:45 Yuji Ito  wrote:
>>
>> Thanks Ben, Jeff
>>
>> Sorry that my explanation confused you.
>>
>> Only node1 is the seed node.
>> Node2 whose C* data is deleted is NOT a seed.
>>
>> I restarted the failure node(node2) after restarting the seed node(node1).
>> The restarting node2 succeeded without the exception.
>> (I couldn't restart node2 before restarting node1 as expected.)
>>
>> Regards,
>>
>>
>> On Tue, Oct 18, 2016 at 1:06 PM, Jeff Jirsa 
>> wrote:
>>
>> The unstated "problem" here is that node1 is a seed, which implies
>> auto_bootstrap=false (can't bootstrap a seed, so it was almost certainly
>> setup to start without bootstrapping).
>>
>> That means once the data dir is wiped, it's going to start again without
>> a bootstrap, and make a single node cluster or join an existing cluster if
>> the seed list is valid
>>
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Oct 17, 2016, at 8:51 PM, Ben Slater 
>> wrote:
>>
>> OK, sorry - I think understand what you are asking now.
>>
>> However, I’m still a little confused by your description. I think your
>> scenario is:
>> 1) Stop C* on all nodes in a cluster (Nodes A,B,C)
>> 2) Delete all data from Node A
>> 3) Restart Node A
>> 4) Restart Node B,C
>>
>> Is this correct?
>>
>> If so, this isn’t a scenario I’ve tested/seen but I’m not surprised Node
>> A starts succesfully as there are no running nodes to tell it via gossip
>> that it shouldn’t start up without the “replaces” flag.
>>
>> I think that right way to recover in this scenario is to run a nodetool
>> rebuild on Node A after the other two nodes are running. You could
>> theoretically also run a repair (which would be good practice after a weird
>> failure scenario like this) but rebuild will probably be quicker given you
>> know all the data needs to be re-streamed.
>>
>> Cheers
>> Ben
>>
>> On Tue, 18 Oct 2016 at 14:03 Yuji Ito  wrote:
>>
>> Thank you Ben, Yabin
>>
>> I understood the rejoin was illegal.
>> I expected this rejoin would fail with the exception.
>> But I could add the failure node to the cluster without the
>> exception after 2) and 3).
>> I want to know why the rejoin succeeds. Should the exception happen?
>>
>> Regards,
>>
>>
>> On Tue, Oct 18, 2016 at 1:51 AM, Yabin Meng  wrote:
>>
>> The exception you run into is expected behavior. This is because as Ben
>> pointed out, when you delete everything (including system schemas), C*
>> cluster thinks you're bootstrapping a new node. However,  

Re: How to throttle up/down compactions without a restart

2016-10-20 Thread Jeff Jirsa
You can also set concurrent compactors through JMX – in the CompactionManager 
mbean, you have CoreCompactionThreads and MaxCompactionThreads – you can adjust 
them at runtime, but do it in an order such that Max is always higher than Core

 

 

 

From: kurt Greaves 
Reply-To: "user@cassandra.apache.org" 
Date: Thursday, October 20, 2016 at 9:54 PM
To: "user@cassandra.apache.org" , 
"thomasjul...@zoho.com" 
Subject: Re: How to throttle up/down compactions without a restart

 

You can throttle compactions using nodetool setcompactionthroughput .

Where x is in mbps. If you're using 2.2 or later this applies immediately to 
all running compactions, otherwise it applies on any "new" compactions.  You 
will want to be careful of allowing compactions to utilise too much disk 
bandwidth. If you're needing to alter this in peak periods you may be starting 
to overload your nodes with writes, or potentially something else is not ideal 
like memtables flushing too frequently. 


Kurt Greaves 

k...@instaclustr.com

www.instaclustr.com

 

On 21 October 2016 at 04:41, Thomas Julian  wrote:

Hello,

 

I was going through this presentation and the Slide-55 caught my attention. 

 

i.e) "Throttled down compactions during high load period, throttled up during 
low load period"

 

Can we throttle down compactions without a restart? 

 

If this can be done, what are all the parameters(JMX?) to work with? How to 
implement this for below Compaction Strategies. 

Size Tiered Compaction Strategy.
Leveled Compaction Strategy
Any help is much appreciated.

 

Best Regards,

Julian.

 

 

 

 

 


CONFIDENTIALITY NOTE: This e-mail and any attachments are confidential and may 
be legally privileged. If you are not the intended recipient, do not disclose, 
copy, distribute, or use this email or any attachments. If you have received 
this in error please let the sender know and then delete the email and all 
attachments.


smime.p7s
Description: S/MIME cryptographic signature


Re: How to throttle up/down compactions without a restart

2016-10-20 Thread kurt Greaves
You can throttle compactions using nodetool setcompactionthroughput .
Where x is in mbps. If you're using 2.2 or later this applies immediately
to all running compactions, otherwise it applies on any "new" compactions.
You will want to be careful of allowing compactions to utilise too much
disk bandwidth. If you're needing to alter this in peak periods you may be
starting to overload your nodes with writes, or potentially something else
is not ideal like memtables flushing too frequently.

Kurt Greaves
k...@instaclustr.com
www.instaclustr.com

On 21 October 2016 at 04:41, Thomas Julian  wrote:

> Hello,
>
>
> I was going through this
> 
> presentation and the Slide-55 caught my attention.
>
> i.e) "Throttled down compactions during high load period, throttled up
> during low load period"
>
> Can we throttle down compactions without a restart?
>
> If this can be done, what are all the parameters(JMX?) to work with? How
> to implement this for below Compaction Strategies.
>
>1. Size Tiered Compaction Strategy.
>2. Leveled Compaction Strategy
>
> Any help is much appreciated.
>
> Best Regards,
> Julian.
>
>
>
>
>


How to throttle up/down compactions without a restart

2016-10-20 Thread Thomas Julian
Hello,





I was going through this presentation and the Slide-55 caught my attention. 



i.e) "Throttled down compactions during high load period, throttled up during 
low load period"



Can we throttle down compactions without a restart? 



If this can be done, what are all the parameters(JMX?) to work with? How to 
implement this for below Compaction Strategies. 

Size Tiered Compaction Strategy.


Leveled Compaction Strategy


Any help is much appreciated.



Best Regards,

Julian.














Re: Cluster Maintenance Mishap

2016-10-20 Thread kurt Greaves
On 20 October 2016 at 20:58, Branton Davis 
wrote:

> Would they have taken on the token ranges of the original nodes or acted
> like new nodes and got new token ranges?  If the latter, is it possible
> that any data moved from the healthy nodes to the "new" nodes or
> would restarting them with the original data (and repairing) put
> the cluster's token ranges back into a normal state?


It sounds like you stopped them before they completed joining. So you
should have nothing to worry about. If not, you will see them marked as DN
from other nodes in the cluster. If you did, they wouldn't have assumed the
token ranges and you shouldn't have any issues.

You can just copy the original data back (including system tables) and they
should assume their own ranges again, and then you can repair to fix any
missing replicas.

Kurt Greaves
k...@instaclustr.com
www.instaclustr.com


Re: Cluster Maintenance Mishap

2016-10-20 Thread Jeremiah D Jordan
The easiest way to figure out what happened is to examine the system log.  It 
will tell you what happened.  But I’m pretty sure your nodes got new tokens 
during that time.

If you want to get back the data inserted during the 2 hours you could use 
sstableloader to send all the data from the /var/data/cassandra_new/cassandra/* 
folders back into the cluster if you still have it.

-Jeremiah


> On Oct 20, 2016, at 3:58 PM, Branton Davis  wrote:
> 
> Howdy folks.  I asked some about this in IRC yesterday, but we're looking to 
> hopefully confirm a couple of things for our sanity.
> 
> Yesterday, I was performing an operation on a 21-node cluster (vnodes, 
> replication factor 3, NetworkTopologyStrategy, and the nodes are balanced 
> across 3 AZs on AWS EC2).  The plan was to swap each node's existing 1TB 
> volume (where all cassandra data, including the commitlog, is stored) with a 
> 2TB volume.  The plan for each node (one at a time) was basically:
> rsync while the node is live (repeated until there were only minor 
> differences from new data)
> stop cassandra on the node
> rsync again
> replace the old volume with the new
> start cassandra
> However, there was a bug in the rsync command.  Instead of copying the 
> contents of /var/data/cassandra to /var/data/cassandra_new, it copied it to 
> /var/data/cassandra_new/cassandra.  So, when cassandra was started after the 
> volume swap, there was some behavior that was similar to bootstrapping a new 
> node (data started streaming in from other nodes).  But there was also some 
> behavior that was similar to a node replacement (nodetool status showed the 
> same IP address, but a different host ID).  This happened with 3 nodes (one 
> from each AZ).  The nodes had received 1.4GB, 1.2GB, and 0.6GB of data 
> (whereas the normal load for a node is around 500-600GB).
> 
> The cluster was in this state for about 2 hours, at which point cassandra was 
> stopped on them.  Later, I moved the data from the original volumes back into 
> place (so, should be the original state before the operation) and started 
> cassandra back up.
> 
> Finally, the questions.  We've accepted the potential loss of new data within 
> the two hours, but our primary concern now is what was happening with the 
> bootstrapping nodes.  Would they have taken on the token ranges of the 
> original nodes or acted like new nodes and got new token ranges?  If the 
> latter, is it possible that any data moved from the healthy nodes to the 
> "new" nodes or would restarting them with the original data (and repairing) 
> put the cluster's token ranges back into a normal state?
> 
> Hopefully that was all clear.  Thanks in advance for any info!



Re: Cluster Maintenance Mishap

2016-10-20 Thread Branton Davis
I guess I'm either not understanding how that answers the question
and/or I've just a done a terrible job at asking it.  I'll sleep on it and
maybe I'll think of a better way to describe it tomorrow ;)

On Thu, Oct 20, 2016 at 8:45 PM, Yabin Meng  wrote:

> I believe you're using VNodes (because token range change doesn't make
> sense for single-token setup unless you change it explicitly). If you
> bootstrap a new node with VNodes, I think the way that the token ranges are
> assigned to the node is random (I'm not 100% sure here, but should be so
> logically). If so, the ownership of the data that each node is responsible
> for will be changed. The part of the data that doesn't belong to the node
> under the new ownership, however, will still be kept on that node.
> Cassandra won't remove it automatically unless you run "nodetool cleanup".
> So to answer your question, I don't think the data have been moved away.
> More likely you have extra duplicate here :
>
> Yabin
>
> On Thu, Oct 20, 2016 at 6:41 PM, Branton Davis  > wrote:
>
>> Thanks for the response, Yabin.  However, if there's an answer to my
>> question here, I'm apparently too dense to see it ;)
>>
>> I understand that, since the system keyspace data was not there, it
>> started bootstrapping.  What's not clear is if they took over the token
>> ranges of the previous nodes or got new token ranges.  I'm mainly
>> concerned about the latter.  We've got the nodes back in place with the
>> original data, but the fear is that some data may have been moved off of
>> other nodes.  I think that this is very unlikely, but I'm just looking for
>> confirmation.
>>
>>
>> On Thursday, October 20, 2016, Yabin Meng  wrote:
>>
>>> Most likely the issue is caused by the fact that when you move the data,
>>> you move the system keyspace data away as well. Meanwhile, due to the error
>>> of data being copied into a different location than what C* is expecting,
>>> when C* starts, it can not find the system metadata info and therefore
>>> tries to start as a fresh new node. If you keep keyspace data in the right
>>> place, you should see all old info. as expected.
>>>
>>> I've seen a few such occurrences from customers. As a best practice, I
>>> would always suggest to totally separate Cassandra application data
>>> directory from system keyspace directory (e.g. they don't share common
>>> parent folder, and such).
>>>
>>> Regards,
>>>
>>> Yabin
>>>
>>> On Thu, Oct 20, 2016 at 4:58 PM, Branton Davis <
>>> branton.da...@spanning.com> wrote:
>>>
 Howdy folks.  I asked some about this in IRC yesterday, but we're
 looking to hopefully confirm a couple of things for our sanity.

 Yesterday, I was performing an operation on a 21-node cluster (vnodes,
 replication factor 3, NetworkTopologyStrategy, and the nodes are balanced
 across 3 AZs on AWS EC2).  The plan was to swap each node's existing
 1TB volume (where all cassandra data, including the commitlog, is stored)
 with a 2TB volume.  The plan for each node (one at a time) was
 basically:

- rsync while the node is live (repeated until there were
only minor differences from new data)
- stop cassandra on the node
- rsync again
- replace the old volume with the new
- start cassandra

 However, there was a bug in the rsync command.  Instead of copying the
 contents of /var/data/cassandra to /var/data/cassandra_new, it copied it to
 /var/data/cassandra_new/cassandra.  So, when cassandra was started
 after the volume swap, there was some behavior that was similar to
 bootstrapping a new node (data started streaming in from other nodes).
  But there was also some behavior that was similar to a node
 replacement (nodetool status showed the same IP address, but a
 different host ID).  This happened with 3 nodes (one from each AZ).  The
 nodes had received 1.4GB, 1.2GB, and 0.6GB of data (whereas the normal load
 for a node is around 500-600GB).

 The cluster was in this state for about 2 hours, at which
 point cassandra was stopped on them.  Later, I moved the data from the
 original volumes back into place (so, should be the original state before
 the operation) and started cassandra back up.

 Finally, the questions.  We've accepted the potential loss of new data
 within the two hours, but our primary concern now is what was happening
 with the bootstrapping nodes.  Would they have taken on the token
 ranges of the original nodes or acted like new nodes and got new token
 ranges?  If the latter, is it possible that any data moved from the
 healthy nodes to the "new" nodes or would restarting them with the original
 data (and repairing) put the cluster's token ranges back into a normal
 state?

 Hopefully that was all clear.  Thanks in advance for any info!


Re: failure node rejoin

2016-10-20 Thread Ben Slater
A couple of questions:
1) At what stage did you have (or expect to have) 1000 rows (and have the
mismatch between actual and expected) - at that end of operation (2) or
after operation (3)?
2) What replication factor and replication strategy is used by the test
keyspace? What consistency level is used by your operations?


Cheers
Ben

On Fri, 21 Oct 2016 at 13:57 Yuji Ito  wrote:

> Thanks Ben,
>
> I tried to run a rebuild and repair after the failure node rejoined the
> cluster as a "new" node with -Dcassandra.replace_address_first_boot.
> The failure node could rejoined and I could read all rows successfully.
> (Sometimes a repair failed because the node cannot access other node. If
> it failed, I retried a repair)
>
> But some rows were lost after my destructive test repeated (after about
> 5-6 hours).
> After the test inserted 1000 rows, there were only 953 rows at the end of
> the test.
>
> My destructive test:
> - each C* node is killed & restarted at the random interval (within about
> 5 min) throughout this test
> 1) truncate all tables
> 2) insert initial rows (check if all rows are inserted successfully)
> 3) request a lot of read/write to random rows for about 30min
> 4) check all rows
> If operation 1), 2) or 4) fail due to C* failure, the test retry the
> operation.
>
> Does anyone have the similar problem?
> What causes data lost?
> Does the test need any operation when C* node is restarted? (Currently, I
> just restarted C* process)
>
> Regards,
>
>
> On Tue, Oct 18, 2016 at 2:18 PM, Ben Slater 
> wrote:
>
> OK, that’s a bit more unexpected (to me at least) but I think the solution
> of running a rebuild or repair still applies.
>
> On Tue, 18 Oct 2016 at 15:45 Yuji Ito  wrote:
>
> Thanks Ben, Jeff
>
> Sorry that my explanation confused you.
>
> Only node1 is the seed node.
> Node2 whose C* data is deleted is NOT a seed.
>
> I restarted the failure node(node2) after restarting the seed node(node1).
> The restarting node2 succeeded without the exception.
> (I couldn't restart node2 before restarting node1 as expected.)
>
> Regards,
>
>
> On Tue, Oct 18, 2016 at 1:06 PM, Jeff Jirsa 
> wrote:
>
> The unstated "problem" here is that node1 is a seed, which implies
> auto_bootstrap=false (can't bootstrap a seed, so it was almost certainly
> setup to start without bootstrapping).
>
> That means once the data dir is wiped, it's going to start again without a
> bootstrap, and make a single node cluster or join an existing cluster if
> the seed list is valid
>
>
>
> --
> Jeff Jirsa
>
>
> On Oct 17, 2016, at 8:51 PM, Ben Slater 
> wrote:
>
> OK, sorry - I think understand what you are asking now.
>
> However, I’m still a little confused by your description. I think your
> scenario is:
> 1) Stop C* on all nodes in a cluster (Nodes A,B,C)
> 2) Delete all data from Node A
> 3) Restart Node A
> 4) Restart Node B,C
>
> Is this correct?
>
> If so, this isn’t a scenario I’ve tested/seen but I’m not surprised Node A
> starts succesfully as there are no running nodes to tell it via gossip that
> it shouldn’t start up without the “replaces” flag.
>
> I think that right way to recover in this scenario is to run a nodetool
> rebuild on Node A after the other two nodes are running. You could
> theoretically also run a repair (which would be good practice after a weird
> failure scenario like this) but rebuild will probably be quicker given you
> know all the data needs to be re-streamed.
>
> Cheers
> Ben
>
> On Tue, 18 Oct 2016 at 14:03 Yuji Ito  wrote:
>
> Thank you Ben, Yabin
>
> I understood the rejoin was illegal.
> I expected this rejoin would fail with the exception.
> But I could add the failure node to the cluster without the
> exception after 2) and 3).
> I want to know why the rejoin succeeds. Should the exception happen?
>
> Regards,
>
>
> On Tue, Oct 18, 2016 at 1:51 AM, Yabin Meng  wrote:
>
> The exception you run into is expected behavior. This is because as Ben
> pointed out, when you delete everything (including system schemas), C*
> cluster thinks you're bootstrapping a new node. However,  node2's IP is
> still in gossip and this is why you see the exception.
>
> I'm not clear the reasoning why you need to delete C* data directory. That
> is a dangerous action, especially considering that you delete system
> schemas. If in any case the failure node is gone for a while, what you need
> to do is to is remove the node first before doing "rejoin".
>
> Cheers,
>
> Yabin
>
> On Mon, Oct 17, 2016 at 1:48 AM, Ben Slater 
> wrote:
>
> To cassandra, the node where you deleted the files looks like a brand new
> machine. It doesn’t automatically rebuild machines to prevent accidental
> replacement. You need to tell it to build the “new” machines as a
> replacement for the “old” machine with that IP by setting 
> 

Re: failure node rejoin

2016-10-20 Thread Yuji Ito
Thanks Ben,

I tried to run a rebuild and repair after the failure node rejoined the
cluster as a "new" node with -Dcassandra.replace_address_first_boot.
The failure node could rejoined and I could read all rows successfully.
(Sometimes a repair failed because the node cannot access other node. If it
failed, I retried a repair)

But some rows were lost after my destructive test repeated (after about 5-6
hours).
After the test inserted 1000 rows, there were only 953 rows at the end of
the test.

My destructive test:
- each C* node is killed & restarted at the random interval (within about 5
min) throughout this test
1) truncate all tables
2) insert initial rows (check if all rows are inserted successfully)
3) request a lot of read/write to random rows for about 30min
4) check all rows
If operation 1), 2) or 4) fail due to C* failure, the test retry the
operation.

Does anyone have the similar problem?
What causes data lost?
Does the test need any operation when C* node is restarted? (Currently, I
just restarted C* process)

Regards,


On Tue, Oct 18, 2016 at 2:18 PM, Ben Slater 
wrote:

> OK, that’s a bit more unexpected (to me at least) but I think the solution
> of running a rebuild or repair still applies.
>
> On Tue, 18 Oct 2016 at 15:45 Yuji Ito  wrote:
>
>> Thanks Ben, Jeff
>>
>> Sorry that my explanation confused you.
>>
>> Only node1 is the seed node.
>> Node2 whose C* data is deleted is NOT a seed.
>>
>> I restarted the failure node(node2) after restarting the seed node(node1).
>> The restarting node2 succeeded without the exception.
>> (I couldn't restart node2 before restarting node1 as expected.)
>>
>> Regards,
>>
>>
>> On Tue, Oct 18, 2016 at 1:06 PM, Jeff Jirsa 
>> wrote:
>>
>> The unstated "problem" here is that node1 is a seed, which implies
>> auto_bootstrap=false (can't bootstrap a seed, so it was almost certainly
>> setup to start without bootstrapping).
>>
>> That means once the data dir is wiped, it's going to start again without
>> a bootstrap, and make a single node cluster or join an existing cluster if
>> the seed list is valid
>>
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Oct 17, 2016, at 8:51 PM, Ben Slater 
>> wrote:
>>
>> OK, sorry - I think understand what you are asking now.
>>
>> However, I’m still a little confused by your description. I think your
>> scenario is:
>> 1) Stop C* on all nodes in a cluster (Nodes A,B,C)
>> 2) Delete all data from Node A
>> 3) Restart Node A
>> 4) Restart Node B,C
>>
>> Is this correct?
>>
>> If so, this isn’t a scenario I’ve tested/seen but I’m not surprised Node
>> A starts succesfully as there are no running nodes to tell it via gossip
>> that it shouldn’t start up without the “replaces” flag.
>>
>> I think that right way to recover in this scenario is to run a nodetool
>> rebuild on Node A after the other two nodes are running. You could
>> theoretically also run a repair (which would be good practice after a weird
>> failure scenario like this) but rebuild will probably be quicker given you
>> know all the data needs to be re-streamed.
>>
>> Cheers
>> Ben
>>
>> On Tue, 18 Oct 2016 at 14:03 Yuji Ito  wrote:
>>
>> Thank you Ben, Yabin
>>
>> I understood the rejoin was illegal.
>> I expected this rejoin would fail with the exception.
>> But I could add the failure node to the cluster without the
>> exception after 2) and 3).
>> I want to know why the rejoin succeeds. Should the exception happen?
>>
>> Regards,
>>
>>
>> On Tue, Oct 18, 2016 at 1:51 AM, Yabin Meng  wrote:
>>
>> The exception you run into is expected behavior. This is because as Ben
>> pointed out, when you delete everything (including system schemas), C*
>> cluster thinks you're bootstrapping a new node. However,  node2's IP is
>> still in gossip and this is why you see the exception.
>>
>> I'm not clear the reasoning why you need to delete C* data directory.
>> That is a dangerous action, especially considering that you delete system
>> schemas. If in any case the failure node is gone for a while, what you need
>> to do is to is remove the node first before doing "rejoin".
>>
>> Cheers,
>>
>> Yabin
>>
>> On Mon, Oct 17, 2016 at 1:48 AM, Ben Slater 
>> wrote:
>>
>> To cassandra, the node where you deleted the files looks like a brand new
>> machine. It doesn’t automatically rebuild machines to prevent accidental
>> replacement. You need to tell it to build the “new” machines as a
>> replacement for the “old” machine with that IP by setting
>> -Dcassandra.replace_address_first_boot=. See
>> http://cassandra.apache.org/doc/latest/operating/topo_changes.html
>> 

Re: Cluster Maintenance Mishap

2016-10-20 Thread Yabin Meng
I believe you're using VNodes (because token range change doesn't make
sense for single-token setup unless you change it explicitly). If you
bootstrap a new node with VNodes, I think the way that the token ranges are
assigned to the node is random (I'm not 100% sure here, but should be so
logically). If so, the ownership of the data that each node is responsible
for will be changed. The part of the data that doesn't belong to the node
under the new ownership, however, will still be kept on that node.
Cassandra won't remove it automatically unless you run "nodetool cleanup".
So to answer your question, I don't think the data have been moved away.
More likely you have extra duplicate here :

Yabin

On Thu, Oct 20, 2016 at 6:41 PM, Branton Davis 
wrote:

> Thanks for the response, Yabin.  However, if there's an answer to my
> question here, I'm apparently too dense to see it ;)
>
> I understand that, since the system keyspace data was not there, it
> started bootstrapping.  What's not clear is if they took over the token
> ranges of the previous nodes or got new token ranges.  I'm mainly
> concerned about the latter.  We've got the nodes back in place with the
> original data, but the fear is that some data may have been moved off of
> other nodes.  I think that this is very unlikely, but I'm just looking for
> confirmation.
>
>
> On Thursday, October 20, 2016, Yabin Meng  wrote:
>
>> Most likely the issue is caused by the fact that when you move the data,
>> you move the system keyspace data away as well. Meanwhile, due to the error
>> of data being copied into a different location than what C* is expecting,
>> when C* starts, it can not find the system metadata info and therefore
>> tries to start as a fresh new node. If you keep keyspace data in the right
>> place, you should see all old info. as expected.
>>
>> I've seen a few such occurrences from customers. As a best practice, I
>> would always suggest to totally separate Cassandra application data
>> directory from system keyspace directory (e.g. they don't share common
>> parent folder, and such).
>>
>> Regards,
>>
>> Yabin
>>
>> On Thu, Oct 20, 2016 at 4:58 PM, Branton Davis <
>> branton.da...@spanning.com> wrote:
>>
>>> Howdy folks.  I asked some about this in IRC yesterday, but we're
>>> looking to hopefully confirm a couple of things for our sanity.
>>>
>>> Yesterday, I was performing an operation on a 21-node cluster (vnodes,
>>> replication factor 3, NetworkTopologyStrategy, and the nodes are balanced
>>> across 3 AZs on AWS EC2).  The plan was to swap each node's existing
>>> 1TB volume (where all cassandra data, including the commitlog, is stored)
>>> with a 2TB volume.  The plan for each node (one at a time) was
>>> basically:
>>>
>>>- rsync while the node is live (repeated until there were only minor
>>>differences from new data)
>>>- stop cassandra on the node
>>>- rsync again
>>>- replace the old volume with the new
>>>- start cassandra
>>>
>>> However, there was a bug in the rsync command.  Instead of copying the
>>> contents of /var/data/cassandra to /var/data/cassandra_new, it copied it to
>>> /var/data/cassandra_new/cassandra.  So, when cassandra was started
>>> after the volume swap, there was some behavior that was similar to
>>> bootstrapping a new node (data started streaming in from other nodes).  But
>>> there was also some behavior that was similar to a node replacement
>>> (nodetool status showed the same IP address, but a different host ID).  This
>>> happened with 3 nodes (one from each AZ).  The nodes had received
>>> 1.4GB, 1.2GB, and 0.6GB of data (whereas the normal load for a node is
>>> around 500-600GB).
>>>
>>> The cluster was in this state for about 2 hours, at which
>>> point cassandra was stopped on them.  Later, I moved the data from the
>>> original volumes back into place (so, should be the original state before
>>> the operation) and started cassandra back up.
>>>
>>> Finally, the questions.  We've accepted the potential loss of new data
>>> within the two hours, but our primary concern now is what was happening
>>> with the bootstrapping nodes.  Would they have taken on the token
>>> ranges of the original nodes or acted like new nodes and got new token
>>> ranges?  If the latter, is it possible that any data moved from the
>>> healthy nodes to the "new" nodes or would restarting them with the original
>>> data (and repairing) put the cluster's token ranges back into a normal
>>> state?
>>>
>>> Hopefully that was all clear.  Thanks in advance for any info!
>>>
>>
>>


Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Yabin Meng
Sorry, I'm not aware of it

On Thu, Oct 20, 2016 at 6:00 PM, Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Thank you Yabin, is there a exisiting JIRA that I can refer to?
>
> On Thu, Oct 20, 2016 at 2:05 PM, Yabin Meng  wrote:
>
>> I have seen this on other releases, on 2.2.x. The workaround is exactly
>> like yours,  some other system keyspaces also need similar changes.
>>
>> I would say this is a benign bug.
>>
>> Yabin
>>
>> On Thu, Oct 20, 2016 at 4:41 PM, Jai Bheemsen Rao Dhanwada <
>> jaibheem...@gmail.com> wrote:
>>
>>> thanks,
>>>
>>> This always works on 2.1.13 and 2.1.16 version but not on 3.0.8.
>>> definitely not a firewall issue
>>>
>>> On Thu, Oct 20, 2016 at 1:16 PM, sai krishnam raju potturi <
>>> pskraj...@gmail.com> wrote:
>>>
 we faced a similar issue earlier, but that was more related to firewall
 rules. The newly added datacenter was not able to communicate with the
 existing datacenters on the port 7000(inter-node communication). Your's
 might be a different issue, but just saying.


 On Thu, Oct 20, 2016 at 4:12 PM, Jai Bheemsen Rao Dhanwada <
 jaibheem...@gmail.com> wrote:

> Hello All,
>
> I have single datacenter with 3 C* nodes and we are trying to expand
> the cluster to another region/DC. I am seeing the below error while doing 
> a
> "nodetool rebuild -- name_of_existing_data_center" .
>
> [user@machine ~]$ nodetool rebuild DC1
> nodetool: Unable to find sufficient sources for streaming range
> (-402178150752044282,-396707578307430827] in keyspace
> system_distributed
> See 'nodetool help' or 'nodetool help '.
> [user@machine ~]$
>
> user@cqlsh> SELECT * from system_schema.keyspaces where
> keyspace_name='system_distributed';
>
>  keyspace_name | durable_writes | replication
> ---++---
> --
>  system_distributed |   True | {'class':
> 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor':
> '3'}
>
> (1 rows)
>
> To overcome this I have updated system_distributed keyspace to DC1:3
> and DC2:3 with NetworkTopologyStrategy
>
> C* Version - 3.0.8
>
> Is this a bug that is introduced in 3.0.8 version of cassandra? as I
> haven't seen this issue with the older versions?
>


>>>
>>
>


Re: Introducing Cassandra 3.7 LTS

2016-10-20 Thread sankalp kohli
I will also publish 3.0 back ports once we are running 3.0

On Thu, Oct 20, 2016 at 4:23 PM, Ben Bromhead  wrote:

> Thanks Sankalp, we are also reviewing our internal 2.1 list against what
> you published (though we are trying to upgrade everyone to later versions
> e.g. 2.2). It's great to compare notes.
>
> On Thu, 20 Oct 2016 at 16:19 sankalp kohli  wrote:
>
>> This is awesome. I have send out the patches which we back ported into
>> 2.1 on the dev list.
>>
>> On Wed, Oct 19, 2016 at 4:33 PM, kurt Greaves 
>> wrote:
>>
>>
>> On 19 October 2016 at 21:07, sfesc...@gmail.com 
>> wrote:
>>
>> Wow, thank you for doing this. This sentiment regarding stability seems
>> to be widespread. Is the team reconsidering the whole tick-tock cadence? If
>> not, I would add my voice to those asking that it is revisited.
>>
>>
>> There has certainly been discussion regarding the tick-tock cadence, and
>> it seems safe to say it will change. There hasn't been any official
>> announcement yet, however.
>>
>> Kurt Greaves
>> k...@instaclustr.com
>> www.instaclustr.com
>>
>>
>> --
> Ben Bromhead
> CTO | Instaclustr 
> +1 650 284 9692
> Managed Cassandra / Spark on AWS, Azure and Softlayer
>


Re: Introducing Cassandra 3.7 LTS

2016-10-20 Thread Ben Bromhead
Thanks Sankalp, we are also reviewing our internal 2.1 list against what
you published (though we are trying to upgrade everyone to later versions
e.g. 2.2). It's great to compare notes.

On Thu, 20 Oct 2016 at 16:19 sankalp kohli  wrote:

> This is awesome. I have send out the patches which we back ported into 2.1
> on the dev list.
>
> On Wed, Oct 19, 2016 at 4:33 PM, kurt Greaves 
> wrote:
>
>
> On 19 October 2016 at 21:07, sfesc...@gmail.com 
> wrote:
>
> Wow, thank you for doing this. This sentiment regarding stability seems to
> be widespread. Is the team reconsidering the whole tick-tock cadence? If
> not, I would add my voice to those asking that it is revisited.
>
>
> There has certainly been discussion regarding the tick-tock cadence, and
> it seems safe to say it will change. There hasn't been any official
> announcement yet, however.
>
> Kurt Greaves
> k...@instaclustr.com
> www.instaclustr.com
>
>
> --
Ben Bromhead
CTO | Instaclustr 
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Introducing Cassandra 3.7 LTS

2016-10-20 Thread sankalp kohli
This is awesome. I have send out the patches which we back ported into 2.1
on the dev list.

On Wed, Oct 19, 2016 at 4:33 PM, kurt Greaves  wrote:

>
> On 19 October 2016 at 21:07, sfesc...@gmail.com 
> wrote:
>
>> Wow, thank you for doing this. This sentiment regarding stability seems
>> to be widespread. Is the team reconsidering the whole tick-tock cadence? If
>> not, I would add my voice to those asking that it is revisited.
>
>
> There has certainly been discussion regarding the tick-tock cadence, and
> it seems safe to say it will change. There hasn't been any official
> announcement yet, however.
>
> Kurt Greaves
> k...@instaclustr.com
> www.instaclustr.com
>


Re: Cluster Maintenance Mishap

2016-10-20 Thread Branton Davis
Thanks for the response, Yabin.  However, if there's an answer to my
question here, I'm apparently too dense to see it ;)

I understand that, since the system keyspace data was not there, it started
bootstrapping.  What's not clear is if they took over the token ranges of
the previous nodes or got new token ranges.  I'm mainly concerned about the
latter.  We've got the nodes back in place with the original data, but the
fear is that some data may have been moved off of other nodes.  I think
that this is very unlikely, but I'm just looking for confirmation.

On Thursday, October 20, 2016, Yabin Meng  wrote:

> Most likely the issue is caused by the fact that when you move the data,
> you move the system keyspace data away as well. Meanwhile, due to the error
> of data being copied into a different location than what C* is expecting,
> when C* starts, it can not find the system metadata info and therefore
> tries to start as a fresh new node. If you keep keyspace data in the right
> place, you should see all old info. as expected.
>
> I've seen a few such occurrences from customers. As a best practice, I
> would always suggest to totally separate Cassandra application data
> directory from system keyspace directory (e.g. they don't share common
> parent folder, and such).
>
> Regards,
>
> Yabin
>
> On Thu, Oct 20, 2016 at 4:58 PM, Branton Davis  > wrote:
>
>> Howdy folks.  I asked some about this in IRC yesterday, but we're
>> looking to hopefully confirm a couple of things for our sanity.
>>
>> Yesterday, I was performing an operation on a 21-node cluster (vnodes,
>> replication factor 3, NetworkTopologyStrategy, and the nodes are balanced
>> across 3 AZs on AWS EC2).  The plan was to swap each node's existing 1TB
>> volume (where all cassandra data, including the commitlog, is stored) with
>> a 2TB volume.  The plan for each node (one at a time) was basically:
>>
>>- rsync while the node is live (repeated until there were only minor
>>differences from new data)
>>- stop cassandra on the node
>>- rsync again
>>- replace the old volume with the new
>>- start cassandra
>>
>> However, there was a bug in the rsync command.  Instead of copying the
>> contents of /var/data/cassandra to /var/data/cassandra_new, it copied it to
>> /var/data/cassandra_new/cassandra.  So, when cassandra was started after
>> the volume swap, there was some behavior that was similar to bootstrapping
>> a new node (data started streaming in from other nodes).  But there
>> was also some behavior that was similar to a node replacement (nodetool
>> status showed the same IP address, but a different host ID).  This
>> happened with 3 nodes (one from each AZ).  The nodes had received 1.4GB,
>> 1.2GB, and 0.6GB of data (whereas the normal load for a node is around
>> 500-600GB).
>>
>> The cluster was in this state for about 2 hours, at which point cassandra
>> was stopped on them.  Later, I moved the data from the original volumes
>> back into place (so, should be the original state before the operation) and
>> started cassandra back up.
>>
>> Finally, the questions.  We've accepted the potential loss of new data
>> within the two hours, but our primary concern now is what was happening
>> with the bootstrapping nodes.  Would they have taken on the token ranges
>> of the original nodes or acted like new nodes and got new token ranges?  If
>> the latter, is it possible that any data moved from the healthy nodes to
>> the "new" nodes or would restarting them with the original data (and
>> repairing) put the cluster's token ranges back into a normal state?
>>
>> Hopefully that was all clear.  Thanks in advance for any info!
>>
>
>


Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Jai Bheemsen Rao Dhanwada
Thank you Yabin, is there a exisiting JIRA that I can refer to?

On Thu, Oct 20, 2016 at 2:05 PM, Yabin Meng  wrote:

> I have seen this on other releases, on 2.2.x. The workaround is exactly
> like yours,  some other system keyspaces also need similar changes.
>
> I would say this is a benign bug.
>
> Yabin
>
> On Thu, Oct 20, 2016 at 4:41 PM, Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
>
>> thanks,
>>
>> This always works on 2.1.13 and 2.1.16 version but not on 3.0.8.
>> definitely not a firewall issue
>>
>> On Thu, Oct 20, 2016 at 1:16 PM, sai krishnam raju potturi <
>> pskraj...@gmail.com> wrote:
>>
>>> we faced a similar issue earlier, but that was more related to firewall
>>> rules. The newly added datacenter was not able to communicate with the
>>> existing datacenters on the port 7000(inter-node communication). Your's
>>> might be a different issue, but just saying.
>>>
>>>
>>> On Thu, Oct 20, 2016 at 4:12 PM, Jai Bheemsen Rao Dhanwada <
>>> jaibheem...@gmail.com> wrote:
>>>
 Hello All,

 I have single datacenter with 3 C* nodes and we are trying to expand
 the cluster to another region/DC. I am seeing the below error while doing a
 "nodetool rebuild -- name_of_existing_data_center" .

 [user@machine ~]$ nodetool rebuild DC1
 nodetool: Unable to find sufficient sources for streaming range
 (-402178150752044282,-396707578307430827] in keyspace
 system_distributed
 See 'nodetool help' or 'nodetool help '.
 [user@machine ~]$

 user@cqlsh> SELECT * from system_schema.keyspaces where
 keyspace_name='system_distributed';

  keyspace_name | durable_writes | replication
 ---++---
 --
  system_distributed |   True | {'class':
 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor':
 '3'}

 (1 rows)

 To overcome this I have updated system_distributed keyspace to DC1:3
 and DC2:3 with NetworkTopologyStrategy

 C* Version - 3.0.8

 Is this a bug that is introduced in 3.0.8 version of cassandra? as I
 haven't seen this issue with the older versions?

>>>
>>>
>>
>


Re: Cluster Maintenance Mishap

2016-10-20 Thread Yabin Meng
Most likely the issue is caused by the fact that when you move the data,
you move the system keyspace data away as well. Meanwhile, due to the error
of data being copied into a different location than what C* is expecting,
when C* starts, it can not find the system metadata info and therefore
tries to start as a fresh new node. If you keep keyspace data in the right
place, you should see all old info. as expected.

I've seen a few such occurrences from customers. As a best practice, I
would always suggest to totally separate Cassandra application data
directory from system keyspace directory (e.g. they don't share common
parent folder, and such).

Regards,

Yabin

On Thu, Oct 20, 2016 at 4:58 PM, Branton Davis 
wrote:

> Howdy folks.  I asked some about this in IRC yesterday, but we're looking
> to hopefully confirm a couple of things for our sanity.
>
> Yesterday, I was performing an operation on a 21-node cluster (vnodes,
> replication factor 3, NetworkTopologyStrategy, and the nodes are balanced
> across 3 AZs on AWS EC2).  The plan was to swap each node's existing 1TB
> volume (where all cassandra data, including the commitlog, is stored) with
> a 2TB volume.  The plan for each node (one at a time) was basically:
>
>- rsync while the node is live (repeated until there were only minor
>differences from new data)
>- stop cassandra on the node
>- rsync again
>- replace the old volume with the new
>- start cassandra
>
> However, there was a bug in the rsync command.  Instead of copying the
> contents of /var/data/cassandra to /var/data/cassandra_new, it copied it to
> /var/data/cassandra_new/cassandra.  So, when cassandra was started after
> the volume swap, there was some behavior that was similar to bootstrapping
> a new node (data started streaming in from other nodes).  But there
> was also some behavior that was similar to a node replacement (nodetool
> status showed the same IP address, but a different host ID).  This
> happened with 3 nodes (one from each AZ).  The nodes had received 1.4GB,
> 1.2GB, and 0.6GB of data (whereas the normal load for a node is around
> 500-600GB).
>
> The cluster was in this state for about 2 hours, at which point cassandra
> was stopped on them.  Later, I moved the data from the original volumes
> back into place (so, should be the original state before the operation) and
> started cassandra back up.
>
> Finally, the questions.  We've accepted the potential loss of new data
> within the two hours, but our primary concern now is what was happening
> with the bootstrapping nodes.  Would they have taken on the token ranges
> of the original nodes or acted like new nodes and got new token ranges?  If
> the latter, is it possible that any data moved from the healthy nodes to
> the "new" nodes or would restarting them with the original data (and
> repairing) put the cluster's token ranges back into a normal state?
>
> Hopefully that was all clear.  Thanks in advance for any info!
>


Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Yabin Meng
I have seen this on other releases, on 2.2.x. The workaround is exactly
like yours,  some other system keyspaces also need similar changes.

I would say this is a benign bug.

Yabin

On Thu, Oct 20, 2016 at 4:41 PM, Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> thanks,
>
> This always works on 2.1.13 and 2.1.16 version but not on 3.0.8.
> definitely not a firewall issue
>
> On Thu, Oct 20, 2016 at 1:16 PM, sai krishnam raju potturi <
> pskraj...@gmail.com> wrote:
>
>> we faced a similar issue earlier, but that was more related to firewall
>> rules. The newly added datacenter was not able to communicate with the
>> existing datacenters on the port 7000(inter-node communication). Your's
>> might be a different issue, but just saying.
>>
>>
>> On Thu, Oct 20, 2016 at 4:12 PM, Jai Bheemsen Rao Dhanwada <
>> jaibheem...@gmail.com> wrote:
>>
>>> Hello All,
>>>
>>> I have single datacenter with 3 C* nodes and we are trying to expand the
>>> cluster to another region/DC. I am seeing the below error while doing a 
>>> "nodetool
>>> rebuild -- name_of_existing_data_center" .
>>>
>>> [user@machine ~]$ nodetool rebuild DC1
>>> nodetool: Unable to find sufficient sources for streaming range
>>> (-402178150752044282,-396707578307430827] in keyspace system_distributed
>>> See 'nodetool help' or 'nodetool help '.
>>> [user@machine ~]$
>>>
>>> user@cqlsh> SELECT * from system_schema.keyspaces where
>>> keyspace_name='system_distributed';
>>>
>>>  keyspace_name | durable_writes | replication
>>> ---++---
>>> --
>>>  system_distributed |   True | {'class':
>>> 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor':
>>> '3'}
>>>
>>> (1 rows)
>>>
>>> To overcome this I have updated system_distributed keyspace to DC1:3 and
>>> DC2:3 with NetworkTopologyStrategy
>>>
>>> C* Version - 3.0.8
>>>
>>> Is this a bug that is introduced in 3.0.8 version of cassandra? as I
>>> haven't seen this issue with the older versions?
>>>
>>
>>
>


Cluster Maintenance Mishap

2016-10-20 Thread Branton Davis
Howdy folks.  I asked some about this in IRC yesterday, but we're looking
to hopefully confirm a couple of things for our sanity.

Yesterday, I was performing an operation on a 21-node cluster (vnodes,
replication factor 3, NetworkTopologyStrategy, and the nodes are balanced
across 3 AZs on AWS EC2).  The plan was to swap each node's existing 1TB
volume (where all cassandra data, including the commitlog, is stored) with
a 2TB volume.  The plan for each node (one at a time) was basically:

   - rsync while the node is live (repeated until there were only minor
   differences from new data)
   - stop cassandra on the node
   - rsync again
   - replace the old volume with the new
   - start cassandra

However, there was a bug in the rsync command.  Instead of copying the
contents of /var/data/cassandra to /var/data/cassandra_new, it copied it to
/var/data/cassandra_new/cassandra.  So, when cassandra was started after
the volume swap, there was some behavior that was similar to bootstrapping
a new node (data started streaming in from other nodes).  But there
was also some behavior that was similar to a node replacement (nodetool
status showed the same IP address, but a different host ID).  This happened
with 3 nodes (one from each AZ).  The nodes had received 1.4GB, 1.2GB, and
0.6GB of data (whereas the normal load for a node is around 500-600GB).

The cluster was in this state for about 2 hours, at which point cassandra
was stopped on them.  Later, I moved the data from the original volumes
back into place (so, should be the original state before the operation) and
started cassandra back up.

Finally, the questions.  We've accepted the potential loss of new data
within the two hours, but our primary concern now is what was happening
with the bootstrapping nodes.  Would they have taken on the token ranges of
the original nodes or acted like new nodes and got new token ranges?  If
the latter, is it possible that any data moved from the healthy nodes to
the "new" nodes or would restarting them with the original data (and
repairing) put the cluster's token ranges back into a normal state?

Hopefully that was all clear.  Thanks in advance for any info!


Re: Rebuild failing while adding new datacenter

2016-10-20 Thread Jai Bheemsen Rao Dhanwada
thanks,

This always works on 2.1.13 and 2.1.16 version but not on 3.0.8. definitely
not a firewall issue

On Thu, Oct 20, 2016 at 1:16 PM, sai krishnam raju potturi <
pskraj...@gmail.com> wrote:

> we faced a similar issue earlier, but that was more related to firewall
> rules. The newly added datacenter was not able to communicate with the
> existing datacenters on the port 7000(inter-node communication). Your's
> might be a different issue, but just saying.
>
>
> On Thu, Oct 20, 2016 at 4:12 PM, Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
>
>> Hello All,
>>
>> I have single datacenter with 3 C* nodes and we are trying to expand the
>> cluster to another region/DC. I am seeing the below error while doing a 
>> "nodetool
>> rebuild -- name_of_existing_data_center" .
>>
>> [user@machine ~]$ nodetool rebuild DC1
>> nodetool: Unable to find sufficient sources for streaming range
>> (-402178150752044282,-396707578307430827] in keyspace system_distributed
>> See 'nodetool help' or 'nodetool help '.
>> [user@machine ~]$
>>
>> user@cqlsh> SELECT * from system_schema.keyspaces where
>> keyspace_name='system_distributed';
>>
>>  keyspace_name | durable_writes | replication
>> ---++---
>> --
>>  system_distributed |   True | {'class':
>> 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}
>>
>> (1 rows)
>>
>> To overcome this I have updated system_distributed keyspace to DC1:3 and
>> DC2:3 with NetworkTopologyStrategy
>>
>> C* Version - 3.0.8
>>
>> Is this a bug that is introduced in 3.0.8 version of cassandra? as I
>> haven't seen this issue with the older versions?
>>
>
>


Re: Rebuild failing while adding new datacenter

2016-10-20 Thread sai krishnam raju potturi
we faced a similar issue earlier, but that was more related to firewall
rules. The newly added datacenter was not able to communicate with the
existing datacenters on the port 7000(inter-node communication). Your's
might be a different issue, but just saying.


On Thu, Oct 20, 2016 at 4:12 PM, Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Hello All,
>
> I have single datacenter with 3 C* nodes and we are trying to expand the
> cluster to another region/DC. I am seeing the below error while doing a 
> "nodetool
> rebuild -- name_of_existing_data_center" .
>
> [user@machine ~]$ nodetool rebuild DC1
> nodetool: Unable to find sufficient sources for streaming range
> (-402178150752044282,-396707578307430827] in keyspace system_distributed
> See 'nodetool help' or 'nodetool help '.
> [user@machine ~]$
>
> user@cqlsh> SELECT * from system_schema.keyspaces where
> keyspace_name='system_distributed';
>
>  keyspace_name | durable_writes | replication
> ---++---
> --
>  system_distributed |   True | {'class':
> 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}
>
> (1 rows)
>
> To overcome this I have updated system_distributed keyspace to DC1:3 and
> DC2:3 with NetworkTopologyStrategy
>
> C* Version - 3.0.8
>
> Is this a bug that is introduced in 3.0.8 version of cassandra? as I
> haven't seen this issue with the older versions?
>


Rebuild failing while adding new datacenter

2016-10-20 Thread Jai Bheemsen Rao Dhanwada
Hello All,

I have single datacenter with 3 C* nodes and we are trying to expand the
cluster to another region/DC. I am seeing the below error while doing
a "nodetool
rebuild -- name_of_existing_data_center" .

[user@machine ~]$ nodetool rebuild DC1
nodetool: Unable to find sufficient sources for streaming range
(-402178150752044282,-396707578307430827] in keyspace system_distributed
See 'nodetool help' or 'nodetool help '.
[user@machine ~]$

user@cqlsh> SELECT * from system_schema.keyspaces where
keyspace_name='system_distributed';

 keyspace_name | durable_writes | replication
---++-
 system_distributed |   True | {'class':
'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}

(1 rows)

To overcome this I have updated system_distributed keyspace to DC1:3 and
DC2:3 with NetworkTopologyStrategy

C* Version - 3.0.8

Is this a bug that is introduced in 3.0.8 version of cassandra? as I
haven't seen this issue with the older versions?


Re: Does anyone store larger values in Cassandra E.g. 500 KB?

2016-10-20 Thread Justin Cameron
You can, but it is not really very efficient or cost-effective. You may
encounter issues with streaming, repairs and compaction if you have very
large blobs (100MB+), so try to keep them under 10MB if possible.

I'd suggest storing blobs in something like Amazon S3 and keeping just the
bucket name & blob id in Cassandra.

On Thu, 20 Oct 2016 at 12:03 Vikas Jaiman  wrote:

> Hi,
>
> Normally people would like to store smaller values in Cassandra. Is there
> anyone using it to store for larger values (e.g 500KB or more) and if so
> what are the issues you are facing . I Would like to know the tweaks also
> which you are considering.
>
> Thanks,
> Vikas
>
-- 

Justin Cameron

Senior Software Engineer | Instaclustr




This email has been sent on behalf of Instaclustr Pty Ltd (Australia) and
Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


Re: Does anyone store larger values in Cassandra E.g. 500 KB?

2016-10-20 Thread Harikrishnan Pillai
We use Cassandra to store images .any data above 2 mb we chunk it and store.it 
works perfectly .

Sent from my iPhone

> On Oct 20, 2016, at 12:09 PM, Vikas Jaiman  wrote:
> 
> Hi,
> 
> Normally people would like to store smaller values in Cassandra. Is there 
> anyone using it to store for larger values (e.g 500KB or more) and if so what 
> are the issues you are facing . I Would like to know the tweaks also which 
> you are considering.
> 
> Thanks,
> Vikas


Does anyone store larger values in Cassandra E.g. 500 KB?

2016-10-20 Thread Vikas Jaiman
Hi,

Normally people would like to store smaller values in Cassandra. Is there
anyone using it to store for larger values (e.g 500KB or more) and if so
what are the issues you are facing . I Would like to know the tweaks also
which you are considering.

Thanks,
Vikas


Re: Handle Leap Seconds with Cassandra

2016-10-20 Thread Ben Bromhead
http://www.datastax.com/dev/blog/preparing-for-the-leap-second gives a
pretty good overview

If you are using a timestamp as part of your primary key, this is the
situation where you could end up overwriting data. I would suggest using
timeuuid instead which will ensure that you get different primary keys even
for data inserted at the exact same timestamp.

The blog post also suggests using certain monotonic timestamp classes in
Java however these will not help you if you have multiple clients that may
overwrite data.

As for the interleaving or out of order problem, this is hard to address in
Cassandra without resorting to external coordination or LWTs. If you are
relying on a wall clock to guarantee order in a distributed system you will
get yourself into trouble even without leap seconds (clock drift, NTP
inaccuracy etc).

On Thu, 20 Oct 2016 at 10:30 Anuj Wadehra  wrote:

> Hi,
>
> I would like to know how you guys handle leap seconds with Cassandra.
>
> I am not bothered about the livelock issue as we are using appropriate
> versions of Linux and Java. I am more interested in finding an optimum
> answer for the following question:
>
> How do you handle wrong ordering of multiple writes (on same row and
> column) during the leap second? You may overwrite the new value with old
> one (disaster).
>
> And Downtime is no option :)
>
> I can see that CASSANDRA-9131 is still open..
>
> FYI..we are on 2.0.14 ..
>
>
> Thanks
> Anuj
>
-- 
Ben Bromhead
CTO | Instaclustr 
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Handle Leap Seconds with Cassandra

2016-10-20 Thread Anuj Wadehra
Hi,
I would like to know how you guys handle leap seconds with Cassandra. 
I am not bothered about the livelock issue as we are using appropriate versions 
of Linux and Java. I am more interested in finding an optimum answer for the 
following question:
How do you handle wrong ordering of multiple writes (on same row and column) 
during the leap second? You may overwrite the new value with old one (disaster).

And Downtime is no option :)
I can see that CASSANDRA-9131 is still open..
FYI..we are on 2.0.14 ..

ThanksAnuj

strange node load decrease after nodetool repair -pr

2016-10-20 Thread Oleg Krayushkin
Hi. After I've run token-ranged repair from node at 12.5.13.125 with

nodetool repair -full -st ${start_tokens[i]} -et ${end_tokens[i]}

on every token range, I got this node load:

--  Address   Load   Tokens  Owns   Rack
UN  12.5.13.141   23.94 GB   256 32.3%  rack1
DN  12.5.13.125   34.71 GB   256 31.8%  rack1
UN  12.5.13.4629.01 GB   512 58.1%  rack1
UN  12.5.13.228   41.17 GB   512 58.5%  rack1
UN  12.5.13.3445.93 GB   512 59.8%  rack1
UN  12.5.13.8242.05 GB   512 59.4%  rack1

Then I've run partitioner-range repair from the same node with

nodetool repair -full -pr

And unexpectedly I got such a different load:

--  Address   Load   Tokens  Owns   Rack
UN  12.5.13.141   22.93 GB   256 32.3%  rack1
UN  12.5.13.125   30.94 GB   256 31.8%  rack1
UN  12.5.13.4627.38 GB   512 58.1%  rack1
UN  12.5.13.228   39.51 GB   512 58.5%  rack1
UN  12.5.13.3441.58 GB   512 59.8%  rack1
UN  12.5.13.8233.9 GB512 59.4%  rack1

What are posible reasons of such load decrease after last repair? Maybe
some compaction, that were not done after token-ranged repairs? But at
12.5.13.82 gone about 8GB!

Additional info:

   - There were no writes to db during these periods.
   - All repair operations completed without errors, exceptions or fails.
   - Before the first repair I've done sstablescrub on every node -- maybe
   this gives a clue?
   - cassandra version is 3.0.8

-- 

Oleg Krayushkin


Re: Inconsistencies in materialized views

2016-10-20 Thread siddharth verma
Hi Edward,
Thanks a lot for your help. It helped us narrow down the problem.

Regards


On Mon, Oct 17, 2016 at 9:33 PM, Edward Capriolo 
wrote:

> https://issues.apache.org/jira/browse/CASSANDRA-11198
>
> Which has problems "maybe" fixed by:
>
> https://issues.apache.org/jira/browse/CASSANDRA-11475
>
> Which has it's own set of problems.
>
> One of these patches was merged into 3.7 which tells you are running a
> version 3.6 with known bugs. Also as the feature is "new ish" you should be
> aware that "new ish" major features usually take 4-6 versions to solidify.
>
>
>
> On Mon, Oct 17, 2016 at 3:19 AM, siddharth verma <
> sidd.verma29.l...@gmail.com> wrote:
>
>> Hi,
>> We have a base table with ~300 million entries.
>> And in a recent sanity activity, I saw approx ~33k entires(in one DC)
>> which were in the materialized view, but not in the base table. (reads with
>> quorum, DCAware)
>> (I haven't done it the other way round yet, i.e. entries in base table
>> but not in materialized view)
>>
>> Could someone suggest a possible cause for the same?
>> We saw some glitches in cassandra cluster
>> 1. node down.
>> If this is the case, will repair fix the issue?
>> 2. IOPS maxed out in one DC.
>> 3. Another DC added with some glitches.
>>
>> Could someone suggest how could we replicate inconsistency between base
>> table and materialized view. Any help would be appreciated.
>>
>> C* 3.6
>> Regards
>> SIddharth Verma
>> (Visit https://github.com/siddv29/cfs for a high speed cassandra full
>> table scan)
>>
>
>


-- 
Siddharth Verma
(Visit https://github.com/siddv29/cfs for a high speed cassandra full table
scan)


Re: time series data model

2016-10-20 Thread wxn...@zjqunshuo.com
Hi Kurt,
I do need to align the time windows to day bucket to prevent one row become too 
big, and event_time is timestamp since unix epoch. If I use bigint as type of 
event_time, can I do queries as you mentioned?

-Simon Wu
 
From: kurt Greaves
Date: 2016-10-20 16:18
To: user
Subject: Re: time series data model
If event_time is timestamps since unix epoch you 1. may want to use the 
in-built timestamps type, and 2. order by event_time DESC. 2 applies if you 
want to do queries such as "select * from eventdata where ... and event_time > 
x" (i.e; get latest events).

Other than that your model seems workable, I assume you're using DTCS/TWCS, and 
aligning the time windows to your day bucket. (If not you should do that)

Kurt Greaves
k...@instaclustr.com
www.instaclustr.com

On 20 October 2016 at 07:29, wxn...@zjqunshuo.com  wrote:
Hi All,
I'm trying to migrate my time series data which is GPS trace from mysql to C*. 
I want a wide row to hold one day data. I designed the data model as below. 
Please help to see if there is any problem. Any suggestion is appreciated.

Table Model:
CREATE TABLE cargts.eventdata (
deviceid int,
date int,
event_time bigint,
position text,
PRIMARY KEY ((deviceid, date), event_time)
)

A slice of data:
cqlsh:cargts> SELECT * FROM eventdata WHERE deviceid =186628 and date = 
20160928 LIMIT 10;

 deviceid | date | event_time| position
--+--+---+-
   186628 | 20160928 | 1474992002000 |  
{"latitude":30.343443936386247,"longitude":120.08751351828943,"speed":41,"heading":48}
   186628 | 20160928 | 1474992012000 |   
{"latitude":30.34409508979662,"longitude":120.08840022183352,"speed":45,"heading":53}
   186628 | 20160928 | 1474992022000 |   
{"latitude":30.34461639856887,"longitude":120.08946100336443,"speed":28,"heading":65}
   186628 | 20160928 | 1474992032000 |   
{"latitude":30.34469478717028,"longitude":120.08973154015409,"speed":11,"heading":67}
   186628 | 20160928 | 1474992042000 |   
{"latitude":30.34494998929474,"longitude":120.09027263811151,"speed":19,"heading":47}
   186628 | 20160928 | 1474992052000 | 
{"latitude":30.346057349126617,"longitude":120.08967091817931,"speed":41,"heading":323}
   186628 | 20160928 | 1474992062000 |
{"latitude":30.346997145708,"longitude":120.08883508853253,"speed":52,"heading":323}
   186628 | 20160928 | 1474992072000 | 
{"latitude":30.348131044340988,"longitude":120.08774702315581,"speed":65,"heading":321}
   186628 | 20160928 | 1474992082000 | 
{"latitude":30.349438164412838,"longitude":120.08652612959328,"speed":68,"heading":322}

-Simon Wu



Re: time series data model

2016-10-20 Thread wxn...@zjqunshuo.com
Thank you Kurt, I thought the one column which was identified by the compsite 
key(deviceId+date+event_time) can hold only one value, so I packaged all info 
into one JSON. Maybe I'm wrong. I rewrite the table as below.

CREATE TABLE cargts.eventdata (
deviceid int,
date int,
event_time bigint,
heading int,
lat decimal,
lon decimal,
speed int,
PRIMARY KEY ((deviceid, date), event_time)
)

cqlsh:cargts> select * from eventdata;

 deviceid | date | event_time| heading | lat   | lon| speed
--+--+---+-+---++---
   186628 | 20160928 | 1474992002005 |  48 | 30.343443 | 120.087514 |41

-Simon Wu

From: kurt Greaves
Date: 2016-10-20 16:23
To: user
Subject: Re: time series data model
Ah didn't pick up on that but looks like he's storing JSON within position. Is 
there any strong reason for this or as Vladimir mentioned can you store the 
fields under "position" in separate columns?

Kurt Greaves
k...@instaclustr.com
www.instaclustr.com

On 20 October 2016 at 08:17, Vladimir Yudovin  wrote:
Hi Simon,

Why position is text and not float? Text takes much more place.
Also speed and headings can be calculated basing on latest positions, so you 
can also save them. If you really need it in data base you can save them as 
floats, or compose single float value like speed.heading: 41.173 (or opposite, 
heading.speed) and save column storage overhead.


Best regards, Vladimir Yudovin, 
Winguzone - Hosted Cloud Cassandra
Launch your cluster in minutes.


 On Thu, 20 Oct 2016 03:29:16 -0400 wrote 

Hi All,
I'm trying to migrate my time series data which is GPS trace from mysql to C*. 
I want a wide row to hold one day data. I designed the data model as below. 
Please help to see if there is any problem. Any suggestion is appreciated.

Table Model:
CREATE TABLE cargts.eventdata (
deviceid int,
date int,
event_time bigint,
position text,
PRIMARY KEY ((deviceid, date), event_time)
)

A slice of data:
cqlsh:cargts> SELECT * FROM eventdata WHERE deviceid =186628 and date = 
20160928 LIMIT 10;

 deviceid | date | event_time| position
--+--+---+-
   186628 | 20160928 | 1474992002000 |  
{"latitude":30.343443936386247,"longitude":120.08751351828943,"speed":41,"heading":48}
   186628 | 20160928 | 1474992012000 |   
{"latitude":30.34409508979662,"longitude":120.08840022183352,"speed":45,"heading":53}
   186628 | 20160928 | 1474992022000 |   
{"latitude":30.34461639856887,"longitude":120.08946100336443,"speed":28,"heading":65}
   186628 | 20160928 | 1474992032000 |   
{"latitude":30.34469478717028,"longitude":120.08973154015409,"speed":11,"heading":67}
   186628 | 20160928 | 1474992042000 |   
{"latitude":30.34494998929474,"longitude":120.09027263811151,"speed":19,"heading":47}
   186628 | 20160928 | 1474992052000 | 
{"latitude":30.346057349126617,"longitude":120.08967091817931,"speed":41,"heading":323}
   186628 | 20160928 | 1474992062000 |
{"latitude":30.346997145708,"longitude":120.08883508853253,"speed":52,"heading":323}
   186628 | 20160928 | 1474992072000 | 
{"latitude":30.348131044340988,"longitude":120.08774702315581,"speed":65,"heading":321}
   186628 | 20160928 | 1474992082000 | 
{"latitude":30.349438164412838,"longitude":120.08652612959328,"speed":68,"heading":322}

-Simon Wu




Re: non incremental repairs with cassandra 2.2+

2016-10-20 Thread kurt Greaves
probably because i was looking the wrong version of the codebase :p


Re: time series data model

2016-10-20 Thread kurt Greaves
If event_time is timestamps since unix epoch you 1. may want to use the
in-built timestamps type, and 2. order by event_time DESC. 2 applies if you
want to do queries such as "select * from eventdata where ... and
event_time > x" (i.e; get latest events).

Other than that your model seems workable, I assume you're using DTCS/TWCS,
and aligning the time windows to your day bucket. (If not you should do
that)

Kurt Greaves
k...@instaclustr.com
www.instaclustr.com

On 20 October 2016 at 07:29, wxn...@zjqunshuo.com 
wrote:

> Hi All,
> I'm trying to migrate my time series data which is GPS trace from mysql to
> C*. I want a wide row to hold one day data. I designed the data model as
> below. Please help to see if there is any problem. Any suggestion is
> appreciated.
>
> Table Model:
> CREATE TABLE cargts.eventdata (
> deviceid int,
> date int,
> event_time bigint,
> position text,
> PRIMARY KEY ((deviceid, date), event_time)
> )
>
> A slice of data:
> cqlsh:cargts> SELECT * FROM eventdata WHERE deviceid =
> 186628 and date = 20160928 LIMIT 10;
>
>  deviceid | date | event_time| position
> --+--+---+--
> ---
>186628 | 20160928 | 1474992002000 |  {"latitude":
> 30.343443936386247,"longitude":120.08751351828943,"speed":41,"heading":48}
>186628 | 20160928 | 1474992012000 |   {"latitude":
> 30.34409508979662,"longitude":120.08840022183352,"speed":45,"heading":53}
>186628 | 20160928 | 1474992022000 |   {"latitude":
> 30.34461639856887,"longitude":120.08946100336443,"speed":28,"heading":65}
>186628 | 20160928 | 1474992032000 |   {"latitude":
> 30.34469478717028,"longitude":120.08973154015409,"speed":11,"heading":67}
>186628 | 20160928 | 1474992042000 |   {"latitude":
> 30.34494998929474,"longitude":120.09027263811151,"speed":19,"heading":47}
>186628 | 20160928 | 1474992052000 | {"latitude":
> 30.346057349126617,"longitude":120.08967091817931,"speed":
> 41,"heading":323}
>186628 | 20160928 | 1474992062000 |{"latitude"
> :30.346997145708,"longitude":120.08883508853253,"speed":52,"heading":323}
>186628 | 20160928 | 1474992072000 | {"latitude":
> 30.348131044340988,"longitude":120.08774702315581,"speed":
> 65,"heading":321}
>186628 | 20160928 | 1474992082000 | {"latitude":
> 30.349438164412838,"longitude":120.08652612959328,"speed":
> 68,"heading":322}
>
> -Simon Wu
>


Re: time series data model

2016-10-20 Thread kurt Greaves
Ah didn't pick up on that but looks like he's storing JSON within position.
Is there any strong reason for this or as Vladimir mentioned can you store
the fields under "position" in separate columns?

Kurt Greaves
k...@instaclustr.com
www.instaclustr.com

On 20 October 2016 at 08:17, Vladimir Yudovin  wrote:

> Hi Simon,
>
> Why *position *is text and not float? Text takes much more place.
> Also speed and headings can be calculated basing on latest positions, so
> you can also save them. If you really need it in data base you can save
> them as floats, or compose single float value like speed.heading: 41.173
> (or opposite, heading.speed) and save column storage overhead.
>
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Thu, 20 Oct 2016 03:29:16 -0400* >* wrote 
>
> Hi All,
> I'm trying to migrate my time series data which is GPS trace from mysql to
> C*. I want a wide row to hold one day data. I designed the data model as
> below. Please help to see if there is any problem. Any suggestion is
> appreciated.
>
> Table Model:
> CREATE TABLE cargts.eventdata (
> deviceid int,
> date int,
> event_time bigint,
> position text,
> PRIMARY KEY ((deviceid, date), event_time)
> )
>
> A slice of data:
> cqlsh:cargts> SELECT * FROM eventdata WHERE deviceid =
> 186628 and date = 20160928 LIMIT 10;
>
>  deviceid | date | event_time| position
> --+--+---+--
> ---
>186628 | 20160928 | 1474992002000 |  {"latitude":
> 30.343443936386247,"longitude":120.08751351828943,"speed":41,"heading":48}
>186628 | 20160928 | 1474992012000 |   {"latitude":
> 30.34409508979662,"longitude":120.08840022183352,"speed":45,"heading":53}
>186628 | 20160928 | 1474992022000 |   {"latitude":
> 30.34461639856887,"longitude":120.08946100336443,"speed":28,"heading":65}
>186628 | 20160928 | 1474992032000 |   {"latitude":
> 30.34469478717028,"longitude":120.08973154015409,"speed":11,"heading":67}
>186628 | 20160928 | 1474992042000 |   {"latitude":
> 30.34494998929474,"longitude":120.09027263811151,"speed":19,"heading":47}
>186628 | 20160928 | 1474992052000 | {"latitude":
> 30.346057349126617,"longitude":120.08967091817931,"speed":
> 41,"heading":323}
>186628 | 20160928 | 1474992062000 |{"latitude"
> :30.346997145708,"longitude":120.08883508853253,"speed":52,"heading":323}
>186628 | 20160928 | 1474992072000 | {"latitude":
> 30.348131044340988,"longitude":120.08774702315581,"speed":
> 65,"heading":321}
>186628 | 20160928 | 1474992082000 | {"latitude":
> 30.349438164412838,"longitude":120.08652612959328,"speed":
> 68,"heading":322}
>
> -Simon Wu
>
>
>


Re: non incremental repairs with cassandra 2.2+

2016-10-20 Thread kurt Greaves
Welp, that's good but wasn't apparent in the codebase :S.

Kurt Greaves
k...@instaclustr.com
www.instaclustr.com

On 20 October 2016 at 05:02, Alexander Dejanovski 
wrote:

> Hi Kurt,
>
> we're not actually.
> Reaper performs full repair by subrange but does incremental repair on all
> ranges at once, node by node.
> Subrange is incompatible with incremental repair anyway.
>
> Cheers,
>
> On Thu, Oct 20, 2016 at 5:24 AM kurt Greaves  wrote:
>
>>
>> On 19 October 2016 at 17:13, Alexander Dejanovski > > wrote:
>>
>> There aren't that many tools I know to orchestrate repairs and we
>> maintain a fork of Reaper, that was made by Spotify, and handles
>> incremental repair : https://github.com/thelastpickle/cassandra-reaper
>>
>>
>> Looks like you're using subranges with incremental repairs. This will
>> generate a lot of anticompactions as you'll only repair a portion of the
>> SSTables. You should use forceRepairAsync for incremental repairs so that
>> it's possible for the repair to act on the whole SSTable, minimising
>> anticompactions.
>>
>> Kurt Greaves
>> k...@instaclustr.com
>> www.instaclustr.com
>>
> --
> -
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>


Re: time series data model

2016-10-20 Thread Vladimir Yudovin
Hi Simon,

Why position is text and not float? Text takes much more place.
Also speed and headings can be calculated basing on latest positions, so you 
can also save them. If you really need it in data base you can save them as 
floats, or compose single float value like speed.heading: 41.173 (or opposite, 
heading.speed) and save column storage overhead.



Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra
Launch your cluster in minutes.





 On Thu, 20 Oct 2016 03:29:16 -0400wxn...@zjqunshuo.com wrote 




Hi All,

I'm trying to migrate my time series data which is GPS trace from mysql to C*. 
I want a wide row to hold one day data. I designed the data model as below. 
Please help to see if there is any problem. Any suggestion is appreciated.



Table Model:

CREATE TABLE cargts.eventdata (
deviceid int,
date int,
event_time bigint,
position text,
PRIMARY KEY ((deviceid, date), event_time)
)


A slice of data:

cqlsh:cargts SELECT * FROM eventdata WHERE deviceid =186628 and date = 
20160928 LIMIT 10;

 deviceid | date | event_time| position
--+--+---+-
   186628 | 20160928 | 1474992002000 |  
{"latitude":30.343443936386247,"longitude":120.08751351828943,"speed":41,"heading":48}
   186628 | 20160928 | 1474992012000 |   
{"latitude":30.34409508979662,"longitude":120.08840022183352,"speed":45,"heading":53}
   186628 | 20160928 | 1474992022000 |   
{"latitude":30.34461639856887,"longitude":120.08946100336443,"speed":28,"heading":65}
   186628 | 20160928 | 1474992032000 |   
{"latitude":30.34469478717028,"longitude":120.08973154015409,"speed":11,"heading":67}
   186628 | 20160928 | 1474992042000 |   
{"latitude":30.34494998929474,"longitude":120.09027263811151,"speed":19,"heading":47}
   186628 | 20160928 | 1474992052000 | 
{"latitude":30.346057349126617,"longitude":120.08967091817931,"speed":41,"heading":323}
   186628 | 20160928 | 1474992062000 |
{"latitude":30.346997145708,"longitude":120.08883508853253,"speed":52,"heading":323}
   186628 | 20160928 | 1474992072000 | 
{"latitude":30.348131044340988,"longitude":120.08774702315581,"speed":65,"heading":321}
   186628 | 20160928 | 1474992082000 | 
{"latitude":30.349438164412838,"longitude":120.08652612959328,"speed":68,"heading":322}


-Simon Wu








time series data model

2016-10-20 Thread wxn...@zjqunshuo.com
Hi All,
I'm trying to migrate my time series data which is GPS trace from mysql to C*. 
I want a wide row to hold one day data. I designed the data model as below. 
Please help to see if there is any problem. Any suggestion is appreciated.

Table Model:
CREATE TABLE cargts.eventdata (
deviceid int,
date int,
event_time bigint,
position text,
PRIMARY KEY ((deviceid, date), event_time)
)

A slice of data:
cqlsh:cargts> SELECT * FROM eventdata WHERE deviceid =186628 and date = 
20160928 LIMIT 10;

 deviceid | date | event_time| position
--+--+---+-
   186628 | 20160928 | 1474992002000 |  
{"latitude":30.343443936386247,"longitude":120.08751351828943,"speed":41,"heading":48}
   186628 | 20160928 | 1474992012000 |   
{"latitude":30.34409508979662,"longitude":120.08840022183352,"speed":45,"heading":53}
   186628 | 20160928 | 1474992022000 |   
{"latitude":30.34461639856887,"longitude":120.08946100336443,"speed":28,"heading":65}
   186628 | 20160928 | 1474992032000 |   
{"latitude":30.34469478717028,"longitude":120.08973154015409,"speed":11,"heading":67}
   186628 | 20160928 | 1474992042000 |   
{"latitude":30.34494998929474,"longitude":120.09027263811151,"speed":19,"heading":47}
   186628 | 20160928 | 1474992052000 | 
{"latitude":30.346057349126617,"longitude":120.08967091817931,"speed":41,"heading":323}
   186628 | 20160928 | 1474992062000 |
{"latitude":30.346997145708,"longitude":120.08883508853253,"speed":52,"heading":323}
   186628 | 20160928 | 1474992072000 | 
{"latitude":30.348131044340988,"longitude":120.08774702315581,"speed":65,"heading":321}
   186628 | 20160928 | 1474992082000 | 
{"latitude":30.349438164412838,"longitude":120.08652612959328,"speed":68,"heading":322}

-Simon Wu