Re: Adding datacenter and data verification

2018-09-17 Thread Pradeep Chhetri
Hi Eunsu,

By going through the documentation, I think you are right, you shouldn't
use withUsedHostsPerRemoteDc because it will contact nodes in other
datacenters.  No i don't use withUsedHostsPerRemoteDc, but instead i use
withLocalDc option.

On Tue, Sep 18, 2018 at 11:02 AM, Eunsu Kim  wrote:

> Yes, I altered the system_auth key space before adding the data center.
>
> However, I suspect that the new data center did not get the system_auth
> data and therefore could not authenticate to the client. Because the new
> data center did not get the replica count by altering keyspace.
>
> Do your clients have the 'withUsedHostsPerRemoteDc' option?
>
>
> On 18 Sep 2018, at 1:17 PM, Pradeep Chhetri  wrote:
>
> Hello Eunsu,
>
> I am also using PasswordAuthenticator in my cassandra cluster. I didn't
> come across this issue while doing the exercise on preprod.
>
> Are you sure that you changed the configuration of system_auth keyspace
> before adding the new datacenter using this:
>
> ALTER KEYSPACE system_auth WITH REPLICATION = {'class':
> 'NetworkTopologyStrategy', 'datacenter1': '3'};
>
> Regards,
> Pradeep
>
>
>
> On Tue, Sep 18, 2018 at 7:23 AM, Eunsu Kim  wrote:
>
>>
>> In my case, there were authentication issues when adding data centers.
>>
>> I was using a PasswordAuthenticator.
>>
>> As soon as the datacenter was added, the following authentication error
>> log was recorded on the client log file.
>>
>> com.datastax.driver.core.exceptions.AuthenticationException:
>> Authentication error on host /xxx.xxx.xxx.xx:9042: Provided username apm
>> and/or password are incorrect
>>
>> I was using DCAwareRoundRobinPolicy, but I guess it's probably because of
>> the withUsedHostsPerRemoteDc option.
>>
>> I took several steps and the error log disappeared. It is probably
>> ’nodetool rebuild' after altering the system_auth table.
>>
>> However, the procedure was not clearly defined.
>>
>>
>> On 18 Sep 2018, at 2:40 AM, Pradeep Chhetri 
>> wrote:
>>
>> Hello Alain,
>>
>> Thank you very much for reviewing it. You answer on seed nodes cleared my
>> doubts. I will update it as per your suggestion.
>>
>> I have few followup questions on decommissioning of datacenter:
>>
>> - Do i need to run nodetool repair -full on each of the nodes (old + new
>> dc nodes) before starting the decommissioning process of old dc.
>> - We have around 15 apps using cassandra cluster. I want to make sure
>> that all queries before starting the new datacenter are going with right
>> consistency level i.e LOCAL_QUORUM instead of QUORUM. Is there a way i can
>> log the consistency level of each query somehow in some log file.
>>
>> Regards,
>> Pradeep
>>
>> On Mon, Sep 17, 2018 at 9:26 PM, Alain RODRIGUEZ 
>> wrote:
>>
>>> Hello Pradeep,
>>>
>>> It looks good to me and it's a cool runbook for you to follow and for
>>> others to reuse.
>>>
>>> To make sure that cassandra nodes in one datacenter can see the nodes of
>>>> the other datacenter, add the seed node of the new datacenter in any of the
>>>> old datacenter’s nodes and restart that node.
>>>
>>>
>>> Nodes seeing each other from the distinct rack is not related to seeds.
>>> It's indeed recommended to use seeds from all the datacenter (a couple or
>>> 3). I guess it's to increase availability on seeds node and/or maybe to
>>> make sure local seeds are available.
>>>
>>> You can perfectly (and even have to) add your second datacenter nodes
>>> using seeds from the first data center. A bootstrapping node should never
>>> be in the list of seeds unless it's the first node of the cluster. Add
>>> nodes, then make them seeds.
>>>
>>>
>>> Le lun. 17 sept. 2018 à 11:25, Pradeep Chhetri 
>>> a écrit :
>>>
>>>> Hello everyone,
>>>>
>>>> Can someone please help me in validating the steps i am following to
>>>> migrate cassandra snitch.
>>>>
>>>> Regards,
>>>> Pradeep
>>>>
>>>> On Wed, Sep 12, 2018 at 1:38 PM, Pradeep Chhetri >>> > wrote:
>>>>
>>>>> Hello
>>>>>
>>>>> I am running cassandra 3.11.3 5-node cluster on AWS with SimpleSnitch.
>>>>> I was testing the process to migrate to GPFS using AWS region as the
>>>>> datacenter name and AWS zone as the rack name in my preprod environment 
>>>>> and
>>>>> was able to achieve it.
>>>>>
>>>>> But before decommissioning the older datacenter, I want to verify that
>>>>> the data in newer dc is in consistence with the one in older dc. Is there
>>>>> any easy way to do that.
>>>>>
>>>>> Do you suggest running a full repair before decommissioning the nodes
>>>>> of older datacenter ?
>>>>>
>>>>> I am using the steps documented here: https://medium.com/p/465
>>>>> e9bf28d99 I will be very happy if someone can confirm me that i am
>>>>> doing the right steps.
>>>>>
>>>>> Regards,
>>>>> Pradeep
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>
>>
>
>


Re: Adding datacenter and data verification

2018-09-17 Thread Pradeep Chhetri
Hello Eunsu,

I am also using PasswordAuthenticator in my cassandra cluster. I didn't
come across this issue while doing the exercise on preprod.

Are you sure that you changed the configuration of system_auth keyspace
before adding the new datacenter using this:

ALTER KEYSPACE system_auth WITH REPLICATION = {'class':
'NetworkTopologyStrategy', 'datacenter1': '3'};

Regards,
Pradeep



On Tue, Sep 18, 2018 at 7:23 AM, Eunsu Kim  wrote:

>
> In my case, there were authentication issues when adding data centers.
>
> I was using a PasswordAuthenticator.
>
> As soon as the datacenter was added, the following authentication error
> log was recorded on the client log file.
>
> com.datastax.driver.core.exceptions.AuthenticationException:
> Authentication error on host /xxx.xxx.xxx.xx:9042: Provided username apm
> and/or password are incorrect
>
> I was using DCAwareRoundRobinPolicy, but I guess it's probably because of
> the withUsedHostsPerRemoteDc option.
>
> I took several steps and the error log disappeared. It is probably
> ’nodetool rebuild' after altering the system_auth table.
>
> However, the procedure was not clearly defined.
>
>
> On 18 Sep 2018, at 2:40 AM, Pradeep Chhetri  wrote:
>
> Hello Alain,
>
> Thank you very much for reviewing it. You answer on seed nodes cleared my
> doubts. I will update it as per your suggestion.
>
> I have few followup questions on decommissioning of datacenter:
>
> - Do i need to run nodetool repair -full on each of the nodes (old + new
> dc nodes) before starting the decommissioning process of old dc.
> - We have around 15 apps using cassandra cluster. I want to make sure that
> all queries before starting the new datacenter are going with right
> consistency level i.e LOCAL_QUORUM instead of QUORUM. Is there a way i can
> log the consistency level of each query somehow in some log file.
>
> Regards,
> Pradeep
>
> On Mon, Sep 17, 2018 at 9:26 PM, Alain RODRIGUEZ 
> wrote:
>
>> Hello Pradeep,
>>
>> It looks good to me and it's a cool runbook for you to follow and for
>> others to reuse.
>>
>> To make sure that cassandra nodes in one datacenter can see the nodes of
>>> the other datacenter, add the seed node of the new datacenter in any of the
>>> old datacenter’s nodes and restart that node.
>>
>>
>> Nodes seeing each other from the distinct rack is not related to seeds.
>> It's indeed recommended to use seeds from all the datacenter (a couple or
>> 3). I guess it's to increase availability on seeds node and/or maybe to
>> make sure local seeds are available.
>>
>> You can perfectly (and even have to) add your second datacenter nodes
>> using seeds from the first data center. A bootstrapping node should never
>> be in the list of seeds unless it's the first node of the cluster. Add
>> nodes, then make them seeds.
>>
>>
>> Le lun. 17 sept. 2018 à 11:25, Pradeep Chhetri  a
>> écrit :
>>
>>> Hello everyone,
>>>
>>> Can someone please help me in validating the steps i am following to
>>> migrate cassandra snitch.
>>>
>>> Regards,
>>> Pradeep
>>>
>>> On Wed, Sep 12, 2018 at 1:38 PM, Pradeep Chhetri 
>>> wrote:
>>>
>>>> Hello
>>>>
>>>> I am running cassandra 3.11.3 5-node cluster on AWS with SimpleSnitch.
>>>> I was testing the process to migrate to GPFS using AWS region as the
>>>> datacenter name and AWS zone as the rack name in my preprod environment and
>>>> was able to achieve it.
>>>>
>>>> But before decommissioning the older datacenter, I want to verify that
>>>> the data in newer dc is in consistence with the one in older dc. Is there
>>>> any easy way to do that.
>>>>
>>>> Do you suggest running a full repair before decommissioning the nodes
>>>> of older datacenter ?
>>>>
>>>> I am using the steps documented here: https://medium.com/p/465e9bf28d99
>>>> I will be very happy if someone can confirm me that i am doing the right
>>>> steps.
>>>>
>>>> Regards,
>>>> Pradeep
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>
>


Re: Adding datacenter and data verification

2018-09-17 Thread Pradeep Chhetri
Hello Alain,

Thank you very much for reviewing it. You answer on seed nodes cleared my
doubts. I will update it as per your suggestion.

I have few followup questions on decommissioning of datacenter:

- Do i need to run nodetool repair -full on each of the nodes (old + new dc
nodes) before starting the decommissioning process of old dc.
- We have around 15 apps using cassandra cluster. I want to make sure that
all queries before starting the new datacenter are going with right
consistency level i.e LOCAL_QUORUM instead of QUORUM. Is there a way i can
log the consistency level of each query somehow in some log file.

Regards,
Pradeep

On Mon, Sep 17, 2018 at 9:26 PM, Alain RODRIGUEZ  wrote:

> Hello Pradeep,
>
> It looks good to me and it's a cool runbook for you to follow and for
> others to reuse.
>
> To make sure that cassandra nodes in one datacenter can see the nodes of
>> the other datacenter, add the seed node of the new datacenter in any of the
>> old datacenter’s nodes and restart that node.
>
>
> Nodes seeing each other from the distinct rack is not related to seeds.
> It's indeed recommended to use seeds from all the datacenter (a couple or
> 3). I guess it's to increase availability on seeds node and/or maybe to
> make sure local seeds are available.
>
> You can perfectly (and even have to) add your second datacenter nodes
> using seeds from the first data center. A bootstrapping node should never
> be in the list of seeds unless it's the first node of the cluster. Add
> nodes, then make them seeds.
>
>
> Le lun. 17 sept. 2018 à 11:25, Pradeep Chhetri  a
> écrit :
>
>> Hello everyone,
>>
>> Can someone please help me in validating the steps i am following to
>> migrate cassandra snitch.
>>
>> Regards,
>> Pradeep
>>
>> On Wed, Sep 12, 2018 at 1:38 PM, Pradeep Chhetri 
>> wrote:
>>
>>> Hello
>>>
>>> I am running cassandra 3.11.3 5-node cluster on AWS with SimpleSnitch. I
>>> was testing the process to migrate to GPFS using AWS region as the
>>> datacenter name and AWS zone as the rack name in my preprod environment and
>>> was able to achieve it.
>>>
>>> But before decommissioning the older datacenter, I want to verify that
>>> the data in newer dc is in consistence with the one in older dc. Is there
>>> any easy way to do that.
>>>
>>> Do you suggest running a full repair before decommissioning the nodes of
>>> older datacenter ?
>>>
>>> I am using the steps documented here: https://medium.com/p/465e9bf28d99
>>> I will be very happy if someone can confirm me that i am doing the right
>>> steps.
>>>
>>> Regards,
>>> Pradeep
>>>
>>>
>>>
>>>
>>>
>>


Re: Adding datacenter and data verification

2018-09-17 Thread Pradeep Chhetri
Hello everyone,

Can someone please help me in validating the steps i am following to
migrate cassandra snitch.

Regards,
Pradeep

On Wed, Sep 12, 2018 at 1:38 PM, Pradeep Chhetri 
wrote:

> Hello
>
> I am running cassandra 3.11.3 5-node cluster on AWS with SimpleSnitch. I
> was testing the process to migrate to GPFS using AWS region as the
> datacenter name and AWS zone as the rack name in my preprod environment and
> was able to achieve it.
>
> But before decommissioning the older datacenter, I want to verify that the
> data in newer dc is in consistence with the one in older dc. Is there any
> easy way to do that.
>
> Do you suggest running a full repair before decommissioning the nodes of
> older datacenter ?
>
> I am using the steps documented here: https://medium.com/p/465e9bf28d99 I
> will be very happy if someone can confirm me that i am doing the right
> steps.
>
> Regards,
> Pradeep
>
>
>
>
>


Adding datacenter and data verification

2018-09-12 Thread Pradeep Chhetri
Hello

I am running cassandra 3.11.3 5-node cluster on AWS with SimpleSnitch. I
was testing the process to migrate to GPFS using AWS region as the
datacenter name and AWS zone as the rack name in my preprod environment and
was able to achieve it.

But before decommissioning the older datacenter, I want to verify that the
data in newer dc is in consistence with the one in older dc. Is there any
easy way to do that.

Do you suggest running a full repair before decommissioning the nodes of
older datacenter ?

I am using the steps documented here: https://medium.com/p/465e9bf28d99 I
will be very happy if someone can confirm me that i am doing the right
steps.

Regards,
Pradeep


Re: Default Single DataCenter -> Multi DataCenter

2018-09-10 Thread Pradeep Chhetri
Hello Eunsu,

I am going through the same exercise at my job. I was making notes as i was
testing the steps in my preproduction environment. Although I haven't
tested end to end but hopefully this might help you:
https://medium.com/p/465e9bf28d99

Regards,
Pradeep

On Mon, Sep 10, 2018 at 5:59 PM, Alain RODRIGUEZ  wrote:

> Adding a data center for the first time is a bit tricky when you
> haven't been considering it from the start.
>
> I operate 5 nodes cluster (3.11.0) in a single data center with
>> SimpleSnitch, SimpleStrategy and all client policy RoundRobin.
>>
>
> You will need:
>
> - To change clients, make them 'DCAware'. This depends on the client, but
> you should be able to find this in your Cassandra driver (client side).
> - To change clients, make them use 'LOCAL_' consistency
> ('LOCAL_ONE'/'LOCAL_QUORUM' being the most common).
> - To change 'SimpleSnitch' for 'EC2Snitch' or
> 'GossipingPropertyFileSnitch' for example, depending on your
> context/preference
> - To change 'SimpleStrategy' for 'NetworkTopologyStrategy' for all the
> keyspaces, with the desired RF. I take the chance to say that switching to
> 1 replica only is often a mistake, you can indeed have data loss (which you
> accept) but also service going down, anytime you restart a node or that a
> node goes down. If you are ok with RF=1, RDBMS might be a better choice.
> It's an anti-pattern of some kind to run Cassandra with RF=1. Yet up to
> you, this is not our topic :). In the same kind of off-topic
> recommendations, I would not stick with C*3.11.0, but go to C*3.11.3 (if
> you do not perform slice delete, there is still a bug with this apparently)
>
> So this all needs to be done *before* starting adding the new data
> center. Changing the snitch is tricky, make sure that the new snitch uses
> the racks and dc names currently in use in your cluster for the current
> cluster, if not the data could not be accessible after the configuration
> change.
>
> Then the procedure to add a data center is probably described around. I
> know I did this detailed description in 2014, here it is:
> https://mail-archives.apache.org/mod_mbox/cassandra-user/
> 201406.mbox/%3CCA+VSrLopop7Th8nX20aOZ3As75g2jrJm
> 3ryx119deklynhq...@mail.gmail.com%3E, but you might find better/more
> recent documentation than this one for this relatively common process, like
> the documentation you linked.
>
> If you are not confident or have doubts, you can share more about the
> context and post your exact plan, as I did years ago in the mail previously
> linked. People here should be able to confirm the process is ok before you
> move forward, giving you an extra confidence.
>
> C*heers,
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> Le lun. 10 sept. 2018 à 11:05, Eunsu Kim  a
> écrit :
>
>> Hello everyone
>>
>> I operate 5 nodes cluster (3.11.0) in a single data center with
>> SimpleSnitch, SimpleStrategy and all client policy RoundRobin.
>>
>> At this point, I am going to create clusters of the same size in
>> different data centers.
>>
>> I think these two documents are appropriate, but there is confusion
>> because they are referenced to each other.
>>
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/
>> operations/opsAddDCToCluster.html
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/
>> operations/opsSwitchSnitch.html
>>
>> Anyone who can clearly guide the order? Currently RF is 2 and I want to
>> have only one replica in the NetworkTopologyStrategy.
>> A little data loss is okay.
>>
>> Thank you in advanced..
>>
>>
>>
>>
>>
>>
>>


Re: Upgrade from 2.1 to 3.11

2018-08-28 Thread Pradeep Chhetri
You may want to try upgrading to 3.11.3 instead which has some memory leaks
fixes.

On Tue, Aug 28, 2018 at 9:59 AM, Mun Dega  wrote:

> I am surprised that no one else ran into any issues with this version.  GC
> can't catch up fast enough and there is constant Full GC taking place.
>
> The result? unresponsive nodes makeing entire cluster unusable.
>
> Any insight on this issue from anyone that is using this version would be
> appreciated.
>
> Ma
>
> On Fri, Aug 24, 2018, 04:30 Mohamadreza Rostami <
> mohamadrezarosta...@gmail.com> wrote:
>
>> You have very large heap,it’s take most of  cpu time in GC stage.you
>> should in maximum set heap on 12GB and enable row cache to your cluster
>> become faster.
>>
>> On Friday, 24 August 2018, Mun Dega  wrote:
>>
>>> 120G data
>>> 28G heap out of 48 on system
>>> 9 node cluster, RF3
>>>
>>>
>>> On Thu, Aug 23, 2018, 17:19 Mohamadreza Rostami <
>>> mohamadrezarosta...@gmail.com> wrote:
>>>
 Hi,
 How much data do you have? How much RAM do your servers have? How much
 do you have a heep?
 On Thu, Aug 23, 2018 at 10:14 PM Mun Dega  wrote:

> Hello,
>
> We recently upgraded from Cassandra 2.1 to 3.11.2 on one cluster.  The
> process went OK including upgradesstable but we started to experience high
> latency for r/w, occasional OOM and long GC pause after.
>
> For the same cluster with 2.1, we didn't have any issues like this.  We
> also kept server specs, heap, all the same in post upgrade
>
> Has anyone else had similar issues going to 3.11 and what are the
> major changes that could have such a major setback in the new version?
>
> Ma Dega
>



Re: Switching Snitch

2018-08-26 Thread Pradeep Chhetri
Hello Joshua

Thank you very much for the replies. I will go through the tickets you sent
to understand the process.

Regards,
Pradeep

On Mon, Aug 27, 2018 at 6:15 AM, rajasekhar kommineni 
wrote:

> Hi Pradeep,
>
> For changing the snitch you have decommission and add the node with new
> switch and updated properties files.
>
> Thanks,
>
>
> On Aug 26, 2018, at 2:15 PM, Joshua Galbraith  INVALID> wrote:
>
> Pradeep,
>
> Here are some related tickets that may also be helpful in understanding
> the current behavior of these options.
>
> * https://issues.apache.org/jira/browse/CASSANDRA-5897
> * https://issues.apache.org/jira/browse/CASSANDRA-9474
> * https://issues.apache.org/jira/browse/CASSANDRA-10243
> * https://issues.apache.org/jira/browse/CASSANDRA-10242
>
> On Sun, Aug 26, 2018 at 1:20 PM, Joshua Galbraith  > wrote:
>
>> Pradeep,
>>
>> That being said, I haven't experimented with -Dcassandra.ignore_dc=true
>> -Dcassandra.ignore_rack=true before.
>>
>> The description here may be helpful:
>> https://github.com/apache/cassandra/blob/trunk/NEWS.txt#L685-L693
>>
>> I would spin up a small test cluster with data you don't care about and
>> verify that your above assumptions are correct there first.
>>
>> On Sun, Aug 26, 2018 at 1:09 PM, Joshua Galbraith <
>> jgalbra...@newrelic.com> wrote:
>>
>>> Pradeep.
>>>
>>> Right, so from that documentation is sounds like you actually have to
>>> stop all nodes in the cluster at once and bring them back up one at a time.
>>> A rolling restart won't work here.
>>>
>>> On Sun, Aug 26, 2018 at 11:46 AM, Pradeep Chhetri >> > wrote:
>>>
>>>> Hi Joshua,
>>>>
>>>> Thank you for the reply. Sorry i forgot to mention that I already went
>>>> through that documentation. There are few missing things regarding which I
>>>> have few questions:
>>>>
>>>> 1) One thing which isn't mentioned there is that cassandra fails to
>>>> restart when we change the datacenter name *or* rack name of a node.
>>>> So whether should i first rolling restart cassandra with flag
>>>> "-Dcassandra.ignore_dc=true -Dcassandra.ignore_rack=true", then run
>>>> sequential repair and then cleanup and then rolling restart cassandra
>>>> without that flag.
>>>>
>>>> 2) Should i not allow any read/write operation from applications during
>>>> the time when sequential repair is running.
>>>>
>>>> Regards,
>>>> Pradeep
>>>>
>>>> On Mon, Aug 27, 2018 at 12:19 AM, Joshua Galbraith <
>>>> jgalbra...@newrelic.com.invalid> wrote:
>>>>
>>>>> Pradeep, it sounds like what you're proposing counts as a topology
>>>>> change because you are changing the datacenter name and rack name.
>>>>>
>>>>> Please refer to the documentation here about what to do in that
>>>>> situation:
>>>>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
>>>>> ons/opsSwitchSnitch.html
>>>>>
>>>>> In particular:
>>>>>
>>>>> Simply altering the snitch and replication to move some nodes to a new
>>>>>> datacenter will result in data being replicated incorrectly.
>>>>>
>>>>>
>>>>> Topology changes may occur when the replicas are placed in different
>>>>>> places by the new snitch. Specifically, the replication strategy places 
>>>>>> the
>>>>>> replicas based on the information provided by the new snitch.
>>>>>
>>>>>
>>>>> If the topology of the network has changed, but no datacenters are
>>>>>> added:
>>>>>> a. Shut down all the nodes, then restart them.
>>>>>> b. Run a sequential repair and nodetool cleanup on each node.
>>>>>
>>>>>
>>>>> On Sun, Aug 26, 2018 at 11:14 AM, Pradeep Chhetri <
>>>>> prad...@stashaway.com> wrote:
>>>>>
>>>>>> Hello everyone,
>>>>>>
>>>>>> Since i didn't hear from anyone, just want to describe my question
>>>>>> again:
>>>>>>
>>>>>> Am i correct in understanding that i need to do following steps to
>>>>>> migrate data from SimpleSnitch to GPFS changing datacenter name and rack
>&g

Re: Switching Snitch

2018-08-26 Thread Pradeep Chhetri
Hi Joshua,

Thank you for the reply. Sorry i forgot to mention that I already went
through that documentation. There are few missing things regarding which I
have few questions:

1) One thing which isn't mentioned there is that cassandra fails to restart
when we change the datacenter name *or* rack name of a node. So whether
should i first rolling restart cassandra with flag "-Dcassandra.ignore_dc=true
-Dcassandra.ignore_rack=true", then run sequential repair and then cleanup
and then rolling restart cassandra without that flag.

2) Should i not allow any read/write operation from applications during the
time when sequential repair is running.

Regards,
Pradeep

On Mon, Aug 27, 2018 at 12:19 AM, Joshua Galbraith <
jgalbra...@newrelic.com.invalid> wrote:

> Pradeep, it sounds like what you're proposing counts as a topology change
> because you are changing the datacenter name and rack name.
>
> Please refer to the documentation here about what to do in that situation:
> https://docs.datastax.com/en/cassandra/3.0/cassandra/
> operations/opsSwitchSnitch.html
>
> In particular:
>
> Simply altering the snitch and replication to move some nodes to a new
>> datacenter will result in data being replicated incorrectly.
>
>
> Topology changes may occur when the replicas are placed in different
>> places by the new snitch. Specifically, the replication strategy places the
>> replicas based on the information provided by the new snitch.
>
>
> If the topology of the network has changed, but no datacenters are added:
>> a. Shut down all the nodes, then restart them.
>> b. Run a sequential repair and nodetool cleanup on each node.
>
>
> On Sun, Aug 26, 2018 at 11:14 AM, Pradeep Chhetri 
> wrote:
>
>> Hello everyone,
>>
>> Since i didn't hear from anyone, just want to describe my question again:
>>
>> Am i correct in understanding that i need to do following steps to
>> migrate data from SimpleSnitch to GPFS changing datacenter name and rack
>> name to AWS region and Availability zone respectively
>>
>> 1) Update the rack and datacenter fields in cassandra-rackdc.properties
>> file and rolling restart cassandra with this flag
>> "-Dcassandra.ignore_dc=true -Dcassandra.ignore_rack=true"
>>
>> 2) Run nodetool repair --sequential and nodetool cleanup.
>>
>> 3) Rolling restart cassandra removing the flag  "-Dcassandra.ignore_dc=true
>> -Dcassandra.ignore_rack=true"
>>
>> Regards,
>> Pradeep
>>
>> On Thu, Aug 23, 2018 at 10:53 PM, Pradeep Chhetri 
>> wrote:
>>
>>> Hello,
>>>
>>> I am currently running a 3.11.2 cluster in SimpleSnitch hence the
>>> datacenter is datacenter1 and rack is rack1 for all nodes on AWS. I want to
>>> switch to GPFS by changing the rack name to the availability-zone name and
>>> datacenter name to region name.
>>>
>>> When I try to restart individual nodes by changing those values, it
>>> failed to start throwing the error about dc and rack name mismatch but
>>> gives me an option to set ignore_dc and ignore_rack to true to bypass it.
>>>
>>> I am not sure if it is safe to set those two flags to true and if there
>>> is any drawback now or in future when i add a new datacenter to the
>>> cluster. I went through the documentation on Switching Snitches but didn't
>>> get much explanation.
>>>
>>> Regards,
>>> Pradeep
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>
> --
> *Joshua Galbraith *| Lead Software Engineer | New Relic
>


Re: Switching Snitch

2018-08-26 Thread Pradeep Chhetri
Hello everyone,

Since i didn't hear from anyone, just want to describe my question again:

Am i correct in understanding that i need to do following steps to migrate
data from SimpleSnitch to GPFS changing datacenter name and rack name to
AWS region and Availability zone respectively

1) Update the rack and datacenter fields in cassandra-rackdc.properties
file and rolling restart cassandra with this flag
"-Dcassandra.ignore_dc=true -Dcassandra.ignore_rack=true"

2) Run nodetool repair --sequential and nodetool cleanup.

3) Rolling restart cassandra removing the flag  "-Dcassandra.ignore_dc=true
-Dcassandra.ignore_rack=true"

Regards,
Pradeep

On Thu, Aug 23, 2018 at 10:53 PM, Pradeep Chhetri 
wrote:

> Hello,
>
> I am currently running a 3.11.2 cluster in SimpleSnitch hence the
> datacenter is datacenter1 and rack is rack1 for all nodes on AWS. I want to
> switch to GPFS by changing the rack name to the availability-zone name and
> datacenter name to region name.
>
> When I try to restart individual nodes by changing those values, it failed
> to start throwing the error about dc and rack name mismatch but gives me an
> option to set ignore_dc and ignore_rack to true to bypass it.
>
> I am not sure if it is safe to set those two flags to true and if there is
> any drawback now or in future when i add a new datacenter to the cluster. I
> went through the documentation on Switching Snitches but didn't get much
> explanation.
>
> Regards,
> Pradeep
>
>
>
>
>
>
>
>


Switching Snitch

2018-08-23 Thread Pradeep Chhetri
Hello,

I am currently running a 3.11.2 cluster in SimpleSnitch hence the
datacenter is datacenter1 and rack is rack1 for all nodes on AWS. I want to
switch to GPFS by changing the rack name to the availability-zone name and
datacenter name to region name.

When I try to restart individual nodes by changing those values, it failed
to start throwing the error about dc and rack name mismatch but gives me an
option to set ignore_dc and ignore_rack to true to bypass it.

I am not sure if it is safe to set those two flags to true and if there is
any drawback now or in future when i add a new datacenter to the cluster. I
went through the documentation on Switching Snitches but didn't get much
explanation.

Regards,
Pradeep


Re: C* in multiple AWS AZ's

2018-06-29 Thread Pradeep Chhetri
Ohh i see now. It makes sense. Thanks a lot.

On Fri, Jun 29, 2018 at 9:17 PM, Randy Lynn  wrote:

> data is only lost if you stop the node. between restarts the storage is
> fine.
>
> On Fri, Jun 29, 2018 at 10:39 AM, Pradeep Chhetri 
> wrote:
>
>> Isnt NVMe storage an instance storage ie. the data will be lost in case
>> the instance restarts. How are you going to make sure that there is no data
>> loss in case instance gets rebooted?
>>
>> On Fri, 29 Jun 2018 at 7:00 PM, Randy Lynn  wrote:
>>
>>> GPFS - Rahul FTW! Thank you for your help!
>>>
>>> Yes, Pradeep - migrating to i3 from r3. moving for NVMe storage, I did
>>> not have the benefit of doing benchmarks.. but we're moving from 1,500 IOPS
>>> so I intrinsically know we'll get better throughput.
>>>
>>> On Fri, Jun 29, 2018 at 7:21 AM, Rahul Singh <
>>> rahul.xavier.si...@gmail.com> wrote:
>>>
>>>> Totally agree. GPFS for the win. EC2 multi region snitch is an
>>>> automation tool like Ansible or Puppet. Unless you have two orders of
>>>> magnitude more servers than you do now, you don’t need it.
>>>>
>>>> Rahul
>>>> On Jun 29, 2018, 6:18 AM -0400, kurt greaves ,
>>>> wrote:
>>>>
>>>> Yes. You would just end up with a rack named differently to the AZ.
>>>> This is not a problem as racks are just logical. I would recommend
>>>> migrating all your DCs to GPFS though for consistency.
>>>>
>>>> On Fri., 29 Jun. 2018, 09:04 Randy Lynn,  wrote:
>>>>
>>>>> So we have two data centers already running..
>>>>>
>>>>> AP-SYDNEY, and US-EAST.. I'm using Ec2Snitch over a site-to-site
>>>>> tunnel.. I'm wanting to move the current US-EAST from AZ 1a to 1e..
>>>>> I know all docs say use ec2multiregion for multi-DC.
>>>>>
>>>>> I like the GPFS idea. would that work with the multi-DC too?
>>>>> What's the downside? status would report rack of 1a, even though in 1e?
>>>>>
>>>>> Thanks in advance for the help/thoughts!!
>>>>>
>>>>>
>>>>> On Thu, Jun 28, 2018 at 6:20 PM, kurt greaves 
>>>>> wrote:
>>>>>
>>>>>> There is a need for a repair with both DCs as rebuild will not stream
>>>>>> all replicas, so unless you can guarantee you were perfectly consistent 
>>>>>> at
>>>>>> time of rebuild you'll want to do a repair after rebuild.
>>>>>>
>>>>>> On another note you could just replace the nodes but use GPFS instead
>>>>>> of EC2 snitch, using the same rack name.
>>>>>>
>>>>>> On Fri., 29 Jun. 2018, 00:19 Rahul Singh, <
>>>>>> rahul.xavier.si...@gmail.com> wrote:
>>>>>>
>>>>>>> Parallel load is the best approach and then switch your Data access
>>>>>>> code to only access the new hardware. After you verify that there are no
>>>>>>> local read / writes on the OLD dc and that the updates are only via 
>>>>>>> Gossip,
>>>>>>> then go ahead and change the replication factor on the key space to have
>>>>>>> zero replicas in the old DC. Then you can decommissioned.
>>>>>>>
>>>>>>> This way you are hundred percent sure that you aren’t missing any
>>>>>>> new data. No need for a DC to DC repair but a repair is always healthy.
>>>>>>>
>>>>>>> Rahul
>>>>>>> On Jun 28, 2018, 9:15 AM -0500, Randy Lynn ,
>>>>>>> wrote:
>>>>>>>
>>>>>>> Already running with Ec2.
>>>>>>>
>>>>>>> My original thought was a new DC parallel to the current, and then
>>>>>>> decommission the other DC.
>>>>>>>
>>>>>>> Also my data load is small right now.. I know small is relative
>>>>>>> term.. each node is carrying about 6GB..
>>>>>>>
>>>>>>> So given the data size, would you go with parallel DC or let the new
>>>>>>> AZ carry a heavy load until the others are migrated over?
>>>>>>> and then I think "repair" to cleanup the replications?
>>>>>>>
>>>>>>>
>>>>>>> On Thu

Re: C* in multiple AWS AZ's

2018-06-29 Thread Pradeep Chhetri
Isnt NVMe storage an instance storage ie. the data will be lost in case the
instance restarts. How are you going to make sure that there is no data
loss in case instance gets rebooted?

On Fri, 29 Jun 2018 at 7:00 PM, Randy Lynn  wrote:

> GPFS - Rahul FTW! Thank you for your help!
>
> Yes, Pradeep - migrating to i3 from r3. moving for NVMe storage, I did not
> have the benefit of doing benchmarks.. but we're moving from 1,500 IOPS so
> I intrinsically know we'll get better throughput.
>
> On Fri, Jun 29, 2018 at 7:21 AM, Rahul Singh  > wrote:
>
>> Totally agree. GPFS for the win. EC2 multi region snitch is an automation
>> tool like Ansible or Puppet. Unless you have two orders of magnitude more
>> servers than you do now, you don’t need it.
>>
>> Rahul
>> On Jun 29, 2018, 6:18 AM -0400, kurt greaves ,
>> wrote:
>>
>> Yes. You would just end up with a rack named differently to the AZ. This
>> is not a problem as racks are just logical. I would recommend migrating all
>> your DCs to GPFS though for consistency.
>>
>> On Fri., 29 Jun. 2018, 09:04 Randy Lynn,  wrote:
>>
>>> So we have two data centers already running..
>>>
>>> AP-SYDNEY, and US-EAST.. I'm using Ec2Snitch over a site-to-site
>>> tunnel.. I'm wanting to move the current US-EAST from AZ 1a to 1e..
>>> I know all docs say use ec2multiregion for multi-DC.
>>>
>>> I like the GPFS idea. would that work with the multi-DC too?
>>> What's the downside? status would report rack of 1a, even though in 1e?
>>>
>>> Thanks in advance for the help/thoughts!!
>>>
>>>
>>> On Thu, Jun 28, 2018 at 6:20 PM, kurt greaves 
>>> wrote:
>>>
 There is a need for a repair with both DCs as rebuild will not stream
 all replicas, so unless you can guarantee you were perfectly consistent at
 time of rebuild you'll want to do a repair after rebuild.

 On another note you could just replace the nodes but use GPFS instead
 of EC2 snitch, using the same rack name.

 On Fri., 29 Jun. 2018, 00:19 Rahul Singh, 
 wrote:

> Parallel load is the best approach and then switch your Data access
> code to only access the new hardware. After you verify that there are no
> local read / writes on the OLD dc and that the updates are only via 
> Gossip,
> then go ahead and change the replication factor on the key space to have
> zero replicas in the old DC. Then you can decommissioned.
>
> This way you are hundred percent sure that you aren’t missing any new
> data. No need for a DC to DC repair but a repair is always healthy.
>
> Rahul
> On Jun 28, 2018, 9:15 AM -0500, Randy Lynn ,
> wrote:
>
> Already running with Ec2.
>
> My original thought was a new DC parallel to the current, and then
> decommission the other DC.
>
> Also my data load is small right now.. I know small is relative term..
> each node is carrying about 6GB..
>
> So given the data size, would you go with parallel DC or let the new
> AZ carry a heavy load until the others are migrated over?
> and then I think "repair" to cleanup the replications?
>
>
> On Thu, Jun 28, 2018 at 10:09 AM, Rahul Singh <
> rahul.xavier.si...@gmail.com> wrote:
>
>> You don’t have to use EC2 snitch on AWS but if you have already
>> started with it , it may put a node in a different DC.
>>
>> If your data density won’t be ridiculous You could add 3 to different
>> DC/ Region and then sync up. After the new DC is operational you can 
>> remove
>> one at a time on the old DC and at the same time add to the new one.
>>
>> Rahul
>> On Jun 28, 2018, 9:03 AM -0500, Randy Lynn ,
>> wrote:
>>
>> I have a 6-node cluster I'm migrating to the new i3 types.
>> But at the same time I want to migrate to a different AZ.
>>
>> What happens if I do the "running node replace method" with 1 node at
>> a time moving to the new AZ. Meaning, I'll have temporarily;
>>
>> 5 nodes in AZ 1c
>> 1 new node in AZ 1e.
>>
>> I'll wash-rinse-repeat till all 6 are on the new machine type and in
>> the new AZ.
>>
>> Any thoughts about whether this gets weird with the Ec2Snitch and a
>> RF 3?
>>
>> --
>> Randy Lynn
>> rl...@getavail.com
>>
>> office:
>> 859.963.1616 <+1-859-963-1616> ext 202
>> 163 East Main Street - Lexington, KY 40507 - USA
>> 
>>
>>  getavail.com 
>>
>>
>
>
> --
> Randy Lynn
> rl...@getavail.com
>
> office:
> 859.963.1616 <+1-859-963-1616> ext 202
> 163 East Main Street - Lexington, KY 40507 - USA
> 
>
>  getavail.com 

Re: C* in multiple AWS AZ's

2018-06-28 Thread Pradeep Chhetri
Just curious -

>From which instance type are you migrating to i3 type and what are the
reasons to move to i3 type ?

Are you going to take benefit from NVMe instance storage - if yes, how ?

Since we are also migrating our cluster on AWS - but we are currently using
r4 instance, so i was interested to know if you did a comparison between r4
and i3 types.

Regards,
Pradeep

On Fri, Jun 29, 2018 at 4:49 AM, Randy Lynn  wrote:

> So we have two data centers already running..
>
> AP-SYDNEY, and US-EAST.. I'm using Ec2Snitch over a site-to-site tunnel..
> I'm wanting to move the current US-EAST from AZ 1a to 1e..
> I know all docs say use ec2multiregion for multi-DC.
>
> I like the GPFS idea. would that work with the multi-DC too?
> What's the downside? status would report rack of 1a, even though in 1e?
>
> Thanks in advance for the help/thoughts!!
>
>
> On Thu, Jun 28, 2018 at 6:20 PM, kurt greaves 
> wrote:
>
>> There is a need for a repair with both DCs as rebuild will not stream all
>> replicas, so unless you can guarantee you were perfectly consistent at time
>> of rebuild you'll want to do a repair after rebuild.
>>
>> On another note you could just replace the nodes but use GPFS instead of
>> EC2 snitch, using the same rack name.
>>
>> On Fri., 29 Jun. 2018, 00:19 Rahul Singh, 
>> wrote:
>>
>>> Parallel load is the best approach and then switch your Data access code
>>> to only access the new hardware. After you verify that there are no local
>>> read / writes on the OLD dc and that the updates are only via Gossip, then
>>> go ahead and change the replication factor on the key space to have zero
>>> replicas in the old DC. Then you can decommissioned.
>>>
>>> This way you are hundred percent sure that you aren’t missing any new
>>> data. No need for a DC to DC repair but a repair is always healthy.
>>>
>>> Rahul
>>> On Jun 28, 2018, 9:15 AM -0500, Randy Lynn , wrote:
>>>
>>> Already running with Ec2.
>>>
>>> My original thought was a new DC parallel to the current, and then
>>> decommission the other DC.
>>>
>>> Also my data load is small right now.. I know small is relative term..
>>> each node is carrying about 6GB..
>>>
>>> So given the data size, would you go with parallel DC or let the new AZ
>>> carry a heavy load until the others are migrated over?
>>> and then I think "repair" to cleanup the replications?
>>>
>>>
>>> On Thu, Jun 28, 2018 at 10:09 AM, Rahul Singh <
>>> rahul.xavier.si...@gmail.com> wrote:
>>>
 You don’t have to use EC2 snitch on AWS but if you have already started
 with it , it may put a node in a different DC.

 If your data density won’t be ridiculous You could add 3 to different
 DC/ Region and then sync up. After the new DC is operational you can remove
 one at a time on the old DC and at the same time add to the new one.

 Rahul
 On Jun 28, 2018, 9:03 AM -0500, Randy Lynn , wrote:

 I have a 6-node cluster I'm migrating to the new i3 types.
 But at the same time I want to migrate to a different AZ.

 What happens if I do the "running node replace method" with 1 node at a
 time moving to the new AZ. Meaning, I'll have temporarily;

 5 nodes in AZ 1c
 1 new node in AZ 1e.

 I'll wash-rinse-repeat till all 6 are on the new machine type and in
 the new AZ.

 Any thoughts about whether this gets weird with the Ec2Snitch and a RF
 3?

 --
 Randy Lynn
 rl...@getavail.com

 office:
 859.963.1616 <+1-859-963-1616> ext 202
 163 East Main Street - Lexington, KY 40507 - USA
 

  getavail.com 


>>>
>>>
>>> --
>>> Randy Lynn
>>> rl...@getavail.com
>>>
>>> office:
>>> 859.963.1616 <+1-859-963-1616> ext 202
>>> 163 East Main Street - Lexington, KY 40507 - USA
>>> 
>>>
>>>  getavail.com 
>>>
>>>
>
>
> --
> Randy Lynn
> rl...@getavail.com
>
> office:
> 859.963.1616 <+1-859-963-1616> ext 202
> 163 East Main Street - Lexington, KY 40507 - USA
> 
>
>  getavail.com 
>


Re: copy sstables while cassandra is running

2018-06-24 Thread Pradeep Chhetri
I doubt mv will run instantly because copy is across two different
filesystems

On Sun, 24 Jun 2018 at 9:26 PM, Nitan Kainth  wrote:

> To be safe you could follow below prices on each node one at a time:
> Stop Cassandra
> Move sstable— mv will do it instantly
> Start Cassandra
>
> If you do it online and a read trust comes for sane data that is being
> moved will fail.
>
> On Jun 23, 2018, at 11:23 PM, onmstester onmstester 
> wrote:
>
> Hi
> I'm using two directories on different disks as cassandra data storage,
> the small disk is 90% full and the bigger diskis 30% full (the bigger one
> was added later that we find out we need more storage!!),
> so i want to move all data to the big disk, one way is to stop my
> application and copy all sstables from small disk to big one, but it would
> take some hours and not acceptable due to QoS.
> I thought maybe i could copy the big sstables (the one that won't be
> compact in weeks) to the big disk (near casssandra data but not right
> there) while cassandra and my app are still running
> , then stop cassandra and my app, move big file to exact directory of
> cassandra data on big disk (would take a few seconds) and then move
> remained small sstables from  small disk to big one.
> Are there all of sstables related file immutable and (data, index,
> summary, ...) would only be changed by compactions? Any better workaround
> for this scenario would be appriciated?
> Thanks in Advance
>
> Sent using Zoho Mail 
>
>
>


Re: Single Host: Fix "Unknown CF" issue

2018-06-06 Thread Pradeep Chhetri
Hi Michael,

We have faced the same situation as yours in our production environment
where we suddenly got "Unknown CF Exception" for materialized views too. We
are using Lagom apps with cassandra for persistence. In our case, since
these views can be regenerated from the original events, we were able to
safely recover.

Few suggestions from my operations experience:

1) Upgrade your cassandra cluster to 3.11.2 because there are lots of bug
fixes specific to materialized views.
2) Never let your application create/update/delete cassandra
table/materialized views. Always create them manually to make sure that
only connection is doing the operation.

Regards,
Pradeep



On Wed, Jun 6, 2018 at 9:44 PM,  wrote:

> Hi Evelyn,
>
> thanks a lot for your detailed response message.
>
> The data is not important. We've already wiped the data and created a new
> cassandra installation. The data re-import task is already running. We've
> lost the data for a couple of months but in this case this does not matter.
>
> Nevertheless we will try what you told us - just to be smarter/faster if
> this happens in production (where we will setup a cassandra cluster with
> multiple cassandra nodes anyway). I will drop you a note when we are done.
>
> Hmmm... the problem is within a "View". Are this the materialized views?
>
> I'm asking this because:
> * Someone on the internet (stackoverflow if a recall correctly) mentioned
> that using materialized views are to be deprecated.
> * I had been on a datastax workshop in Zurich a couple of days ago where a
> datastax employee told me that we should not use materialized views - it is
> better to create & fill all tables directly.
>
> Would you also recommend not to use materialized views? As this problem is
> related to a view - maybe we could avoid this problem simply by following
> this recommendation.
>
> Thanks a lot again!
>
> Greetings,
> Michael
>
>
>
>
> On 06.06.2018 16:48, Evelyn Smith wrote:
>
>> Hi Michael,
>>
>> So I looked at the code, here are some stages of your error message:
>> 1. at
>> org.apache.cassandra.service.CassandraDaemon.setup(Cassandra
>> Daemon.java:292)
>> [apache-cassandra-3.11.0.jar:3.11.0
>>  At this step Cassandra is running through the keyspaces in it’s
>> schema turning off compactions for all tables before it starts
>> rerunning the commit log (so it isn’t an issue with the commit log).
>> 2. at org.apache.cassandra.db.Keyspace.open(Keyspace.java:127)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>  Loading key space related to the column family that is erroring out
>> 3. at org.apache.cassandra.db.Keyspace.(Keyspace.java:324)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>  Cassandra has initialised the column family and is reloading the view
>> 4. at
>> org.apache.cassandra.db.Keyspace.getColumnFamilyStore(Keyspace.java:204)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>>  At this point I haven’t had enough time to tell if Cassandra is
>> requesting info on a column specifically or still requesting
>> information on a column family. Regardless, given we already rule out
>> issues with the SSTables and their directory and Cassandra is yet to
>> start processing the commit log this to me suggests it’s something
>> wrong in one of the system keyspaces storing the schema information.
>>
>> There should definitely be a way to resolve this with zero data loss
>> by either:
>> 1. Fixing the issue in the system keyspace SSTables (hard)
>> 2. Rerunning the commit log on a new Cassandra node that has been
>> restored from the current one (I’m not sure if this is possible but
>> I’ll figure it out tomorrow)
>>
>> The alternative is if you are ok with losing the commitlog then you
>> can backup the data and restore it to a new node (or the same node but
>> with everything blown away). This isn’t a trivial process though
>> I’ve done it a few times.
>>
>> How important is the data?
>>
>> Happy to come back to this tomorrow (need some sleep)
>>
>> Regards,
>> Eevee.
>>
>> On 5 Jun 2018, at 7:32 pm, m...@vis.at wrote:
>>> Keyspace.getColumnFamilyStore
>>>
>>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Using K8s to Manage Cassandra in Production

2018-05-18 Thread Pradeep Chhetri
Hello Hassaan,

We use cassandra helm chart[0] for deploying cassandra over kubernetes in
production. We have around 200GB cas data. It works really well. You can
scale up nodes easily (I haven't tested scaling down).

I would say that if you are worried about running cassandra over k8s in
production, maybe you should first try setting it for your
staging/preproduction and gain confidence over time.

I have tested situations where i have killed the host running cassandra
container and have seen that container moves to a different node and joins
cluster properly. So from my experience its pretty good. No issues till yet.

[0]: https://github.com/kubernetes/charts/tree/master/incubator/cassandra


Regards,
Pradeep

On Fri, May 18, 2018 at 1:01 PM, Павел Сапежко 
wrote:

> Hi, Hassaan! For example we are using C* in k8s in production for our
> video surveillance system. Moreover, we are using Ceph RBD as our storage
> for cassandra. Today we have 8 C* nodes each manages 2Tb of data.
>
> On Fri, May 18, 2018 at 9:27 AM Hassaan Pasha  wrote:
>
>> Hi,
>>
>> I am trying to craft a deployment strategy for deploying and maintaining
>> a C* cluster. I was wondering if there are actual production deployments of
>> C* using K8s as the orchestration layer.
>>
>> I have been given the impression that K8s managing a C* cluster can be a
>> recipe for disaster, especially if you aren't well versed with the
>> intricacies of a scale-up/down event. I know use cases where people are
>> using Mesos or a custom tool built with terraform/chef etc to run their
>> production clusters but have yet to find a real K8s use case.
>>
>> *Questions?*
>> Is K8s a reasonable choice for managing a production C* cluster?
>> Are there documented use cases for this?
>>
>> Any help would be greatly appreciated.
>>
>> --
>> Regards,
>>
>>
>> *Hassaan Pasha*
>>
> --
>
> Regrads,
>
> Pavel Sapezhko
>
>


Re: JVM Tuning post

2018-04-11 Thread Pradeep Chhetri
Thank you for writing this. The post is really very helpful.

One question - My understanding is GC tuning depends a lot on the
read/write workload and the data size. What will be the right way to
simulate the production workload on a non-production environment in
cassandra world.

On Wed, Apr 11, 2018 at 8:54 PM, Russell Bateman 
wrote:

> Nice write-up. G1GC became the default garbage collection mechanism
> beginning in Java 9, right?
>
>
> On 04/11/2018 09:05 AM, Joao Serrachinha wrote:
>
> Many thanks to "The Last Pickle", also for TWCS advice's. Especially for
> C* new features on version 3.11.1
>
> Regards,
> João
>
> On 11/04/2018 16:00, Jon Haddad wrote:
>
> Hey folks,
>
> We (The Last Pickle) have helped a lot of teams with JVM tuning over
> the years, finally managed to write some stuff down.  We’re hoping the
> community finds it helpful.
> http://thelastpickle.com/blog/2018/04/11/gc-tuning.html
>
> Jon
>
>
>
>
>


Re: Refresh from Prod to Dev

2018-02-09 Thread Pradeep Chhetri
Hi Anshu,

We used to have similar requirements in my workplace.

We tried multiple options like snapshot and restore it but the best one
which worked for us was making a same number of nodes of cas cluster in
preprod and doing a parallel scp of the data directly from production to
preprod and then run a nodetool refresh.

On Fri, 9 Feb 2018 at 12:03 PM, Anshu Vajpayee 
wrote:

> Team ,
>
> I want to validate and POC on production data. Data on production
> is huge.  What could be optimal method to move the data from Prod to Dev
> environment?  I know there are few solutions but what/which is most
> efficient method do refresh for dev env?
>
> --
> *C*heers,*
> *Anshu V*
>
>
>


Re: Restoring fresh new cluster from existing snapshot

2018-01-11 Thread Pradeep Chhetri
Hello Jeff,

Thank you for the reply.

One doubt - If i copy the /var/lib/cassandra one to one from source cluster
to destination cluster nodes and change the cluster name in configuration
and delete system.peers table and restart each cassandra node, do you think
the cluster will come up properly. Although the clusters are in different
VPC in AWS but still i want to make sure that they get different
clustername.

Regards.

On Thu, Jan 11, 2018 at 9:37 PM, Jeff Jirsa  wrote:

> Make sure the new cluster has a different cluster name, and avoid copying
> system.peers if you can avoid it. Doing so risks merging your new cluster
> and old cluster if they’re able to reach each other.
>
> --
> Jeff Jirsa
>
>
> On Jan 11, 2018, at 1:41 AM, Pradeep Chhetri 
> wrote:
>
> Thank you very much Jean. Since i don't have any constraints, as you said,
> i will try copying the complete keyspace system node by node first and will
> do nodetool refresh and see if it works.
>
>
>
> On Thu, Jan 11, 2018 at 3:21 PM, Jean Carlo 
> wrote:
>
>> Hello,
>>
>> Basically, every node has to have the same token range. So yes you have
>> to play with initial_token having the same numbers of tokens per node like
>> the cluster source. To save time and if you dont have any constraints about
>> the name of the cluster etc. you can just copy and paste the complete
>> keyspace system node by node.
>>
>> So you will have the same cluster( cluster name, confs, etc)
>>
>>
>> Saludos
>>
>> Jean Carlo
>>
>> "The best way to predict the future is to invent it" Alan Kay
>>
>> On Thu, Jan 11, 2018 at 10:28 AM, Pradeep Chhetri 
>> wrote:
>>
>>> Hello Jean,
>>>
>>> I am running cassandra 3.11.1.
>>>
>>> Since i dont have much cassandra operations experience yet, I have a
>>> follow-up question - how can i ensure the same token ranges distribution ?
>>> Do i need to set initial_token configuration for each cassandra node ?
>>>
>>> Thank you for the quick response.
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jan 11, 2018 at 3:04 PM, Jean Carlo 
>>> wrote:
>>>
>>>> Hello Pradeep,
>>>>
>>>> Actually the key here is to know if your cluster has the same token
>>>> ranges distribution. So it is not only the same size but also the same
>>>> tokens match node by node, from cluster source to cluster destination. In
>>>> that case, you can use nodetool refresh.So after copy all your sstable node
>>>> by node, it would be enough to make nodetool refresh in every node to
>>>> restore your data. You can also restart casandra instead of doing nodetool
>>>> refresh. It will help you to avoid the compactions after refreshing.
>>>>
>>>>
>>>> Saludos
>>>>
>>>> Jean Carlo
>>>>
>>>> "The best way to predict the future is to invent it" Alan Kay
>>>>
>>>> On Thu, Jan 11, 2018 at 9:58 AM, Pradeep Chhetri >>> > wrote:
>>>>
>>>>> Hello everyone,
>>>>>
>>>>> We are running cassandra cluster inside containers over Kubernetes. We
>>>>> have a requirement where we need to restore a fresh new cluster with
>>>>> existing snapshot on weekly basis.
>>>>>
>>>>> Currently, while doing it manually. i need to copy the snapshot folder
>>>>> inside container and then run sstableloader utility to load those tables.
>>>>>
>>>>> Since the source and destination cluster size is equal, I was thinking
>>>>> if there are some easy way to just copy and paste the complete data
>>>>> directory by mapping the nodes one to one.
>>>>>
>>>>> Since i wasn't able to find documentation around other  backup
>>>>> restoration methods apart from nodetool snapshot and sstableloader, I
>>>>> haven't explored much. I recently came across this project -
>>>>> https://github.com/Netflix/Priam but tried it yet.
>>>>>
>>>>> Would be very happy if i can get some ideas around various ways of
>>>>> backup/restoration while running inside containers.
>>>>>
>>>>> Thank you
>>>>>
>>>>
>>>>
>>>
>>
>


Re: Restoring fresh new cluster from existing snapshot

2018-01-11 Thread Pradeep Chhetri
Thank you very much Jean. Since i don't have any constraints, as you said,
i will try copying the complete keyspace system node by node first and will
do nodetool refresh and see if it works.



On Thu, Jan 11, 2018 at 3:21 PM, Jean Carlo 
wrote:

> Hello,
>
> Basically, every node has to have the same token range. So yes you have to
> play with initial_token having the same numbers of tokens per node like the
> cluster source. To save time and if you dont have any constraints about the
> name of the cluster etc. you can just copy and paste the complete keyspace
> system node by node.
>
> So you will have the same cluster( cluster name, confs, etc)
>
>
> Saludos
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
> On Thu, Jan 11, 2018 at 10:28 AM, Pradeep Chhetri 
> wrote:
>
>> Hello Jean,
>>
>> I am running cassandra 3.11.1.
>>
>> Since i dont have much cassandra operations experience yet, I have a
>> follow-up question - how can i ensure the same token ranges distribution ?
>> Do i need to set initial_token configuration for each cassandra node ?
>>
>> Thank you for the quick response.
>>
>>
>>
>>
>>
>> On Thu, Jan 11, 2018 at 3:04 PM, Jean Carlo 
>> wrote:
>>
>>> Hello Pradeep,
>>>
>>> Actually the key here is to know if your cluster has the same token
>>> ranges distribution. So it is not only the same size but also the same
>>> tokens match node by node, from cluster source to cluster destination. In
>>> that case, you can use nodetool refresh.So after copy all your sstable node
>>> by node, it would be enough to make nodetool refresh in every node to
>>> restore your data. You can also restart casandra instead of doing nodetool
>>> refresh. It will help you to avoid the compactions after refreshing.
>>>
>>>
>>> Saludos
>>>
>>> Jean Carlo
>>>
>>> "The best way to predict the future is to invent it" Alan Kay
>>>
>>> On Thu, Jan 11, 2018 at 9:58 AM, Pradeep Chhetri 
>>> wrote:
>>>
>>>> Hello everyone,
>>>>
>>>> We are running cassandra cluster inside containers over Kubernetes. We
>>>> have a requirement where we need to restore a fresh new cluster with
>>>> existing snapshot on weekly basis.
>>>>
>>>> Currently, while doing it manually. i need to copy the snapshot folder
>>>> inside container and then run sstableloader utility to load those tables.
>>>>
>>>> Since the source and destination cluster size is equal, I was thinking
>>>> if there are some easy way to just copy and paste the complete data
>>>> directory by mapping the nodes one to one.
>>>>
>>>> Since i wasn't able to find documentation around other  backup
>>>> restoration methods apart from nodetool snapshot and sstableloader, I
>>>> haven't explored much. I recently came across this project -
>>>> https://github.com/Netflix/Priam but tried it yet.
>>>>
>>>> Would be very happy if i can get some ideas around various ways of
>>>> backup/restoration while running inside containers.
>>>>
>>>> Thank you
>>>>
>>>
>>>
>>
>


Re: Restoring fresh new cluster from existing snapshot

2018-01-11 Thread Pradeep Chhetri
Hello Jean,

I am running cassandra 3.11.1.

Since i dont have much cassandra operations experience yet, I have a
follow-up question - how can i ensure the same token ranges distribution ?
Do i need to set initial_token configuration for each cassandra node ?

Thank you for the quick response.





On Thu, Jan 11, 2018 at 3:04 PM, Jean Carlo 
wrote:

> Hello Pradeep,
>
> Actually the key here is to know if your cluster has the same token ranges
> distribution. So it is not only the same size but also the same tokens
> match node by node, from cluster source to cluster destination. In that
> case, you can use nodetool refresh.So after copy all your sstable node by
> node, it would be enough to make nodetool refresh in every node to restore
> your data. You can also restart casandra instead of doing nodetool refresh.
> It will help you to avoid the compactions after refreshing.
>
>
> Saludos
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
> On Thu, Jan 11, 2018 at 9:58 AM, Pradeep Chhetri 
> wrote:
>
>> Hello everyone,
>>
>> We are running cassandra cluster inside containers over Kubernetes. We
>> have a requirement where we need to restore a fresh new cluster with
>> existing snapshot on weekly basis.
>>
>> Currently, while doing it manually. i need to copy the snapshot folder
>> inside container and then run sstableloader utility to load those tables.
>>
>> Since the source and destination cluster size is equal, I was thinking if
>> there are some easy way to just copy and paste the complete data directory
>> by mapping the nodes one to one.
>>
>> Since i wasn't able to find documentation around other  backup
>> restoration methods apart from nodetool snapshot and sstableloader, I
>> haven't explored much. I recently came across this project -
>> https://github.com/Netflix/Priam but tried it yet.
>>
>> Would be very happy if i can get some ideas around various ways of
>> backup/restoration while running inside containers.
>>
>> Thank you
>>
>
>


Restoring fresh new cluster from existing snapshot

2018-01-11 Thread Pradeep Chhetri
Hello everyone,

We are running cassandra cluster inside containers over Kubernetes. We have
a requirement where we need to restore a fresh new cluster with existing
snapshot on weekly basis.

Currently, while doing it manually. i need to copy the snapshot folder
inside container and then run sstableloader utility to load those tables.

Since the source and destination cluster size is equal, I was thinking if
there are some easy way to just copy and paste the complete data directory
by mapping the nodes one to one.

Since i wasn't able to find documentation around other  backup restoration
methods apart from nodetool snapshot and sstableloader, I haven't explored
much. I recently came across this project - https://github.com/Netflix/Priam
but tried it yet.

Would be very happy if i can get some ideas around various ways of
backup/restoration while running inside containers.

Thank you


Re: Problem adding a new node to a cluster

2017-12-17 Thread Pradeep Chhetri
Hello Kurt,

I realized it was because of RAM shortage which caused the issue. I bumped
up the memory of the machine and node bootstrap started but this time i hit
this bug of cassandra 3.9:

https://issues.apache.org/jira/browse/CASSANDRA-12905

I tried running nodetool bootstrap resume multiple times but every time it
fails with exception after completing around 963%

https://gist.github.com/chhetripradeep/93567ad24c44ba72d0753d4088a10ce4

Do you think there is some workaround for this. Or do you suggest upgrading
to v3.11 which has this fix.

Also, can we just upgrade the cassandra from 3.9 -> 3.11 in rolling fashion
or do we need to take care of something in case we have to upgrade.

Thanks.






On Mon, Dec 18, 2017 at 5:45 AM, kurt greaves  wrote:

> You haven't provided enough logs for us to really tell what's wrong. I
> suggest running *nodetool netstats* *| grep -v 100% *to see if any
> streams are still ongoing, and also running *nodetool compactionstats -H* to
> see if there are any index builds the node might be waiting for prior to
> joining the ring.
>
> If neither of those provide any useful information, send us the full
> system.log and debug.log
>
> On 17 December 2017 at 11:19, Pradeep Chhetri 
> wrote:
>
>> Hello all,
>>
>> I am trying to add a 4th node to a 3-node cluster which is using
>> SimpleSnitch. But this new node is stuck in Joining state for last 20
>> hours. We have around 10GB data per node with RF as 3.
>>
>> Its mostly stuck in redistributing index summaries phase.
>>
>> Here are the logs:
>>
>> https://gist.github.com/chhetripradeep/37e4f232ddf0dd3b830091ca9829416d
>>
>> # nodetool status
>> Datacenter: datacenter1
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  AddressLoad   Tokens   Owns (effective)  Host ID
>>  Rack
>> UJ  10.42.187.43   9.73 GiB   256  ?
>>  36384dc5-a183-4a5b-ae2d-ee67c897df3d  rack1
>> UN  10.42.106.184  9.95 GiB   256  100.0%
>> 42cd09e9-8efb-472f-ace6-c7bb98634887  rack1
>> UN  10.42.169.195  10.35 GiB  256  100.0%
>> 9fcc99a1-6334-4df8-818d-b097b1920bb9  rack1
>> UN  10.42.209.245  8.54 GiB   256  100.0%
>> 9b99d5d8-818e-4741-9533-259d0fc0e16d  rack1
>>
>> Not sure what is going here, will be very helpful if someone can help in
>> identifying the issue.
>>
>> Thank you.
>>
>>
>>
>


Problem adding a new node to a cluster

2017-12-17 Thread Pradeep Chhetri
Hello all,

I am trying to add a 4th node to a 3-node cluster which is using
SimpleSnitch. But this new node is stuck in Joining state for last 20
hours. We have around 10GB data per node with RF as 3.

Its mostly stuck in redistributing index summaries phase.

Here are the logs:

https://gist.github.com/chhetripradeep/37e4f232ddf0dd3b830091ca9829416d

# nodetool status
Datacenter: datacenter1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens   Owns (effective)  Host ID
   Rack
UJ  10.42.187.43   9.73 GiB   256  ?
 36384dc5-a183-4a5b-ae2d-ee67c897df3d  rack1
UN  10.42.106.184  9.95 GiB   256  100.0%
42cd09e9-8efb-472f-ace6-c7bb98634887  rack1
UN  10.42.169.195  10.35 GiB  256  100.0%
9fcc99a1-6334-4df8-818d-b097b1920bb9  rack1
UN  10.42.209.245  8.54 GiB   256  100.0%
9b99d5d8-818e-4741-9533-259d0fc0e16d  rack1

Not sure what is going here, will be very helpful if someone can help in
identifying the issue.

Thank you.


Re: Snapshot verification

2017-10-31 Thread Pradeep Chhetri
Hi Varun,

Thank you for the reply. I was looking for some kind of automated way (eg.
if i can get some kind of md5 per table while taking snapshot and compare
it with md5 after restoring that snapshot).

Regards.

On Tue, Oct 31, 2017 at 10:47 PM, Varun Gupta  wrote:

> We use COPY command to generate a file, from source and destination. After
> that you can use diff tool.
>
> On Mon, Oct 30, 2017 at 10:11 PM Pradeep Chhetri 
> wrote:
>
>> Hi,
>>
>> We are taking daily snapshots for backing up our cassandra data and then
>> use our backups to restore in a different environment. I would like to
>> verify that the data is consistent and all the data during the time backup
>> was taken is actually restored.
>>
>> Currently I just count the number of rows in each table. Was wondering if
>> there any inbuilt way to accomplish this.
>>
>> Thank you.
>> Pradeep
>>
>


Snapshot verification

2017-10-30 Thread Pradeep Chhetri
Hi,

We are taking daily snapshots for backing up our cassandra data and then
use our backups to restore in a different environment. I would like to
verify that the data is consistent and all the data during the time backup
was taken is actually restored.

Currently I just count the number of rows in each table. Was wondering if
there any inbuilt way to accomplish this.

Thank you.
Pradeep


Re: Restore cassandra snapshots

2017-10-18 Thread Pradeep Chhetri
Hi Jean,

Thank you very much for verifying the steps.

Regards.

On Wed, Oct 18, 2017 at 11:52 AM, Jean Carlo 
wrote:

> Hi Pradeep,
>
> Because you use sstableloader, you don't need to restore de system
> keyspace.
>
> Your procedure is correct to me.
>
> Best regards
>
> On Oct 18, 2017 4:22 AM, "Pradeep Chhetri"  wrote:
>
> Hi Anthony
>
> I did the following steps to restore. Please let me know if I missed
> something.
>
> - Took snapshots on the 3 nodes of the existing cluster simultaneously
> - copied that snapshots respectively on the 3 nodes of the freshly created
> cluster
> - ran sstableloader on each of the application table. ( I didn't restore
> the system related tables ) on all of three node.
>
> I was assuming that since I ran from all the three snapshots, all the
> tokens should be there so thought that this will not cause any data loss.
>
> Do you see that I might have data loss. I am not sure how to verify
> the data loss although I did ran count of few table to verify the row count.
>
> Thank you the help.
>
>
> On Wed, 18 Oct 2017 at 5:39 AM, Anthony Grasso 
> wrote:
>
>> Hi Pradeep,
>>
>> If you are going to copy N snapshots to N nodes you will need to make
>> sure you have the System keyspace as part of that snapshot. The System
>> keyspace that is local to each node, contains the token allocations for
>> that particular node. This allows the node to work out what data it is
>> responsible for. Further to that, if you are restoring the System keyspace
>> from snapshots, make sure that the cluster name of the new cluster is
>> exactly the same as the cluster which generated the System keyspace
>> snapshots.
>>
>> Regards,
>> Anthony
>>
>> On 16 October 2017 at 23:28, Jean Carlo 
>> wrote:
>>
>>> HI,
>>>
>>> Yes of course, you can use sstableloader from every sstable to your new
>>> cluster. Actually this is the common procedure. Just check the log of
>>> cassandra, you shouldn't see any errors of streaming.
>>>
>>>
>>> However, because the fact you are migrating from on cluster of N nodes
>>> to another of N nodes, I believe you can just copy and paste your data node
>>> per node and make a nodetool refresh. Checking obviously the correct names
>>> of your sstables.
>>> You can check the tokens of your node using nodetool info -T
>>>
>>> But I think sstableloader is the easy way :)
>>>
>>>
>>>
>>>
>>> Saludos
>>>
>>> Jean Carlo
>>>
>>> "The best way to predict the future is to invent it" Alan Kay
>>>
>>> On Mon, Oct 16, 2017 at 1:55 PM, Pradeep Chhetri 
>>> wrote:
>>>
>>>> Hi Jean,
>>>>
>>>> Thank you for the quick response. I am not sure how to achieve that.
>>>> Can i set the tokens for a node via cqlsh ?
>>>>
>>>> I know that i can check the nodetool rings to get the tokens allocated
>>>> to a node.
>>>>
>>>> I was thinking to basically run sstableloader for each of the snapshots
>>>> and was assuming it will load the complete data properly. Isn't that the
>>>> case.
>>>>
>>>> Thank you.
>>>>
>>>> On Mon, Oct 16, 2017 at 5:21 PM, Jean Carlo 
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Be sure that you have the same tokens distribution than your original
>>>>> cluster. So if you are going to restore from old node 1 to new node 1, 
>>>>> make
>>>>> sure that the new node and the old node have the same tokens.
>>>>>
>>>>>
>>>>> Saludos
>>>>>
>>>>> Jean Carlo
>>>>>
>>>>> "The best way to predict the future is to invent it" Alan Kay
>>>>>
>>>>> On Mon, Oct 16, 2017 at 1:40 PM, Pradeep Chhetri <
>>>>> prad...@stashaway.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am trying to restore an empty 3-node cluster with the three
>>>>>> snapshots taken on another 3-node cluster.
>>>>>>
>>>>>> What is the best approach to achieve it without loosing any data
>>>>>> present in the snapshot.
>>>>>>
>>>>>> Thank you.
>>>>>> Pradeep
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


Re: Restore cassandra snapshots

2017-10-17 Thread Pradeep Chhetri
Hi Anthony

I did the following steps to restore. Please let me know if I missed
something.

- Took snapshots on the 3 nodes of the existing cluster simultaneously
- copied that snapshots respectively on the 3 nodes of the freshly created
cluster
- ran sstableloader on each of the application table. ( I didn't restore
the system related tables ) on all of three node.

I was assuming that since I ran from all the three snapshots, all the
tokens should be there so thought that this will not cause any data loss.

Do you see that I might have data loss. I am not sure how to verify the
data loss although I did ran count of few table to verify the row count.

Thank you the help.


On Wed, 18 Oct 2017 at 5:39 AM, Anthony Grasso 
wrote:

> Hi Pradeep,
>
> If you are going to copy N snapshots to N nodes you will need to make sure
> you have the System keyspace as part of that snapshot. The System keyspace
> that is local to each node, contains the token allocations for that
> particular node. This allows the node to work out what data it is
> responsible for. Further to that, if you are restoring the System keyspace
> from snapshots, make sure that the cluster name of the new cluster is
> exactly the same as the cluster which generated the System keyspace
> snapshots.
>
> Regards,
> Anthony
>
> On 16 October 2017 at 23:28, Jean Carlo  wrote:
>
>> HI,
>>
>> Yes of course, you can use sstableloader from every sstable to your new
>> cluster. Actually this is the common procedure. Just check the log of
>> cassandra, you shouldn't see any errors of streaming.
>>
>>
>> However, because the fact you are migrating from on cluster of N nodes to
>> another of N nodes, I believe you can just copy and paste your data node
>> per node and make a nodetool refresh. Checking obviously the correct names
>> of your sstables.
>> You can check the tokens of your node using nodetool info -T
>>
>> But I think sstableloader is the easy way :)
>>
>>
>>
>>
>> Saludos
>>
>> Jean Carlo
>>
>> "The best way to predict the future is to invent it" Alan Kay
>>
>> On Mon, Oct 16, 2017 at 1:55 PM, Pradeep Chhetri 
>> wrote:
>>
>>> Hi Jean,
>>>
>>> Thank you for the quick response. I am not sure how to achieve that. Can
>>> i set the tokens for a node via cqlsh ?
>>>
>>> I know that i can check the nodetool rings to get the tokens allocated
>>> to a node.
>>>
>>> I was thinking to basically run sstableloader for each of the snapshots
>>> and was assuming it will load the complete data properly. Isn't that the
>>> case.
>>>
>>> Thank you.
>>>
>>> On Mon, Oct 16, 2017 at 5:21 PM, Jean Carlo 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Be sure that you have the same tokens distribution than your original
>>>> cluster. So if you are going to restore from old node 1 to new node 1, make
>>>> sure that the new node and the old node have the same tokens.
>>>>
>>>>
>>>> Saludos
>>>>
>>>> Jean Carlo
>>>>
>>>> "The best way to predict the future is to invent it" Alan Kay
>>>>
>>>> On Mon, Oct 16, 2017 at 1:40 PM, Pradeep Chhetri >>> > wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to restore an empty 3-node cluster with the three
>>>>> snapshots taken on another 3-node cluster.
>>>>>
>>>>> What is the best approach to achieve it without loosing any data
>>>>> present in the snapshot.
>>>>>
>>>>> Thank you.
>>>>> Pradeep
>>>>>
>>>>
>>>>
>>>
>>
>


Re: Restore cassandra snapshots

2017-10-16 Thread Pradeep Chhetri
Quick questions - 1) I have around 2GB of cassandra snapshot - do you
suggest using sstableloader 2) What do you mean by "restore option" - do
you mean copying snapshot dir directly to the nodes of the new cluster ?

On Mon, Oct 16, 2017 at 6:19 PM, Nitan Kainth  wrote:

> Sstableloader is good for small dataset, for bigger snapshots restore is a
> better option
>
> Sent from my iPhone
>
> On Oct 16, 2017, at 7:28 AM, Jean Carlo  wrote:
>
> HI,
>
> Yes of course, you can use sstableloader from every sstable to your new
> cluster. Actually this is the common procedure. Just check the log of
> cassandra, you shouldn't see any errors of streaming.
>
>
> However, because the fact you are migrating from on cluster of N nodes to
> another of N nodes, I believe you can just copy and paste your data node
> per node and make a nodetool refresh. Checking obviously the correct names
> of your sstables.
> You can check the tokens of your node using nodetool info -T
>
> But I think sstableloader is the easy way :)
>
>
>
>
> Saludos
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
> On Mon, Oct 16, 2017 at 1:55 PM, Pradeep Chhetri 
> wrote:
>
>> Hi Jean,
>>
>> Thank you for the quick response. I am not sure how to achieve that. Can
>> i set the tokens for a node via cqlsh ?
>>
>> I know that i can check the nodetool rings to get the tokens allocated to
>> a node.
>>
>> I was thinking to basically run sstableloader for each of the snapshots
>> and was assuming it will load the complete data properly. Isn't that the
>> case.
>>
>> Thank you.
>>
>> On Mon, Oct 16, 2017 at 5:21 PM, Jean Carlo 
>> wrote:
>>
>>> Hi,
>>>
>>> Be sure that you have the same tokens distribution than your original
>>> cluster. So if you are going to restore from old node 1 to new node 1, make
>>> sure that the new node and the old node have the same tokens.
>>>
>>>
>>> Saludos
>>>
>>> Jean Carlo
>>>
>>> "The best way to predict the future is to invent it" Alan Kay
>>>
>>> On Mon, Oct 16, 2017 at 1:40 PM, Pradeep Chhetri 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am trying to restore an empty 3-node cluster with the three snapshots
>>>> taken on another 3-node cluster.
>>>>
>>>> What is the best approach to achieve it without loosing any data
>>>> present in the snapshot.
>>>>
>>>> Thank you.
>>>> Pradeep
>>>>
>>>
>>>
>>
>


Re: Restore cassandra snapshots

2017-10-16 Thread Pradeep Chhetri
Hi Jean,

Thank you for the quick response. I am not sure how to achieve that. Can i
set the tokens for a node via cqlsh ?

I know that i can check the nodetool rings to get the tokens allocated to a
node.

I was thinking to basically run sstableloader for each of the snapshots and
was assuming it will load the complete data properly. Isn't that the case.

Thank you.

On Mon, Oct 16, 2017 at 5:21 PM, Jean Carlo 
wrote:

> Hi,
>
> Be sure that you have the same tokens distribution than your original
> cluster. So if you are going to restore from old node 1 to new node 1, make
> sure that the new node and the old node have the same tokens.
>
>
> Saludos
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
> On Mon, Oct 16, 2017 at 1:40 PM, Pradeep Chhetri 
> wrote:
>
>> Hi,
>>
>> I am trying to restore an empty 3-node cluster with the three snapshots
>> taken on another 3-node cluster.
>>
>> What is the best approach to achieve it without loosing any data present
>> in the snapshot.
>>
>> Thank you.
>> Pradeep
>>
>
>


Restore cassandra snapshots

2017-10-16 Thread Pradeep Chhetri
Hi,

I am trying to restore an empty 3-node cluster with the three snapshots
taken on another 3-node cluster.

What is the best approach to achieve it without loosing any data present in
the snapshot.

Thank you.
Pradeep


Re: Schema Mismatch Issue in Production

2017-10-12 Thread Pradeep Chhetri
Got the cluster to converge on same schema by restarting just the one
having different version.

Thanks.

On Thu, Oct 12, 2017 at 2:23 PM, Pradeep Chhetri 
wrote:

> Hi Carlos,
>
> Thank you for the reply.
>
> I am running 3.9 cassandra version.
>
> I am also not sure what affect does this have on the applications talking
> to the cassandra.
>
> So there is no way to converge the cluster schema version without downtime.
>
> Thank you.
>
> On Thu, Oct 12, 2017 at 2:16 PM, Carlos Rolo  wrote:
>
>> Which version are you running? I got stuck in a similar situation (With a
>> lot more nodes) and the only way to make it good was to stop the whole
>> cluster, start nodes 1 by 1.
>>
>>
>>
>> Regards,
>>
>> Carlos Juzarte Rolo
>> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>>
>> Pythian - Love your data
>>
>> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
>> *linkedin.com/in/carlosjuzarterolo
>> <http://linkedin.com/in/carlosjuzarterolo>*
>> Mobile: +351 918 918 100
>> www.pythian.com
>>
>> On Thu, Oct 12, 2017 at 5:53 AM, Pradeep Chhetri 
>> wrote:
>>
>>> Hello everyone,
>>>
>>> We had some issues yesterday in our 3 nodes cluster where the
>>> application tried to create the same table twice quickly and cluster became
>>> unstable.
>>>
>>> Temporarily, we reduced it to single node cluster which gave us some
>>> relief.
>>>
>>> Now when we are trying to bootstrap a new node and add it to cluster.
>>> we're seeing schema mismatch issue.
>>>
>>> # nodetool status
>>> Datacenter: datacenter1
>>> ===
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> --  AddressLoad   Tokens   Owns (effective)  Host ID
>>>Rack
>>> UN  10.42.247.173  3.07 GiB   256  100.0%
>>> dffc39e5-d4ba-4b10-872e-0e3cc10f5e08  rack1
>>> UN  10.42.209.245  2.25 GiB   256  100.0%
>>> 9b99d5d8-818e-4741-9533-259d0fc0e16d  rack1
>>>
>>> root@cassandra-2:~# nodetool describecluster
>>> Cluster Information:
>>> Name: sa-cassandra
>>> Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
>>> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>>> Schema versions:
>>> e2275d0f-a5fc-39d9-8f11-268b5e9dc295: [10.42.209.245]
>>>
>>> 5f5f66f5-d6aa-3b90-b674-e08811d4d412: [10.42.247.173]
>>>
>>> Freshly bootstrapped node - 10.42.247.173
>>> Single node from original cluster - 10.42.209.245
>>>
>>> I read https://docs.datastax.com/en/dse-trblshoot/doc/troubles
>>> hooting/schemaDisagree.html and tried restarting the new node but it
>>> didnt help.
>>>
>>> Please do suggest. We are facing this issue in production.
>>>
>>> Thank you.
>>>
>>
>>
>> --
>>
>>
>>
>>
>


Re: Schema Mismatch Issue in Production

2017-10-12 Thread Pradeep Chhetri
Hi Carlos,

Thank you for the reply.

I am running 3.9 cassandra version.

I am also not sure what affect does this have on the applications talking
to the cassandra.

So there is no way to converge the cluster schema version without downtime.

Thank you.

On Thu, Oct 12, 2017 at 2:16 PM, Carlos Rolo  wrote:

> Which version are you running? I got stuck in a similar situation (With a
> lot more nodes) and the only way to make it good was to stop the whole
> cluster, start nodes 1 by 1.
>
>
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> *linkedin.com/in/carlosjuzarterolo
> <http://linkedin.com/in/carlosjuzarterolo>*
> Mobile: +351 918 918 100
> www.pythian.com
>
> On Thu, Oct 12, 2017 at 5:53 AM, Pradeep Chhetri 
> wrote:
>
>> Hello everyone,
>>
>> We had some issues yesterday in our 3 nodes cluster where the application
>> tried to create the same table twice quickly and cluster became unstable.
>>
>> Temporarily, we reduced it to single node cluster which gave us some
>> relief.
>>
>> Now when we are trying to bootstrap a new node and add it to cluster.
>> we're seeing schema mismatch issue.
>>
>> # nodetool status
>> Datacenter: datacenter1
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  AddressLoad   Tokens   Owns (effective)  Host ID
>>  Rack
>> UN  10.42.247.173  3.07 GiB   256  100.0%
>> dffc39e5-d4ba-4b10-872e-0e3cc10f5e08  rack1
>> UN  10.42.209.245  2.25 GiB   256  100.0%
>> 9b99d5d8-818e-4741-9533-259d0fc0e16d  rack1
>>
>> root@cassandra-2:~# nodetool describecluster
>> Cluster Information:
>> Name: sa-cassandra
>> Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
>> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>> Schema versions:
>> e2275d0f-a5fc-39d9-8f11-268b5e9dc295: [10.42.209.245]
>>
>> 5f5f66f5-d6aa-3b90-b674-e08811d4d412: [10.42.247.173]
>>
>> Freshly bootstrapped node - 10.42.247.173
>> Single node from original cluster - 10.42.209.245
>>
>> I read https://docs.datastax.com/en/dse-trblshoot/doc/troubles
>> hooting/schemaDisagree.html and tried restarting the new node but it
>> didnt help.
>>
>> Please do suggest. We are facing this issue in production.
>>
>> Thank you.
>>
>
>
> --
>
>
>
>


Schema Mismatch Issue in Production

2017-10-11 Thread Pradeep Chhetri
Hello everyone,

We had some issues yesterday in our 3 nodes cluster where the application
tried to create the same table twice quickly and cluster became unstable.

Temporarily, we reduced it to single node cluster which gave us some relief.

Now when we are trying to bootstrap a new node and add it to cluster. we're
seeing schema mismatch issue.

# nodetool status
Datacenter: datacenter1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens   Owns (effective)  Host ID
   Rack
UN  10.42.247.173  3.07 GiB   256  100.0%
dffc39e5-d4ba-4b10-872e-0e3cc10f5e08  rack1
UN  10.42.209.245  2.25 GiB   256  100.0%
9b99d5d8-818e-4741-9533-259d0fc0e16d  rack1

root@cassandra-2:~# nodetool describecluster
Cluster Information:
Name: sa-cassandra
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
e2275d0f-a5fc-39d9-8f11-268b5e9dc295: [10.42.209.245]

5f5f66f5-d6aa-3b90-b674-e08811d4d412: [10.42.247.173]

Freshly bootstrapped node - 10.42.247.173
Single node from original cluster - 10.42.209.245

I read https://docs.datastax.com/en/dse-trblshoot/doc/troubleshooting/
schemaDisagree.html and tried restarting the new node but it didnt help.

Please do suggest. We are facing this issue in production.

Thank you.