Re: Upgrade strategy for high number of nodes

2019-11-29 Thread Shishir Kumar
Thanks for pointer. We haven't much changed data model since long, so
before workarounds (scrub) worth understanding root cause of problem.
This might be reason why running upgradesstables in parallel was not
recommended.
-Shishir

On Sat, 30 Nov 2019, 10:37 Jeff Jirsa,  wrote:

> Scrub really shouldn’t be required here.
>
> If there’s ever a step that reports corruption, it’s either a very very
> old table where you dropped columns previously or did something “wrong” in
> the past or a software bug. The old dropped column really should be obvious
> in the stack trace - anything else deserves a bug report.
>
> It’s unfortunate that people jump to just scrubbing the unreadable data -
> would appreciate an anonymized JIRA if possible. Alternatively work with
> your vendor to make sure they don’t have bugs in their readers somehow.
>
>
>
>
> On Nov 29, 2019, at 8:58 PM, Shishir Kumar 
> wrote:
>
> 
> Some more background. We are planning (tested) binary upgrade across all
> nodes without downtime. As next step running upgradesstables. As C*
> file format and version (from format big, version mc to format bti, version
> aa (Refer
> https://docs.datastax.com/en/dse/6.0/dse-admin/datastax_enterprise/tools/toolsSStables/ToolsSSTableupgrade.html
> - upgrade from DSE 5.1 to 6.x). Underlying changes explains why it takes
> too much time to upgrade.
> Running  upgradesstables  in parallel across RAC - This is where I am not
> sure on impact of running in parallel (document recommends to run one node
> at time). During upgradesstables there are scenario's where it report file
> corruption, hence require corrective step I.e. scrub. Due to file
> corruption at times nodes goes down due to sstable corruption or result in
> high CPU usage ~100%. Performing above in parallel *without downtime*
> might result in more inconsistency across nodes. This scenario have not
> tested, so will need group help in case they have done similar upgrade in
> past (I.e. scenario's/complexity which needs to be considered and why
> guideline recommend to run upgradesstable one node at time).
> -Shishir
>
> On Fri, Nov 29, 2019 at 11:52 PM Josh Snyder  wrote:
>
>> Hello Shishir,
>>
>> It shouldn't be necessary to take downtime to perform upgrades of a
>> Cassandra cluster. It sounds like the biggest issue you're facing is the
>> upgradesstables step. upgradesstables is not strictly necessary before a
>> Cassandra node re-enters the cluster to serve traffic; in my experience it
>> is purely for optimizing the performance of the database once the software
>> upgrade is complete. I recommend trying out an upgrade in a test
>> environment without using upgradesstables, which should bring the 5 hours
>> per node down to just a few minutes.
>>
>> If you're running NetworkTopologyStrategy and you want to optimize
>> further, you could consider performing the upgrade on multiple nodes within
>> the same rack in parallel. When correctly configured,
>> NetworkTopologyStrategy can protect your database from an outage of an
>> entire rack. So performing an upgrade on a few nodes at a time within a
>> rack is the same as a partial rack outage, from the database's perspective.
>>
>> Have a nice upgrade!
>>
>> Josh
>>
>> On Fri, Nov 29, 2019 at 7:22 AM Shishir Kumar 
>> wrote:
>>
>>> Hi,
>>>
>>> Need input on cassandra upgrade strategy for below:
>>> 1. We have Datacenter across 4 geography (multiple isolated deployments
>>> in each DC).
>>> 2. Number of Cassandra nodes in each deployment is between 6 to 24
>>> 3. Data volume on each nodes between 150 to 400 GB
>>> 4. All production environment has DR set up
>>> 5. During upgrade we do not want downtime
>>>
>>> We are planning to go for stack upgrade but upgradesstables is taking
>>> approx. 5 hours per node (if data volume is approx 200 GB).
>>> Options-
>>> No downtime - As per recommendation (DataStax documentation) if we plan
>>> to upgrade one node at time I.e. in sequence upgrade cycle for one
>>> environment will take weeks, so DevOps concern.
>>> Read Only (No downtime) - Route read only load to DR system. We have
>>> resilience built up to take care of mutation scenarios. But incase it takes
>>> more than say 3-4 hours, there will be long catch up exercise. Maintenance
>>> cost seems too high due to unknowns
>>> Downtime- Can upgrade all nodes in parallel as no live customers. This
>>> has direct Customer impact, so need to convince on maintenance cost vs
>>> customer impact.
>>> Please suggest how other Organisation are solving this scenario (whom
>>> have 100+ nodes)
>>>
>>> Regards
>>> Shishir
>>>
>>>


Re: Upgrade strategy for high number of nodes

2019-11-29 Thread Jeff Jirsa
Scrub really shouldn’t be required here. 

If there’s ever a step that reports corruption, it’s either a very very old 
table where you dropped columns previously or did something “wrong” in the past 
or a software bug. The old dropped column really should be obvious in the stack 
trace - anything else deserves a bug report.

It’s unfortunate that people jump to just scrubbing the unreadable data - would 
appreciate an anonymized JIRA if possible. Alternatively work with your vendor 
to make sure they don’t have bugs in their readers somehow. 




> On Nov 29, 2019, at 8:58 PM, Shishir Kumar  wrote:
> 
> 
> Some more background. We are planning (tested) binary upgrade across all 
> nodes without downtime. As next step running upgradesstables. As C* file 
> format and version (from format big, version mc to format bti, version aa 
> (Refer 
> https://docs.datastax.com/en/dse/6.0/dse-admin/datastax_enterprise/tools/toolsSStables/ToolsSSTableupgrade.html
>  - upgrade from DSE 5.1 to 6.x). Underlying changes explains why it takes too 
> much time to upgrade. 
> Running  upgradesstables  in parallel across RAC - This is where I am not 
> sure on impact of running in parallel (document recommends to run one node at 
> time). During upgradesstables there are scenario's where it report file 
> corruption, hence require corrective step I.e. scrub. Due to file corruption 
> at times nodes goes down due to sstable corruption or result in high CPU 
> usage ~100%. Performing above in parallel without downtime might result in 
> more inconsistency across nodes. This scenario have not tested, so will need 
> group help in case they have done similar upgrade in past (I.e. 
> scenario's/complexity which needs to be considered and why guideline 
> recommend to run upgradesstable one node at time).
> -Shishir
> 
>> On Fri, Nov 29, 2019 at 11:52 PM Josh Snyder  wrote:
>> Hello Shishir,
>> 
>> It shouldn't be necessary to take downtime to perform upgrades of a 
>> Cassandra cluster. It sounds like the biggest issue you're facing is the 
>> upgradesstables step. upgradesstables is not strictly necessary before a 
>> Cassandra node re-enters the cluster to serve traffic; in my experience it 
>> is purely for optimizing the performance of the database once the software 
>> upgrade is complete. I recommend trying out an upgrade in a test environment 
>> without using upgradesstables, which should bring the 5 hours per node down 
>> to just a few minutes.
>> 
>> If you're running NetworkTopologyStrategy and you want to optimize further, 
>> you could consider performing the upgrade on multiple nodes within the same 
>> rack in parallel. When correctly configured, NetworkTopologyStrategy can 
>> protect your database from an outage of an entire rack. So performing an 
>> upgrade on a few nodes at a time within a rack is the same as a partial rack 
>> outage, from the database's perspective.
>> 
>> Have a nice upgrade!
>> 
>> Josh
>> 
>>> On Fri, Nov 29, 2019 at 7:22 AM Shishir Kumar  
>>> wrote:
>>> Hi,
>>> 
>>> Need input on cassandra upgrade strategy for below:
>>> 1. We have Datacenter across 4 geography (multiple isolated deployments in 
>>> each DC).
>>> 2. Number of Cassandra nodes in each deployment is between 6 to 24
>>> 3. Data volume on each nodes between 150 to 400 GB
>>> 4. All production environment has DR set up
>>> 5. During upgrade we do not want downtime 
>>> 
>>> We are planning to go for stack upgrade but upgradesstables is taking 
>>> approx. 5 hours per node (if data volume is approx 200 GB). 
>>> Options- 
>>> No downtime - As per recommendation (DataStax documentation) if we plan to 
>>> upgrade one node at time I.e. in sequence upgrade cycle for one environment 
>>> will take weeks, so DevOps concern.
>>> Read Only (No downtime) - Route read only load to DR system. We have 
>>> resilience built up to take care of mutation scenarios. But incase it takes 
>>> more than say 3-4 hours, there will be long catch up exercise. Maintenance 
>>> cost seems too high due to unknowns 
>>> Downtime- Can upgrade all nodes in parallel as no live customers. This has 
>>> direct Customer impact, so need to convince on maintenance cost vs customer 
>>> impact.
>>> Please suggest how other Organisation are solving this scenario (whom have 
>>> 100+ nodes)
>>> 
>>> Regards 
>>> Shishir 
>>> 


Re: Upgrade strategy for high number of nodes

2019-11-29 Thread Shishir Kumar
Some more background. We are planning (tested) binary upgrade across all
nodes without downtime. As next step running upgradesstables. As C*
file format and version (from format big, version mc to format bti, version
aa (Refer
https://docs.datastax.com/en/dse/6.0/dse-admin/datastax_enterprise/tools/toolsSStables/ToolsSSTableupgrade.html
- upgrade from DSE 5.1 to 6.x). Underlying changes explains why it takes
too much time to upgrade.
Running  upgradesstables  in parallel across RAC - This is where I am not
sure on impact of running in parallel (document recommends to run one node
at time). During upgradesstables there are scenario's where it report file
corruption, hence require corrective step I.e. scrub. Due to file
corruption at times nodes goes down due to sstable corruption or result in
high CPU usage ~100%. Performing above in parallel *without downtime* might
result in more inconsistency across nodes. This scenario have not tested,
so will need group help in case they have done similar upgrade in
past (I.e. scenario's/complexity which needs to be considered and why
guideline recommend to run upgradesstable one node at time).
-Shishir

On Fri, Nov 29, 2019 at 11:52 PM Josh Snyder  wrote:

> Hello Shishir,
>
> It shouldn't be necessary to take downtime to perform upgrades of a
> Cassandra cluster. It sounds like the biggest issue you're facing is the
> upgradesstables step. upgradesstables is not strictly necessary before a
> Cassandra node re-enters the cluster to serve traffic; in my experience it
> is purely for optimizing the performance of the database once the software
> upgrade is complete. I recommend trying out an upgrade in a test
> environment without using upgradesstables, which should bring the 5 hours
> per node down to just a few minutes.
>
> If you're running NetworkTopologyStrategy and you want to optimize
> further, you could consider performing the upgrade on multiple nodes within
> the same rack in parallel. When correctly configured,
> NetworkTopologyStrategy can protect your database from an outage of an
> entire rack. So performing an upgrade on a few nodes at a time within a
> rack is the same as a partial rack outage, from the database's perspective.
>
> Have a nice upgrade!
>
> Josh
>
> On Fri, Nov 29, 2019 at 7:22 AM Shishir Kumar 
> wrote:
>
>> Hi,
>>
>> Need input on cassandra upgrade strategy for below:
>> 1. We have Datacenter across 4 geography (multiple isolated deployments
>> in each DC).
>> 2. Number of Cassandra nodes in each deployment is between 6 to 24
>> 3. Data volume on each nodes between 150 to 400 GB
>> 4. All production environment has DR set up
>> 5. During upgrade we do not want downtime
>>
>> We are planning to go for stack upgrade but upgradesstables is taking
>> approx. 5 hours per node (if data volume is approx 200 GB).
>> Options-
>> No downtime - As per recommendation (DataStax documentation) if we plan
>> to upgrade one node at time I.e. in sequence upgrade cycle for one
>> environment will take weeks, so DevOps concern.
>> Read Only (No downtime) - Route read only load to DR system. We have
>> resilience built up to take care of mutation scenarios. But incase it takes
>> more than say 3-4 hours, there will be long catch up exercise. Maintenance
>> cost seems too high due to unknowns
>> Downtime- Can upgrade all nodes in parallel as no live customers. This
>> has direct Customer impact, so need to convince on maintenance cost vs
>> customer impact.
>> Please suggest how other Organisation are solving this scenario (whom
>> have 100+ nodes)
>>
>> Regards
>> Shishir
>>
>>


Re: Upgrade strategy for high number of nodes

2019-11-29 Thread Josh Snyder
Hello Shishir,

It shouldn't be necessary to take downtime to perform upgrades of a
Cassandra cluster. It sounds like the biggest issue you're facing is the
upgradesstables step. upgradesstables is not strictly necessary before a
Cassandra node re-enters the cluster to serve traffic; in my experience it
is purely for optimizing the performance of the database once the software
upgrade is complete. I recommend trying out an upgrade in a test
environment without using upgradesstables, which should bring the 5 hours
per node down to just a few minutes.

If you're running NetworkTopologyStrategy and you want to optimize further,
you could consider performing the upgrade on multiple nodes within the same
rack in parallel. When correctly configured, NetworkTopologyStrategy can
protect your database from an outage of an entire rack. So performing an
upgrade on a few nodes at a time within a rack is the same as a partial
rack outage, from the database's perspective.

Have a nice upgrade!

Josh

On Fri, Nov 29, 2019 at 7:22 AM Shishir Kumar 
wrote:

> Hi,
>
> Need input on cassandra upgrade strategy for below:
> 1. We have Datacenter across 4 geography (multiple isolated deployments in
> each DC).
> 2. Number of Cassandra nodes in each deployment is between 6 to 24
> 3. Data volume on each nodes between 150 to 400 GB
> 4. All production environment has DR set up
> 5. During upgrade we do not want downtime
>
> We are planning to go for stack upgrade but upgradesstables is taking
> approx. 5 hours per node (if data volume is approx 200 GB).
> Options-
> No downtime - As per recommendation (DataStax documentation) if we plan to
> upgrade one node at time I.e. in sequence upgrade cycle for one environment
> will take weeks, so DevOps concern.
> Read Only (No downtime) - Route read only load to DR system. We have
> resilience built up to take care of mutation scenarios. But incase it takes
> more than say 3-4 hours, there will be long catch up exercise. Maintenance
> cost seems too high due to unknowns
> Downtime- Can upgrade all nodes in parallel as no live customers. This has
> direct Customer impact, so need to convince on maintenance cost vs customer
> impact.
> Please suggest how other Organisation are solving this scenario (whom have
> 100+ nodes)
>
> Regards
> Shishir
>
>


Upgrade strategy for high number of nodes

2019-11-29 Thread Shishir Kumar
Hi,

Need input on cassandra upgrade strategy for below:
1. We have Datacenter across 4 geography (multiple isolated deployments in
each DC).
2. Number of Cassandra nodes in each deployment is between 6 to 24
3. Data volume on each nodes between 150 to 400 GB
4. All production environment has DR set up
5. During upgrade we do not want downtime

We are planning to go for stack upgrade but upgradesstables is taking
approx. 5 hours per node (if data volume is approx 200 GB).
Options-
No downtime - As per recommendation (DataStax documentation) if we plan to
upgrade one node at time I.e. in sequence upgrade cycle for one environment
will take weeks, so DevOps concern.
Read Only (No downtime) - Route read only load to DR system. We have
resilience built up to take care of mutation scenarios. But incase it takes
more than say 3-4 hours, there will be long catch up exercise. Maintenance
cost seems too high due to unknowns
Downtime- Can upgrade all nodes in parallel as no live customers. This has
direct Customer impact, so need to convince on maintenance cost vs customer
impact.
Please suggest how other Organisation are solving this scenario (whom have
100+ nodes)

Regards
Shishir


Re: Aws instance stop and star with ebs

2019-11-29 Thread Georg Brandemann
Hi Rahul

Also have a look at  https://issues.apache.org/jira/browse/CASSANDRA-14358 .
We saw this on a 2.1.x cluster and there it also took ~10 minutes till the
restarted node was really fully available in the cluster. the echo ACKs
from some nodes simply seemed to never reach the target

Georg

Am Mi., 6. Nov. 2019 um 21:41 Uhr schrieb Rahul Reddy <
rahulreddy1...@gmail.com>:

> Thanks Daemeon ,
>
> will do that and post the results.
> I found jira in open state with similar issue
> https://issues.apache.org/jira/browse/CASSANDRA-13984
>
> On Wed, Nov 6, 2019 at 1:49 PM daemeon reiydelle 
> wrote:
>
>> No connection timeouts? No tcp level retries? I am sorry truly sorry but
>> you have exceeded my capability. I have never seen a java.io timeout
>> with out either a session half open failure (no response) or multiple
>> retries.
>>
>> I am out of my depth, so please feel free to ignore but, did you see the
>> packets that are making the initial connection (which must have timed out)?
>> Out of curiosity, a netstat -arn must be showing bad packets, timeouts,
>> etc. To see progress, create a simple shell script that dumps date, dumps
>> netstat, sleeps 100 seconds, repeated. During that window stop, wait 10
>> seconds, restart the remove node.
>>
>> <==>
>> Made weak by time and fate, but strong in will,
>> To strive, to seek, to find, and not to yield.
>> Ulysses - A. Lord Tennyson
>>
>> *Daemeon C.M. Reiydelle*
>>
>> *email: daeme...@gmail.com *
>> *San Francisco 1.415.501.0198/Skype daemeon.c.m.reiydelle*
>>
>>
>>
>> On Wed, Nov 6, 2019 at 9:11 AM Rahul Reddy 
>> wrote:
>>
>>> Thank you.
>>>
>>> I have stopped instance in east. i see that all other instances can
>>> gossip to that instance and only one instance in west having issues
>>> gossiping to that node.  when i enable debug mode i see below on the west
>>> node
>>>
>>> i see bellow messages from 16:32 to 16:47
>>> DEBUG [RMI TCP Connection(272)-127.0.0.1] 2019-11-06 16:44:50,
>>> 417 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a
>>> response from everybody:
>>> 424 StorageProxy.java:2361 - Hosts not in agreement. Didn't get a
>>> response from everybody:
>>>
>>> later i see timeout
>>>
>>> DEBUG [MessagingService-Outgoing-/eastip-Gossip] 2019-11-06 16:47:04,831
>>> OutboundTcpConnection.java:350 - Error writing to /eastip
>>> java.io.IOException: Connection timed out
>>>
>>> then  INFO  [GossipStage:1] 2019-11-06 16:47:05,792 StorageService.j
>>> ava:2289 - Node /eastip state jump to NORMAL
>>>
>>> DEBUG [GossipStage:1] 2019-11-06 16:47:06,244 MigrationManager
>>> .java:99 - Not pulling schema from /eastip, because sche
>>> ma versions match: local/real=cdbb639b-1675-31b3-8a0d-84aca18e
>>> 86bf, local/compatible=49bf1daa-d585-38e0-a72b-b36ce82da9cb, r
>>> emote=cdbb639b-1675-31b3-8a0d-84aca18e86bf
>>>
>>> i tried running some tcpdump during that time i dont see any packet loss
>>> during that time.  still unsure why east instance which was stopped and
>>> started unreachable to west node almost for 15 minutes.
>>>
>>>
>>> On Tue, Nov 5, 2019 at 10:14 PM daemeon reiydelle 
>>> wrote:
>>>
 10 minutes is 600 seconds, and there are several timeouts that are set
 to that, including the data center timeout as I recall.

 You may be forced to tcpdump the interface(s) to see where the chatter
 is. Out of curiosity, when you restart the node, have you snapped the jvm's
 memory to see if e.g. heap is even in use?


 On Tue, Nov 5, 2019 at 7:03 PM Rahul Reddy 
 wrote:

> Thanks Ben,
> Before stoping the ec2 I did run nodetool drain .so i ruled it out and
> system.log also doesn't show commitlogs being applied.
>
>
>
>
>
> On Tue, Nov 5, 2019, 7:51 PM Ben Slater 
> wrote:
>
>> The logs between first start and handshaking should give you a
>> clue but my first guess would be replaying commit logs.
>>
>> Cheers
>> Ben
>>
>> ---
>>
>>
>> *Ben Slater**Chief Product Officer*
>>
>> 
>>
>> 
>> 
>> 
>>
>> Read our latest technical blog posts here
>> .
>>
>> This email has been sent on behalf of Instaclustr Pty. Limited
>> (Australia) and Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not 
>> copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the 
>> message.
>>
>>
>> On Wed, 6 Nov 2019 at 04:36, Rahul Reddy 
>> wrote:
>>
>>> I can reproduce the issue.
>>>
>>> I did drain Cassandra node then stop and started Cassandra 

performance

2019-11-29 Thread hahaha sc
Query based on a field with a non-primary key and a secondary index, and
then update based on the primary key. Can it be  more efficient than mysql?