Re: Dropping down replication factor

2017-08-12 Thread Brian Spindler
nothing in logs on the node that it was streaming from.

however, I think I found the issue on the other node in the C rack:

ERROR [STREAM-IN-/10.40.17.114] 2017-08-12 16:48:53,354
StreamSession.java:512 - [Stream #08957970-7f7e-11e7-b2a2-a31e21b877e5]
Streaming error occurred
org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted:
/ephemeral/cassandra/data/...

I did a 'cat /var/log/cassandra/system.log|grep Corrupt'  and it seems it's
a single Index.db file and nothing on the other node.

I think nodetool scrub or offline sstablescrub might be in order but with
the current load I'm not sure I can take it offline for very long.

Thanks again for the help.


On Sat, Aug 12, 2017 at 9:38 PM Jeffrey Jirsa  wrote:

> Compaction is backed up – that may be normal write load (because of the
> rack imbalance), or it may be a secondary index build. Hard to say for
> sure. ‘nodetool compactionstats’ if you’re able to provide it. The jstack
> probably not necessary, streaming is being marked as failed and it’s
> turning itself off. Not sure why streaming is marked as failing, though,
> anything on the sending sides?
>
>
>
>
>
> From: Brian Spindler 
> Reply-To: 
> Date: Saturday, August 12, 2017 at 6:34 PM
> To: 
> Subject: Re: Dropping down replication factor
>
> Thanks for replying Jeff.
>
> Responses below.
>
> On Sat, Aug 12, 2017 at 8:33 PM Jeff Jirsa  wrote:
>
>> Answers inline
>>
>> --
>> Jeff Jirsa
>>
>>
>> > On Aug 12, 2017, at 2:58 PM, brian.spind...@gmail.com wrote:
>> >
>> > Hi folks, hopefully a quick one:
>> >
>> > We are running a 12 node cluster (2.1.15) in AWS with Ec2Snitch.  It's
>> all in one region but spread across 3 availability zones.  It was nicely
>> balanced with 4 nodes in each.
>> >
>> > But with a couple of failures and subsequent provisions to the wrong az
>> we now have a cluster with :
>> >
>> > 5 nodes in az A
>> > 5 nodes in az B
>> > 2 nodes in az C
>> >
>> > Not sure why, but when adding a third node in AZ C it fails to stream
>> after getting all the way to completion and no apparent error in logs.
>> I've looked at a couple of bugs referring to scrubbing and possible OOM
>> bugs due to metadata writing at end of streaming (sorry don't have ticket
>> handy).  I'm worried I might not be able to do much with these since the
>> disk space usage is high and they are under a lot of load given the small
>> number of them for this rack.
>>
>> You'll definitely have higher load on az C instances with rf=3 in this
>> ratio
>
>
>> Streaming should still work - are you sure it's not busy doing something?
>> Like building secondary index or similar? jstack thread dump would be
>> useful, or at least nodetool tpstats
>>
>> Only other thing might be a backup.  We do incrementals x1hr and
> snapshots x24h; they are shipped to s3 then links are cleaned up.  The
> error I get on the node I'm trying to add to rack C is:
>
> ERROR [main] 2017-08-12 23:54:51,546 CassandraDaemon.java:583 - Exception
> encountered during startup
> java.lang.RuntimeException: Error during boostrap: Stream failed
> at
> org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:87)
> ~[apache-cassandra-2.1.15.jar:2.1.15]
> at
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1166)
> ~[apache-cassandra-2.1.15.jar:2.1.15]
> at
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:944)
> ~[apache-cassandra-2.1.15.jar:2.1.15]
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:740)
> ~[apache-cassandra-2.1.15.jar:2.1.15]
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:617)
> ~[apache-cassandra-2.1.15.jar:2.1.15]
> at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:391)
> [apache-cassandra-2.1.15.jar:2.1.15]
> at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:566)
> [apache-cassandra-2.1.15.jar:2.1.15]
> at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:655)
> [apache-cassandra-2.1.15.jar:2.1.15]
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> ~[apache-cassandra-2.1.15.jar:2.1.15]
> at
> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
> ~[guava-16.0.jar:na]
> at
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> ~[guava-16.0.jar:na]
> at
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> ~[guava-16.0.jar:na]
> at
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> ~[guava-16.0.jar:na]
> at
> 

Re: Dropping down replication factor

2017-08-12 Thread Jeffrey Jirsa
Compaction is backed up ­ that may be normal write load (because of the rack
imbalance), or it may be a secondary index build. Hard to say for sure.
Œnodetool compactionstats¹ if you¹re able to provide it. The jstack probably
not necessary, streaming is being marked as failed and it¹s turning itself
off. Not sure why streaming is marked as failing, though, anything on the
sending sides?





From:  Brian Spindler 
Reply-To:  
Date:  Saturday, August 12, 2017 at 6:34 PM
To:  
Subject:  Re: Dropping down replication factor

Thanks for replying Jeff.

Responses below. 

On Sat, Aug 12, 2017 at 8:33 PM Jeff Jirsa  wrote:
> Answers inline
> 
> --
> Jeff Jirsa
> 
> 
>> > On Aug 12, 2017, at 2:58 PM, brian.spind...@gmail.com wrote:
>> >
>> > Hi folks, hopefully a quick one:
>> >
>> > We are running a 12 node cluster (2.1.15) in AWS with Ec2Snitch.  It's all
>> in one region but spread across 3 availability zones.  It was nicely balanced
>> with 4 nodes in each.
>> >
>> > But with a couple of failures and subsequent provisions to the wrong az we
>> now have a cluster with :
>> >
>> > 5 nodes in az A
>> > 5 nodes in az B
>> > 2 nodes in az C
>> >
>> > Not sure why, but when adding a third node in AZ C it fails to stream after
>> getting all the way to completion and no apparent error in logs.  I've looked
>> at a couple of bugs referring to scrubbing and possible OOM bugs due to
>> metadata writing at end of streaming (sorry don't have ticket handy).  I'm
>> worried I might not be able to do much with these since the disk space usage
>> is high and they are under a lot of load given the small number of them for
>> this rack.
> 
> You'll definitely have higher load on az C instances with rf=3 in this ratio
> 
> Streaming should still work - are you sure it's not busy doing something? Like
> building secondary index or similar? jstack thread dump would be useful, or at
> least nodetool tpstats
> 
Only other thing might be a backup.  We do incrementals x1hr and snapshots
x24h; they are shipped to s3 then links are cleaned up.  The error I get on
the node I'm trying to add to rack C is:

ERROR [main] 2017-08-12 23:54:51,546 CassandraDaemon.java:583 - Exception
encountered during startup
java.lang.RuntimeException: Error during boostrap: Stream failed
at 
org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:87)
~[apache-cassandra-2.1.15.jar:2.1.15]
at 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:11
66) ~[apache-cassandra-2.1.15.jar:2.1.15]
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.jav
a:944) ~[apache-cassandra-2.1.15.jar:2.1.15]
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:7
40) ~[apache-cassandra-2.1.15.jar:2.1.15]
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:6
17) ~[apache-cassandra-2.1.15.jar:2.1.15]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:391)
[apache-cassandra-2.1.15.jar:2.1.15]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:5
66) [apache-cassandra-2.1.15.jar:2.1.15]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:655)
[apache-cassandra-2.1.15.jar:2.1.15]
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
at 
org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(S
treamEventJMXNotifier.java:85) ~[apache-cassandra-2.1.15.jar:2.1.15]
at 
com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
~[guava-16.0.jar:na]
at 
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.ex
ecute(MoreExecutors.java:297) ~[guava-16.0.jar:na]
at 
com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionLis
t.java:156) ~[guava-16.0.jar:na]
at 
com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:1
45) ~[guava-16.0.jar:na]
at 
com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture
.java:202) ~[guava-16.0.jar:na]
at 
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResult
Future.java:209) ~[apache-cassandra-2.1.15.jar:2.1.15]
at 
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(Stre
amResultFuture.java:185) ~[apache-cassandra-2.1.15.jar:2.1.15]
at 
org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java
:413) ~[apache-cassandra-2.1.15.jar:2.1.15]
at 
org.apache.cassandra.streaming.StreamSession.maybeCompleted(StreamSession.ja
va:700) ~[apache-cassandra-2.1.15.jar:2.1.15]
at 
org.apache.cassandra.streaming.StreamSession.taskCompleted(StreamSession.jav
a:661) ~[apache-cassandra-2.1.15.jar:2.1.15]
at 

Re: Dropping down replication factor

2017-08-12 Thread Brian Spindler
Thanks for replying Jeff.

Responses below.

On Sat, Aug 12, 2017 at 8:33 PM Jeff Jirsa  wrote:

> Answers inline
>
> --
> Jeff Jirsa
>
>
> > On Aug 12, 2017, at 2:58 PM, brian.spind...@gmail.com wrote:
> >
> > Hi folks, hopefully a quick one:
> >
> > We are running a 12 node cluster (2.1.15) in AWS with Ec2Snitch.  It's
> all in one region but spread across 3 availability zones.  It was nicely
> balanced with 4 nodes in each.
> >
> > But with a couple of failures and subsequent provisions to the wrong az
> we now have a cluster with :
> >
> > 5 nodes in az A
> > 5 nodes in az B
> > 2 nodes in az C
> >
> > Not sure why, but when adding a third node in AZ C it fails to stream
> after getting all the way to completion and no apparent error in logs.
> I've looked at a couple of bugs referring to scrubbing and possible OOM
> bugs due to metadata writing at end of streaming (sorry don't have ticket
> handy).  I'm worried I might not be able to do much with these since the
> disk space usage is high and they are under a lot of load given the small
> number of them for this rack.
>
> You'll definitely have higher load on az C instances with rf=3 in this
> ratio


> Streaming should still work - are you sure it's not busy doing something?
> Like building secondary index or similar? jstack thread dump would be
> useful, or at least nodetool tpstats
>
> Only other thing might be a backup.  We do incrementals x1hr and snapshots
x24h; they are shipped to s3 then links are cleaned up.  The error I get on
the node I'm trying to add to rack C is:

ERROR [main] 2017-08-12 23:54:51,546 CassandraDaemon.java:583 - Exception
encountered during startup
java.lang.RuntimeException: Error during boostrap: Stream failed
at
org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:87)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1166)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:944)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:740)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:617)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:391)
[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:566)
[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:655)
[apache-cassandra-2.1.15.jar:2.1.15]
Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
at
org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
~[guava-16.0.jar:na]
at
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
~[guava-16.0.jar:na]
at
com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
~[guava-16.0.jar:na]
at
com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
~[guava-16.0.jar:na]
at
com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
~[guava-16.0.jar:na]
at
org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:209)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:185)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:413)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.streaming.StreamSession.maybeCompleted(StreamSession.java:700)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.streaming.StreamSession.taskCompleted(StreamSession.java:661)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:179)
~[apache-cassandra-2.1.15.jar:2.1.15]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[na:1.8.0_112]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[na:1.8.0_112]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[na:1.8.0_112]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
~[na:1.8.0_112]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_112]
WARN  [StorageServiceShutdownHook] 2017-08-12 23:54:51,582
Gossiper.java:1462 - No local state or state is in silent 

Re: Dropping down replication factor

2017-08-12 Thread Jeff Jirsa
Answers inline

-- 
Jeff Jirsa


> On Aug 12, 2017, at 2:58 PM, brian.spind...@gmail.com wrote:
> 
> Hi folks, hopefully a quick one:
> 
> We are running a 12 node cluster (2.1.15) in AWS with Ec2Snitch.  It's all in 
> one region but spread across 3 availability zones.  It was nicely balanced 
> with 4 nodes in each.
> 
> But with a couple of failures and subsequent provisions to the wrong az we 
> now have a cluster with : 
> 
> 5 nodes in az A
> 5 nodes in az B
> 2 nodes in az C
> 
> Not sure why, but when adding a third node in AZ C it fails to stream after 
> getting all the way to completion and no apparent error in logs.  I've looked 
> at a couple of bugs referring to scrubbing and possible OOM bugs due to 
> metadata writing at end of streaming (sorry don't have ticket handy).  I'm 
> worried I might not be able to do much with these since the disk space usage 
> is high and they are under a lot of load given the small number of them for 
> this rack.

You'll definitely have higher load on az C instances with rf=3 in this ratio

Streaming should still work - are you sure it's not busy doing something? Like 
building secondary index or similar? jstack thread dump would be useful, or at 
least nodetool tpstats


> 
> Rather than troubleshoot this further, what I was thinking about doing was:
> - drop the replication factor on our keyspace to two

Repair before you do this, or you'll lose your consistency guarantees

> - hopefully this would reduce load on these two remaining nodes 

It should, racks awareness guarantees on replica per rack if rf==num racks, so 
right now those 2 c machines have 2.5x as much data as the others. This will 
drop that requirement and drop the load significantly 

> - run repairs/cleanup across the cluster 
> - then shoot these two nodes in the 'c' rack

Why shoot the c instances? Why not drop RF and then add 2 more C instances, 
then increase RF back to 3, run repair, then Decom the extra instances in a and 
b?


> - run repairs/cleanup across the cluster
> 
> Would this work with minimal/no disruption? 

The big risk of running rf=2 is that quorum==all - any gc pause or node 
restarting will make you lose HA or strong consistency guarantees.

> Should I update their "rack" before hand or after ?

You can't change a node's rack once it's in the cluster, it SHOULD refuse to 
start if you do that

> What else am I not thinking about? 
> 
> My main goal atm is to get back to where the cluster is in a clean consistent 
> state that allows nodes to properly bootstrap.
> 
> Thanks for your help in advance.
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Dropping down replication factor

2017-08-12 Thread brian . spindler
Hi folks, hopefully a quick one:

We are running a 12 node cluster (2.1.15) in AWS with Ec2Snitch.  It's all in 
one region but spread across 3 availability zones.  It was nicely balanced with 
4 nodes in each.

But with a couple of failures and subsequent provisions to the wrong az we now 
have a cluster with : 

5 nodes in az A
5 nodes in az B
2 nodes in az C

Not sure why, but when adding a third node in AZ C it fails to stream after 
getting all the way to completion and no apparent error in logs.  I've looked 
at a couple of bugs referring to scrubbing and possible OOM bugs due to 
metadata writing at end of streaming (sorry don't have ticket handy).  I'm 
worried I might not be able to do much with these since the disk space usage is 
high and they are under a lot of load given the small number of them for this 
rack.

Rather than troubleshoot this further, what I was thinking about doing was:
- drop the replication factor on our keyspace to two
- hopefully this would reduce load on these two remaining nodes 
- run repairs/cleanup across the cluster 
- then shoot these two nodes in the 'c' rack
- run repairs/cleanup across the cluster

Would this work with minimal/no disruption? 
Should I update their "rack" before hand or after ?
What else am I not thinking about? 

My main goal atm is to get back to where the cluster is in a clean consistent 
state that allows nodes to properly bootstrap.

Thanks for your help in advance.
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Answering the question of Cassandra Summit 2017

2017-08-12 Thread Jeff Jirsa
Thanks for sending that out Patrick!

Really sad not to have a 2017 summit, but looking forward to 2018
summit(s).




On Fri, Aug 11, 2017 at 5:27 PM, Patrick McFadin  wrote:

> Hello Cassandra Community,
>
> I know this is a hot topic so let me TL;DR for those of you looking for
> the short answer. Will there be a Cassandra Summit in 2017? DataStax will
> not be holding a Cassandra Summit in 2017, but instead multiple DataStax
> Summits in 2018.
>
> More details. Last year was pretty chaotic for the Cassandra community and
> where DataStax fit with the project. I don’t need to re-cap all the drama.
> You can go look at the dev and user lists around this time last year if you
> want to re-live it. It’s safe to say 2016 is a year I wouldn’t want to do
> again for a lot of reasons. Those of us at the Cassandra Summit 2016 knew
> it was the end of something, now the question is what’s next and what will
> it be?
>
> Having a place for people together and talk about what we as a community
> do is really important. Those of you that know me, know how much I live
> that. When we started talking summit inside DataStax, we realized it would
> be a hot button issue. When I started talking to people in the community,
> it was even more of a hot button issue. Having DataStax run the Cassandra
> Summit was going to cause a lot of heartache and would further divide the
> community with questions and accusations. It’s just much easier to hold a
> DataStax Summit so we are just out there plainly and move forward.
>
> What DataStax will be doing different.
>
> We will be moving back to a more regional format instead of the big bang
> single event in San Jose starting in 2018. Fun fact. Almost 80% of
> attendees of Cassandra Summit were from the Bay Area. That means we have
> developers and operators from a lot of other places being excluded which
> isn’t cool. We will also be inviting talks from the Cassandra Community.
> You don’t have to be a DataStax customer or partner to get on the speaking
> list.
>
> If there is some new group or company that launches a Cassandra Summit,
> DataStax will happily be a sponsor. There are some for-profit, professional
> conference companies like the Linux Foundation out there that just may and
> if so, I’ll see you there. After being involved in making the Cassandra
> Summit happen for years, I can say it’s no small effort.
>
> There it is. Fire away with your questions, comments. All I ask is keep it
> respectful because this is a community of amazing people. You have changed
> the world over these years and I know it won’t stop. You know I got a hug
> for you wherever we just happen to meet.
>
> Patrick
>
>