[jira] [Commented] (CASSANDRA-13555) Thread leak during repair

2017-05-31 Thread Simon Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032401#comment-16032401
 ] 

Simon Zhou commented on CASSANDRA-13555:


Thanks [~tjake] for the comment. I'll be working on the patch but I'm not sure 
whether that is the best fix. Reasons:
1. The "executor" is created in RepairRunnable and runs all RepairJob's for a 
given keyspace. It's not a single RepairSession instance's responsibility to 
stop the "executor", nor it has a reference to it.
2. The bigger problem is that, why do we handle "node down" in RepairSession? 
IMHO it should be handled at a higher level. That means, once an endpoint is 
down, we should stop all RepairRunnable's. Sure there could be improvement, 
e.g., only stop those affected RepairSession's (token ranges). But anyway we 
are not doing this today and it deserves a separate change.

What do you think? I know there is bigger change in upcoming 4.0 but I don't 
want a band-aid fix that just makes things messy.

> Thread leak during repair
> -
>
> Key: CASSANDRA-13555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13555
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>
> The symptom is similar to what happened in [CASSANDRA-13204 | 
> https://issues.apache.org/jira/browse/CASSANDRA-13204] that the thread 
> waiting forever doing nothing. This one happened during "nodetool repair -pr 
> -seq -j 1" in production but I can easily simulate the problem with just 
> "nodetool repair" in dev environment (CCM). I'm trying to explain what 
> happened with 3.0.13 code base.
> 1. One node is down while doing repair. This is the error I saw in production:
> {code}
> ERROR [GossipTasks:1] 2017-05-19 15:00:10,545 RepairSession.java:334 - 
> [repair #bc9a3cd1-3ca3-11e7-a44a-e30923ac9336] session completed with the 
> following error
> java.io.IOException: Endpoint /10.185.43.15 died
> at 
> org.apache.cassandra.repair.RepairSession.convict(RepairSession.java:333) 
> ~[apache-cassandra-3.0.11.jar:3.0.11]
> at 
> org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:306) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:766) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at org.apache.cassandra.gms.Gossiper.access$800(Gossiper.java:66) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:181) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
>  [apache-cassandra-3.0.11.jar:3.0.11]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [na:1.8.0_121]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  [na:1.8.0_121]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  [na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [apache-cassandra-3.0.11.jar:3.0.11]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {code}
> 2. At this moment the repair coordinator hasn't received the response 
> (MerkleTrees) for the node that was marked down. This means, RepairJob#run 
> will never return because it waits for validations to finish:
> {code}
> // Wait for validation to complete
> Futures.getUnchecked(validations);
> {code}
> Be noted that all RepairJob's (as Runnable) run on a shared executor created 
> in RepairRunnable#runMayThrow, while all snapshot, validation and sync'ing 
> happen on a per-RepairSession "taskExecutor". The RepairJob#run will only 
> return when it receives MerkleTrees (or null) from all endpoints for a given 
> column family and token range.
> As evidence of the thread leak, below is from the thread dump. I can also get 
> the same stack trace when simulating the same issue in dev environment.
> {code}
> "Repair#129:56" #406373 daemon prio=5 os_prio=0 tid=0x7fc495028400 
> nid=0x1a77d waiting on condition [0x7fc02153]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0002d7c00198> (a 
> com.google.common.util.concurrent.AbstractFuture$Sync)
>   

[jira] [Commented] (CASSANDRA-13555) Thread leak during repair

2017-05-30 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029431#comment-16029431
 ] 

T Jake Luciani commented on CASSANDRA-13555:


Just realised you already found that :)  

We only block on validations because we want to throttle the validations (since 
they require compactors).  If we don't it can overwhelm the replicas with 
pending work for each subrange  

> Thread leak during repair
> -
>
> Key: CASSANDRA-13555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13555
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>
> The symptom is similar to what happened in [CASSANDRA-13204 | 
> https://issues.apache.org/jira/browse/CASSANDRA-13204] that the thread 
> waiting forever doing nothing. This one happened during "nodetool repair -pr 
> -seq -j 1" in production but I can easily simulate the problem with just 
> "nodetool repair" in dev environment (CCM). I'm trying to explain what 
> happened with 3.0.13 code base.
> 1. One node is down while doing repair. This is the error I saw in production:
> {code}
> ERROR [GossipTasks:1] 2017-05-19 15:00:10,545 RepairSession.java:334 - 
> [repair #bc9a3cd1-3ca3-11e7-a44a-e30923ac9336] session completed with the 
> following error
> java.io.IOException: Endpoint /10.185.43.15 died
> at 
> org.apache.cassandra.repair.RepairSession.convict(RepairSession.java:333) 
> ~[apache-cassandra-3.0.11.jar:3.0.11]
> at 
> org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:306) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:766) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at org.apache.cassandra.gms.Gossiper.access$800(Gossiper.java:66) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:181) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
>  [apache-cassandra-3.0.11.jar:3.0.11]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [na:1.8.0_121]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  [na:1.8.0_121]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  [na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [apache-cassandra-3.0.11.jar:3.0.11]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {code}
> 2. At this moment the repair coordinator hasn't received the response 
> (MerkleTrees) for the node that was marked down. This means, RepairJob#run 
> will never return because it waits for validations to finish:
> {code}
> // Wait for validation to complete
> Futures.getUnchecked(validations);
> {code}
> Be noted that all RepairJob's (as Runnable) run on a shared executor created 
> in RepairRunnable#runMayThrow, while all snapshot, validation and sync'ing 
> happen on a per-RepairSession "taskExecutor". The RepairJob#run will only 
> return when it receives MerkleTrees (or null) from all endpoints for a given 
> column family and token range.
> As evidence of the thread leak, below is from the thread dump. I can also get 
> the same stack trace when simulating the same issue in dev environment.
> {code}
> "Repair#129:56" #406373 daemon prio=5 os_prio=0 tid=0x7fc495028400 
> nid=0x1a77d waiting on condition [0x7fc02153]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0002d7c00198> (a 
> com.google.common.util.concurrent.AbstractFuture$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285)
> at 
> 

[jira] [Commented] (CASSANDRA-13555) Thread leak during repair

2017-05-30 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16029421#comment-16029421
 ] 

T Jake Luciani commented on CASSANDRA-13555:


The issue here is we aren't finishing off the validations when the repair 
session terminates:

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/repair/RepairSession.java#L302

So just need to walk that array and set results to null.  Same with the 
syncTasks.
[~szhou] would you be able to make a patch (and ideally a dtest)?




> Thread leak during repair
> -
>
> Key: CASSANDRA-13555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13555
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>
> The symptom is similar to what happened in [CASSANDRA-13204 | 
> https://issues.apache.org/jira/browse/CASSANDRA-13204] that the thread 
> waiting forever doing nothing. This one happened during "nodetool repair -pr 
> -seq -j 1" in production but I can easily simulate the problem with just 
> "nodetool repair" in dev environment (CCM). I'm trying to explain what 
> happened with 3.0.13 code base.
> 1. One node is down while doing repair. This is the error I saw in production:
> {code}
> ERROR [GossipTasks:1] 2017-05-19 15:00:10,545 RepairSession.java:334 - 
> [repair #bc9a3cd1-3ca3-11e7-a44a-e30923ac9336] session completed with the 
> following error
> java.io.IOException: Endpoint /10.185.43.15 died
> at 
> org.apache.cassandra.repair.RepairSession.convict(RepairSession.java:333) 
> ~[apache-cassandra-3.0.11.jar:3.0.11]
> at 
> org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:306) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:766) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at org.apache.cassandra.gms.Gossiper.access$800(Gossiper.java:66) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:181) 
> [apache-cassandra-3.0.11.jar:3.0.11]
> at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
>  [apache-cassandra-3.0.11.jar:3.0.11]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [na:1.8.0_121]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  [na:1.8.0_121]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  [na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [apache-cassandra-3.0.11.jar:3.0.11]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {code}
> 2. At this moment the repair coordinator hasn't received the response 
> (MerkleTrees) for the node that was marked down. This means, RepairJob#run 
> will never return because it waits for validations to finish:
> {code}
> // Wait for validation to complete
> Futures.getUnchecked(validations);
> {code}
> Be noted that all RepairJob's (as Runnable) run on a shared executor created 
> in RepairRunnable#runMayThrow, while all snapshot, validation and sync'ing 
> happen on a per-RepairSession "taskExecutor". The RepairJob#run will only 
> return when it receives MerkleTrees (or null) from all endpoints for a given 
> column family and token range.
> As evidence of the thread leak, below is from the thread dump. I can also get 
> the same stack trace when simulating the same issue in dev environment.
> {code}
> "Repair#129:56" #406373 daemon prio=5 os_prio=0 tid=0x7fc495028400 
> nid=0x1a77d waiting on condition [0x7fc02153]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0002d7c00198> (a 
> com.google.common.util.concurrent.AbstractFuture$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>