> On Jan. 15, 2016, 4:33 p.m., John Sirois wrote: > > src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java, > > line 152 > > <https://reviews.apache.org/r/42332/diff/1/?file=1197802#file1197802line152> > > > > You could centralize this handling in deleteTasks - it has all the ids > > to give as useful a message as you have here, and that localizes the > > otherwise overly broad looking RTE catch. It'll clean up the temp you had > > to introduce below as well. Further, it'll give you 1 nice spot to add a > > comment explaining why termination on 1 failed prune is sane; ie: you can > > point out the inevitable history growth issue and that rolling back is > > better than waiting for the failure and then rolling back. > > Zameer Manji wrote: > The objective is to catch any unexpected exceptions in the Runnable given > to the executor. For example the exception that triggered this investigation > did not come from delete tasks but from sorting the set of tasks to delete: > ```` > FluentIterable > .from(Tasks.LATEST_ACTIVITY.sortedCopy(inactiveTasks)) > .filter(safeToDelete) > .limit(tasksToPrune) > .transform(Tasks::id) > .toSet(); > ```` > > I agree that this approach is not ideal. I'm not sure on how to change > this without changing the executor to have some sort of ListenableFuture that > does something `onFailure`.
Right, and that sounds like a great idea to me, it would be nice to have a more centralized way to deal with this and a ListenableFutures provide it: http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/util/concurrent/ListeningExecutorService.html http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/util/concurrent/MoreExecutors.html#listeningDecorator(java.util.concurrent.ExecutorService) Not sure if you want to pursue that idea and drop this or do this change, add the larger change then flip over to it later. - John ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/42332/#review114795 ----------------------------------------------------------- On Jan. 14, 2016, 5:53 p.m., Zameer Manji wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/42332/ > ----------------------------------------------------------- > > (Updated Jan. 14, 2016, 5:53 p.m.) > > > Review request for Aurora, John Sirois and Maxim Khutornenko. > > > Bugs: AURORA-1582 > https://issues.apache.org/jira/browse/AURORA-1582 > > > Repository: aurora > > > Description > ------- > > Task pruning is critical to the operation of a large cluster. Failure to > prune tasks can lead to storage growing in unexpected ways leading to > scheduling slowdown. This patch adds shutdown on task pruning failure to > prevent the scheduler from entering an undefined state. > > > Diffs > ----- > > src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java > 2064089937f5178b1413d386a312f4173a0e35fb > > src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java > 295960f13693c6ba0d7075a8ef7f9680a91ae69d > > Diff: https://reviews.apache.org/r/42332/diff/ > > > Testing > ------- > > ./gradlew build -Pq > > > Thanks, > > Zameer Manji > >
