> On Jan. 15, 2016, 3:33 p.m., John Sirois wrote:
> > src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java, 
> > line 152
> > <https://reviews.apache.org/r/42332/diff/1/?file=1197802#file1197802line152>
> >
> >     You could centralize this handling in deleteTasks - it has all the ids 
> > to give as useful a message as you have here, and that localizes the 
> > otherwise overly broad looking RTE catch.  It'll clean up the temp you had 
> > to introduce below as well.  Further, it'll give you 1 nice spot to add a 
> > comment explaining why termination on 1 failed prune is sane; ie: you can 
> > point out the inevitable history growth issue and that rolling back is 
> > better than waiting for the failure and then rolling back.

The objective is to catch any unexpected exceptions in the Runnable given to 
the executor. For example the exception that triggered this investigation did 
not come from delete tasks but from sorting the set of tasks to delete:
````
FluentIterable
        .from(Tasks.LATEST_ACTIVITY.sortedCopy(inactiveTasks))
        .filter(safeToDelete)
        .limit(tasksToPrune)
        .transform(Tasks::id)
        .toSet();
````

I agree that this approach is not ideal. I'm not sure on how to change this 
without changing the executor to have some sort of ListenableFuture that does 
something `onFailure`.


- Zameer


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42332/#review114795
-----------------------------------------------------------


On Jan. 14, 2016, 4:53 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42332/
> -----------------------------------------------------------
> 
> (Updated Jan. 14, 2016, 4:53 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1582
>     https://issues.apache.org/jira/browse/AURORA-1582
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Task pruning is critical to the operation of a large cluster. Failure to 
> prune tasks can lead to storage growing in unexpected ways leading to 
> scheduling slowdown. This patch adds shutdown on task pruning failure to 
> prevent the scheduler from entering an undefined state.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java 
> 2064089937f5178b1413d386a312f4173a0e35fb 
>   
> src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java 
> 295960f13693c6ba0d7075a8ef7f9680a91ae69d 
> 
> Diff: https://reviews.apache.org/r/42332/diff/
> 
> 
> Testing
> -------
> 
> ./gradlew build -Pq
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Reply via email to