keith-turner commented on a change in pull request #1646:
URL: https://github.com/apache/accumulo/pull/1646#discussion_r448499885



##########
File path: 
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/MinorCompactor.java
##########
@@ -148,6 +149,13 @@ public CompactionStats call() {
           reportedProblem = true;
         } catch (CompactionCanceledException e) {
           throw new IllegalStateException(e);
+        } catch (Throwable t) {

Review comment:
       > Now I am thinking that a Halt just does not feel right, and I fear 
that given the appropriate circumstances could result in a cascade of tserver 
deaths. 
   
   That could happen.  Part of the problem is that some errors are benign and 
others are likely and indication of a catastrophe.  Maybe configuration is the 
best option, could have three configuration items.
   
    * Configurable list of errors to retry on.
    * Configurable list of errors to halt on.
    * Action to take for errors that fall in neither list : hang, retry, halt. 
   
   This would allow when a new error is encountered like in #1644 that the 
error class could just be added to config for retry.
   
   >  It could be that all of these changes go away and I simply catch the 
NoClassDefFoundError and call it a day.
   
   Maybe a follow on issue would be best.  I feel like this is a wider issue, 
because any thread could encounter an error and having a single mechanism for 
handling unexpected errors in server processes seems useful.
   
   I am also wondering if emitting metrics for errors would be useful.  If the 
cardinality is deemed low enough, could emit metrics for each error class name. 
 Then in the metrics system could see when a tserver has an out of memory 
error.  This would be a follow on issue if it seems like something that might 
be useful.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to