kfaraz commented on issue #11165:
URL: https://github.com/apache/druid/issues/11165#issuecomment-832720263


   Thanks a lotfor the feedback, @jihoonson ! Sorry if the description was 
somewhat vague.
   
   > If I understand correctly, the ultimate goal is Richer error messages 
detailing what went wrong and possible actions for mitigation for Easier 
debugging and RCA (without looking at server logs)
   Yes, your understanding is correct.
   
   By a `a unified error handling and reporting system`, I simply mean using 
the error codes and the corresponding exception wherever applicable.
   
   > From the user perspective, richer error message for easier debugging is 
useful when the error is something that the user can work around by adjusting 
some user settings.
   
   Agreed.
   
   > The Druid query system already has an error-code based error reporting 
system (see Query exception class). In the query system, there are a few 
user-facing errors and the current error reporting system has been working 
well. I think it already provides richer error message with a potential 
workaround.
   
   The proposed error code system closely resembles the existing system we have 
for query failures. We already have sufficient error codes and error messages 
for such failures. So, as you suggest, we can focus only on ingestion task 
failures in this proposal.
   
   > Thinking about what the "error code" actually is, is it simply a key to 
choose the proper error formatter when the error is exposed to users?
   
   Yes, "Error Code" is primarily a key to choose the correct error formatter 
while showing errors to the users.
   It is meant to serve two more (minor) purposes:
   - It is a concise way of representing specific failures in error logs
   - In a corner case where the Overlord encounters an error code that it does 
not understand (perhaps in case of an error code arising from an extension that 
is loaded on, say only middle managers), the error code might be displayed as 
part of a generic message to the user. e.g. `Error occurred on middle manager 
while running task. Error Code: kafka-indexer.offset.outofrange.`
   
   > Specifying the severity of errors and other potential side effects
   > Hiding implementation and other sensitive details from the end user
   
   In some places, we currently expose the names of the Exception class in the 
error message itself. Through this proposal, we can prevent leaks of such 
information as there would be curated messages corresponding to each known 
code, and any non-coded error could be represented as a miscellaneous error.
   
   > In DruidTypedException, what does "typed" mean?
   
   The `Type` refers to the fact that this exception is raised for a specific 
`ErrorType` (contains code, moduleName and message format). Sorry, I couldn't 
think of a more appropriate name for this.
   
   Please let me know what you think.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to