kfaraz commented on issue #11165: URL: https://github.com/apache/druid/issues/11165#issuecomment-832720263
Thanks a lotfor the feedback, @jihoonson ! Sorry if the description was somewhat vague. > If I understand correctly, the ultimate goal is Richer error messages detailing what went wrong and possible actions for mitigation for Easier debugging and RCA (without looking at server logs) Yes, your understanding is correct. By a `a unified error handling and reporting system`, I simply mean using the error codes and the corresponding exception wherever applicable. > From the user perspective, richer error message for easier debugging is useful when the error is something that the user can work around by adjusting some user settings. Agreed. > The Druid query system already has an error-code based error reporting system (see Query exception class). In the query system, there are a few user-facing errors and the current error reporting system has been working well. I think it already provides richer error message with a potential workaround. The proposed error code system closely resembles the existing system we have for query failures. We already have sufficient error codes and error messages for such failures. So, as you suggest, we can focus only on ingestion task failures in this proposal. > Thinking about what the "error code" actually is, is it simply a key to choose the proper error formatter when the error is exposed to users? Yes, "Error Code" is primarily a key to choose the correct error formatter while showing errors to the users. It is meant to serve two more (minor) purposes: - It is a concise way of representing specific failures in error logs - In a corner case where the Overlord encounters an error code that it does not understand (perhaps in case of an error code arising from an extension that is loaded on, say only middle managers), the error code might be displayed as part of a generic message to the user. e.g. `Error occurred on middle manager while running task. Error Code: kafka-indexer.offset.outofrange.` > Specifying the severity of errors and other potential side effects > Hiding implementation and other sensitive details from the end user In some places, we currently expose the names of the Exception class in the error message itself. Through this proposal, we can prevent leaks of such information as there would be curated messages corresponding to each known code, and any non-coded error could be represented as a miscellaneous error. > In DruidTypedException, what does "typed" mean? The `Type` refers to the fact that this exception is raised for a specific `ErrorType` (contains code, moduleName and message format). Sorry, I couldn't think of a more appropriate name for this. Please let me know what you think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
