jihoonson commented on issue #11165:
URL: https://github.com/apache/druid/issues/11165#issuecomment-832341499


   Hmm, it was a bit hard to understand what is the problem you want solve and 
how your proposal can address it. If I understand correctly, the ultimate goal 
is `Richer error messages detailing what went wrong and possible actions for 
mitigation` for `Easier debugging and RCA (without looking at server logs)`. 
And what you are proposing seems actually 2 things, 1) error propagation using 
the error code and 2) a unified error handling and reporting system based on 
1). The ultimate goal sounds reasonable if we are not doing it already, but 
what it's unclear to me is how the unified system you are proposing achieves 
the goal. To be more precise, the error propagation using the error code seems 
enough to me to achieve the goal. So, here is my question around the point 2). 
   
   - From the user perspective, richer error message for easier debugging is 
useful when the error is something that the user can work around by adjusting 
some user settings.
   - The Druid query system already has an error-code based error reporting 
system (see [Query exception 
class](https://github.com/apache/druid/blob/master/core/src/main/java/org/apache/druid/query/QueryException.java)).
 In the query system, there are a few [user-facing 
errors](http://druid.apache.org/docs/latest/querying/querying.html#query-errors)
 and the current error reporting system has been working well. I _think_ it 
already provides richer error message with a potential workaround. 
   - If we didn't have this query error reporting system, it would be worth 
considering having a unified system that can be used everywhere. However, given 
that we already have such a system for querying, what is the benefit of having 
a unified system?
     - If you are thinking of the registrable errors in custom extensions, I 
would say we don't probably need it for query system. You can currently develop 
[custom aggregators, complexMetrics, or query 
types](http://druid.apache.org/docs/latest/development/modules.html#adding-aggregators).
 Custom aggregators and complexMetrics will mostly need to handle 
not-enough-resources style errors which are covered by the error types 
currently we have. If you develop a new query type, it will require you to 
develop a new query engine as well to process your custom query type. You can 
freely add your custom error type in this case.
   
   Based on this, I would suggest focusing on the ingestion side in this 
proposal. What do you think?
   
   Besides the comment on introducing a unified system, here are my comments on 
other parts.
   
   > - Specifying the severity of errors and other potential side effects
   > - Hiding implementation and other sensitive details from the end user
   
   1. Can you elaborate more on these? What is the problem of the current 
status you think? And how does the proposal address the problem?
   
   2. How would the proposed error-code-based system integrate with the current 
ingestion system? Looking at the code snippets, it seems like we have to modify 
all codes where throw exceptions to convert them to `DruidTypedException`? If 
this is what you are proposing, it seems pretty error-prone because it's easy 
to forget catching and converting exceptions especially when they are unchecked 
exceptions. Also, please add a code snippet of `TaskStatus` after this change. 
It will help to understand the proposal.
   
   > Reference containing rich documentation for each error code, thus greatly 
simplifying debugging
   
   3. This is one of the advantages you listed. However, I'm not quite sure how 
it can simplify debugging because the error code will not be exposed to users? 
Can you add some more details on this?
   
   4. Thinking about what the "error code" actually is, is it simply a key to 
choose the proper error formatter when the error is exposed to users?
   
   5. In `DruidTypedException`, what does "typed" mean? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to