paul-rogers commented on issue #13123:
URL: https://github.com/apache/druid/issues/13123#issuecomment-1261666459

   @gianm's example points out one more item about errors: some of the 
information the user wants is not available at the point that the error is 
thrown. In Gian's example, the query ID and task ID are not likely known to the 
thing that is computing row layout and hits the column limit. The MSQ (really, 
an Overlord task) has a standard way to handle such things at the "top" of a 
task: if stuff fails, create a report with the overall context.
   
   Still, there are things in the "middle" not known to the top level, but not 
known at the site of the error. For example, consider a bad record. We'd like 
to know the row that failed (which we probably have) and the query (which MSQ 
can provide.) But, we'd also like to know the file being ingested, among the 50 
for this job. How do we get this mid-level context?
   
   In Drill, we sometimes passed the information down to the error site via an 
"error context." That works, but is rather unsatisfying: it is extra work 
needed just because Drill's `UserError` is immutable, and logged at the point 
it is created. We can learn from that and try a different approach: allow the 
`DruidException` to be extended as it bubbles back up. For our bad-record 
example, making up a simple file parser:
   
   ```java
     try (Handle handle = openFile(fileName)) {
       while (handle.hasMore()) {
         parseRow(handle.getLine());
       }
     } catch (DruidException e) {
       throw e.withContext("file name", fileName, "line number", 
handle.getLineNumber);
     }
   ```
   
   Now the user gets the specific error ("cannot parse JSON record"), the 
high-level context (query id) and the mid-level context (the file and line 
number).
   
   
   To make this work, we'd log the error at the top of the stack, before it is 
forwarded to the user (via some API, or in an error report.)
   
     


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to