paul-rogers commented on issue #13123:
URL: https://github.com/apache/druid/issues/13123#issuecomment-1261666459
@gianm's example points out one more item about errors: some of the
information the user wants is not available at the point that the error is
thrown. In Gian's example, the query ID and task ID are not likely known to the
thing that is computing row layout and hits the column limit. The MSQ (really,
an Overlord task) has a standard way to handle such things at the "top" of a
task: if stuff fails, create a report with the overall context.
Still, there are things in the "middle" not known to the top level, but not
known at the site of the error. For example, consider a bad record. We'd like
to know the row that failed (which we probably have) and the query (which MSQ
can provide.) But, we'd also like to know the file being ingested, among the 50
for this job. How do we get this mid-level context?
In Drill, we sometimes passed the information down to the error site via an
"error context." That works, but is rather unsatisfying: it is extra work
needed just because Drill's `UserError` is immutable, and logged at the point
it is created. We can learn from that and try a different approach: allow the
`DruidException` to be extended as it bubbles back up. For our bad-record
example, making up a simple file parser:
```java
try (Handle handle = openFile(fileName)) {
while (handle.hasMore()) {
parseRow(handle.getLine());
}
} catch (DruidException e) {
throw e.withContext("file name", fileName, "line number",
handle.getLineNumber);
}
```
Now the user gets the specific error ("cannot parse JSON record"), the
high-level context (query id) and the mid-level context (the file and line
number).
To make this work, we'd log the error at the top of the stack, before it is
forwarded to the user (via some API, or in an error report.)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]