Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/518 As it turns out, the sample code shown was actually tested with a stock Jackson JSON parser: it does work. No parser changes are needed. The issue is not whether we can make the parser do what is needed: the code posted in the comment above demonstrated a solution. The issue is how we incorporate that code into the JSON parser to clean up partial records and prevent schema changes. When I have time, I'll investigate that question in greater depth. IMHO, without a proper fix, we should simply state that Drill does not support malformed JSON. If an input file might be incorrect, run it though a clean-up step before allowing Drill to scan it. Otherwise, we are opening the door to many hard-to-resolve bugs when people ask Drill to scan corrupt JSON: the result, without a proper fix, would be undefined -- which is worse than the current behavior that simply fails the scan with an error. Let's follow up again after I (or someone) has had a chance to figure out if we can undo a partially built record. If we can do that, then we've got a path to a clean solution: recover the parser (as shown earlier) and discard the in-flight record (as we need to research.)
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---