[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15505198#comment-15505198
 ] 

ASF GitHub Bot commented on DRILL-4653:
---------------------------------------

Github user ssriniva123 commented on the issue:

    https://github.com/apache/drill/pull/518
  
    Apologize for getting back on this thread late, got tied up with some 
issues@work.
    
    Paul,
    The json parser is not just a tokenizer, it keeps track of the JSON 
structure and understands various aspects of it like root, array/objectcontext 
and all parsing is done under that context.
    
    - we cannot keep track of {} accurately - For eg: The counting json 
processor does a parser. skipChildren which tries to skip to the end of the 
JSON, but this can rollover to next line when
    there is a malformed JSON in the bottom most json sub object - see example 
below (missing " in last json structure). This is similar behavior with the 
JsonReader.
    
    {"balance": 1000.0,"num": 100,"is_vip": true,"name": 
"foo3","curr":{"denom":"pound","test":{"value  :false}}}
    
    - One possible solution is to rewind the input source to reset the stream 
(which is not recommended and there is no guarentee that all streams support 
mark/reset semantics.
    
    Given where we are, I think the solution proposed works perfect for almost 
all malformed JSON's.
    
    



> Malformed JSON should not stop the entire query from progressing
> ----------------------------------------------------------------
>
>                 Key: DRILL-4653
>                 URL: https://issues.apache.org/jira/browse/DRILL-4653
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - JSON
>    Affects Versions: 1.6.0
>            Reporter: subbu srinivasan
>             Fix For: Future
>
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to