[ 
https://issues.apache.org/jira/browse/DRILL-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305703#comment-16305703
 ] 

Paul Rogers commented on DRILL-6035:
------------------------------------

h4. Conclusion

The net-net conclusion from all of the above is:

* A large amount of work would be needed to provide solid JSON support in Drill.
* That work may not be justified given that Parquet resolves the ambiguities 
and provides better performance.

The take-away for Drill users is simple:

* If JSON is used with Drill, it must be very simple and follow Drill's JSON 
format rules as explained above.
* Use a purpose-built ETL tool to convert JSON to Parquet and point Drill at 
the Parquet file instead of JSON.

>From a work perspective, it may be far faster, cheaper and more effective to 
>back off Drill's lavish claims for JSON than to do the work needed to achieve 
>those promises.

> Specify Drill's JSON behavior
> -----------------------------
>
>                 Key: DRILL-6035
>                 URL: https://issues.apache.org/jira/browse/DRILL-6035
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.13.0
>            Reporter: Paul Rogers
>            Assignee: Pritesh Maker
>
> Drill supports JSON as its native data format. However, experience suggests 
> that Drill may have limitations in the JSON that Drill supports. This ticket 
> asks to clarify Drill's expected behavior on various kinds of JSON.
> Topics to be addressed:
> * Relational vs. non-relational structures
> * JSON structures used in practice and how they map to Drill
> * Support for varying data types
> * Support for missing values, especially across files
> These topics are complex, hence the request to provide a detailed 
> specifications that clarifies what Drill does and does not support (or what 
> is should and should not support.)
> As noted below, the "net-net" conclusion for users is to use an ETL tool to 
> convert JSON to Parquet, then allow Drill to query the Parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to