[ 
https://issues.apache.org/jira/browse/DRILL-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939640#comment-14939640
 ] 

Steven Phillips commented on DRILL-3229:
----------------------------------------

i) In this first iteration, Union types will be enabled with an option, and 
they will be created in Json Reader and Mongo reader automatically if the 
option is enabled. Everything will be a Union type in this case. A future patch 
will work on promoting from non-union once it is necessary to promote.
ii) Your understanding is correct. One change from the earlier comment, there 
is no "bits" vector. The underlying primitive type vectors will have their own 
"bits" for tracking nulls. The type vector with a value of zero will also 
indicate null.

Without going into much detail at this point, I can answer the next paragraph 
of question by saying that this patch will allow reading of any valid json. It 
also has a more literal representation of the json, e.g. null values will be 
treated as null, instead of empty maps/lists. The patch also includes functions 
for inspecting the type of a field, which can be used with case statements to 
handle the data based on which type it is. Though it may be somewhat 
cumbersome, with these tools you should be able to run almost any query against 
dynamic json data. This will generally involve using introspection and case 
statements to remove the Union types early in the query. Future work will 
eliminate the need for this in many cases. One notable exception is that 
flatten is not supported in this initial patch.

> Create a new EmbeddedVector
> ---------------------------
>
>                 Key: DRILL-3229
>                 URL: https://issues.apache.org/jira/browse/DRILL-3229
>             Project: Apache Drill
>          Issue Type: Sub-task
>          Components: Execution - Codegen, Execution - Data Types, Execution - 
> Relational Operators, Functions - Drill
>            Reporter: Jacques Nadeau
>            Assignee: Steven Phillips
>             Fix For: Future
>
>
> Embedded Vector will leverage a binary encoding for holding information about 
> type for each individual field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to