Paul Rogers created DRILL-5283:
----------------------------------

             Summary: Support "is not present" as subtype of "is null" for JSON 
data
                 Key: DRILL-5283
                 URL: https://issues.apache.org/jira/browse/DRILL-5283
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 1.10.0
            Reporter: Paul Rogers


JSON files consist of a series of "objects", each of which has name/value 
pairs. Values can be in one of three states:

* Not present (the value does not appear)
* Null (the name appears and the value is null)
* Non-null (the field is one of the JSON data types)

Drill, however, has only a single null state and so Drill collapses "not 
present" and "null" into the same state.

The not-present and present-but-null states work identically for calculations 
inside Drill. But, when doing a CTAS from JSON to JSON, the collapsed state 
means that the user does not get out of Drill what was put in: all null values 
either appear as null values, or do not appear at all (depending on Drill 
version.)

This ticket asks to repurpose the "bit" fields in nullable vectors. Rename the 
vector to "nullState". Then, use these values:

* 0: value is set
* 1: value is null
* 3: value is not present

The column is null if the null state is non-zero. The column is not null if the 
null state is 0.

This change requires reversing the "polarity" of the bit field, and so is a 
major change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to