Paul Rogers created DRILL-6048:
----------------------------------

             Summary: ListVector is incomplete and broken, RepeatedListVector 
works
                 Key: DRILL-6048
                 URL: https://issues.apache.org/jira/browse/DRILL-6048
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.10.0
            Reporter: Paul Rogers


Drill provides two kinds of "list vectors": {{ListVector}} and 
{{RepeatedListVector}}. I attempted to use the {{ListVector}} to implement 
lists in JSON. While some parts work, others are broken and JIRA tickets were 
filed.

Once things worked well enough to run a query, it turned out that the Project 
operator failed. Digging into the cause, it appears that the {{ListVector}} is 
incomplete and not used. Its implementation of {{makeTransferPair()}} was 
clearly never tested. A list has contents, but when this method attempts to 
create the contents of the target vector, it fails to create the list contents.

Elsewhere, we saw that the constructor did correctly create the vector, and 
that the {{promoteToUnion()}} had holes. The sheer number of bugs leads to the 
conclusion that this class is not, in fact, used or usable.

Looking more carefully at the JSON and older writer code, it appears that the 
ListVector was *not* used for JSON, and that JSON has the limitations of a 
repeated vector (it cannot support lists with null elements.)

This implies that the JSON reader itself is broken as it does not support fully 
JSON semantics because it does not use the {{ListVector}} that was intended for 
this purpose.

So, the conclusion is that JSON uses:

* Repeated vectors for single-dimensional arrays (without null support)
* {{RepeatedListVector}} for two-dimensional arrays

This triggers the question: what do we do for three-dimensional arrays?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to