[jira] [Commented] (DRILL-1843) Support per-batch schema change at RecordBatchLoader

Hanifi Gunes (JIRA) Thu, 11 Dec 2014 14:37:39 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243278#comment-14243278
 ]


Hanifi Gunes commented on DRILL-1843:
-------------------------------------

Assume you have a field X that is all null-valued for batch 1 and later it has 
a non-null/concrete value (that is not nullable int). That is the first vector 
being sent from Drillbit is NullableIntVector but not the second one.

RecordBatchLoader then instantiates a NullableIntVector upon receiving the 
first batch and memoizes it. For the subsequent batches, it first checks if a 
vector exists with the same field name X without checking the expected type of 
incoming batch, which means trying to load a non-NullableIntVector into a 
NullableIntVector which fails as expected.

The fix is to verify that the cached/previous vector type matches the expected 
vector type or otherwise dump the existing vector and instantiate a vector of 
new expected type then load the data into the vector.

> Support per-batch schema change at RecordBatchLoader
> ----------------------------------------------------
>
>                 Key: DRILL-1843
>                 URL: https://issues.apache.org/jira/browse/DRILL-1843
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Hanifi Gunes
>            Assignee: Hanifi Gunes
>
> RecordBatchLoader maintains a map of vectors previously loaded. If the type 
> of vector changes across batches, the load fails. This is issue proposes to 
> make RecordBatchLoader support per batch schema changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-1843) Support per-batch schema change at RecordBatchLoader

Reply via email to