[ 
https://issues.apache.org/jira/browse/DRILL-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-4001:
------------------------------------------
    Component/s: Execution - Data Types

> Empty vectors from previous batch left by 
> MapVector.load(...)/RecordBatchLoader.load(...)
> -----------------------------------------------------------------------------------------
>
>                 Key: DRILL-4001
>                 URL: https://issues.apache.org/jira/browse/DRILL-4001
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>            Reporter: Daniel Barclay (Drill)
>
> In certain cases, {{MapVector.load(...)}} (called by 
> {{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
> length of zero instead of having a length matching the length of sibling 
> vectors and the number of records in the batch.  (This caused some of the 
> {{IndexOutOfBoundException}} errors seen in fixing DRILL-2288.)
> The condition seems to be that a child field (e.g., an HBase column in a 
> HBase column family) appears in an earlier batch and does not appear in a 
> later batch.  
> (The HBase column's child vector gets created (in the MapVector for the HBase 
> column family) during loading of the earlier batch.  During loading of the 
> later batch, all vectors get reset to zero length, and then only vectors for 
> fields _appearing in the batch message being loaded_ get loaded and set to 
> the length of the batch-\-other vectors created from earlier 
> messages/{{load}} calls are left with a length of zero (instead of, say, 
> being filled with nulls to the length of their siblings and the current 
> record batch).)
> See the TODO(DRILL-xxxx) mark and workaround in {{MapVector.getObject(int)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to