[
https://issues.apache.org/jira/browse/DRILL-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Barclay (Drill) updated DRILL-4001:
------------------------------------------
Component/s: Execution - Data Types
> Empty vectors from previous batch left by
> MapVector.load(...)/RecordBatchLoader.load(...)
> -----------------------------------------------------------------------------------------
>
> Key: DRILL-4001
> URL: https://issues.apache.org/jira/browse/DRILL-4001
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Data Types
> Reporter: Daniel Barclay (Drill)
>
> In certain cases, {{MapVector.load(...)}} (called by
> {{RecordBatchLoader.load(...)}}) returns with some map child vectors having a
> length of zero instead of having a length matching the length of sibling
> vectors and the number of records in the batch. (This caused some of the
> {{IndexOutOfBoundException}} errors seen in fixing DRILL-2288.)
> The condition seems to be that a child field (e.g., an HBase column in a
> HBase column family) appears in an earlier batch and does not appear in a
> later batch.
> (The HBase column's child vector gets created (in the MapVector for the HBase
> column family) during loading of the earlier batch. During loading of the
> later batch, all vectors get reset to zero length, and then only vectors for
> fields _appearing in the batch message being loaded_ get loaded and set to
> the length of the batch-\-other vectors created from earlier
> messages/{{load}} calls are left with a length of zero (instead of, say,
> being filled with nulls to the length of their siblings and the current
> record batch).)
> See the TODO(DRILL-xxxx) mark and workaround in {{MapVector.getObject(int)}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)