[
https://issues.apache.org/jira/browse/HIVE-12202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967376#comment-14967376
]
Elliot West commented on HIVE-12202:
------------------------------------
I've checked to see that {{AcidUtils.serializeDeltas}} is being used correctly
in conjunction with {{AcidUtils.deserializeDeltas}}. It appears that
{{serializeDeltas}} does indeed create {{DeltaMetaData}} instances with an
empty list for the statement IDs for delta paths containing only
{{$startTxnId}} and {{$endTxnId}}. However, the deserialization process in
{{AcidInputFormat.DeltaMetaData.readFields(DataInput)}} incorrectly sets
{{stmtIds}} to {{null}} at line 152 if no statement count was serialized. Hence
{{AcidUtils.deserializeDeltas}} then gets tripped up by an NPE at line 371.
> NPE thrown when reading legacy ACID delta files
> -----------------------------------------------
>
> Key: HIVE-12202
> URL: https://issues.apache.org/jira/browse/HIVE-12202
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 1.3.0
> Reporter: Elliot West
> Assignee: Elliot West
> Labels: transactions
>
> When reading legacy ACID deltas of the form {{delta_$startTxnId_$endTxnId}} a
> {{NullPointerException}} is thrown on:
> {code:title=org.apache.hadoop.hive.ql.io.AcidUtils.deserializeDeltas#371}
> if(dmd.getStmtIds().isEmpty()) {
> {code}
> The older ACID data format (pre-Hive 1.3.0) which does not include the
> statement ID, and code written for that format should still be supported.
> Therefore the above condition should also include a null check or
> alternatively {{AcidInputFormat.DeltaMetaData}} should never return null, and
> return an empty list in this specific scenario.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)