[
https://issues.apache.org/jira/browse/HIVE-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14957102#comment-14957102
]
Elliot West commented on HIVE-11030:
------------------------------------
When using this to read legacy ACID deltas of the form
{{delta_$startTxnId_$endTxnId}} I get a NPE on:
{code:title=org.apache.hadoop.hive.ql.io.AcidUtils.deserializeDeltas#371}
if(dmd.getStmtIds().isEmpty()) {
{code}
I assume that the older ACID data format that does not include the statement ID
should still be supported? If so I think the above condition should also
include a null check or alternatively {{AcidInputFormat.DeltaMetaData}} should
never return {{null}}, and return an empty list in this specific scenario. I
believe this can be achieved with a couple of minor changes:
{code}
--- a/ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java
@@ -115,7 +114,7 @@
private List<Integer> stmtIds;
public DeltaMetaData() {
- this(0,0,null);
+ this(0,0,new ArrayList<Integer>());
}
DeltaMetaData(long minTxnId, long maxTxnId, List<Integer> stmtIds) {
this.minTxnId = minTxnId;
@@ -136,7 +135,7 @@ public void write(DataOutput out) throws IOException {
out.writeLong(minTxnId);
out.writeLong(maxTxnId);
out.writeInt(stmtIds.size());
- if(stmtIds == null) {
+ if(stmtIds.isEmpty()) {
return;
}
for(Integer id : stmtIds) {
{code}
> Enhance storage layer to create one delta file per write
> --------------------------------------------------------
>
> Key: HIVE-11030
> URL: https://issues.apache.org/jira/browse/HIVE-11030
> Project: Hive
> Issue Type: Sub-task
> Components: Transactions
> Affects Versions: 1.2.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
> Fix For: 1.3.0
>
> Attachments: HIVE-11030.2.patch, HIVE-11030.3.patch,
> HIVE-11030.4.patch, HIVE-11030.5.patch, HIVE-11030.6.patch,
> HIVE-11030.7.patch, HIVE-11030.8.branch1.patch, HIVE-11030.8.patch
>
>
> Currently each txn using ACID insert/update/delete will generate a delta
> directory like delta_0000100_0000101. In order to support multi-statement
> transactions we must generate one delta per operation within the transaction
> so the deltas would be named like delta_0000100_0000101_0001, etc.
> Support for MERGE (HIVE-10924) would need the same.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)