[ 
https://issues.apache.org/jira/browse/HIVE-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14957102#comment-14957102
 ] 

Elliot West commented on HIVE-11030:
------------------------------------

When using this to read legacy ACID deltas of the form 
{{delta_$startTxnId_$endTxnId}} I get a NPE on:
{code:title=org.apache.hadoop.hive.ql.io.AcidUtils.deserializeDeltas#371}
if(dmd.getStmtIds().isEmpty()) {
{code}
I assume that the older ACID data format that does not include the statement ID 
should still be supported? If so I think the above condition should also 
include a null check or alternatively {{AcidInputFormat.DeltaMetaData}} should 
never return {{null}}, and return an empty list in this specific scenario. I 
believe this can be achieved with a couple of minor changes:
{code}
--- a/ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java
@@ -115,7 +114,7 @@
     private List<Integer> stmtIds;

     public DeltaMetaData() {
-      this(0,0,null);
+      this(0,0,new ArrayList<Integer>());
     }
     DeltaMetaData(long minTxnId, long maxTxnId, List<Integer> stmtIds) {
       this.minTxnId = minTxnId;
@@ -136,7 +135,7 @@ public void write(DataOutput out) throws IOException {
       out.writeLong(minTxnId);
       out.writeLong(maxTxnId);
       out.writeInt(stmtIds.size());
-      if(stmtIds == null) {
+      if(stmtIds.isEmpty()) {
         return;
       }
       for(Integer id : stmtIds) {
{code}



> Enhance storage layer to create one delta file per write
> --------------------------------------------------------
>
>                 Key: HIVE-11030
>                 URL: https://issues.apache.org/jira/browse/HIVE-11030
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>    Affects Versions: 1.2.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>             Fix For: 1.3.0
>
>         Attachments: HIVE-11030.2.patch, HIVE-11030.3.patch, 
> HIVE-11030.4.patch, HIVE-11030.5.patch, HIVE-11030.6.patch, 
> HIVE-11030.7.patch, HIVE-11030.8.branch1.patch, HIVE-11030.8.patch
>
>
> Currently each txn using ACID insert/update/delete will generate a delta 
> directory like delta_0000100_0000101.  In order to support multi-statement 
> transactions we must generate one delta per operation within the transaction 
> so the deltas would be named like delta_0000100_0000101_0001, etc.
> Support for MERGE (HIVE-10924) would need the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to