Peter Rozsa created IMPALA-13759:
------------------------------------

             Summary: Hive ACID table base folder identification procedure is 
inconsistent with Hive
                 Key: IMPALA-13759
                 URL: https://issues.apache.org/jira/browse/IMPALA-13759
             Project: IMPALA
          Issue Type: Bug
          Components: Frontend
            Reporter: Peter Rozsa


Impala's base folder identification uses a different approach to decide whether 
a base folder is feasible for reading or not in the sense of open writeIds. 
This could cause read inconsistencies with Hive, as Hive reads the base folder 
even if there's an open writeId before a newer base writeId.

Impala's validation: 
[https://github.com/apache/impala/blob/b8f4034754b691a4790e502af214935486aa3ced/fe/src/main/java/org/apache/impala/util/AcidUtils.java#L261]

Hive's validation: 
[https://github.com/apache/hive/blob/0759352ddddc793c0e717c460f0e08eb3f14c1e9/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1774-L1797]

PR that changed the behavior: 
[https://github.com/apache/hive/commit/8ee3497f87f81fa84ee1023e891dc54087c2cd5e]

 

Also, it's worth mentioning whether the described situation is considered valid 
in the first place from Hive's side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to