[
https://issues.apache.org/jira/browse/IMPALA-13759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17928064#comment-17928064
]
Csaba Ringhofer commented on IMPALA-13759:
------------------------------------------
>Also, it's worth mentioning whether the described situation is considered
>valid in the first place from Hive's side.
At first I assumed that this is not valid as an INSERT OVERWRITE shouldn't
start while there are open transactions (and locks), but based on
[~boroknagyz]'s observation that this doesn't hold up in partitioned tables
(when partition level locks are used) it was easy to reproduce this scenario
with Impala:
{code}
-- impala client 1:
set default_transactional_type=insert_only;
create table tacid (i int) partitioned by (p string);
insert into tacid partition(p="a") select sleep(100000); -- long sleep to keep
the partition open
-- impala client 2:
insert into tacid partition (p="b") values (2);
insert overwrite tacid partition (p="b") values (3);
insert into tacid partition (p="b") values (4);
select * from tacid;
{code}
The the final select can return different results sets:
1. if insert to partition "a" is not committed yet:
{code}
2 b
4 b
{code}
2. after insert to partition "a" is committed:
{code}
1 a
2 b
4 b
{code}
3. once running REFRESH after insert to partition "a" is committed:
{code}
1 a
3 b
4 b
{code}
1. and 2. are incorrect, as we should never observe 4b without seeing 3b (which
was commited before the insert of 3b started).
Noticed another (rare) scenario that could lead to this:
https://github.com/apache/impala/blob/aac67a077eb80e23f433eef5fcbf9edda75deb75/fe/src/main/java/org/apache/impala/service/Frontend.java#L2689
Impala allocates the new writeId before locking the table, so if something
happens (e.g. the coordinator crashes) between the two, then Impala will have a
write id with open transaction without any corresponding lock.
> Hive ACID table base folder identification procedure is inconsistent with Hive
> ------------------------------------------------------------------------------
>
> Key: IMPALA-13759
> URL: https://issues.apache.org/jira/browse/IMPALA-13759
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Reporter: Peter Rozsa
> Priority: Major
>
> Impala's base folder identification uses a different approach to decide
> whether a base folder is feasible for reading or not in the sense of open
> writeIds. This could cause read inconsistencies with Hive, as Hive reads the
> base folder even if there's an open writeId before a newer base writeId.
> Impala's validation:
> [https://github.com/apache/impala/blob/b8f4034754b691a4790e502af214935486aa3ced/fe/src/main/java/org/apache/impala/util/AcidUtils.java#L261]
> Hive's validation:
> [https://github.com/apache/hive/blob/0759352ddddc793c0e717c460f0e08eb3f14c1e9/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1774-L1797]
> PR that changed the behavior:
> [https://github.com/apache/hive/commit/8ee3497f87f81fa84ee1023e891dc54087c2cd5e]
>
> Also, it's worth mentioning whether the described situation is considered
> valid in the first place from Hive's side.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]