Zoltán Borók-Nagy created IMPALA-10757:
------------------------------------------
Summary: ACID table locking for DML statements is faulty
Key: IMPALA-10757
URL: https://issues.apache.org/jira/browse/IMPALA-10757
Project: IMPALA
Issue Type: Bug
Components: Frontend
Reporter: Zoltán Borók-Nagy
Plain SELECT queries don't take ACID locks. They use the latest snapshot of the
table that is loaded by CatalogD.
However, DML statements lock all the tables it references, not just the target
table.
E.g.:
{noformat}
INSERT INTO target_table SELECT * FROM source_table;
{noformat}
acquires locks for both target_table and source_table. However, after acquiring
the locks Impala doesn't reload the tables.
Therefore the following situation is possible:
{noformat}
INSERT OVERWRITE foo SELECT ...; (takes an exclusive lock for foo)
{noformat}
while the following statement also tries to take a SHARED_LOCK for foo:
{noformat}
INSERT INTO bar SELECT * FROM foo;
{noformat}
It means the INSERT INTO statement might wait for the completion of the INSERT
OVERWRITE statement, but since it doesn't reload foo it will still use the old
snapshot of foo, hence there was no benefit of waiting for the lock.
Possible solutions:
# Re-load tables after the lock is acquired
# Only take lock for the target table. This would be better than the current
behavior, also it would be consistent with plain SELECT queries.
I think reloading should be favored as Impala should run every statement (that
involves ACID tables) in a transaction and take proper locks, see IMPALA-8788.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]