Zoltán Borók-Nagy created IMPALA-10757:
------------------------------------------

             Summary: ACID table locking for DML statements is faulty
                 Key: IMPALA-10757
                 URL: https://issues.apache.org/jira/browse/IMPALA-10757
             Project: IMPALA
          Issue Type: Bug
          Components: Frontend
            Reporter: Zoltán Borók-Nagy


Plain SELECT queries don't take ACID locks. They use the latest snapshot of the 
table that is loaded by CatalogD.

However, DML statements lock all the tables it references, not just the target 
table.

E.g.:
{noformat}
INSERT INTO target_table SELECT * FROM source_table;
{noformat}
acquires locks for both target_table and source_table. However, after acquiring 
the locks Impala doesn't reload the tables.

Therefore the following situation is possible:
{noformat}
INSERT OVERWRITE foo SELECT ...; (takes an exclusive lock for foo)
{noformat}
while the following statement also tries to take a SHARED_LOCK for foo:
{noformat}
INSERT INTO bar SELECT * FROM foo;
{noformat}
It means the INSERT INTO statement might wait for the completion of the INSERT 
OVERWRITE statement, but since it doesn't reload foo it will still use the old 
snapshot of foo, hence there was no benefit of waiting for the lock.

Possible solutions:
 # Re-load tables after the lock is acquired
 # Only take lock for the target table. This would be better than the current 
behavior, also it would be consistent with plain SELECT queries.

I think reloading should be favored as Impala should run every statement (that 
involves ACID tables) in a transaction and take proper locks, see IMPALA-8788.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to