[ 
https://issues.apache.org/jira/browse/IMPALA-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17577027#comment-17577027
 ] 

ASF subversion and git services commented on IMPALA-11335:
----------------------------------------------------------

Commit 8a04e0d3d5d925b4d3ca9d3a2ec1e43a57700627 in impala's branch 
refs/heads/branch-4.1.1 from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=8a04e0d3d ]

IMPALA-11335: allocate WriteId before taking locks during INSERT to ACID tables

It turned out that in case of INSERT HMS uses the writeId during lock
creation to save it and use it to cleanup aborted / timeouted
transactions. See the Jira for more details.

Testing:
- It is tricky to test this, so no new test was added. Hive should
  check whether there is already a new writeId for a table during
  lock creation and return an error if not - this would ensure
  that the correct behavior of Impala is tested.

Change-Id: Ic13b8822662474a0d2d4d1a31f12745159c758f4
Reviewed-on: http://gerrit.cloudera.org:8080/18583
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Zoltan Borok-Nagy <[email protected]>


> WriteId must be requested before taking locks during inserts
> ------------------------------------------------------------
>
>                 Key: IMPALA-11335
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11335
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog, Frontend
>            Reporter: Csaba Ringhofer
>            Priority: Major
>              Labels: ACID
>             Fix For: Impala 4.2.0
>
>
> It turned out that the writeId as saved to DB by HMS during lock creating for 
> inserts, because this info is used to delete the folders created by 
> aborted/time outed inserts. This seems a bit hacky but makes sense, as during 
> lock creation we express the intention of the transaction for the given table 
> ( 
> https://github.infra.cloudera.com/CDH/hive/blob/4604ca6f1077dd808055539e95e9b9be97cdb312/standalone-metastore/src/main/thrift/hive_metastore.thrift#L1123
>  ), while this information is express in th other APIs (open_txns, 
> allocate_table_write_ids).
> Currently Impala takes the lock first, which can cause issues during the 
> cleanup of aborted/timeouted inserts. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to