[
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822618#comment-17822618
]
Rui Li commented on HIVE-26882:
-------------------------------
Hi [~pvary], we're trying to improve performance of concurrent write to iceberg
tables, and are evaluating this feature. While it considerably improves commit
throughput, there seems to be some consistency issue.
Here's how we test: enable the no-lock feature in an iceberg table, and
launches 120 processes to concurrently update this table. Each process randomly
generates a unique key/value pair and adds it to the table property. After all
the processes finish, we find that the number of successful process doesn't
match the number of newly added property in the table, e.g. 72 processes
succeeded but only 37 properties were added.
I searched the HMS logs. There're some commit conflicts [detected by the
HMS|https://github.com/apache/hive/blob/branch-2.3/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java#L163],
but I don't see any errors from the underlying DBMS (which implies the
isolation level not working IIUC).
I also tried changing the isolation level to SERIALIZABLE but it doesn't help.
So I wonder if we're missing any configurations or misusing this feature. Any
inputs would be helpful.
Our test env:
* HMS built from the latest
[branch-2.3|https://github.com/apache/hive/tree/branch-2.3]
* Apache Iceberg 1.4.3
* Backend DBMS is 10.1.45-MariaDB
> Allow transactional check of Table parameter before altering the Table
> ----------------------------------------------------------------------
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
> Issue Type: Improvement
> Components: Standalone Metastore
> Reporter: Peter Vary
> Assignee: Peter Vary
> Priority: Major
> Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
> Time Spent: 4h 40m
> Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes
--
This message was sent by Atlassian Jira
(v8.20.10#820010)