dengzhhu653 commented on PR #5929:
URL: https://github.com/apache/hive/pull/5929#issuecomment-3454010825
> -- MVCC style (optimistic)
BEGIN;
SELECT balance FROM accounts WHERE id = 1; -- snapshot read
-- some processing
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
COMMIT;
-- If another transaction modified same row → commit fails (conflict)
As I explained, this would cause lost-update,
After some research, I find it's hard to use DataNucleus without the pure
"UPDATE" query,
mTable.setLastAccessTime((int) (System.currentTimeMillis()/1000));
pm.flush(); // here might flush the old MTable into data store
pm.refresh(mTable);
In pm.flush() the old state of the table can get overwritten the one in data
store, resulting to some columns missing in COLUMN_STATS_ACCURATE, for example:
{"COLUMN_STATS":{"col_0":"true","col_1":"true","col_2":"true","col_3":"true","col_5":"true","col_6":"true","col_7":"true","col_8":"true"}}
col_4 and col_9 are missing.
> Every iceberg table commits involves alter table operation and it's
non-blocking ATM.
The Iceberg commit relies on the DB transaction atomicity, it should involve
the row lock behind the scenes, though the lock is quite small(TBL_ID,
PARAM_KEY), if the table has multiple commits at the same time, only one is
allowed to alter the `TABLE_PARAMS`.
> That’s the opposite of what CU experienced on a highly loaded MySQL
cluster with S4U on the NEXT_TXN_ID table.
This is because every HMS request needs to lock only one `NEXT_TXN_ID`,
compared the same level lock distributed among different tables
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]