raptond opened a new pull request #1873:
URL: https://github.com/apache/iceberg/pull/1873


   50 milliseconds (constant) sleep time between "checking lock status" 
thrashes hive metastore databases when multiple jobs try to commit to the same 
Iceberg table. This fix allows the frequency of "checking the WAITING lock 
status" configurable and makes use of Tasks to backoff exponentially.
   
   Every time a check on the lock is made, the HMS performs heartbeats on the 
lock record and the transaction record. It eventually ends up with the below 
errors if the number of jobs on the same table grew and commit at the same 
time. Ability to configure the delay between retries and slowing down retries 
further exponentially would help. Thanks.
   
   ```
   MetaException(message:Unable to update transaction database 
org.postgresql.util.PSQLException: ERROR: could not serialize access due to 
read/write dependencies among transactions
   Detail: Reason code: Canceled on identification as a pivot, during write.
   Hint: The transaction might succeed if retried.
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to