[
https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980019#comment-15980019
]
Eugene Koifman edited comment on HIVE-16321 at 4/22/17 4:39 PM:
----------------------------------------------------------------
committed to branch-2 (2.4.0)
https://github.com/apache/hive/commit/59faf36952f70d243ba997e54c1349ea2c01e670
was (Author: ekoifman):
committed to branch-2
https://github.com/apache/hive/commit/59faf36952f70d243ba997e54c1349ea2c01e670
> Possible deadlock in metastore with Acid enabled
> ------------------------------------------------
>
> Key: HIVE-16321
> URL: https://issues.apache.org/jira/browse/HIVE-16321
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 1.3.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
> Priority: Blocker
> Attachments: HIVE-16321.01-branch-2.3.patch,
> HIVE-16321.01.branch-2.patch, HIVE-16321.01.patch,
> HIVE-16321.02.branch-2.patch, HIVE-16321.02.patch,
> HIVE-16321.03-branch-2.patch, HIVE-16321.03.patch,
> HIVE-16321.05-branch-2.3.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can
> coordinate their operations. It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock. TxnHandler uses a connection pool
> of fixed size. Suppose you have X simultaneous calls to TxnHandler.lock(),
> where X is >= size of the pool. This take all connections form the pool, so
> when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat}
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the
> pool is empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.
> (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn
> pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after
> enqueueing the lock with the expectation that the caller will always follow
> up with a call to checkLock(CheckLockRequest rqst).
> cc [~f1sherox]
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)