[
https://issues.apache.org/jira/browse/SENTRY-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15447837#comment-15447837
]
Colin Ma commented on SENTRY-1422:
----------------------------------
Hi, [~akolb],
1. This problem is caused by Sentry-1321, and HMSFollower is came from that
JIRA. For the stack traces, we only get the log about
SentryStore.alterSentryRoleGrantPrivileges(), but you can find the sentence
"The selected victim is XID : 712" from log. When deadlock is detected, one
transaction will be selected as valid, and the other will be failed. That's why
there is no stack traces for HMSFollower pasted here.
2. How the deadlock happen? Check the code of
SentryStore.alterSentryRoleGrantPrivileges() and
SentryStore.dropPrivilege()(called by HMSFollower), they both get the
privileges first and update the privileges. The involved tables are
SENTRY_ROLE_DB_PRIVILEGE_MAP and SENTRY_DB_PRIVILEGE. Get the privileges will
get the shared lock and update will get the exclusive lock, these 2 type locks
are conflict. If the transaction is executed as following order:
a) SentryStore.alterSentryRoleGrantPrivileges() get the privileges (shared
lock get, ok)
b) SentryStore.dropPrivilege() get the privileges (shared lock get, ok)
c) SentryStore.alterSentryRoleGrantPrivileges(exclusive lock get,
failed, hold by b)
d) SentryStore.dropPrivilege(exclusive lock get, failed, hold by a)
Check the log again, we need update both SENTRY_ROLE_DB_PRIVILEGE_MAP and
SENTRY_DB_PRIVILEGE in 2 transactions, but the transaction 1 get the shared
lock for SENTRY_ROLE_DB_PRIVILEGE_MAP and exclusive lock for
SENTRY_DB_PRIVILEGE, and transaction 2 is opposite, that cause the deadlock:
{code}
Lock : ROW, SENTRY_ROLE_DB_PRIVILEGE_MAP, (1,10)
Waiting XID : {712, S} , SENTRY, SELECT
'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS
NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON
A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?
Granted XID : {704, X}
Lock : ROW, SENTRY_DB_PRIVILEGE, (1,11)
Waiting XID : {704, S} , SENTRY, SELECT
'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS
NUCLEUS_TYPE,A0.URI,A0."ACTION",A0."COLUMN_NAME",A0.CREATE_TIME,A0.DB_NAME,A0.WITH_GRANT_OPTION,A0.PRIVILEGE_SCOPE,A0."SERVER_NAME",A0."TABLE_NAME",A0.DB_PRIVILEGE_ID
FROM SENTRY_DB_PRIVILEGE A0 WHERE A0."SERVER_NAME" = 'server1' AND A0.DB_NAME
= 'db_2' AND A0."ACTION" = 'create'
Granted XID : {712, X}
{code}
3. For transactions and concurrent, based on the point 2, I think it's
impossible for the current implementation.
4.Retry the failed transaction can be an enhancement for Sentry later, but for
this problem, we know how the deadlock happens, I think we should fix this.
> JDO deadlocks while processing grant while a background thread processes
> Notificationlogs
> -----------------------------------------------------------------------------------------
>
> Key: SENTRY-1422
> URL: https://issues.apache.org/jira/browse/SENTRY-1422
> Project: Sentry
> Issue Type: Sub-task
> Components: Hdfs Plugin
> Reporter: Sravya Tirukkovalur
> Assignee: Colin Ma
> Fix For: sentry-ha-redesign
>
> Attachments: SENTRY-1422.008-sentry-ha-redesign.patch,
> SENTRY-1422.009-sentry-ha-redesign.patch,
> SENTRY-1422.010-sentry-ha-redesign.patch
>
>
> As I was working on Sentry-1321. I see that test case
> TestDbPrivilegeCleanupOnDrop#testDropObjects fails while "Granting select on
> table" with following deadlock.
> {code}
> java.sql.SQLException: Error while processing statement: FAILED: Execution
> Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.SentryGrantRevokeTask. Unknown error for
> request: TAlterSentryRoleGrantPrivilegeRequest(protocol_version:2,
> requestorUserName:admin1, roleName:select_tbl1,
> privileges:[TSentryPrivilege(privilegeScope:TABLE, serverName:server1,
> dbName:db_2, tableName:tb_1, URI:, action:select, createTime:1469562185573,
> grantOption:FALSE, columnName:)]), message: Iteration request failed : SELECT
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS
> NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
> FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON
> A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?. Server
> Stacktrace: javax.jdo.JDODataStoreException: Iteration request failed :
> SELECT 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS
> NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
> FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON
> A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?
> at
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
> at
> org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:165)
> at
> org.apache.sentry.provider.db.service.persistent.SentryStore.commitTransaction(SentryStore.java:279)
> at
> org.apache.sentry.provider.db.service.persistent.SentryStore.commitUpdateTransaction(SentryStore.java:261)
> at
> org.apache.sentry.provider.db.service.persistent.SentryStore.alterSentryRoleGrantPrivileges(SentryStore.java:458)
> at
> org.apache.sentry.provider.db.service.thrift.SentryPolicyStoreProcessor.alter_sentry_role_grant_privilege(SentryPolicyStoreProcessor.java:261)
> at
> org.apache.sentry.provider.db.service.thrift.SentryPolicyService$Processor$alter_sentry_role_grant_privilege.getResult(SentryPolicyService.java:1237)
> at
> org.apache.sentry.provider.db.service.thrift.SentryPolicyService$Processor$alter_sentry_role_grant_privilege.getResult(SentryPolicyService.java:1222)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.sentry.provider.db.service.thrift.SentryProcessorWrapper.process(SentryProcessorWrapper.java:35)
> at
> org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:123)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> NestedThrowablesStackTrace:
> java.sql.SQLTransactionRollbackException: A lock could not be obtained due to
> a deadlock, cycle of locks and waiters is:
> Lock : ROW, SENTRY_ROLE_DB_PRIVILEGE_MAP, (1,10)
> Waiting XID : {712, S} , SENTRY, SELECT
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS
> NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
> FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON
> A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?
> Granted XID : {704, X}
> Lock : ROW, SENTRY_DB_PRIVILEGE, (1,11)
> Waiting XID : {704, S} , SENTRY, SELECT
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS
> NUCLEUS_TYPE,A0.URI,A0."ACTION",A0."COLUMN_NAME",A0.CREATE_TIME,A0.DB_NAME,A0.WITH_GRANT_OPTION,A0.PRIVILEGE_SCOPE,A0."SERVER_NAME",A0."TABLE_NAME",A0.DB_PRIVILEGE_ID
> FROM SENTRY_DB_PRIVILEGE A0 WHERE A0."SERVER_NAME" = 'server1' AND
> A0.DB_NAME = 'db_2' AND A0."ACTION" = 'create'
> Granted XID : {712, X}
> . The selected victim is XID : 712.
> at
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown
> Source)
> at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown
> Source)
> at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown
> Source)
> at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown
> Source)
> at
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown
> Source)
> at
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeQuery(Unknown Source)
> at
> com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:172)
> at
> org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:381)
> at
> org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:559)
> at
> org.datanucleus.store.rdbms.scostore.JoinSetStore.iterator(JoinSetStore.java:967)
> at org.datanucleus.store.types.backed.Set.loadFromStore(Set.java:323)
> at org.datanucleus.store.types.backed.Set.initialise(Set.java:272)
> at
> org.datanucleus.store.types.SCOUtils.createSCOWrapper(SCOUtils.java:256)
> at
> org.datanucleus.store.types.SCOUtils.newSCOInstance(SCOUtils.java:142)
> at
> org.datanucleus.store.rdbms.mapping.java.AbstractContainerMapping.replaceFieldWithWrapper(AbstractContainerMapping.java:399)
> at
> org.datanucleus.store.rdbms.mapping.java.AbstractContainerMapping.postFetch(AbstractContainerMapping.java:417)
> at
> org.datanucleus.store.rdbms.request.FetchRequest.execute(FetchRequest.java:420)
> at
> org.datanucleus.store.rdbms.RDBMSPersistenceHandler.fetchObject(RDBMSPersistenceHandler.java:324)
> at
> org.datanucleus.state.AbstractStateManager.loadFieldsFromDatastore(AbstractStateManager.java:1122)
> at
> org.datanucleus.state.JDOStateManager.loadUnloadedFieldsInFetchPlan(JDOStateManager.java:3000)
> at
> org.datanucleus.state.AbstractStateManager.loadFieldsInFetchPlan(AbstractStateManager.java:1064)
> at
> org.datanucleus.ExecutionContextImpl.performDetachAllOnTxnEndPreparation(ExecutionContextImpl.java:4574)
> at
> org.datanucleus.ExecutionContextImpl.preCommit(ExecutionContextImpl.java:4259)
> at
> org.datanucleus.ExecutionContextImpl.transactionPreCommit(ExecutionContextImpl.java:654)
> at
> org.datanucleus.TransactionImpl.internalPreCommit(TransactionImpl.java:379)
> at org.datanucleus.TransactionImpl.commit(TransactionImpl.java:268)
> at org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:98)
> at
> org.apache.sentry.provider.db.service.persistent.SentryStore.commitTransaction(SentryStore.java:279)
> at
> org.apache.sentry.provider.db.service.persistent.SentryStore.commitUpdateTransaction(SentryStore.java:261)
> at
> org.apache.sentry.provider.db.service.persistent.SentryStore.alterSentryRoleGrantPrivileges(SentryStore.java:458)
> at
> org.apache.sentry.provider.db.service.thrift.SentryPolicyStoreProcessor.alter_sentry_role_grant_privilege(SentryPolicyStoreProcessor.java:261)
> at
> org.apache.sentry.provider.db.service.thrift.SentryPolicyService$Processor$alter_sentry_role_grant_privilege.getResult(SentryPolicyService.java:1237)
> at
> org.apache.sentry.provider.db.service.thrift.SentryPolicyService$Processor$alter_sentry_role_grant_privilege.getResult(SentryPolicyService.java:1222)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.sentry.provider.db.service.thrift.SentryProcessorWrapper.process(SentryProcessorWrapper.java:35)
> at
> org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:123)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.sql.SQLException: A lock could not be obtained due to a
> deadlock, cycle of locks and waiters is:
> Lock : ROW, SENTRY_ROLE_DB_PRIVILEGE_MAP, (1,10)
> Waiting XID : {712, S} , SENTRY, SELECT
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS
> NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
> FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON
> A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?
> Granted XID : {704, X}
> Lock : ROW, SENTRY_DB_PRIVILEGE, (1,11)
> Waiting XID : {704, S} , SENTRY, SELECT
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS
> NUCLEUS_TYPE,A0.URI,A0."ACTION",A0."COLUMN_NAME",A0.CREATE_TIME,A0.DB_NAME,A0.WITH_GRANT_OPTION,A0.PRIVILEGE_SCOPE,A0."SERVER_NAME",A0."TABLE_NAME",A0.DB_PRIVILEGE_ID
> FROM SENTRY_DB_PRIVILEGE A0 WHERE A0."SERVER_NAME" = 'server1' AND
> A0.DB_NAME = 'db_2' AND A0."ACTION" = 'create'
> Granted XID : {712, X}
> . The selected victim is XID : 712.
> at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
> at
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown
> Source)
> ... 44 more
> Caused by: ERROR 40001: A lock could not be obtained due to a deadlock, cycle
> of locks and waiters is:
> Lock : ROW, SENTRY_ROLE_DB_PRIVILEGE_MAP, (1,10)
> Waiting XID : {712, S} , SENTRY, SELECT
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS
> NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
> FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON
> A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?
> Granted XID : {704, X}
> Lock : ROW, SENTRY_DB_PRIVILEGE, (1,11)
> Waiting XID : {704, S} , SENTRY, SELECT
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS
> NUCLEUS_TYPE,A0.URI,A0."ACTION",A0."COLUMN_NAME",A0.CREATE_TIME,A0.DB_NAME,A0.WITH_GRANT_OPTION,A0.PRIVILEGE_SCOPE,A0."SERVER_NAME",A0."TABLE_NAME",A0.DB_PRIVILEGE_ID
> FROM SENTRY_DB_PRIVILEGE A0 WHERE A0."SERVER_NAME" = 'server1' AND
> A0.DB_NAME = 'db_2' AND A0."ACTION" = 'create'
> Granted XID : {712, X}
> . The selected victim is XID : 712.
> at org.apache.derby.iapi.error.StandardException.newException(Unknown
> Source)
> at org.apache.derby.impl.services.locks.Deadlock.buildException(Unknown
> Source)
> at
> org.apache.derby.impl.services.locks.ConcurrentLockSet.lockObject(Unknown
> Source)
> at
> org.apache.derby.impl.services.locks.ConcurrentLockSet.zeroDurationLockObject(Unknown
> Source)
> at
> org.apache.derby.impl.services.locks.AbstractPool.zeroDurationlockObject(Unknown
> Source)
> at
> org.apache.derby.impl.services.locks.ConcurrentPool.zeroDurationlockObject(Unknown
> Source)
> at
> org.apache.derby.impl.store.raw.xact.RowLocking2nohold.lockRecordForRead(Unknown
> Source)
> at
> org.apache.derby.impl.store.access.heap.HeapController.lockRow(Unknown Source)
> at
> org.apache.derby.impl.store.access.heap.HeapController.lockRow(Unknown Source)
> at
> org.apache.derby.impl.store.access.btree.index.B2IRowLocking3.lockRowOnPage(Unknown
> Source)
> at
> org.apache.derby.impl.store.access.btree.index.B2IRowLocking3._lockScanRow(Unknown
> Source)
> at
> org.apache.derby.impl.store.access.btree.index.B2IRowLockingRR.lockScanRow(Unknown
> Source)
> at
> org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(Unknown
> Source)
> at
> org.apache.derby.impl.store.access.btree.BTreeScan.fetchNextGroup(Unknown
> Source)
> at
> org.apache.derby.impl.sql.execute.BulkTableScanResultSet.reloadArray(Unknown
> Source)
> at
> org.apache.derby.impl.sql.execute.BulkTableScanResultSet.getNextRowCore(Unknown
> Source)
> at org.apache.derby.impl.sql.execute.JoinResultSet.openCore(Unknown
> Source)
> at
> org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.openCore(Unknown
> Source)
> at
> org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.open(Unknown Source)
> at
> org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown Source)
> at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown
> Source)
> ... 38 more
> at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296)
> at
> org.apache.sentry.tests.e2e.dbprovider.TestDbPrivilegeCleanupOnDrop.setupPrivileges(TestDbPrivilegeCleanupOnDrop.java:273)
> at
> org.apache.sentry.tests.e2e.dbprovider.TestDbPrivilegeCleanupOnDrop.testDropObjects(TestDbPrivilegeCleanupOnDrop.java:104)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
> at
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)