[ 
https://issues.apache.org/jira/browse/SENTRY-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15447837#comment-15447837
 ] 

Colin Ma commented on SENTRY-1422:
----------------------------------

Hi, [~akolb],
1. This problem is caused by Sentry-1321, and HMSFollower is came from that 
JIRA. For the stack traces, we only get the log about 
SentryStore.alterSentryRoleGrantPrivileges(), but you can find the sentence 
"The selected victim is XID : 712" from log. When deadlock is detected, one 
transaction will be selected as valid, and the other will be failed. That's why 
there is no stack traces for HMSFollower pasted here.
2.  How the deadlock happen? Check the code of 
SentryStore.alterSentryRoleGrantPrivileges() and 
SentryStore.dropPrivilege()(called by HMSFollower), they both get the 
privileges first and update the privileges. The involved tables are 
SENTRY_ROLE_DB_PRIVILEGE_MAP and SENTRY_DB_PRIVILEGE. Get the privileges will 
get the shared lock and update will get the exclusive lock, these 2 type locks 
are conflict. If the transaction is executed as following order:
    a) SentryStore.alterSentryRoleGrantPrivileges() get the privileges (shared 
lock get, ok)
        b) SentryStore.dropPrivilege() get the privileges (shared lock get, ok)
        c) SentryStore.alterSentryRoleGrantPrivileges(exclusive lock get, 
failed, hold by b)
        d) SentryStore.dropPrivilege(exclusive lock get, failed, hold by a)
        
    Check the log again, we need update both SENTRY_ROLE_DB_PRIVILEGE_MAP and 
SENTRY_DB_PRIVILEGE in 2 transactions, but the transaction 1 get the shared 
lock for SENTRY_ROLE_DB_PRIVILEGE_MAP and exclusive lock for 
SENTRY_DB_PRIVILEGE, and  transaction 2 is opposite, that cause the deadlock:
{code}
Lock : ROW, SENTRY_ROLE_DB_PRIVILEGE_MAP, (1,10)
  Waiting XID : {712, S} , SENTRY, SELECT 
'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS 
NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
 FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON 
A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?
  Granted XID : {704, X} 
Lock : ROW, SENTRY_DB_PRIVILEGE, (1,11)
  Waiting XID : {704, S} , SENTRY, SELECT 
'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS 
NUCLEUS_TYPE,A0.URI,A0."ACTION",A0."COLUMN_NAME",A0.CREATE_TIME,A0.DB_NAME,A0.WITH_GRANT_OPTION,A0.PRIVILEGE_SCOPE,A0."SERVER_NAME",A0."TABLE_NAME",A0.DB_PRIVILEGE_ID
 FROM SENTRY_DB_PRIVILEGE A0 WHERE A0."SERVER_NAME" = 'server1' AND A0.DB_NAME 
= 'db_2' AND A0."ACTION" = 'create'
  Granted XID : {712, X}
{code}

3. For transactions and concurrent, based on the point 2, I think it's 
impossible for the current implementation.
4.Retry the failed transaction can be an enhancement for Sentry later, but for 
this problem, we know how the deadlock happens, I think we should fix this. 

> JDO deadlocks while processing grant while a background thread processes 
> Notificationlogs
> -----------------------------------------------------------------------------------------
>
>                 Key: SENTRY-1422
>                 URL: https://issues.apache.org/jira/browse/SENTRY-1422
>             Project: Sentry
>          Issue Type: Sub-task
>          Components: Hdfs Plugin
>            Reporter: Sravya Tirukkovalur
>            Assignee: Colin Ma
>             Fix For: sentry-ha-redesign
>
>         Attachments: SENTRY-1422.008-sentry-ha-redesign.patch, 
> SENTRY-1422.009-sentry-ha-redesign.patch, 
> SENTRY-1422.010-sentry-ha-redesign.patch
>
>
> As I was working on Sentry-1321. I see that test case 
> TestDbPrivilegeCleanupOnDrop#testDropObjects fails while "Granting select on 
> table" with following deadlock.
> {code}
> java.sql.SQLException: Error while processing statement: FAILED: Execution 
> Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.SentryGrantRevokeTask. Unknown error for 
> request: TAlterSentryRoleGrantPrivilegeRequest(protocol_version:2, 
> requestorUserName:admin1, roleName:select_tbl1, 
> privileges:[TSentryPrivilege(privilegeScope:TABLE, serverName:server1, 
> dbName:db_2, tableName:tb_1, URI:, action:select, createTime:1469562185573, 
> grantOption:FALSE, columnName:)]), message: Iteration request failed : SELECT 
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS 
> NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
>  FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON 
> A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?. Server 
> Stacktrace: javax.jdo.JDODataStoreException: Iteration request failed : 
> SELECT 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS 
> NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
>  FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON 
> A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?
>       at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
>       at 
> org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:165)
>       at 
> org.apache.sentry.provider.db.service.persistent.SentryStore.commitTransaction(SentryStore.java:279)
>       at 
> org.apache.sentry.provider.db.service.persistent.SentryStore.commitUpdateTransaction(SentryStore.java:261)
>       at 
> org.apache.sentry.provider.db.service.persistent.SentryStore.alterSentryRoleGrantPrivileges(SentryStore.java:458)
>       at 
> org.apache.sentry.provider.db.service.thrift.SentryPolicyStoreProcessor.alter_sentry_role_grant_privilege(SentryPolicyStoreProcessor.java:261)
>       at 
> org.apache.sentry.provider.db.service.thrift.SentryPolicyService$Processor$alter_sentry_role_grant_privilege.getResult(SentryPolicyService.java:1237)
>       at 
> org.apache.sentry.provider.db.service.thrift.SentryPolicyService$Processor$alter_sentry_role_grant_privilege.getResult(SentryPolicyService.java:1222)
>       at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>       at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>       at 
> org.apache.sentry.provider.db.service.thrift.SentryProcessorWrapper.process(SentryProcessorWrapper.java:35)
>       at 
> org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:123)
>       at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> NestedThrowablesStackTrace:
> java.sql.SQLTransactionRollbackException: A lock could not be obtained due to 
> a deadlock, cycle of locks and waiters is:
> Lock : ROW, SENTRY_ROLE_DB_PRIVILEGE_MAP, (1,10)
>   Waiting XID : {712, S} , SENTRY, SELECT 
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS 
> NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
>  FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON 
> A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?
>   Granted XID : {704, X} 
> Lock : ROW, SENTRY_DB_PRIVILEGE, (1,11)
>   Waiting XID : {704, S} , SENTRY, SELECT 
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS 
> NUCLEUS_TYPE,A0.URI,A0."ACTION",A0."COLUMN_NAME",A0.CREATE_TIME,A0.DB_NAME,A0.WITH_GRANT_OPTION,A0.PRIVILEGE_SCOPE,A0."SERVER_NAME",A0."TABLE_NAME",A0.DB_PRIVILEGE_ID
>  FROM SENTRY_DB_PRIVILEGE A0 WHERE A0."SERVER_NAME" = 'server1' AND 
> A0.DB_NAME = 'db_2' AND A0."ACTION" = 'create'
>   Granted XID : {712, X} 
> . The selected victim is XID : 712.
>       at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
> Source)
>       at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
> Source)
>       at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
> Source)
>       at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
> Source)
>       at 
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
> Source)
>       at 
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeQuery(Unknown Source)
>       at 
> com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:172)
>       at 
> org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:381)
>       at 
> org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:559)
>       at 
> org.datanucleus.store.rdbms.scostore.JoinSetStore.iterator(JoinSetStore.java:967)
>       at org.datanucleus.store.types.backed.Set.loadFromStore(Set.java:323)
>       at org.datanucleus.store.types.backed.Set.initialise(Set.java:272)
>       at 
> org.datanucleus.store.types.SCOUtils.createSCOWrapper(SCOUtils.java:256)
>       at 
> org.datanucleus.store.types.SCOUtils.newSCOInstance(SCOUtils.java:142)
>       at 
> org.datanucleus.store.rdbms.mapping.java.AbstractContainerMapping.replaceFieldWithWrapper(AbstractContainerMapping.java:399)
>       at 
> org.datanucleus.store.rdbms.mapping.java.AbstractContainerMapping.postFetch(AbstractContainerMapping.java:417)
>       at 
> org.datanucleus.store.rdbms.request.FetchRequest.execute(FetchRequest.java:420)
>       at 
> org.datanucleus.store.rdbms.RDBMSPersistenceHandler.fetchObject(RDBMSPersistenceHandler.java:324)
>       at 
> org.datanucleus.state.AbstractStateManager.loadFieldsFromDatastore(AbstractStateManager.java:1122)
>       at 
> org.datanucleus.state.JDOStateManager.loadUnloadedFieldsInFetchPlan(JDOStateManager.java:3000)
>       at 
> org.datanucleus.state.AbstractStateManager.loadFieldsInFetchPlan(AbstractStateManager.java:1064)
>       at 
> org.datanucleus.ExecutionContextImpl.performDetachAllOnTxnEndPreparation(ExecutionContextImpl.java:4574)
>       at 
> org.datanucleus.ExecutionContextImpl.preCommit(ExecutionContextImpl.java:4259)
>       at 
> org.datanucleus.ExecutionContextImpl.transactionPreCommit(ExecutionContextImpl.java:654)
>       at 
> org.datanucleus.TransactionImpl.internalPreCommit(TransactionImpl.java:379)
>       at org.datanucleus.TransactionImpl.commit(TransactionImpl.java:268)
>       at org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:98)
>       at 
> org.apache.sentry.provider.db.service.persistent.SentryStore.commitTransaction(SentryStore.java:279)
>       at 
> org.apache.sentry.provider.db.service.persistent.SentryStore.commitUpdateTransaction(SentryStore.java:261)
>       at 
> org.apache.sentry.provider.db.service.persistent.SentryStore.alterSentryRoleGrantPrivileges(SentryStore.java:458)
>       at 
> org.apache.sentry.provider.db.service.thrift.SentryPolicyStoreProcessor.alter_sentry_role_grant_privilege(SentryPolicyStoreProcessor.java:261)
>       at 
> org.apache.sentry.provider.db.service.thrift.SentryPolicyService$Processor$alter_sentry_role_grant_privilege.getResult(SentryPolicyService.java:1237)
>       at 
> org.apache.sentry.provider.db.service.thrift.SentryPolicyService$Processor$alter_sentry_role_grant_privilege.getResult(SentryPolicyService.java:1222)
>       at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>       at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>       at 
> org.apache.sentry.provider.db.service.thrift.SentryProcessorWrapper.process(SentryProcessorWrapper.java:35)
>       at 
> org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:123)
>       at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.sql.SQLException: A lock could not be obtained due to a 
> deadlock, cycle of locks and waiters is:
> Lock : ROW, SENTRY_ROLE_DB_PRIVILEGE_MAP, (1,10)
>   Waiting XID : {712, S} , SENTRY, SELECT 
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS 
> NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
>  FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON 
> A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?
>   Granted XID : {704, X} 
> Lock : ROW, SENTRY_DB_PRIVILEGE, (1,11)
>   Waiting XID : {704, S} , SENTRY, SELECT 
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS 
> NUCLEUS_TYPE,A0.URI,A0."ACTION",A0."COLUMN_NAME",A0.CREATE_TIME,A0.DB_NAME,A0.WITH_GRANT_OPTION,A0.PRIVILEGE_SCOPE,A0."SERVER_NAME",A0."TABLE_NAME",A0.DB_PRIVILEGE_ID
>  FROM SENTRY_DB_PRIVILEGE A0 WHERE A0."SERVER_NAME" = 'server1' AND 
> A0.DB_NAME = 'db_2' AND A0."ACTION" = 'create'
>   Granted XID : {712, X} 
> . The selected victim is XID : 712.
>       at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
>       at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown
>  Source)
>       ... 44 more
> Caused by: ERROR 40001: A lock could not be obtained due to a deadlock, cycle 
> of locks and waiters is:
> Lock : ROW, SENTRY_ROLE_DB_PRIVILEGE_MAP, (1,10)
>   Waiting XID : {712, S} , SENTRY, SELECT 
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS 
> NUCLEUS_TYPE,A1.URI,A1."ACTION",A1."COLUMN_NAME",A1.CREATE_TIME,A1.DB_NAME,A1.WITH_GRANT_OPTION,A1.PRIVILEGE_SCOPE,A1."SERVER_NAME",A1."TABLE_NAME",A1.DB_PRIVILEGE_ID
>  FROM SENTRY_ROLE_DB_PRIVILEGE_MAP A0 INNER JOIN SENTRY_DB_PRIVILEGE A1 ON 
> A0.DB_PRIVILEGE_ID = A1.DB_PRIVILEGE_ID WHERE A0.ROLE_ID = ?
>   Granted XID : {704, X} 
> Lock : ROW, SENTRY_DB_PRIVILEGE, (1,11)
>   Waiting XID : {704, S} , SENTRY, SELECT 
> 'org.apache.sentry.provider.db.service.model.MSentryPrivilege' AS 
> NUCLEUS_TYPE,A0.URI,A0."ACTION",A0."COLUMN_NAME",A0.CREATE_TIME,A0.DB_NAME,A0.WITH_GRANT_OPTION,A0.PRIVILEGE_SCOPE,A0."SERVER_NAME",A0."TABLE_NAME",A0.DB_PRIVILEGE_ID
>  FROM SENTRY_DB_PRIVILEGE A0 WHERE A0."SERVER_NAME" = 'server1' AND 
> A0.DB_NAME = 'db_2' AND A0."ACTION" = 'create'
>   Granted XID : {712, X} 
> . The selected victim is XID : 712.
>       at org.apache.derby.iapi.error.StandardException.newException(Unknown 
> Source)
>       at org.apache.derby.impl.services.locks.Deadlock.buildException(Unknown 
> Source)
>       at 
> org.apache.derby.impl.services.locks.ConcurrentLockSet.lockObject(Unknown 
> Source)
>       at 
> org.apache.derby.impl.services.locks.ConcurrentLockSet.zeroDurationLockObject(Unknown
>  Source)
>       at 
> org.apache.derby.impl.services.locks.AbstractPool.zeroDurationlockObject(Unknown
>  Source)
>       at 
> org.apache.derby.impl.services.locks.ConcurrentPool.zeroDurationlockObject(Unknown
>  Source)
>       at 
> org.apache.derby.impl.store.raw.xact.RowLocking2nohold.lockRecordForRead(Unknown
>  Source)
>       at 
> org.apache.derby.impl.store.access.heap.HeapController.lockRow(Unknown Source)
>       at 
> org.apache.derby.impl.store.access.heap.HeapController.lockRow(Unknown Source)
>       at 
> org.apache.derby.impl.store.access.btree.index.B2IRowLocking3.lockRowOnPage(Unknown
>  Source)
>       at 
> org.apache.derby.impl.store.access.btree.index.B2IRowLocking3._lockScanRow(Unknown
>  Source)
>       at 
> org.apache.derby.impl.store.access.btree.index.B2IRowLockingRR.lockScanRow(Unknown
>  Source)
>       at 
> org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(Unknown 
> Source)
>       at 
> org.apache.derby.impl.store.access.btree.BTreeScan.fetchNextGroup(Unknown 
> Source)
>       at 
> org.apache.derby.impl.sql.execute.BulkTableScanResultSet.reloadArray(Unknown 
> Source)
>       at 
> org.apache.derby.impl.sql.execute.BulkTableScanResultSet.getNextRowCore(Unknown
>  Source)
>       at org.apache.derby.impl.sql.execute.JoinResultSet.openCore(Unknown 
> Source)
>       at 
> org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.openCore(Unknown 
> Source)
>       at 
> org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.open(Unknown Source)
>       at 
> org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown Source)
>       at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown 
> Source)
>       ... 38 more
>       at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296)
>       at 
> org.apache.sentry.tests.e2e.dbprovider.TestDbPrivilegeCleanupOnDrop.setupPrivileges(TestDbPrivilegeCleanupOnDrop.java:273)
>       at 
> org.apache.sentry.tests.e2e.dbprovider.TestDbPrivilegeCleanupOnDrop.testDropObjects(TestDbPrivilegeCleanupOnDrop.java:104)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to