[jira] [Commented] (SENTRY-2476) Optimize deleting specific paths for objects
[ https://issues.apache.org/jira/browse/SENTRY-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715974#comment-16715974 ] Hadoop QA commented on SENTRY-2476: --- Here are the results of testing the latest attachment https://issues.apache.org/jira/secure/attachment/12951284/SENTRY-2476.04.patch against master. {color:green}Overall:{color} +1 all checks pass {color:green}SUCCESS:{color} all tests passed Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/4288/console This message is automatically generated. > Optimize deleting specific paths for objects > > > Key: SENTRY-2476 > URL: https://issues.apache.org/jira/browse/SENTRY-2476 > Project: Sentry > Issue Type: Bug >Reporter: Arjun Mishra >Assignee: Arjun Mishra >Priority: Major > Attachments: SENTRY-2476.01.patch, SENTRY-2476.02.patch, > SENTRY-2476.03.patch, SENTRY-2476.04.patch > > > Right now when we process a drop partition event, we fetch each path object > for paths_mapping object then find the one we want to delete and then delete > it. We should avoid fetching all objects and directly delete the path that > needs to be deleted -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SENTRY-2476) Optimize deleting specific paths for objects
[ https://issues.apache.org/jira/browse/SENTRY-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16715474#comment-16715474 ] Hadoop QA commented on SENTRY-2476: --- Here are the results of testing the latest attachment https://issues.apache.org/jira/secure/attachment/12951143/SENTRY-2476.03.patch against master. {color:green}Overall:{color} +1 all checks pass {color:green}SUCCESS:{color} all tests passed Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/4286/console This message is automatically generated. > Optimize deleting specific paths for objects > > > Key: SENTRY-2476 > URL: https://issues.apache.org/jira/browse/SENTRY-2476 > Project: Sentry > Issue Type: Bug >Reporter: Arjun Mishra >Assignee: Arjun Mishra >Priority: Major > Attachments: SENTRY-2476.01.patch, SENTRY-2476.02.patch, > SENTRY-2476.03.patch > > > Right now when we process a drop partition event, we fetch each path object > for paths_mapping object then find the one we want to delete and then delete > it. We should avoid fetching all objects and directly delete the path that > needs to be deleted -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SENTRY-2476) Optimize deleting specific paths for objects
[ https://issues.apache.org/jira/browse/SENTRY-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714307#comment-16714307 ] Hadoop QA commented on SENTRY-2476: --- Here are the results of testing the latest attachment https://issues.apache.org/jira/secure/attachment/12951143/SENTRY-2476.03.patch against master. {color:red}Overall:{color} -1 due to 4 errors {color:red}ERROR:{color} mvn test exited 1 {color:red}ERROR:{color} Failed: org.apache.sentry.tests.e2e.hdfs.TestHDFSIntegrationWithHA {color:red}ERROR:{color} Failed: org.apache.sentry.tests.e2e.hdfs.TestHDFSIntegrationWithHA {color:red}ERROR:{color} Failed: org.apache.sentry.tests.e2e.hdfs.TestHDFSIntegrationWithHA Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/4284/console This message is automatically generated. > Optimize deleting specific paths for objects > > > Key: SENTRY-2476 > URL: https://issues.apache.org/jira/browse/SENTRY-2476 > Project: Sentry > Issue Type: Bug >Reporter: Arjun Mishra >Assignee: Arjun Mishra >Priority: Major > Attachments: SENTRY-2476.01.patch, SENTRY-2476.02.patch, > SENTRY-2476.03.patch > > > Right now when we process a drop partition event, we fetch each path object > for paths_mapping object then find the one we want to delete and then delete > it. We should avoid fetching all objects and directly delete the path that > needs to be deleted -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SENTRY-2476) Optimize deleting specific paths for objects
[ https://issues.apache.org/jira/browse/SENTRY-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713452#comment-16713452 ] Hadoop QA commented on SENTRY-2476: --- Here are the results of testing the latest attachment https://issues.apache.org/jira/secure/attachment/12951057/SENTRY-2476.01.patch against master. {color:green}Overall:{color} +1 all checks pass {color:green}SUCCESS:{color} all tests passed Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/4281/console This message is automatically generated. > Optimize deleting specific paths for objects > > > Key: SENTRY-2476 > URL: https://issues.apache.org/jira/browse/SENTRY-2476 > Project: Sentry > Issue Type: Bug >Reporter: Arjun Mishra >Assignee: Arjun Mishra >Priority: Major > Attachments: SENTRY-2476.01.patch, SENTRY-2476.02.patch > > > Right now when we process a drop partition event, we fetch each path object > for paths_mapping object then find the one we want to delete and then delete > it. We should avoid fetching all objects and directly delete the path that > needs to be deleted -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SENTRY-2476) Optimize deleting specific paths for objects
[ https://issues.apache.org/jira/browse/SENTRY-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713420#comment-16713420 ] Hadoop QA commented on SENTRY-2476: --- Here are the results of testing the latest attachment https://issues.apache.org/jira/secure/attachment/12951065/SENTRY-2476.02.patch against master. {color:red}Overall:{color} -1 due to 3 errors {color:red}ERROR:{color} mvn test exited 1 {color:red}ERROR:{color} Failed: org.apache.sentry.provider.db.service.persistent.TestSentryStore {color:red}ERROR:{color} Failed: org.apache.sentry.provider.db.service.persistent.TestSentryStore Console output: https://builds.apache.org/job/PreCommit-SENTRY-Build/4282/console This message is automatically generated. > Optimize deleting specific paths for objects > > > Key: SENTRY-2476 > URL: https://issues.apache.org/jira/browse/SENTRY-2476 > Project: Sentry > Issue Type: Bug >Reporter: Arjun Mishra >Assignee: Arjun Mishra >Priority: Major > Attachments: SENTRY-2476.01.patch, SENTRY-2476.02.patch > > > Right now when we process a drop partition event, we fetch each path object > for paths_mapping object then find the one we want to delete and then delete > it. We should avoid fetching all objects and directly delete the path that > needs to be deleted -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SENTRY-2476) Optimize deleting specific paths for objects
[ https://issues.apache.org/jira/browse/SENTRY-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713361#comment-16713361 ] Arjun Mishra commented on SENTRY-2476: -- Found a new solution that deletes paths and processes events in less than 10 ms. The optimization is aimed at not caching objects. This was primarily triggered by the *_paths_* attribute in MAuthsPathsMapping class. See below for breakdown of steps # Get mapping objects for object name {noformat} 2018-12-07 14:05:12,670 INFO org.apache.sentry.provider.db.service.persistent.SentryStore: Start getAllMAuthzPathsMappingCore for object dbx.tbly 2018-12-07 14:05:12,670 DEBUG DataNucleus.Persistence: ExecutionContext.internalFlush() process started using ordered flush - 2 enlisted objects 2018-12-07 14:05:12,670 DEBUG DataNucleus.Persistence: ExecutionContext.internalFlush() process finished 2018-12-07 14:05:12,671 DEBUG DataNucleus.Connection: Connection found in the pool : org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl@1ad3c7d5 [conn=com.jolbox.bonecp.ConnectionHandle@16c2b4fe, commitOnRelease=false, closeOnRelease=false, closeOnTxnEnd=true] for key=org.datanucleus.ExecutionContextThreadedImpl@50b469b4 in factory=ConnectionFactory:tx[org.datanucleus.store.rdbms.ConnectionFactoryImpl@5d5d9e5] 2018-12-07 14:05:12,671 DEBUG DataNucleus.Datastore.Native: BATCH [INSERT INTO `SENTRY_HMS_NOTIFICATION_ID` (`NOTIFICATION_ID`) VALUES (<21958>)] 2018-12-07 14:05:12,671 DEBUG DataNucleus.Datastore: Execution Time = 0 ms (number of rows = [1]) on PreparedStatement "org.datanucleus.store.rdbms.ParamLoggingPreparedStatement@496079d2" 2018-12-07 14:05:12,671 DEBUG DataNucleus.Datastore: Closing PreparedStatement "com.jolbox.bonecp.PreparedStatementHandle@254a68f2" 2018-12-07 14:05:12,672 DEBUG DataNucleus.Datastore.Native: SELECT 'org.apache.sentry.provider.db.service.model.MAuthzPathsMapping' AS NUCLEUS_TYPE,`A0`.`AUTHZ_OBJ_NAME`,`A0`.`AUTHZ_SNAPSHOT_ID`,`A0`.`CREATE_TIME_MS`,`A0`.`AUTHZ_OBJ_ID` FROM `AUTHZ_PATHS_MAPPING` `A0` WHERE `A0`.`AUTHZ_OBJ_NAME` = <'dbx.tbly'> 2018-12-07 14:05:12,672 DEBUG DataNucleus.Datastore.Retrieve: Execution Time = 0 ms 2018-12-07 14:05:12,672 INFO org.apache.sentry.provider.db.service.persistent.SentryStore: End getAllMAuthzPathsMappingCore for object dbx.tbly {noformat} # Get MPath objects for path name {noformat} 2018-12-07 14:05:12,672 INFO org.apache.sentry.provider.db.service.persistent.SentryStore: Start getMAuthzPathsCore for path [user/hive/warehouse/dbx.db/tbly/col10=1] 2018-12-07 14:05:12,672 DEBUG DataNucleus.Connection: Connection found in the pool : org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl@1ad3c7d5 [conn=com.jolbox.bonecp.ConnectionHandle@16c2b4fe, commitOnRelease=false, closeOnRelease=false, closeOnTxnEnd=true] for key=org.datanucleus.ExecutionContextThreadedImpl@50b469b4 in factory=ConnectionFactory:tx[org.datanucleus.store.rdbms.ConnectionFactoryImpl@5d5d9e5] 2018-12-07 14:05:12,672 DEBUG DataNucleus.Datastore: Closing PreparedStatement "com.jolbox.bonecp.PreparedStatementHandle@2be897c0" 2018-12-07 14:05:12,672 DEBUG DataNucleus.Datastore.Native: SELECT 'org.apache.sentry.provider.db.service.model.MPath' AS NUCLEUS_TYPE,`A0`.`PATH_NAME`,`A0`.`PATH_ID` FROM `AUTHZ_PATH` `A0` WHERE `A0`.`PATH_NAME` = <'user/hive/warehouse/dbx.db/tbly/col10=1'> 2018-12-07 14:05:12,676 DEBUG DataNucleus.Datastore.Retrieve: Execution Time = 4 ms 2018-12-07 14:05:12,676 DEBUG DataNucleus.Persistence: Retrieved object with OID "11255[OID]org.apache.sentry.provider.db.service.model.MPath" 2018-12-07 14:05:12,676 DEBUG DataNucleus.Cache: Object with id "11255[OID]org.apache.sentry.provider.db.service.model.MPath" not found in Level 1 cache [cache size = 2] 2018-12-07 14:05:12,676 DEBUG DataNucleus.Cache: Object "org.apache.sentry.provider.db.service.model.MPath@48707c3" (id="11255[OID]org.apache.sentry.provider.db.service.model.MPath") added to Level 1 cache (loadedFlags="[N]") 2018-12-07 14:05:12,676 INFO org.apache.sentry.provider.db.service.persistent.SentryStore: End getMAuthzPathsCore for path [user/hive/warehouse/dbx.db/tbly/col10=1]. Size 1 {noformat} # Delete the path objects {noformat} 2018-12-07 14:05:12,676 INFO org.apache.sentry.provider.db.service.persistent.SentryStore: Start deleting all objects for path [user/hive/warehouse/dbx.db/tbly/col10=1] 2018-12-07 14:05:12,676 DEBUG DataNucleus.Persistence: Deleting object from persistence : "org.apache.sentry.provider.db.service.model.MPath@48707c3" 2018-12-07 14:05:12,676 DEBUG DataNucleus.Lifecycle: Object "org.apache.sentry.provider.db.service.model.MPath@48707c3" (id="11255[OID]org.apache.sentry.provider.db.service.model.MPath") has a lifecycle change : "HOLLOW"->"P_DELETED" 2018-12-07 14:05:12,676 DEBUG DataNucleus.Transaction: Object "org.apache.sentry.provider.db.service.model.MPath@48707c3"