Re: Review Request 69076: SENTRY-2301: Log where sentry stands in the snapshot fetching process, periodically

2018-10-18 Thread Arjun Mishra via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69076/
---

(Updated Oct. 18, 2018, 9:29 p.m.)


Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.


Repository: sentry


Description
---

When sentry is fetching snapshot from HMS, it should log periodically on where 
it stands in the snapshot process. This will help person debugging it and help 
him understand the progress.

 

This is important as this process could take magnitude of minutes when the HMS 
data is huge.


Diffs (updated)
-

  
sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryMetrics.java
 214d78c53 
  
sentry-service/sentry-service-server/src/main/java/org/apache/sentry/service/thrift/FullUpdateInitializer.java
 3e27d1bbe 
  
sentry-service/sentry-service-server/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
 4baeb6725 


Diff: https://reviews.apache.org/r/69076/diff/2/

Changes: https://reviews.apache.org/r/69076/diff/1-2/


Testing
---


Thanks,

Arjun Mishra



Review Request 69076: SENTRY-2301: Log where sentry stands in the snapshot fetching process, periodically

2018-10-18 Thread Arjun Mishra via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69076/
---

Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.


Repository: sentry


Description
---

When sentry is fetching snapshot from HMS, it should log periodically on where 
it stands in the snapshot process. This will help person debugging it and help 
him understand the progress.

 

This is important as this process could take magnitude of minutes when the HMS 
data is huge.


Diffs
-

  
sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryMetrics.java
 214d78c53 
  
sentry-service/sentry-service-server/src/main/java/org/apache/sentry/service/thrift/FullUpdateInitializer.java
 3e27d1bbe 
  
sentry-service/sentry-service-server/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
 4baeb6725 


Diff: https://reviews.apache.org/r/69076/diff/1/


Testing
---


Thanks,

Arjun Mishra



Re: Review Request 68968: SENTRY-2425: Add metric to track the time taken to update the owner privilege

2018-10-18 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68968/#review209762
---




sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryPolicyStoreProcessor.java
Lines 1515 (patched)


Sergio is moving owner privilege update from post event listener to 
notification processor. So your code here will be affected


- Na Li


On Oct. 18, 2018, 5:38 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68968/
> ---
> 
> (Updated Oct. 18, 2018, 5:38 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2425
> https://issues.apache.org/jira/browse/SENTRY-2425
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> New metric should be added to track the time taken by sentry server in 
> granting/updating owner privileges.
> 
> 
> Diffs
> -
> 
>   
> sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryMetrics.java
>  214d78c5332d4113e440b76077bb7b2a6dae81d6 
>   
> sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryPolicyStoreProcessor.java
>  709434c388689b63d5db7d71cd6cc47952d647bf 
> 
> 
> Diff: https://reviews.apache.org/r/68968/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 68973: SENTRY-2305: Optimize time taken for persistence HMS snapshot by persisting in parallel

2018-10-18 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68973/#review209760
---




sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
Lines 3427 (patched)


can you log the value of authzPaths.size() and 
getMAuthzPathsMappingCountCore(pm, nextSnapshotID)? That could give us a sense 
how big a difference they are.



sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
Lines 5001 (patched)


why do you have this field? I don't see it is used in this class.



sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
Lines 5020 (patched)


can you log the value of snapshotID as well?


- Na Li


On Oct. 10, 2018, 11:04 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68973/
> ---
> 
> (Updated Oct. 10, 2018, 11:04 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2305
> https://issues.apache.org/jira/browse/SENTRY-2305
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> I have considered multiple options. Persisting in batches is not an option 
> with out changing the schema as the data nucleus does not persist row in 
> batches for tables which have foreign key on other tables.
> 
> I see that best option is to persist the paths in parallel. It gave good 
> results.
> 
> Solution Approach:
> 
> I have used a thread pool to persist the snapshot. Size of this thread pool 
> is configurable. Paths for each object database/table are submitted to this 
> thread pool. If for reason some of the paths are not pesisted, snapshot is 
> removed and exception is throw back.
> 
>   This patch along with SENTRY-2423 was 5 times faster when tested with below.
> 
>  
> 
> Object Type   Count
> Databases 209
> Tables2100
> Partitions24
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/service/common/ServiceConstants.java
>  092060c450c6a906850630cb10454737157af5fe 
>   
> sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  7a736ca9604eb0bb182a159b5a2aed274275c16e 
>   
> sentry-service/sentry-service-server/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  0d62941a7bd45e0d8a67b5d95fdb39a8801f5a26 
>   
> sentry-service/sentry-service-server/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  4a9afe303672baff39be01d4f190034b2bfb75fe 
> 
> 
> Diff: https://reviews.apache.org/r/68973/diff/3/
> 
> 
> Testing
> ---
> 
> Added new test and also made sure that existing tests passed.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Review Request 69074: SENTRY-2429: Transfer database owner drops table owner

2018-10-18 Thread Na Li via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69074/
---

Review request for sentry, Arjun Mishra, kalyan kumar kalvagadda, and Sergio 
Pena.


Bugs: sentry-2429
https://issues.apache.org/jira/browse/sentry-2429


Repository: sentry


Description
---

1) What is the problem?
When changing owner of a database, the owners of all tables belonging to this 
database are removed.

2) For example:
2.1) user_A is owner of DB1, user_B is owner of DB1.TABLE1, and user_D is owner 
of DB1.TABLE2.
2.2) admin with" ALL with grant option" changes the owner of DB1 from user_A to 
user_C through command "ALTER DATABASE DB1 SET OWNER USER user_C".
2.3) Now, user_C is owner of DB1, there is no owner on DB1.TABLE1, and no owner 
on DB1.TABLE2.. In other words, the owners of DB1.TABLE1 and DB1.TABLE2 are 
removed when changing owner of their database.


Diffs
-

  
sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
 29f83a8 
  
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestOwnerPrivileges.java
 880fa94 


Diff: https://reviews.apache.org/r/69074/diff/1/


Testing
---

add new test cases and verified the fix in e2e test


Thanks,

Na Li



Re: Review Request 68968: SENTRY-2425: Add metric to track the time taken to update the owner privilege

2018-10-18 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68968/
---

(Updated Oct. 18, 2018, 5:38 p.m.)


Review request for sentry, Arjun Mishra, Na Li, and Sergio Pena.


Changes
---

addressed review comments.


Bugs: SENTRY-2425
https://issues.apache.org/jira/browse/SENTRY-2425


Repository: sentry


Description
---

New metric should be added to track the time taken by sentry server in 
granting/updating owner privileges.


Diffs (updated)
-

  
sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryMetrics.java
 214d78c5332d4113e440b76077bb7b2a6dae81d6 
  
sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryPolicyStoreProcessor.java
 709434c388689b63d5db7d71cd6cc47952d647bf 


Diff: https://reviews.apache.org/r/68968/diff/2/

Changes: https://reviews.apache.org/r/68968/diff/1-2/


Testing
---


Thanks,

kalyan kumar kalvagadda



Re: Review Request 68968: SENTRY-2425: Add metric to track the time taken to update the owner privilege

2018-10-18 Thread kalyan kumar kalvagadda via Review Board


> On Oct. 9, 2018, 6:19 p.m., Arjun Mishra wrote:
> > sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryMetrics.java
> > Lines 124 (patched)
> > 
> >
> > Can we have one for granting and one for revoking owner privilege?
> 
> kalyan kumar kalvagadda wrote:
> Do you see a value in having seperate merics?
> 
> Arjun Mishra wrote:
> Is there a plan in the future to allow explicit granting and revoking of 
> owner privileges? If so, its better to keep them seperated

There is no plan to provide a way to explcitly revoke owner privilege. I will 
be adding metrics for adding and updating owner privileges.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68968/#review209376
---


On Oct. 9, 2018, 6:01 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68968/
> ---
> 
> (Updated Oct. 9, 2018, 6:01 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2425
> https://issues.apache.org/jira/browse/SENTRY-2425
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> New metric should be added to track the time taken by sentry server in 
> granting/updating owner privileges.
> 
> 
> Diffs
> -
> 
>   
> sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryMetrics.java
>  d34340cd6a41ba01943bd3b2b42c9bddc9fa722f 
>   
> sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryPolicyStoreProcessor.java
>  709434c388689b63d5db7d71cd6cc47952d647bf 
> 
> 
> Diff: https://reviews.apache.org/r/68968/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 68968: SENTRY-2425: Add metric to track the time taken to update the owner privilege

2018-10-18 Thread kalyan kumar kalvagadda via Review Board


> On Oct. 11, 2018, 2:52 a.m., Na Li wrote:
> > sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryMetrics.java
> > Lines 124 (patched)
> > 
> >
> > We should use different metric for "granting" vs "update" since update 
> > is much more complicated and would take more time. Once we support "revoke" 
> > owner privilege, we can add a metric for that.

There is no plan to provide a way to explcitly revoke owner privilege. I will 
be adding metrics for adding and updating owner privileges.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68968/#review209429
---


On Oct. 9, 2018, 6:01 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68968/
> ---
> 
> (Updated Oct. 9, 2018, 6:01 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2425
> https://issues.apache.org/jira/browse/SENTRY-2425
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> New metric should be added to track the time taken by sentry server in 
> granting/updating owner privileges.
> 
> 
> Diffs
> -
> 
>   
> sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryMetrics.java
>  d34340cd6a41ba01943bd3b2b42c9bddc9fa722f 
>   
> sentry-service/sentry-service-server/src/main/java/org/apache/sentry/api/service/thrift/SentryPolicyStoreProcessor.java
>  709434c388689b63d5db7d71cd6cc47952d647bf 
> 
> 
> Diff: https://reviews.apache.org/r/68968/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 68973: SENTRY-2305: Optimize time taken for persistence HMS snapshot by persisting in parallel

2018-10-18 Thread kalyan kumar kalvagadda via Review Board


> On Oct. 11, 2018, 2:55 a.m., Na Li wrote:
> > do you have measurement on how much time saved on saving big snapshot in 
> > parallel vs the number of threads? When multiple threads accessing the same 
> > table, the improvement may not be linear.
> 
> kalyan kumar kalvagadda wrote:
> Lina I agree. Performence would not increase linearly with the increare 
> in number of threads. That is why limited the thread count to 2. 
> I tests reults depend on the hardware where it is running and other load 
> on the CPU where the tests are run. That is why I did not capture the results 
> with variant thread count. I limited my tests with 1,2,3 threads. Performence 
> was increasing as I increased the threads to 3 but same may not be true if 
> increase furthur.

One side effect of this approach is that there will be out-of-order inserts and 
gaps in PATH_IDin AUTHZ_PATH table. This is not an issue.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68973/#review209430
---


On Oct. 10, 2018, 11:04 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68973/
> ---
> 
> (Updated Oct. 10, 2018, 11:04 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2305
> https://issues.apache.org/jira/browse/SENTRY-2305
> 
> 
> Repository: sentry
> 
> 
> Description
> ---
> 
> I have considered multiple options. Persisting in batches is not an option 
> with out changing the schema as the data nucleus does not persist row in 
> batches for tables which have foreign key on other tables.
> 
> I see that best option is to persist the paths in parallel. It gave good 
> results.
> 
> Solution Approach:
> 
> I have used a thread pool to persist the snapshot. Size of this thread pool 
> is configurable. Paths for each object database/table are submitted to this 
> thread pool. If for reason some of the paths are not pesisted, snapshot is 
> removed and exception is throw back.
> 
>   This patch along with SENTRY-2423 was 5 times faster when tested with below.
> 
>  
> 
> Object Type   Count
> Databases 209
> Tables2100
> Partitions24
> 
> 
> Diffs
> -
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/service/common/ServiceConstants.java
>  092060c450c6a906850630cb10454737157af5fe 
>   
> sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  7a736ca9604eb0bb182a159b5a2aed274275c16e 
>   
> sentry-service/sentry-service-server/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  0d62941a7bd45e0d8a67b5d95fdb39a8801f5a26 
>   
> sentry-service/sentry-service-server/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  4a9afe303672baff39be01d4f190034b2bfb75fe 
> 
> 
> Diff: https://reviews.apache.org/r/68973/diff/3/
> 
> 
> Testing
> ---
> 
> Added new test and also made sure that existing tests passed.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>