> On Oct. 11, 2018, 2:55 a.m., Na Li wrote:
> > do you have measurement on how much time saved on saving big snapshot in 
> > parallel vs the number of threads? When multiple threads accessing the same 
> > table, the improvement may not be linear.

Lina I agree. Performence would not increase linearly with the increare in 
number of threads. That is why limited the thread count to 2. 
I tests reults depend on the hardware where it is running and other load on the 
CPU where the tests are run. That is why I did not capture the results with 
variant thread count. I limited my tests with 1,2,3 threads. Performence was 
increasing as I increased the threads to 3 but same may not be true if increase 
furthur.


- kalyan kumar


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68973/#review209430
-----------------------------------------------------------


On Oct. 10, 2018, 11:04 p.m., kalyan kumar kalvagadda wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68973/
> -----------------------------------------------------------
> 
> (Updated Oct. 10, 2018, 11:04 p.m.)
> 
> 
> Review request for sentry, Arjun Mishra, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2305
>     https://issues.apache.org/jira/browse/SENTRY-2305
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> I have considered multiple options. Persisting in batches is not an option 
> with out changing the schema as the data nucleus does not persist row in 
> batches for tables which have foreign key on other tables.
> 
> I see that best option is to persist the paths in parallel. It gave good 
> results.
> 
> Solution Approach:
> 
> I have used a thread pool to persist the snapshot. Size of this thread pool 
> is configurable. Paths for each object database/table are submitted to this 
> thread pool. If for reason some of the paths are not pesisted, snapshot is 
> removed and exception is throw back.
> 
>   This patch along with SENTRY-2423 was 5 times faster when tested with below.
> 
>  
> 
> Object Type   Count
> Databases     209
> Tables        2100
> Partitions    200004
> 
> 
> Diffs
> -----
> 
>   
> sentry-core/sentry-core-common/src/main/java/org/apache/sentry/service/common/ServiceConstants.java
>  092060c450c6a906850630cb10454737157af5fe 
>   
> sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java
>  7a736ca9604eb0bb182a159b5a2aed274275c16e 
>   
> sentry-service/sentry-service-server/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java
>  0d62941a7bd45e0d8a67b5d95fdb39a8801f5a26 
>   
> sentry-service/sentry-service-server/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java
>  4a9afe303672baff39be01d4f190034b2bfb75fe 
> 
> 
> Diff: https://reviews.apache.org/r/68973/diff/3/
> 
> 
> Testing
> -------
> 
> Added new test and also made sure that existing tests passed.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>

Reply via email to