> On Oct. 11, 2018, 2:55 a.m., Na Li wrote: > > do you have measurement on how much time saved on saving big snapshot in > > parallel vs the number of threads? When multiple threads accessing the same > > table, the improvement may not be linear.
Lina I agree. Performence would not increase linearly with the increare in number of threads. That is why limited the thread count to 2. I tests reults depend on the hardware where it is running and other load on the CPU where the tests are run. That is why I did not capture the results with variant thread count. I limited my tests with 1,2,3 threads. Performence was increasing as I increased the threads to 3 but same may not be true if increase furthur. - kalyan kumar ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68973/#review209430 ----------------------------------------------------------- On Oct. 10, 2018, 11:04 p.m., kalyan kumar kalvagadda wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68973/ > ----------------------------------------------------------- > > (Updated Oct. 10, 2018, 11:04 p.m.) > > > Review request for sentry, Arjun Mishra, Na Li, and Sergio Pena. > > > Bugs: SENTRY-2305 > https://issues.apache.org/jira/browse/SENTRY-2305 > > > Repository: sentry > > > Description > ------- > > I have considered multiple options. Persisting in batches is not an option > with out changing the schema as the data nucleus does not persist row in > batches for tables which have foreign key on other tables. > > I see that best option is to persist the paths in parallel. It gave good > results. > > Solution Approach: > > I have used a thread pool to persist the snapshot. Size of this thread pool > is configurable. Paths for each object database/table are submitted to this > thread pool. If for reason some of the paths are not pesisted, snapshot is > removed and exception is throw back. > > This patch along with SENTRY-2423 was 5 times faster when tested with below. > > > > Object Type Count > Databases 209 > Tables 2100 > Partitions 200004 > > > Diffs > ----- > > > sentry-core/sentry-core-common/src/main/java/org/apache/sentry/service/common/ServiceConstants.java > 092060c450c6a906850630cb10454737157af5fe > > sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java > 7a736ca9604eb0bb182a159b5a2aed274275c16e > > sentry-service/sentry-service-server/src/test/java/org/apache/sentry/provider/db/service/persistent/TestHMSFollower.java > 0d62941a7bd45e0d8a67b5d95fdb39a8801f5a26 > > sentry-service/sentry-service-server/src/test/java/org/apache/sentry/provider/db/service/persistent/TestSentryStore.java > 4a9afe303672baff39be01d4f190034b2bfb75fe > > > Diff: https://reviews.apache.org/r/68973/diff/3/ > > > Testing > ------- > > Added new test and also made sure that existing tests passed. > > > Thanks, > > kalyan kumar kalvagadda > >