> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote: > > agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java > > Line 102 (original), 105 (patched) > > <https://reviews.apache.org/r/74144/diff/1/?file=2270242#file2270242line108> > > > > This needs a review to reduce the M x N complexity. If there is a map > > of <resource-id, service-resource> maintained (in memory) for existing > > service-tags, then the loop at line 115 may reduce to a map lookup by > > resource-id. > > Madhan Neethiraj wrote: > Yes. Updating this block to replace inner loop with a map look up will > result in significant improvement. I was planning to address that in a > subsequent patch. Without the current fix, delta processing didn't complete > even after 1h35m; with the the delta processing gets to completion within few > minutes. > > Also, I think we should review in-place updating of serviceTags instance > passed as argument. This would be an issue if that instance is used in > another thread while it is being updated.
I dont think serviceTags instance passed as an argument will be used in other thread, given that it is called within a lock.tryLock block from getLatestOrCached() function. (Caveat: Of course, there is a case where the tryLock times out while the updates to cached service-tags are going on. It is also called from RangerTagRefresher.populateTag() in the plugins. If the tag download interval is too small, then there is a possibility of two threads accessing service-tags concurrently.) > On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote: > > security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java > > Line 1251 (original), 1250 (patched) > > <https://reviews.apache.org/r/74144/diff/1/?file=2270243#file2270243line1255> > > > > Although this change reduces dependency on JPA cache behavior, it also > > removes one consistency check on the sanity of deltas and the state of the > > database. Perhaps, performing all of these computations/database > > interactions in a dedicated, read-only new transaction will obviate need > > for such peep-hole optimizations. This also applies to changes around line > > #1269. > > Madhan Neethiraj wrote: > Yes. Use of read-only transactions can help avoid JPA overheads. However, > wouldn't the optimizations introduced in this patch still be helpful to avoid > overheads? Yes, the current optimizations will help, but for the sake of maintainability in the face of future changes, the real solution may be to explore using separate, read-only transaction for delta creation. - Abhay ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/74144/#review224714 ----------------------------------------------------------- On Sept. 28, 2022, 5:44 a.m., Madhan Neethiraj wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/74144/ > ----------------------------------------------------------- > > (Updated Sept. 28, 2022, 5:44 a.m.) > > > Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay > Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, > Subhrat Chaudhary, and Velmurugan Periasamy. > > > Bugs: RANGER-3934 > https://issues.apache.org/jira/browse/RANGER-3934 > > > Repository: ranger > > > Description > ------- > > - updated several retrieval of XXService JPA object by replacing with > retrival of serviceId > - replaced several instances of Map<Long, Long> with Set<Long>, as these > instances had same value for key and value > - avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) > and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId) > - updated several methods in XXTagResourceMapDao to avoid loading of > XXTagResourceMap JPA object; instead created XXTagResourceMap object from > individial fields retrieved from query > > > Diffs > ----- > > > agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java > 8d9241c1c > security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java > d8154b7de > > security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java > afa754ba2 > security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java > 3cc3d9cef > security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java > 3f8b5b718 > > security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java > b0f71e138 > security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 > security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 > > > Diff: https://reviews.apache.org/r/74144/diff/1/ > > > Testing > ------- > > - with Ranger database having ~1m service-resources, ~2m tags and delta of > 20k resource & 40k tags: > -- before these optmizations, tag-cache update from delta didn't complete > event after a long time (1h35m) > -- with these optmizations, tag-cache update from the same delta completed > within 10 minutes > - there is scope for further optimization > RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in > updating the cache (588 seconds); compare this to > TagDBStore.getServiceTagsDelta() which completed in 13 seeconds > > > Thanks, > > Madhan Neethiraj > >
