> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote:
> > agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
> > Line 102 (original), 105 (patched)
> > <https://reviews.apache.org/r/74144/diff/1/?file=2270242#file2270242line108>
> >
> >     This needs a review to reduce the M x N complexity. If there is a map 
> > of <resource-id, service-resource> maintained (in memory) for existing 
> > service-tags, then the loop at line 115 may reduce to a map lookup by 
> > resource-id.

Yes. Updating this block to replace inner loop with a map look up will result 
in significant improvement. I was planning to address that in a subsequent 
patch. Without the current fix, delta processing didn't complete even after 
1h35m; with the the delta processing gets to completion within few minutes.

Also, I think we should review in-place updating of serviceTags instance passed 
as argument. This would be an issue if that instance is used in another thread 
while it is being updated.


> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote:
> > security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java
> > Lines 1217 (patched)
> > <https://reviews.apache.org/r/74144/diff/1/?file=2270243#file2270243line1219>
> >
> >     This change may result in processing the same tag or resource id 
> > multiple times if the same object appears multiple times in the list of 
> > deltas. Does the change cause significant performance improvement? If not, 
> > please consider either reverting it or replacing the Map to a HashSet to 
> > ensure that each object appears at most once in the subsequent loop.

This change replaces use of Map<Long, Long> with Set<Long> - since the map was 
populated with the same value for key and value. This change doesn't introduce 
any change in behavior.


> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote:
> > security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java
> > Line 1251 (original), 1250 (patched)
> > <https://reviews.apache.org/r/74144/diff/1/?file=2270243#file2270243line1255>
> >
> >     Although this change reduces dependency on JPA cache behavior, it also 
> > removes one consistency check on the sanity of deltas and the state of the 
> > database. Perhaps, performing all of these computations/database 
> > interactions in a dedicated, read-only new transaction will obviate need 
> > for such peep-hole optimizations. This also applies to changes around line 
> > #1269.

Yes. Use of read-only transactions can help avoid JPA overheads. However, 
wouldn't the optimizations introduced in this patch still be helpful to avoid 
overheads?


> On Sept. 28, 2022, 6:12 p.m., Abhay Kulkarni wrote:
> > security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java
> > Lines 1282 (patched)
> > <https://reviews.apache.org/r/74144/diff/1/?file=2270243#file2270243line1290>
> >
> >     Please review if this code fragment is required after reviewing the 
> > following API and its callers. The  changes to each of tags, resources and 
> > tag-resource-mappings may be completely kept track in the change-records 
> > created in the delta-tables.
> >     
> >     ---
> >     
> >     XXServiceVersionInfoDao.updateTagVersionAndTagUpdateTime().

Earlier version used rangerTagResourceMapService.getTagIdsForResourceId() to 
get IDs of tags associated with a given resources. This patch replaces this 
logic to obtain tag IDs from XXServiceResource.tags field.

It is not clear hwo updateTagVersionAndTagUpdateTime() can help here. Can you 
please add details? Thanks.


- Madhan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74144/#review224714
-----------------------------------------------------------


On Sept. 28, 2022, 5:44 a.m., Madhan Neethiraj wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74144/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2022, 5:44 a.m.)
> 
> 
> Review request for ranger, Ankita Sinha, Kishor Gollapalliwar, Abhay 
> Kulkarni, Mehul Parikh, Pradeep Agrawal, Ramesh Mani, Sailaja Polavarapu, 
> Subhrat Chaudhary, and Velmurugan Periasamy.
> 
> 
> Bugs: RANGER-3934
>     https://issues.apache.org/jira/browse/RANGER-3934
> 
> 
> Repository: ranger
> 
> 
> Description
> -------
> 
> - updated several retrieval of XXService JPA object by replacing with 
> retrival of serviceId
> - replaced several instances of Map<Long, Long> with Set<Long>, as these 
> instances had same value for key and value
> - avoided expensive calls to RangerTagResourceMapService.getByTagId(tagId) 
> and RangerTagResourceMapService.getTagIdsForResourceId(serviceResourceId)
> - updated several methods in XXTagResourceMapDao to avoid loading of 
> XXTagResourceMap JPA object; instead created XXTagResourceMap object from 
> individial fields retrieved from query
> 
> 
> Diffs
> -----
> 
>   
> agents-common/src/main/java/org/apache/ranger/plugin/util/RangerServiceTagsDeltaUtil.java
>  8d9241c1c 
>   security-admin/src/main/java/org/apache/ranger/biz/TagDBStore.java 
> d8154b7de 
>   
> security-admin/src/main/java/org/apache/ranger/db/XXRMSServiceResourceDao.java
>  afa754ba2 
>   security-admin/src/main/java/org/apache/ranger/db/XXServiceDao.java 
> 3cc3d9cef 
>   security-admin/src/main/java/org/apache/ranger/db/XXTagResourceMapDao.java 
> 3f8b5b718 
>   
> security-admin/src/main/java/org/apache/ranger/patch/PatchForAtlasServiceDefUpdate_J10013.java
>  b0f71e138 
>   security-admin/src/main/java/org/apache/ranger/rest/TagREST.java c7cf3bfb8 
>   security-admin/src/main/resources/META-INF/jpa_named_queries.xml e4a2354b0 
> 
> 
> Diff: https://reviews.apache.org/r/74144/diff/1/
> 
> 
> Testing
> -------
> 
> - with Ranger database having ~1m service-resources, ~2m tags and delta of 
> 20k resource & 40k tags:
>   -- before these optmizations, tag-cache update from delta didn't complete 
> event after a long time (1h35m)
>   -- with these optmizations, tag-cache update from the same delta completed 
> within 10 minutes
> - there is scope for further optimization 
> RangerServiceTagsDeltaUtil.applyDelta() - as this took almost 97% time in 
> updating the cache (588 seconds); compare this to 
> TagDBStore.getServiceTagsDelta() which completed in 13 seeconds
> 
> 
> Thanks,
> 
> Madhan Neethiraj
> 
>

Reply via email to