> On Jan. 28, 2025, 4:12 p.m., Madhan Neethiraj wrote:
> > agents-common/src/main/java/org/apache/ranger/plugin/model/RangerGds.java
> > Lines 880 (patched)
> > <https://reviews.apache.org/r/75347/diff/2/?file=2297494#file2297494line880>
> >
> >     `filters` doesn't seem to be the appropriate name for this field. 
> > Looking at its contents in "Testing Done" section, it contains counts of 
> > labels and keywords. Are these counts specific to datasets included in 
> > `datasetSummary` field? Or are these counts across all datasets?
> >     
> >     I suggest to consider separate APIs to get summary of labels and 
> > keywords - like:
> >     
> >     service/gds/labels/summary
> >     service/gds/keywords/summary
> 
> Radhika Kundam wrote:
>     Regarding the name 'filters': Yes I updated the patch with additionalInfo.
>     Regarding Counts in filters section: These counts are for only for the 
> datasets matches to the search criteria of that particular summary request 
> but before applying the pagination.
>     This jira is raised to see all the labels and keywords for only datasets 
> which matches to search criteria, having separate apis to get labels and 
> keywords doesn't solve the purpose.
> 
> Madhan Neethiraj wrote:
>     Radhika - do dataset details in the response include labels and keywords 
> added to each dataset?
>     - if yes, what is the need for aggregating them in additionalInfo?
>     - if not, can dataset details be updated to include labels and keywords?
> 
> Radhika Kundam wrote:
>     Currently, the summary API response is limited by pagination, returning 
> only a subset of datasets per page. While each dataset summary includes 
> labels and keywords, this restriction prevents retrieving a complete list of 
> matching labels and keywords across all datasets.
>     
>     Challenges:
>     1. Labels and keywords are only available for datasets on the current 
> page.
>     2. A full list of labels and keywords across all matching datasets is not 
> accessible, making it difficult to filter datasets effectively.
>     
>     Solution:
>     To overcome this limitation, a new endpoint, /enhancedsummary, will be 
> introduced. This enhanced API will return dataset summaries along with 
> aggregated labels and keywords, ensuring users have access to the full set of 
> matching values regardless of pagination.
>     
>     Note: The existing /summary API will remain unchanged.
>     
>     Example Scenario:
>     Consider a total of 100 datasets, where filtering criteria result in 50 
> matching datasets.
>     With a page size of 10, the current summary API only provides labels and 
> keywords for those 10 datasets at a time.
>     This makes it difficult for users to further refine the 50 matching 
> datasets based on labels and keywords.
>     With /enhancedsummary, users will receive a consolidated list of labels 
> and keywords across all 50 matching datasets, allowing for seamless filtering 
> without pagination constraints.

Radhika - thank you for adding details. Sounds good.


- Madhan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/75347/#review227200
-----------------------------------------------------------


On Feb. 6, 2025, 5:47 p.m., Radhika Kundam wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/75347/
> -----------------------------------------------------------
> 
> (Updated Feb. 6, 2025, 5:47 p.m.)
> 
> 
> Review request for ranger, Madhan Neethiraj and Ramesh Mani.
> 
> 
> Bugs: RANGER-5111
>     https://issues.apache.org/jira/browse/RANGER-5111
> 
> 
> Repository: ranger
> 
> 
> Description
> -------
> 
> The Summary API should support distinct filtering capabilities that include 
> all unique labels and keywords associated with the datashares returned in the 
> response. Additionally, the API should provide the count of datashares linked 
> to each specific label and keyword. This enhancement ensures that the data 
> can be effectively utilized for advanced filtering on the UI.
> 
> 
> Diffs
> -----
> 
>   agents-common/src/main/java/org/apache/ranger/plugin/model/RangerGds.java 
> 260ebc0a8 
>   security-admin/src/main/java/org/apache/ranger/biz/GdsDBStore.java 
> 7916f0818 
>   security-admin/src/main/java/org/apache/ranger/rest/GdsREST.java 0d3ef3d76 
> 
> 
> Diff: https://reviews.apache.org/r/75347/diff/7/
> 
> 
> Testing
> -------
> 
> Tested locally.
> 
> Summary view with additionalInfo(Labels & Keywords) of DataShares in GDS:
> ------------------------------------------------------------------------
> Request:
> -------
> curl -X GET -u <username>:<pwd> '<ranger 
> url>/service/gds/dataset/enhancedsummary?pageSize=1
> 
> Response: 
> --------
> Response consists of datasetSummary along with additionalInfo
> additionalInfo: map with relevant labels and keywords with dataset counts
> 
> {
>     "startIndex": 0,
>     "pageSize": 1,
>     "totalCount": 10,
>     "resultSize": 1,
>     "sortType": "asc",
>     "sortBy": "datasetId",
>     "queryTimeMS": 1738828193935,
>     "list": [
>         {
>             "id": 9,
>             "guid": "883f33a0-0919-4150-a749-38dead04411d",
>             "isEnabled": true,
>             "createdBy": "Admin",
>             "updatedBy": "Admin",
>             "createTime": 1738200913000,
>             "updateTime": 1738200913000,
>             "version": 1,
>             "name": "ds-1",
>             "description": "test dataset validity",
>             "permissionForCaller": "ADMIN",
>             "principalsCount": {
>                 "ROLE": 0,
>                 "USER": 0,
>                 "GROUP": 0
>             },
>             "aclPrincipalsCount": {
>                 "ROLE": 0,
>                 "USER": 1,
>                 "GROUP": 0
>             },
>             "projectsCount": 0,
>             "totalResourceCount": 0,
>             "validitySchedule": {
>                 "startTime": "2025/01/29 00:00:00",
>                 "endTime": "2025/02/15 00:00:00",
>                 "timeZone": "Pacific/Pitcairn"
>             },
>             "labels": [
>                 "testLabel1"
>             ],
>             "keywords": [
>                 "testKW1"
>             ]
>         }
>     ],
>     "additionalInfo": {
>         "keywordCounts": {
>             "testKW4": 2,
>             "testKW5": 2,
>             "testKW2": 1,
>             "testKW3": 1,
>             "testKW1": 3
>         },
>         "labelCounts": {
>             "testLabel4": 2,
>             "testLabel5": 2,
>             "testLabel2": 1,
>             "testLabel3": 1,
>             "testLabel1": 3
>         }
>     },
>     "listSize": 1
> }
> 
> 
> Thanks,
> 
> Radhika Kundam
> 
>

Reply via email to