> On Jan. 28, 2025, 8:12 a.m., Madhan Neethiraj wrote: > > agents-common/src/main/java/org/apache/ranger/plugin/model/RangerGds.java > > Lines 880 (patched) > > <https://reviews.apache.org/r/75347/diff/2/?file=2297494#file2297494line880> > > > > `filters` doesn't seem to be the appropriate name for this field. > > Looking at its contents in "Testing Done" section, it contains counts of > > labels and keywords. Are these counts specific to datasets included in > > `datasetSummary` field? Or are these counts across all datasets? > > > > I suggest to consider separate APIs to get summary of labels and > > keywords - like: > > > > service/gds/labels/summary > > service/gds/keywords/summary > > Radhika Kundam wrote: > Regarding the name 'filters': Yes I updated the patch with additionalInfo. > Regarding Counts in filters section: These counts are for only for the > datasets matches to the search criteria of that particular summary request > but before applying the pagination. > This jira is raised to see all the labels and keywords for only datasets > which matches to search criteria, having separate apis to get labels and > keywords doesn't solve the purpose. > > Madhan Neethiraj wrote: > Radhika - do dataset details in the response include labels and keywords > added to each dataset? > - if yes, what is the need for aggregating them in additionalInfo? > - if not, can dataset details be updated to include labels and keywords?
Currently, the summary API response is limited by pagination, returning only a subset of datasets per page. While each dataset summary includes labels and keywords, this restriction prevents retrieving a complete list of matching labels and keywords across all datasets. Challenges: 1. Labels and keywords are only available for datasets on the current page. 2. A full list of labels and keywords across all matching datasets is not accessible, making it difficult to filter datasets effectively. Solution: To overcome this limitation, a new endpoint, /enhancedsummary, will be introduced. This enhanced API will return dataset summaries along with aggregated labels and keywords, ensuring users have access to the full set of matching values regardless of pagination. Note: The existing /summary API will remain unchanged. Example Scenario: Consider a total of 100 datasets, where filtering criteria result in 50 matching datasets. With a page size of 10, the current summary API only provides labels and keywords for those 10 datasets at a time. This makes it difficult for users to further refine the 50 matching datasets based on labels and keywords. With /enhancedsummary, users will receive a consolidated list of labels and keywords across all 50 matching datasets, allowing for seamless filtering without pagination constraints. - Radhika ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/75347/#review227200 ----------------------------------------------------------- On Jan. 30, 2025, 7:15 p.m., Radhika Kundam wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/75347/ > ----------------------------------------------------------- > > (Updated Jan. 30, 2025, 7:15 p.m.) > > > Review request for ranger, Madhan Neethiraj and Ramesh Mani. > > > Bugs: RANGER-5111 > https://issues.apache.org/jira/browse/RANGER-5111 > > > Repository: ranger > > > Description > ------- > > The Summary API should support distinct filtering capabilities that include > all unique labels and keywords associated with the datashares returned in the > response. Additionally, the API should provide the count of datashares linked > to each specific label and keyword. This enhancement ensures that the data > can be effectively utilized for advanced filtering on the UI. > > > Diffs > ----- > > agents-common/src/main/java/org/apache/ranger/plugin/model/RangerGds.java > 260ebc0a8 > security-admin/src/main/java/org/apache/ranger/biz/GdsDBStore.java > 7916f0818 > security-admin/src/main/java/org/apache/ranger/rest/GdsREST.java 0d3ef3d76 > > > Diff: https://reviews.apache.org/r/75347/diff/4/ > > > Testing > ------- > > Tested locally. > > Summary view with additionalInfo(Labels & Keywords) of DataShares in GDS: > ------------------------------------------------------------------------ > Request: > ------- > curl -X GET -u <username>:<pwd> '<ranger > url>/service/gds/dataset/enhancedsummary?pageSize=1 > > Response: > -------- > Response consists of datasetSummary and filters > datasetSummary: list of all datasets > filters: map with relevant labels and keywords with dataset counts > > { > "datasetSummary": { > "startIndex": 0, > "pageSize": 1, > "totalCount": 8, > "resultSize": 1, > "sortType": "asc", > "sortBy": "datasetId", > "queryTimeMS": 1737664148576, > "list": [ > { > "id": 1, > "guid": "87662f8e-57af-40e3-8c92-45c108d474ac", > "isEnabled": true, > "createdBy": "Admin", > "updatedBy": "Admin", > "createTime": 1736362927000, > "updateTime": 1736362927000, > "version": 1, > "name": "dataset-1", > "permissionForCaller": "ADMIN", > "principalsCount": { > "USER": 1, > "GROUP": 1, > "ROLE": 0 > }, > "aclPrincipalsCount": { > "USER": 1, > "GROUP": 0, > "ROLE": 0 > }, > "projectsCount": 0, > "totalResourceCount": 4, > "dataShares": [ > { > "id": 1, > "guid": "d4596038-122d-476f-a5e5-55937e87e011", > "isEnabled": true, > "createdBy": "Admin", > "updatedBy": "Admin", > "createTime": 1736362834000, > "updateTime": 1736362834000, > "version": 1, > "dataShareId": 1, > "dataShareName": "datashare-1", > "serviceId": 6, > "serviceName": "cm_hive", > "zoneName": " ", > "resourceCount": 4, > "shareStatus": "ACTIVE", > "approver": "admin" > } > ] > } > ], > "listSize": 1 > }, > "additionalInfo": { > "aggregatedKeywords": { > "kw1": 2, > "testKW1": 1, > "kw2": 2, > "kw21": 3, > "kw11": 3 > }, > "aggregatedLabels": { > "abc1": 2, > "test2": 2, > "test21": 3, > "abc111": 3, > "testLabel1": 1 > } > } > } > > > Thanks, > > Radhika Kundam > >