[ https://issues.apache.org/jira/browse/RANGER-4959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anand Nadar updated RANGER-4959: -------------------------------- Description: Below is a metrics retrieved for importservicetags api to create tags and tag-resource association. |Duration|10 min| |Successful request |49| |Number of tags for each resource |6 | |Number of columns |100 | |Total resources tag mapping |6*100 = 600| |Total tag resource map in overall |49*600 = 29400 records in db | |Rate per min |2940| * Number of tables = 200k * Number of columns = 500 * Avg tag on each column =6 * Total resources tag mapping = 200000 * 500*6 = 600,000,000 * Time as per rate with 10 threads = 600,000,000/2940 = 204081 min = 141.722917 days Above is the performance of the importservicetags api with 2GB heap memory. Therefore to improve this performance, we are trying to do the below changes. # Remove usage of x_tag table # In the x_tag_resource_map table, the association should be between the tag def id and the resource id. # The x_tag_resource_map will have 2 new columns "tagAttrs" to store the tag attributes and "type" which will be the name of tagDef. # tags_text which is stored in x_service_resource table can have the below data to reduce its size. {code:java} [{"id":1069576,"isEnabled":true,"name":"TAG1","attributes":{"restricted1":"true"}}] {code} id, isEnabled, name - These data will be from x_tag_def attributes - This will be retrieved from the x_tag_resource_map table for that particular resource. The tag owner case is not being handled here. ImportserviceTags flow - create tagDef if not exists - Create service resource if not exists - Create tag Def and resource association with tag attributes - Refresh the tags_text in x_service_resource (This can be handled in a separate thread) The download json structure will be maintained to minimise plugin side changes. The importservicetags json input will remain the same. tag delta and tag dedup will be affected, it needs to be handled accordingly cc: [~madhan] [~avadhavkar] was: Below is a metrics retrieved for importservicetags api to create tags and tag-resource association. |Duration|10 min| |Successful request |49| |Number of tags for each resource |6 | |Number of columns |100 | |Total resources tag mapping |6*100 = 600| |Total tag resource map in overall |49*600 = 29400 records in db | |Rate per min |2940| * Number of tables = 200k * Number of columns = 500 * Avg tag on each column =6 * Total resources tag mapping = 200000 * 500*6 = 600,000,000 * Time as per rate with 10 threads = 600,000,000/2940 = 204081 min = 141.722917 days Above is the performance of the importservicetags api with 2GB heap memory. Therefore to improve this performance, we are trying to do the below changes. # Remove usage of x_tag table # In the x_tag_resource_map table, the association should be between the tag def id and the resource id. # The x_tag_resource_map will have a new column "tagAttrs" to store the tag attributes. # tags_text which is stored in x_service_resource table can have the below data to reduce its size. {code:java} [{"id":1069576,"isEnabled":true,"name":"TAG1","attributes":{"restricted1":"true"}}] {code} id, isEnabled, name - These data will be from x_tag_def attributes - This will be retrieved from the x_tag_resource_map table for that particular resource. The tag owner case is not being handled here. ImportserviceTags flow - create tagDef if not exists - Create service resource if not exists - Create tag Def and resource association with tag attributes - Refresh the tags_text in x_service_resource (This can be handled in a separate thread) The download json structure will be maintained to minimise plugin side changes. The importservicetags json input will remain the same. tag delta and tag dedup will be affected, it needs to be handled accordingly cc: [~madhan] [~avadhavkar] > [Ranger-Admin] Remove use of x_tag table > ---------------------------------------- > > Key: RANGER-4959 > URL: https://issues.apache.org/jira/browse/RANGER-4959 > Project: Ranger > Issue Type: Improvement > Components: admin > Reporter: Anand Nadar > Assignee: Anand Nadar > Priority: Major > > Below is a metrics retrieved for importservicetags api to create tags and > tag-resource association. > |Duration|10 min| > |Successful request |49| > |Number of tags for each resource |6 | > |Number of columns |100 | > |Total resources tag mapping |6*100 = 600| > |Total tag resource map in overall |49*600 = 29400 records in db | > |Rate per min |2940| > * Number of tables = 200k > * Number of columns = 500 > * Avg tag on each column =6 > * Total resources tag mapping = 200000 * 500*6 = 600,000,000 > * Time as per rate with 10 threads = 600,000,000/2940 = 204081 min = > 141.722917 days > Above is the performance of the importservicetags api with 2GB heap memory. > > Therefore to improve this performance, we are trying to do the below changes. > # Remove usage of x_tag table > # In the x_tag_resource_map table, the association should be between the tag > def id and the resource id. > # The x_tag_resource_map will have 2 new columns "tagAttrs" to store the > tag attributes and "type" which will be the name of tagDef. > # tags_text which is stored in x_service_resource table can have the below > data to reduce its size. > {code:java} > [{"id":1069576,"isEnabled":true,"name":"TAG1","attributes":{"restricted1":"true"}}] > {code} > id, isEnabled, name - These data will be from x_tag_def > attributes - This will be retrieved from the x_tag_resource_map table for > that particular resource. > The tag owner case is not being handled here. > ImportserviceTags flow > - create tagDef if not exists > - Create service resource if not exists > - Create tag Def and resource association with tag attributes > - Refresh the tags_text in x_service_resource (This can be handled in a > separate thread) > The download json structure will be maintained to minimise plugin side > changes. > The importservicetags json input will remain the same. > tag delta and tag dedup will be affected, it needs to be handled accordingly > cc: [~madhan] [~avadhavkar] -- This message was sent by Atlassian Jira (v8.20.10#820010)