[ https://issues.apache.org/jira/browse/RANGER-4959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anand Nadar updated RANGER-4959: -------------------------------- Description: Below is a metrics retrieved for importservicetags api to create tags and tag-resource association. |Duration|10 min| |Successful request |49| |Number of tags for each resource |6 | |Number of columns |100 | |Total resources tag mapping |6*100 = 600| |Total tag resource map in overall |49*600 = 29400 records in db | |Rate per min |2940| * Number of tables = 200k * Number of columns = 500 * Avg tag on each column =6 * Total resources tag mapping = 200000 * 500*6 = 600,000,000 * Time as per rate with 10 threads = 600,000,000/2940 = 204081 min = 141.722917 days Above is the performance of the importservicetags api with 2GB heap memory. Therefore to improve this performance, we are trying to do the below changes. # Remove usage of x_tag table # In the x_tag_resource_map table, the association should be between the tag def id and the resource id. # The x_tag_resource_map will have a new column "tagAttrs" to store the tag attributes. # tags_text which is stored in x_service_resource table can have the below data to reduce its size. {code:java} [{"id":1069576,"isEnabled":true,"name":"TAG1","attributes":{"restricted1":"true"}}] {code} id, isEnabled, name - These data will be from x_tag_def attributes - This will be retrieved from the x_tag_resource_map table for that particular resource. The tag owner case is not being handled here. cc: [~madhan] [~avadhavkar] was: Below is a metrics retrieved for importservicetags api to create tags and tag-resource association. |Duration| 10 min| |Successful request |49| |Number of tags for each resource |6 | |Number of columns |100 | |Total resources tag mapping |6*100 = 600| |Total tag resource map in overall |49*600 = 29400 records in db | |Rate per min |2940| * Number of tables = 200k * Number of columns = 500 * Avg tag on each column =6 * Total resources tag mapping = 200000 * 500*6 = 600,000,000 * Time as per rate with 10 threads = 600,000,000/2940 = 204081 min = 141.722917 days Above is the performance of the importservicetags api with 2GB heap memory. Therefore to improve this performance, we are trying to do the below changes. # Remove usage of x_tag table # In the x_tag_resource_map table, the association should be between the tag def id and the resource id. # The x_tag_resource_map will have a new column "tagAttrs" to store the tag attributes. # tags_text which is stored in x_service_resource table can have the below data to reduce its size. {code:java} [{"id":1069576,"isEnabled":true,"name":"TAG1","attributes":{"restricted1":"true"}}] {code} id, isEnabled, name - These data will be from x_tag_def attributes - This will be retrieved from the x_tag_resource_map table for that particular resource. The tag owner case is not being handled here. > [Ranger-Admin] Remove use of x_tag table and tags_text from > x_service_resource table which has duplicate data > ------------------------------------------------------------------------------------------------------------- > > Key: RANGER-4959 > URL: https://issues.apache.org/jira/browse/RANGER-4959 > Project: Ranger > Issue Type: Improvement > Components: admin > Reporter: Anand Nadar > Assignee: Anand Nadar > Priority: Major > > Below is a metrics retrieved for importservicetags api to create tags and > tag-resource association. > |Duration|10 min| > |Successful request |49| > |Number of tags for each resource |6 | > |Number of columns |100 | > |Total resources tag mapping |6*100 = 600| > |Total tag resource map in overall |49*600 = 29400 records in db | > |Rate per min |2940| > * Number of tables = 200k > * Number of columns = 500 > * Avg tag on each column =6 > * Total resources tag mapping = 200000 * 500*6 = 600,000,000 > * Time as per rate with 10 threads = 600,000,000/2940 = 204081 min = > 141.722917 days > Above is the performance of the importservicetags api with 2GB heap memory. > > Therefore to improve this performance, we are trying to do the below changes. > # Remove usage of x_tag table > # In the x_tag_resource_map table, the association should be between the tag > def id and the resource id. > # The x_tag_resource_map will have a new column "tagAttrs" to store the tag > attributes. > # tags_text which is stored in x_service_resource table can have the below > data to reduce its size. > {code:java} > [{"id":1069576,"isEnabled":true,"name":"TAG1","attributes":{"restricted1":"true"}}] > {code} > id, isEnabled, name - These data will be from x_tag_def > attributes - This will be retrieved from the x_tag_resource_map table for > that particular resource. > > The tag owner case is not being handled here. > cc: [~madhan] [~avadhavkar] -- This message was sent by Atlassian Jira (v8.20.10#820010)