Madhan Neethiraj created RANGER-4741: ----------------------------------------
Summary: Hive plugin optimization to avoid excessive metastore API calls Key: RANGER-4741 URL: https://issues.apache.org/jira/browse/RANGER-4741 Project: Ranger Issue Type: Improvement Components: plugins Reporter: Madhan Neethiraj Assignee: Madhan Neethiraj Authorizing access to tables with large number of columns can take a long time, as shown below. Time taken to for a table with 400 columns takes about 100 seconds. {noformat} 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_4000; ... No rows selected (98.674 seconds) 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_1000; ... No rows selected (10.4 seconds) {noformat} For each column referenced in the query, Ranger Hive authorizer calls metastore API to obtain owner of the table. Optimizing to call the metastore API once per table can significantly reduce the time taken to authorize queries. Here is the time taken to query the same tables with the Ranger Hive authorizer optimized to call metastore API only once per table referenced in the query: {noformat} 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_4000; ... No rows selected (1.328 seconds) 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_1000; ... No rows selected (0.194 seconds) {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)