Madhan Neethiraj created RANGER-4741:
----------------------------------------
Summary: Hive plugin optimization to avoid excessive metastore API
calls
Key: RANGER-4741
URL: https://issues.apache.org/jira/browse/RANGER-4741
Project: Ranger
Issue Type: Improvement
Components: plugins
Reporter: Madhan Neethiraj
Assignee: Madhan Neethiraj
Authorizing access to tables with large number of columns can take a long time,
as shown below. Time taken to for a table with 400 columns takes about 100
seconds.
{noformat}
0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_4000;
...
No rows selected (98.674 seconds)
0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_1000;
...
No rows selected (10.4 seconds)
{noformat}
For each column referenced in the query, Ranger Hive authorizer calls metastore
API to obtain owner of the table. Optimizing to call the metastore API once per
table can significantly reduce the time taken to authorize queries.
Here is the time taken to query the same tables with the Ranger Hive authorizer
optimized to call metastore API only once per table referenced in the query:
{noformat}
0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_4000;
...
No rows selected (1.328 seconds)
0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_1000;
...
No rows selected (0.194 seconds)
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)