[
https://issues.apache.org/jira/browse/RANGER-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Madhan Neethiraj updated RANGER-4741:
-------------------------------------
Attachment: RANGER-4741.patch
> Hive plugin optimization to avoid excessive metastore API calls
> ---------------------------------------------------------------
>
> Key: RANGER-4741
> URL: https://issues.apache.org/jira/browse/RANGER-4741
> Project: Ranger
> Issue Type: Improvement
> Components: plugins
> Reporter: Madhan Neethiraj
> Assignee: Madhan Neethiraj
> Priority: Major
> Fix For: 3.0.0
>
> Attachments: RANGER-4741.patch
>
>
> Authorizing access to tables with large number of columns can take a long
> time, as shown below. Time taken to for a table with 400 columns takes about
> 100 seconds.
> {noformat}
> 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_4000;
> ...
> No rows selected (98.674 seconds)
> 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_1000;
> ...
> No rows selected (10.4 seconds)
> {noformat}
>
> For each column referenced in the query, Ranger Hive authorizer calls
> metastore API to obtain owner of the table. Optimizing to call the metastore
> API once per table can significantly reduce the time taken to authorize
> queries.
> Here is the time taken to query the same tables with the Ranger Hive
> authorizer optimized to call metastore API only once per table referenced in
> the query:
> {noformat}
> 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_4000;
> ...
> No rows selected (1.328 seconds)
> 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_1000;
> ...
> No rows selected (0.194 seconds)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)