[ 
https://issues.apache.org/jira/browse/RANGER-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Madhan Neethiraj updated RANGER-4741:
-------------------------------------
    Attachment: RANGER-4741.patch

> Hive plugin optimization to avoid excessive metastore API calls
> ---------------------------------------------------------------
>
>                 Key: RANGER-4741
>                 URL: https://issues.apache.org/jira/browse/RANGER-4741
>             Project: Ranger
>          Issue Type: Improvement
>          Components: plugins
>            Reporter: Madhan Neethiraj
>            Assignee: Madhan Neethiraj
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: RANGER-4741.patch
>
>
> Authorizing access to tables with large number of columns can take a long 
> time, as shown below. Time taken to for a table with 400 columns takes about 
> 100 seconds.
> {noformat}
> 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_4000;
> ...
> No rows selected (98.674 seconds)
> 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_1000;
> ...
> No rows selected (10.4 seconds)
> {noformat}
>  
> For each column referenced in the query, Ranger Hive authorizer calls 
> metastore API to obtain owner of the table. Optimizing to call the metastore 
> API once per table can significantly reduce the time taken to authorize 
> queries.
> Here is the time taken to query the same tables with the Ranger Hive 
> authorizer optimized to call metastore API only once per table referenced in 
> the query:
> {noformat}
> 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_4000;
> ...
> No rows selected (1.328 seconds)
> 0: jdbc:hive2://localhost:10000> SELECT * FROM large_tbl_1000;
> ...
> No rows selected (0.194 seconds)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to