[ 
https://issues.apache.org/jira/browse/IMPALA-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9002 started by Quanlong Huang.
----------------------------------------------
> Add flag to only check SELECT priviledge in GET_TABLES
> ------------------------------------------------------
>
>                 Key: IMPALA-9002
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9002
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Security
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>
> In Frontend.doGetTableNames(), if authorization is enabled, we only return 
> tables that current user has ANY priviledge on them:
> {code:java}
>   private List<String> doGetTableNames(String dbName, PatternMatcher matcher,
>       User user) throws ImpalaException {
>     FeCatalog catalog = getCatalog();
>     List<String> tblNames = catalog.getTableNames(dbName, matcher);
>     if (authzFactory_.getAuthorizationConfig().isEnabled()) {
>       Iterator<String> iter = tblNames.iterator();
>       while (iter.hasNext()) {
>         ......
>         PrivilegeRequest privilegeRequest = new PrivilegeRequestBuilder(
>             authzFactory_.getAuthorizableFactory())
>             .any().onAnyColumn(dbName, tblName, tableOwner).build();  <-- 
> require ANY priviledge here
>         if (!authzChecker_.get().hasAccess(user, privilegeRequest)) {
>           iter.remove();
>         }
>       }
>     }
>     return tblNames;
>   } {code}
> In Sentry integration, checking ANY priviledge will check all possible 
> priviledges, i.e. ALL, OWNER, ALTER, DROP, CREATE, INSERT, SELECT, REFRESH, 
> until one is permitted. In the worst case that current use don't have any 
> priviledge on a table, we need to perform 8 checks on this table.
> {code:java}
> public enum Privilege {
>   ...
>   static {
>     ...
>     ANY.implied_ = EnumSet.of(ALL, OWNER, ALTER, DROP, CREATE, INSERT, SELECT,
>         REFRESH); {code}
> GET_TABLES performance is poor when there're thosands of tables. It's 
> reasonable to only return tables that current user has SELECT priviledge on 
> them. Checking only the SELECT priviledge can boost the perfomance to be 8 
> times better. In my experiment on impala-2.12-cdh5.16.2 with 40k tables, 
> GET_TABLES takes 16s originally when current user only have priviledges on 6 
> tables. With this change, time reduces to 2s.
> We can add a flag to only check on SELECT priviledge for table visuability.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to