[
https://issues.apache.org/jira/browse/IMPALA-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on IMPALA-9002 started by Quanlong Huang.
----------------------------------------------
> Add flag to only check SELECT priviledge in GET_TABLES
> ------------------------------------------------------
>
> Key: IMPALA-9002
> URL: https://issues.apache.org/jira/browse/IMPALA-9002
> Project: IMPALA
> Issue Type: Improvement
> Components: Security
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Major
>
> In Frontend.doGetTableNames(), if authorization is enabled, we only return
> tables that current user has ANY priviledge on them:
> {code:java}
> private List<String> doGetTableNames(String dbName, PatternMatcher matcher,
> User user) throws ImpalaException {
> FeCatalog catalog = getCatalog();
> List<String> tblNames = catalog.getTableNames(dbName, matcher);
> if (authzFactory_.getAuthorizationConfig().isEnabled()) {
> Iterator<String> iter = tblNames.iterator();
> while (iter.hasNext()) {
> ......
> PrivilegeRequest privilegeRequest = new PrivilegeRequestBuilder(
> authzFactory_.getAuthorizableFactory())
> .any().onAnyColumn(dbName, tblName, tableOwner).build(); <--
> require ANY priviledge here
> if (!authzChecker_.get().hasAccess(user, privilegeRequest)) {
> iter.remove();
> }
> }
> }
> return tblNames;
> } {code}
> In Sentry integration, checking ANY priviledge will check all possible
> priviledges, i.e. ALL, OWNER, ALTER, DROP, CREATE, INSERT, SELECT, REFRESH,
> until one is permitted. In the worst case that current use don't have any
> priviledge on a table, we need to perform 8 checks on this table.
> {code:java}
> public enum Privilege {
> ...
> static {
> ...
> ANY.implied_ = EnumSet.of(ALL, OWNER, ALTER, DROP, CREATE, INSERT, SELECT,
> REFRESH); {code}
> GET_TABLES performance is poor when there're thosands of tables. It's
> reasonable to only return tables that current user has SELECT priviledge on
> them. Checking only the SELECT priviledge can boost the perfomance to be 8
> times better. In my experiment on impala-2.12-cdh5.16.2 with 40k tables,
> GET_TABLES takes 16s originally when current user only have priviledges on 6
> tables. With this change, time reduces to 2s.
> We can add a flag to only check on SELECT priviledge for table visuability.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]