[
https://issues.apache.org/jira/browse/IMPALA-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004373#comment-17004373
]
ASF subversion and git services commented on IMPALA-9195:
---------------------------------------------------------
Commit 05dfb208ff32d434399cb18219425467b0f9b2a9 in impala's branch
refs/heads/master from xuzhou
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=05dfb20 ]
IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'
If Sentry authorization is enabled, users with multi group-policies
will take time to get the result of 'show tables/databases'. It seems
that ResourceAuthorizationProvider.hasAccess performs bad for users
with complex group-policies, IMPALA-9242 will target to address this
problem.
This patch provides a config option 'num_check_authorization_threads' to
accelerate 'show tables/databases' by using multithreading. This configuration
is applicable only when authorization is enabled. A value of 1 disables
multi-threaded execution for checking access. However, a small value of larger
than 1 may limit the parallism of FE requests when checking authorization with
a high concurrency. The value must be in the range of 1 to 128. The default
value of 'num_check_access_threads' is 1.
Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96
Reviewed-on: http://gerrit.cloudera.org:8080/14846
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Using multithreaded execution to accelerate ‘show tables/databases’
> -------------------------------------------------------------------
>
> Key: IMPALA-9195
> URL: https://issues.apache.org/jira/browse/IMPALA-9195
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: xuzhou
> Assignee: xuzhou
> Priority: Critical
>
> Impala version: 2.12
> Using sentry for authentication
> While users with multi group-policies(group-policy may be nested) executing
> 'show tables/databases',it seems to be awful with a long latency. In my case,
> the database has 910 tables, the user waiting 65.886 seconds to get 160
> tables.
> I study the code and find that while executing Frontend.getTableNames:
> for table in tables:
> for action in actions(all actions defined in DBModelAction):
> ResourceAuthorizationProvider.hasAccess
> It seems that 'hasAccess' is responsable for bad performance while checking
> users with complex group-policies.
> I tried to use 16 threads in getTablesNames and it costs 4.752 seconds in my
> case.
> The code seems to be the same while using sentry service in the latest
> impala. I'm not sure that if any promotion has been done in the latest sentry
> service as I failed to migrate file-based sentry authentication to the sentry
> service. I see that ranger is supported in the latest impala, does ranger
> have the similar problem?
> It seems 'show tables/databases' can benefit from multithreaded execution
> while using sentry , is it reasonable to support such operations in query
> option MT_DOP?
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]