[ 
https://issues.apache.org/jira/browse/IMPALA-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004373#comment-17004373
 ] 

ASF subversion and git services commented on IMPALA-9195:
---------------------------------------------------------

Commit 05dfb208ff32d434399cb18219425467b0f9b2a9 in impala's branch 
refs/heads/master from xuzhou
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=05dfb20 ]

IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'

If Sentry authorization is enabled, users with multi group-policies
will take time to get the result of 'show tables/databases'. It seems
that ResourceAuthorizationProvider.hasAccess performs bad for users
with complex group-policies, IMPALA-9242 will target to address this
problem.

This patch provides a config option 'num_check_authorization_threads' to
accelerate 'show tables/databases' by using multithreading. This configuration
is applicable only when authorization is enabled. A value of 1 disables
multi-threaded execution for checking access. However, a small value of larger
than 1 may limit the parallism of FE requests when checking authorization with
a high concurrency. The value must be in the range of 1 to 128. The default
value of 'num_check_access_threads' is 1.

Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96
Reviewed-on: http://gerrit.cloudera.org:8080/14846
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Using multithreaded execution to accelerate ‘show tables/databases’
> -------------------------------------------------------------------
>
>                 Key: IMPALA-9195
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9195
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>            Reporter: xuzhou
>            Assignee: xuzhou
>            Priority: Critical
>
> Impala version: 2.12
> Using sentry for authentication
> While users with multi group-policies(group-policy may be nested) executing 
> 'show tables/databases',it seems to be awful with a long latency. In my case, 
> the database has 910 tables, the user waiting 65.886 seconds to get 160 
> tables.  
> I study the code and find that while executing Frontend.getTableNames:
> for table in tables:
>     for action in actions(all actions defined in DBModelAction):
>        ResourceAuthorizationProvider.hasAccess
> It seems that 'hasAccess' is responsable for bad performance while checking 
> users with complex group-policies. 
> I tried to use 16 threads in getTablesNames and it costs 4.752 seconds in my 
> case.  
> The code seems to be the same while using sentry service in the latest 
> impala. I'm not sure that if any promotion has been done in the latest sentry 
> service as I failed to migrate file-based sentry authentication to the sentry 
> service. I see that ranger is supported in the latest impala, does ranger 
> have the similar problem? 
> It seems 'show tables/databases' can benefit from multithreaded execution 
> while using sentry , is it reasonable to support such operations in query 
> option MT_DOP?
>     
>         



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to