xuzhou created IMPALA-9195:
------------------------------

             Summary: Using multithreaded execution to accelerate ‘show 
tables/databases’
                 Key: IMPALA-9195
                 URL: https://issues.apache.org/jira/browse/IMPALA-9195
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: xuzhou


Impala version: 2.12

Using sentry for authentication

While users with multi group-policies(group-policy may be nested) executing 
'show tables/databases',it seems to be awful with a long latency. In my case, 
the database has 910 tables, the user waiting 65.886 seconds to get 160 tables. 
 

I study the code and find that while executing Frontend.getTableNames:

for table in tables:

    for action in actions(all actions defined in DBModelAction):

       ResourceAuthorizationProvider.hasAccess

It seems that 'hasAccess' is responsable for bad performance while checking 
users with complex group-policies. 

I tried to use 16 threads in getTablesNames and it costs 4.752 seconds in my 
case.  

The code seems to be the same while using sentry service in the latest impala. 
I'm not sure that if any promotion has been done in the latest sentry service 
as I failed to migrate file-based sentry authentication to the sentry service. 
I see that ranger is supported in the latest impala, does ranger have the 
similar problem? 

It seems 'show tables/databases' can benefit from multithreaded execution while 
using sentry , is it reasonable to support such operations in query option 
MT_DOP?

    

        



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to