Stephen Harrison created AIRFLOW-6729:
-----------------------------------------

             Summary: LDAP Authorisation very slow on large AD
                 Key: AIRFLOW-6729
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6729
             Project: Apache Airflow
          Issue Type: Bug
          Components: authentication
    Affects Versions: 1.10.2
         Environment: Centos 7.6, Python 3.6, AWS EC2 t2.medium
            Reporter: Stephen Harrison


Current code in ldap_auth.py is very slow on my organisations AD Forest. 
Currently the ldap query used by this code, when tested using ldapsearch on the 
command line, takes around 30 seconds to execute. This affects the perceived 
performance of the UI considerably as the ldap authorisation code appears to be 
executed on every server round trip.

I think the main problem lies in the function group_user_contains, which 
searches for the group and then checks to see if the user exists within the 
returned group. Because groups and users are stored in different trees of AD 
then this query has no choice but to use the common root for the query and 
hence in our case takes a long time.

I have modified to above function locally to add the user query at the end of 
the search for the group, i.e.:

{{def group_contains_user(conn, search_base, group_filter, user_name_attr, 
username):}}
{{    search_filter = '(&(\{0})(\{1}=\{2}))'.format(group_filter, 
user_name_attr, username)}}

the original search filter was:

{{    search_filter = '(&(\{0}))'.format(group_filter)}}

I have not changed the function signature but now the search returns the answer 
in fractions of a second, e.g. 0.1 seconds using ldapsearch to perform the same 
query.

Not this also makes the loop at the end of the function mostly redundant as a 
check that an entry has been returned would be sufficient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to