Stephen Harrison created AIRFLOW-6729:
-----------------------------------------
Summary: LDAP Authorisation very slow on large AD
Key: AIRFLOW-6729
URL: https://issues.apache.org/jira/browse/AIRFLOW-6729
Project: Apache Airflow
Issue Type: Bug
Components: authentication
Affects Versions: 1.10.2
Environment: Centos 7.6, Python 3.6, AWS EC2 t2.medium
Reporter: Stephen Harrison
Current code in ldap_auth.py is very slow on my organisations AD Forest.
Currently the ldap query used by this code, when tested using ldapsearch on the
command line, takes around 30 seconds to execute. This affects the perceived
performance of the UI considerably as the ldap authorisation code appears to be
executed on every server round trip.
I think the main problem lies in the function group_user_contains, which
searches for the group and then checks to see if the user exists within the
returned group. Because groups and users are stored in different trees of AD
then this query has no choice but to use the common root for the query and
hence in our case takes a long time.
I have modified to above function locally to add the user query at the end of
the search for the group, i.e.:
{{def group_contains_user(conn, search_base, group_filter, user_name_attr,
username):}}
{{ search_filter = '(&(\{0})(\{1}=\{2}))'.format(group_filter,
user_name_attr, username)}}
the original search filter was:
{{ search_filter = '(&(\{0}))'.format(group_filter)}}
I have not changed the function signature but now the search returns the answer
in fractions of a second, e.g. 0.1 seconds using ldapsearch to perform the same
query.
Not this also makes the loop at the end of the function mostly redundant as a
check that an entry has been returned would be sufficient.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)