alejandro-rivera opened a new pull request, #39935:
URL: https://github.com/apache/airflow/pull/39935

   ## Description
   
   This PR fixes a bug that makes `AirflowSecurityManagerV2` leave some 
transactions in the `idle in transaction` state.
   
   The problem is due to the combined usage of the `@cached_property` and 
`@provide_session` decorators on top of the `_auth_manager_is_authorized_map` 
method: the `@cached_property` decorator makes the `create_session` context 
manager get consumed and not be available to wrap the nested functions.
   
   The fix presented here just makes sure each of the nested functions has a 
clean session to work with, which gets properly committed and closed.
   
   ## Impact
   
   This problem affects views that perform fine-grained access checks using 
resource ID (for example, editing variables, editing connections, etc).
   
   Apart from the negative impact of having orphaned transactions in the `idle 
in transaction` state, this problem also may lead to 
`sqlalchemy.exc.PendingRollbackError` exceptions whilst trying to edit a 
resource.
   
   The sqlalchemy pool (using the default settings) seems to just chug along 
reusing the connections that have orphaned transactions, until they get 
invalidated. Then, if the `Session` object attached to the security manager's 
nested functions gets to use one of those invalidated connections, an exception 
like the following exception is thrown:
   ```
   Traceback (most recent call last):
     File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", 
line 2529, in wsgi_app
       response = self.full_dispatch_request()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", 
line 1825, in full_dispatch_request
       rv = self.handle_user_exception(e)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", 
line 1823, in full_dispatch_request
       rv = self.dispatch_request()
            ^^^^^^^^^^^^^^^^^^^^^^^
     File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", 
line 1799, in dispatch_request
       return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/www/decorators.py", 
line 159, in wrapper
       return f(*args, **kwargs)
              ^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/www/auth.py", line 
110, in wraps
       if permission_str in self.base_permissions and 
self.appbuilder.sm.has_access(
                                                      
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/www/security_manager.py",
 line 142, in has_access
       return is_authorized_method(action_name, resource_pk, user)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/www/security_manager.py",
 line 318, in <lambda>
       details=VariableDetails(key=get_variable_key(resource_pk)),
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/www/security_manager.py",
 line 227, in get_variable_key
       variable = session.scalar(select(Variable).where(Variable.id == 
resource_pk).limit(1))
                  
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", 
line 1747, in scalar
       return self.execute(
              ^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", 
line 1717, in execute
       result = conn._execute_20(statement, params or {}, execution_options)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", 
line 1710, in _execute_20
       return meth(self, args_10style, kwargs_10style, execution_options)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", 
line 334, in _execute_on_connection
       return connection._execute_clauseelement(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", 
line 1577, in _execute_clauseelement
       ret = self._execute_context(
             ^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", 
line 1808, in _execute_context
       conn = self._revalidate_connection()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", 
line 650, in _revalidate_connection
       self._invalid_transaction()
     File 
"/home/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", 
line 622, in _invalid_transaction
       raise exc.PendingRollbackError(
   sqlalchemy.exc.PendingRollbackError: Can't reconnect until invalid 
transaction is rolled back. (Background on this error at: 
https://sqlalche.me/e/14/8s2b)
   ```
   
   Tweaking the sqlalchemy pool settings and using pgbouncer also seem to play 
a role in mitigating this issue. In any case, there are already some users 
reporting errors editing resources. See 
https://github.com/apache/airflow/issues/39581.
   
   I'm not familiar with the connection invalidation mechanics within 
sqlalchemy pools, but that's definitely a component in generating errors like 
the above.
   
   I stumbled upon this problem in a Postgres-based environment after upgrading 
to Airflow 2.9.1 (Python 3.11). I don't know if other DB backends are equally 
affected.
   
   ## Reproducing
   
   1. Spin up an Airflow environment using the official docker-compose file
   2. Make the adjustments to have access to the postgres container from your 
local machine
   3. Monitor the `idle in transaction` transactions:
   ```sh
   watch -n 1 'psql postgresql://airflow:airflow@postgres-container/airflow -P 
pager=off -c "select * from pg_stat_activity where state = '"'idle in 
transaction'"'"'
   ```
   4. Add various resources (variables, connections, etc) through the web UI
   5. Edit the above resources in the web UI
   6. Observe the `idle in transaction` monitor command show orphaned 
transactions that won't go away...until the webserver is restarted
   
   The problem goes away by applying the code in this PR.
   
   <!--
    Licensed to the Apache Software Foundation (ASF) under one
    or more contributor license agreements.  See the NOTICE file
    distributed with this work for additional information
    regarding copyright ownership.  The ASF licenses this file
    to you under the Apache License, Version 2.0 (the
    "License"); you may not use this file except in compliance
    with the License.  You may obtain a copy of the License at
   
      http://www.apache.org/licenses/LICENSE-2.0
   
    Unless required by applicable law or agreed to in writing,
    software distributed under the License is distributed on an
    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    KIND, either express or implied.  See the License for the
    specific language governing permissions and limitations
    under the License.
    -->
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of an existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   
   
   <!-- Please keep an empty line above the dashes. -->
   ---
   **^ Add meaningful description above**
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to