coolacid opened a new issue #16271:
URL: https://github.com/apache/airflow/issues/16271


   **Apache Airflow version**: apache/airflow:2.0.2-python3.6
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl 
version`): None
   
   **Environment**: Docker
   
   - **Cloud provider or hardware configuration**: Desktop machine, docker 
instance
   - **OS** (e.g. from /etc/os-release): 
   - **Kernel** (e.g. `uname -a`):
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   
   Dangling database connection within the celery executor. 
   
   ```
   mysql_1      | 2021-06-04T16:02:07.342024Z 3016 [Note] Got an error reading 
communication packets
   ```
   and 
   
   ```
   mysql_1      | 2021-06-04T02:20:19.124785Z 487 [Note] Aborted connection 487 
to db: 'airflow' user: 'root' host: '10.10.4.6' (Got an error reading 
communication packets)
   ```
   
   **What you expected to happen**:
   
   All connections closed. 
   
   **How to reproduce it**:
   
   Very simple Docker-compose setup, monitor mysql log, restart the celery 
worker, warning/error displays in MySQL logs. 
   
   **Anything else we need to know**:
   
   First, I added a debug message (`log.debug(engine.pool.status())`) to the 
`dispose_orm()` function in airflow.settings within the `if engine` block, this 
shows us the state of pools during the atexit ORM disposal. The result of this 
shows the following:
   
   ```
   af_worker    | [2021-06-04 15:42:03,284: DEBUG/MainProcess] Disposing DB 
connection pool (PID 7)
   af_worker    | [2021-06-04 15:42:03,285: WARNING/MainProcess] Pool size: 5  
Connections in pool: 1 Current Overflow: -4 Current Checked out connections: 0
   ```
   This shows that after the `Session.remove()` call, there still exists one 
connection in the pool. Calling `engine.dopose()` does not release this 
connection since it's currently "active", it just unlinks it from the engine 
pool. 
   
   To investigate further, we can look at the active process list within MySQL, 
here we see one connection still coming from our worker docker container: 
(Note, removed other connections not related)
   
   ```
   mysql> show processlist;
   
+------+------+-----------------+---------+---------+------+----------+------------------+
   | Id   | User | Host            | db      | Command | Time | State    | Info 
            |
   
+------+------+-----------------+---------+---------+------+----------+------------------+
   | 3017 | root | 10.10.4.6:40242 | airflow | Sleep   |   12 |          | NULL 
            |
   ```
   
   We can see what queries are run by any connection by enabling the 
`general_log` in mysql with `SET GLOBAL general_log = 'ON';`
   
   This yields the following query: 
   
   ```
   2021-06-04T16:02:09.215732Z      3017 Connect   [email protected] on airflow 
using TCP/IP
   2021-06-04T16:02:09.215907Z      3017 Query     set autocommit=0
   2021-06-04T16:02:09.216057Z      3017 Query     SET NAMES utf8mb4
   2021-06-04T16:02:09.216276Z      3017 Query     SHOW VARIABLES LIKE 
'sql_mode'
   2021-06-04T16:02:09.217518Z      3017 Query     SHOW VARIABLES LIKE 
'lower_case_table_names'
   2021-06-04T16:02:09.218483Z      3017 Query     SELECT VERSION()
   2021-06-04T16:02:09.218672Z      3017 Query     SELECT DATABASE()
   2021-06-04T16:02:09.218833Z      3017 Query     SELECT 
@@transaction_isolation
   2021-06-04T16:02:09.219010Z      3017 Query     show collation where 
`Charset` = 'utf8mb4' and `Collation` = 'utf8mb4_bin'
   2021-06-04T16:02:09.220013Z      3017 Query     SELECT CAST('test plain 
returns' AS CHAR(60)) AS anon_1
   2021-06-04T16:02:09.220295Z      3017 Query     SELECT CAST('test unicode 
returns' AS CHAR(60)) AS anon_1
   2021-06-04T16:02:09.220652Z      3017 Query     SELECT CAST('test collated 
returns' AS CHAR CHARACTER SET utf8mb4) COLLATE utf8mb4_bin AS anon_1
   2021-06-04T16:02:09.220778Z      3017 Query     rollback
   2021-06-04T16:02:09.220941Z      3017 Query     SET NAMES utf8mb4
   2021-06-04T16:02:09.221058Z      3017 Query     SET time_zone = '+00:00'
   2021-06-04T16:02:09.222799Z      3017 Query     INSERT INTO log (dttm, 
dag_id, task_id, event, execution_date, owner, extra) VALUES ('2021-06-04 
16:02:09.210482', NULL, NULL, 'cli_worker', NULL, 'airflow', '{\"host_name\": 
\"469064a5afb7\", \"full_command\": \"[\'/home/airflow/.local/bin/airflow\', 
\'celery\', \'worker\']\"}')
   2021-06-04T16:02:09.223199Z      3017 Query     commit
   2021-06-04T16:02:09.226052Z      3017 Query     rollback
   ```
   
   This log insert matches with pcaps I previously captured showing the log 
insert connection without a matching `Quit` MySQL command. 
   
   My suspicion is, the addition of this log is not releasing the DB session. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to