[ https://issues.apache.org/jira/browse/AIRFLOW-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066365#comment-17066365 ]
Nguyen Lam Phuc commented on AIRFLOW-6912: ------------------------------------------ [~ash] Is there a way for us to share the dag with you securely? via email maybe? > Airflow unable to run concurrent ssh tasks (up to 27) in a single dag > --------------------------------------------------------------------- > > Key: AIRFLOW-6912 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6912 > Project: Apache Airflow > Issue Type: Bug > Components: executors, scheduler > Affects Versions: 1.10.9 > Environment: Kubernetes 1.13 > Reporter: Nguyen Lam Phuc > Priority: Major > Attachments: version_1_10_7-working.png, version_1_10_9_break.png > > > *Current working Airflow version:* 1.10.7 > *Environment:* Kubernetes 1.13, Helm chart 5.2.4 > *Airflow version that breaks:* 1.10.9 > *Description:* > * We have a list of _ssh_operators_ tasks in a dag that need to be executed > in parallel (as shown in the screenshot) and everything is working fine at > Airflow version _1.10.7_ > * We tried to update Airflow to version _1.10.9_ and the tasks break in > random orders and number. (as shown in the screenshot) > * Here are some of the error that we collected: > ** (psycopg2.OperationalError) FATAL: remaining connection slots are > reserved for non-replication superuser connections > ** Executor reports task instance <TaskInstance: <dag_name>.<task_name> > 2020-02-20 12:20:00+00:00 [queued]> finished (failed) although the task says > its queued. Was the task killed externally? > ** <TaskInstance: <dag_name>.<task_name> 2020-02-20 08:20:00+00:00 > [running]> detected as zombie > ** (psycopg2.OperationalError) FATAL: remaining connection slots are > reserved for non-replication superuser connections > *Actions taken:* > * We suspected that there were not enough database connections so we > increased the _AIRFLOW__CORE__SQL_ALCHEMY_POOL_SIZE_ value from 5 to 50 but > the problem still persists. > * We reverted the version back to _1.10.7_ and everything works as per > normal. -- This message was sent by Atlassian Jira (v8.3.4#803005)