[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error
[ https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708512#comment-16708512 ] Yuvaraj commented on AIRFLOW-3405: -- We are in process of upgrading to 1.10.1. Will keep posted if this gets fixed in latest. > Task instance fail intermittently due to MySQL error > > > Key: AIRFLOW-3405 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3405 > Project: Apache Airflow > Issue Type: Improvement > Environment: MySQL, Redhat Linux >Reporter: Yuvaraj >Priority: Major > Labels: performance, usability > > Dags are getting failed intermittently due to below error. > OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many > connections') > [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded! > We have max_connections defined as 2000 in DB. > Below are the setting in cfg. > sql_alchemy_pool_size = 1980 > sql_alchemy_pool_recycle = 3600 > As per DBA, The airflow scheduler keeps opening connections to the database, > these connections are mostly idle, they get reset whenever the scheduler > restarts but with max_connections at 2000 and scheduler holding on to 1600 of > these, other apps trying to connect might start running out of connections. > How do we remediate these idle connections. What should be the optimal value > for these configs and max_connections that to be set at DB. Consider we need > to build a large environment serving 500+ definitions with 1+ runs per > day. Need suggestions... > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error
[ https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701772#comment-16701772 ] Ash Berlin-Taylor commented on AIRFLOW-3405: 1.10.0 - https://issues.apache.org/jira/browse/AIRFLOW-1559 was the issue I was thinking of > Task instance fail intermittently due to MySQL error > > > Key: AIRFLOW-3405 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3405 > Project: Apache Airflow > Issue Type: Improvement > Environment: MySQL, Redhat Linux >Reporter: Yuvaraj >Priority: Major > Labels: performance, usability > > Dags are getting failed intermittently due to below error. > OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many > connections') > [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded! > We have max_connections defined as 2000 in DB. > Below are the setting in cfg. > sql_alchemy_pool_size = 1980 > sql_alchemy_pool_recycle = 3600 > As per DBA, The airflow scheduler keeps opening connections to the database, > these connections are mostly idle, they get reset whenever the scheduler > restarts but with max_connections at 2000 and scheduler holding on to 1600 of > these, other apps trying to connect might start running out of connections. > How do we remediate these idle connections. What should be the optimal value > for these configs and max_connections that to be set at DB. Consider we need > to build a large environment serving 500+ definitions with 1+ runs per > day. Need suggestions... > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error
[ https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701769#comment-16701769 ] Ash Berlin-Taylor commented on AIRFLOW-3405: The first thing I would suggest trying is 1.10.1 if you can - I think there was some work done in 1.9.0 or 1.10.0 to reduce the number of connections workers used. > Task instance fail intermittently due to MySQL error > > > Key: AIRFLOW-3405 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3405 > Project: Apache Airflow > Issue Type: Improvement > Environment: MySQL, Redhat Linux >Reporter: Yuvaraj >Priority: Major > Labels: performance, usability > > Dags are getting failed intermittently due to below error. > OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many > connections') > [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded! > We have max_connections defined as 2000 in DB. > Below are the setting in cfg. > sql_alchemy_pool_size = 1980 > sql_alchemy_pool_recycle = 3600 > As per DBA, The airflow scheduler keeps opening connections to the database, > these connections are mostly idle, they get reset whenever the scheduler > restarts but with max_connections at 2000 and scheduler holding on to 1600 of > these, other apps trying to connect might start running out of connections. > How do we remediate these idle connections. What should be the optimal value > for these configs and max_connections that to be set at DB. Consider we need > to build a large environment serving 500+ definitions with 1+ runs per > day. Need suggestions... > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error
[ https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701411#comment-16701411 ] Yuvaraj commented on AIRFLOW-3405: -- Version 1.8.1 > Task instance fail intermittently due to MySQL error > > > Key: AIRFLOW-3405 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3405 > Project: Apache Airflow > Issue Type: Improvement > Environment: MySQL, Redhat Linux >Reporter: Yuvaraj >Priority: Major > Labels: performance, usability > > Dags are getting failed intermittently due to below error. > OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many > connections') > [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded! > We have max_connections defined as 2000 in DB. > Below are the setting in cfg. > sql_alchemy_pool_size = 1980 > sql_alchemy_pool_recycle = 3600 > As per DBA, The airflow scheduler keeps opening connections to the database, > these connections are mostly idle, they get reset whenever the scheduler > restarts but with max_connections at 2000 and scheduler holding on to 1600 of > these, other apps trying to connect might start running out of connections. > How do we remediate these idle connections. What should be the optimal value > for these configs and max_connections that to be set at DB. Consider we need > to build a large environment serving 500+ definitions with 1+ runs per > day. Need suggestions... > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error
[ https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700466#comment-16700466 ] Ash Berlin-Taylor commented on AIRFLOW-3405: If you are up for trying the bleeding edge version of Airflow (warning! there may be bugs in there!) then this PR[1] may help by reducing the number of pool slots you need - it should be possible to run with a much much smaller SQLA pool size in theory. [1]: https://github.com/apache/incubator-airflow/pull/4234 > Task instance fail intermittently due to MySQL error > > > Key: AIRFLOW-3405 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3405 > Project: Apache Airflow > Issue Type: Improvement > Environment: MySQL, Redhat Linux >Reporter: Yuvaraj >Priority: Major > Labels: performance, usability > > Dags are getting failed intermittently due to below error. > OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many > connections') > [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded! > We have max_connections defined as 2000 in DB. > Below are the setting in cfg. > sql_alchemy_pool_size = 1980 > sql_alchemy_pool_recycle = 3600 > As per DBA, The airflow scheduler keeps opening connections to the database, > these connections are mostly idle, they get reset whenever the scheduler > restarts but with max_connections at 2000 and scheduler holding on to 1600 of > these, other apps trying to connect might start running out of connections. > How do we remediate these idle connections. What should be the optimal value > for these configs and max_connections that to be set at DB. Consider we need > to build a large environment serving 500+ definitions with 1+ runs per > day. Need suggestions... > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error
[ https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700460#comment-16700460 ] Ash Berlin-Taylor commented on AIRFLOW-3405: That is a huge number of connections and is something we'd want to fix. What version of Airflow are you running on? > Task instance fail intermittently due to MySQL error > > > Key: AIRFLOW-3405 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3405 > Project: Apache Airflow > Issue Type: Improvement > Environment: MySQL, Redhat Linux >Reporter: Yuvaraj >Priority: Major > Labels: performance, usability > > Dags are getting failed intermittently due to below error. > OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many > connections') > [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded! > We have max_connections defined as 2000 in DB. > Below are the setting in cfg. > sql_alchemy_pool_size = 1980 > sql_alchemy_pool_recycle = 3600 > As per DBA, The airflow scheduler keeps opening connections to the database, > these connections are mostly idle, they get reset whenever the scheduler > restarts but with max_connections at 2000 and scheduler holding on to 1600 of > these, other apps trying to connect might start running out of connections. > How do we remediate these idle connections. What should be the optimal value > for these configs and max_connections that to be set at DB. Consider we need > to build a large environment serving 500+ definitions with 1+ runs per > day. Need suggestions... > -- This message was sent by Atlassian JIRA (v7.6.3#76005)