[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error

2018-12-04 Thread Yuvaraj (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708512#comment-16708512
 ] 

Yuvaraj commented on AIRFLOW-3405:
--

We are in process of upgrading to 1.10.1. Will keep posted if this gets fixed 
in latest.

> Task instance fail intermittently due to MySQL error
> 
>
> Key: AIRFLOW-3405
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3405
> Project: Apache Airflow
>  Issue Type: Improvement
> Environment: MySQL, Redhat Linux
>Reporter: Yuvaraj
>Priority: Major
>  Labels: performance, usability
>
> Dags are getting failed intermittently due to below error. 
> OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many 
> connections')
> [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded!
> We have max_connections defined as 2000 in DB. 
> Below are the setting in cfg.
> sql_alchemy_pool_size = 1980
> sql_alchemy_pool_recycle = 3600
> As per DBA, The airflow scheduler keeps opening connections to the database, 
> these connections are mostly idle, they get reset whenever the scheduler 
> restarts but with max_connections at 2000 and scheduler holding on to 1600 of 
> these, other apps trying to connect might start running out of connections.
> How do we remediate these idle connections. What should be the optimal value 
> for these configs and max_connections that to be set at DB. Consider we need 
> to build a large environment serving 500+ definitions with 1+ runs per 
> day. Need suggestions...  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error

2018-11-28 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701772#comment-16701772
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3405:


1.10.0 - https://issues.apache.org/jira/browse/AIRFLOW-1559 was the issue I was 
thinking of

> Task instance fail intermittently due to MySQL error
> 
>
> Key: AIRFLOW-3405
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3405
> Project: Apache Airflow
>  Issue Type: Improvement
> Environment: MySQL, Redhat Linux
>Reporter: Yuvaraj
>Priority: Major
>  Labels: performance, usability
>
> Dags are getting failed intermittently due to below error. 
> OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many 
> connections')
> [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded!
> We have max_connections defined as 2000 in DB. 
> Below are the setting in cfg.
> sql_alchemy_pool_size = 1980
> sql_alchemy_pool_recycle = 3600
> As per DBA, The airflow scheduler keeps opening connections to the database, 
> these connections are mostly idle, they get reset whenever the scheduler 
> restarts but with max_connections at 2000 and scheduler holding on to 1600 of 
> these, other apps trying to connect might start running out of connections.
> How do we remediate these idle connections. What should be the optimal value 
> for these configs and max_connections that to be set at DB. Consider we need 
> to build a large environment serving 500+ definitions with 1+ runs per 
> day. Need suggestions...  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error

2018-11-28 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701769#comment-16701769
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3405:


The first thing I would suggest trying is 1.10.1 if you can - I think there was 
some work done in 1.9.0 or 1.10.0 to reduce the number of connections workers 
used.

> Task instance fail intermittently due to MySQL error
> 
>
> Key: AIRFLOW-3405
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3405
> Project: Apache Airflow
>  Issue Type: Improvement
> Environment: MySQL, Redhat Linux
>Reporter: Yuvaraj
>Priority: Major
>  Labels: performance, usability
>
> Dags are getting failed intermittently due to below error. 
> OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many 
> connections')
> [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded!
> We have max_connections defined as 2000 in DB. 
> Below are the setting in cfg.
> sql_alchemy_pool_size = 1980
> sql_alchemy_pool_recycle = 3600
> As per DBA, The airflow scheduler keeps opening connections to the database, 
> these connections are mostly idle, they get reset whenever the scheduler 
> restarts but with max_connections at 2000 and scheduler holding on to 1600 of 
> these, other apps trying to connect might start running out of connections.
> How do we remediate these idle connections. What should be the optimal value 
> for these configs and max_connections that to be set at DB. Consider we need 
> to build a large environment serving 500+ definitions with 1+ runs per 
> day. Need suggestions...  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error

2018-11-27 Thread Yuvaraj (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701411#comment-16701411
 ] 

Yuvaraj commented on AIRFLOW-3405:
--

Version 1.8.1

> Task instance fail intermittently due to MySQL error
> 
>
> Key: AIRFLOW-3405
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3405
> Project: Apache Airflow
>  Issue Type: Improvement
> Environment: MySQL, Redhat Linux
>Reporter: Yuvaraj
>Priority: Major
>  Labels: performance, usability
>
> Dags are getting failed intermittently due to below error. 
> OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many 
> connections')
> [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded!
> We have max_connections defined as 2000 in DB. 
> Below are the setting in cfg.
> sql_alchemy_pool_size = 1980
> sql_alchemy_pool_recycle = 3600
> As per DBA, The airflow scheduler keeps opening connections to the database, 
> these connections are mostly idle, they get reset whenever the scheduler 
> restarts but with max_connections at 2000 and scheduler holding on to 1600 of 
> these, other apps trying to connect might start running out of connections.
> How do we remediate these idle connections. What should be the optimal value 
> for these configs and max_connections that to be set at DB. Consider we need 
> to build a large environment serving 500+ definitions with 1+ runs per 
> day. Need suggestions...  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error

2018-11-27 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700466#comment-16700466
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3405:


If you are up for trying the bleeding edge version of Airflow (warning! there 
may be bugs in there!) then this PR[1] may help by reducing the number of pool 
slots you need - it should be possible to run with a much much smaller SQLA 
pool size in theory.

[1]: https://github.com/apache/incubator-airflow/pull/4234

> Task instance fail intermittently due to MySQL error
> 
>
> Key: AIRFLOW-3405
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3405
> Project: Apache Airflow
>  Issue Type: Improvement
> Environment: MySQL, Redhat Linux
>Reporter: Yuvaraj
>Priority: Major
>  Labels: performance, usability
>
> Dags are getting failed intermittently due to below error. 
> OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many 
> connections')
> [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded!
> We have max_connections defined as 2000 in DB. 
> Below are the setting in cfg.
> sql_alchemy_pool_size = 1980
> sql_alchemy_pool_recycle = 3600
> As per DBA, The airflow scheduler keeps opening connections to the database, 
> these connections are mostly idle, they get reset whenever the scheduler 
> restarts but with max_connections at 2000 and scheduler holding on to 1600 of 
> these, other apps trying to connect might start running out of connections.
> How do we remediate these idle connections. What should be the optimal value 
> for these configs and max_connections that to be set at DB. Consider we need 
> to build a large environment serving 500+ definitions with 1+ runs per 
> day. Need suggestions...  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3405) Task instance fail intermittently due to MySQL error

2018-11-27 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700460#comment-16700460
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3405:


That is a huge number of connections and is something we'd want to fix.

What version of Airflow are you running on?

> Task instance fail intermittently due to MySQL error
> 
>
> Key: AIRFLOW-3405
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3405
> Project: Apache Airflow
>  Issue Type: Improvement
> Environment: MySQL, Redhat Linux
>Reporter: Yuvaraj
>Priority: Major
>  Labels: performance, usability
>
> Dags are getting failed intermittently due to below error. 
> OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many 
> connections')
> [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded!
> We have max_connections defined as 2000 in DB. 
> Below are the setting in cfg.
> sql_alchemy_pool_size = 1980
> sql_alchemy_pool_recycle = 3600
> As per DBA, The airflow scheduler keeps opening connections to the database, 
> these connections are mostly idle, they get reset whenever the scheduler 
> restarts but with max_connections at 2000 and scheduler holding on to 1600 of 
> these, other apps trying to connect might start running out of connections.
> How do we remediate these idle connections. What should be the optimal value 
> for these configs and max_connections that to be set at DB. Consider we need 
> to build a large environment serving 500+ definitions with 1+ runs per 
> day. Need suggestions...  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)