While this wasn't with Airflow, I had similar problems with connection failures with Redshift on very long queries (6 hours+) when the client was not in EC2. When I moved the client into EC2 those problems went away.
-- JCL On Thu, 8 Sep 2016 10:45:58 -0700, David Klosowski <[email protected]> wrote: > We're running airflow using docker + ecs with EMR and Redshift and > have not noticed any issues with the PostgresOperator yet. We don't > have anything running in production with Redshift, but have evaluated > a few DAGs using it. I haven't noticed any issued with the > Operator. I would also look at the Redshift query console looking at > the query in question to check it's status. You can also check in > with AWS support to see if they have any guidance. > > Cheers, > David > > On Thu, Sep 8, 2016 at 8:22 AM, Maxime Beauchemin < > [email protected]> wrote: > > > My intuition is that is comes from some Postgres or less likely but > > possible environmental setting (os, network, ...) on your end. > > Clearly the stack trace you share says that the server closed the > > connection during the cursor's execute method. I bet you that you > > can reproduce with a vanilla query using the same python driver > > outside of Airflow, and most likely with any other API in other > > languages. > > > > Good luck, > > > > Max > > > > On Fri, Sep 2, 2016 at 12:58 AM, Arthur Purvis > > <[email protected]> wrote: > > > > > Hi - > > > > > > I am experiencing the following situation while evaluating > > > airflow; wondering if this rings any bells with anyone: > > > > > > > > > 1. *airflow 1.7.1.2* is running in a *docker* container on > > > amazon > > *ECS* > > > 2. *dagrun_timeout* = timedelta(hours=24) for this DAG > > > 3. running redshift queries with the PostGresOperator works > > > fine > > > > > > *except when that query takes over two hours*, and then the query > > > is *cut off* *exactly at the two hour mark* with this message > > > > > > [2016-09-02 00:32:14,307] {models.py:1286} ERROR - server closed > > > the connection unexpectedly > > > This probably means the server terminated abnormally > > > before or while processing the request. > > > Traceback (most recent call last): > > > File > > > "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line > > > 1245, in run result = task_copy.execute(context=context) > > > File > > > "/usr/local/lib/python2.7/dist-packages/airflow/ > > > operators/postgres_operator.py", > > > line 39, in execute > > > self.hook.run(self.sql, self.autocommit, > > > parameters=self.parameters) File > > > "/usr/local/lib/python2.7/dist-packages/airflow/hooks/dbapi_hook.py", > > line > > > 132, in run > > > cur.execute(s) > > > OperationalError: server closed the connection unexpectedly > > > This probably means the server terminated abnormally > > > before or while processing the request. > > > > > > > > > > > > We are 100% sure that this account is able to run long queries in > > redshift > > > because we run these same queries from outside airflow from a > > > regular AWS EC2 instance (not a docker container) without issue. > > > > > > The *exactly *two hours says to me that there is some setting > > > somewhere - either in airflow itself or in the postgresoperator > > > or in the docker TCP settings (or somewhere else) - that is going > > > awry, but I cannot sort out what it could be. It doesn't seem to > > > be an issue with celery, because other jobs are running that > > > polling for 6+ hours. > > > > > > If this rings any bells for anyone I would love to hear about it. > > > > > > Thanks for reading.
