mistercrunch opened a new pull request #4251: Using a NullPool for external 
connections by default
URL: https://github.com/apache/incubator-superset/pull/4251
 
 
   Currently, even though `get_sqla_engine` calls get memoized, engines are
   still short lived since they are attached to an models.Database ORM
   object. All engines created through this method have the scope of a web
   request.
   
   Knowing that the SQLAlchemy objects are short lived means that
   a related connection pool would also be short lived and mostly useless.
   I think it's pretty rare that connections get reused within the context
   of a view or Celery worker task.
   
   We've noticed on Redshift that Superset was leaving many connections
   opened (hundreds). This is probably due to a combination of the current
   process not garbage collecting connections properly, and perhaps the
   absence of connection timeout on the redshift side of things. This
   could also be related to the fact that we experience web requests timeouts
   (enforced by gunicorn) and that process-killing may not allow SQLAlchemy
   to clean up connections as they occur (which this PR may not help
   fixing...)
   
   For all these reasons, it seems like the right thing to do to use
   NullPool for external connection (but not for our connection to the metadata
   db!).
   
   Opening the PR for conversation. Putting this query into our staging
   today to run some tests.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to