[GitHub] Mogball commented on issue #3895: [druid] Fixing issue 3894 multi-processing w/ Gunicorn

2017-11-17 Thread GitBox
Mogball commented on issue #3895: [druid] Fixing issue 3894  multi-processing 
w/ Gunicorn
URL: 
https://github.com/apache/incubator-superset/pull/3895#issuecomment-345418884
 
 
   `latest_metadata` doesn't make any calls to SQLAlchemy ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Mogball commented on issue #3895: [druid] Fixing issue 3894 multi-processing w/ Gunicorn

2017-11-17 Thread GitBox
Mogball commented on issue #3895: [druid] Fixing issue 3894  multi-processing 
w/ Gunicorn
URL: 
https://github.com/apache/incubator-superset/pull/3895#issuecomment-345406215
 
 
   The speed up is more than just 4x (or whatever the number of available 
threads). Most of the time is spent waiting for Druid to respond, which means 
that the metadata requests are issued more or less at simultaneously for each 
datasource, and then processed when they all start coming back. This makes a 
difference of like 40 seconds to 3 seconds (when hard refreshing all 
datasources).
   
   This should be thread-safe since there is no interaction outside of the 
datasource object and with the Superset backend during refresh.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Mogball commented on issue #3895: [druid] Fixing issue 3894 multi-processing w/ Gunicorn

2017-11-17 Thread GitBox
Mogball commented on issue #3895: [druid] Fixing issue 3894  multi-processing 
w/ Gunicorn
URL: 
https://github.com/apache/incubator-superset/pull/3895#issuecomment-345406215
 
 
   The speed up is more than just 4x (or whatever the number of available 
threads). Most of the time is spent waiting for Druid to respond, which means 
that the metadata requests are issued more or less simultaneously for each 
datasource, and then processed when they all start coming back. This makes a 
difference of like 40 seconds to 3 seconds (when hard refreshing all 
datasources).
   
   This should be thread-safe since there is no interaction outside of the 
datasource object and with the Superset backend during refresh.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Mogball commented on issue #3895: [druid] Fixing issue 3894 multi-processing w/ Gunicorn

2017-11-16 Thread GitBox
Mogball commented on issue #3895: [druid] Fixing issue 3894  multi-processing 
w/ Gunicorn
URL: 
https://github.com/apache/incubator-superset/pull/3895#issuecomment-345142406
 
 
   Looks okay to me. I'm pretty sure it should be thread-safe.
   
   `refresh_async` was the name of the function when it was suppose to be 
actually asynchronous but I never changed the name afterwards (i.e. a bit of a 
misnomer).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Mogball commented on issue #3895: [druid] Fixing issue 3894 multi-processing w/ Gunicorn

2017-11-16 Thread GitBox
Mogball commented on issue #3895: [druid] Fixing issue 3894  multi-processing 
w/ Gunicorn
URL: 
https://github.com/apache/incubator-superset/pull/3895#issuecomment-345141488
 
 
   Multithreaded execution was introduced because refreshing a cluster with 
many datasources took... a long time. Metadata queries were issued 
sequentially, but multithreading this makes refreshing lots of datasources much 
faster.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services