Pinimo opened a new issue #9597:
URL: https://github.com/apache/incubator-superset/issues/9597


   Globally: the **cache warm-up tasks launched by Celery workers all silently 
fail**. Indeed, they perform `GET`s on the main server's URL without providing 
the required authentication. However, dashboards may not be loaded without 
being logged in.
   
   
   **Related bugs:**
   
   - unit tests on this feature miss the error
   - the documentation should mention that the Celery worker needs the `--beat` 
flag to listen on CeleryBeat schedules (cf `docker-compose.yml` configuration)
   
   At stake: long dashboard load times for our users, or outdated dashboards.
   
   ### Expected results
   
   When the Celery worker logs this (notice `'errors': []`):
   
       superset-worker_1  | [2020-04-20 13:05:00,299: INFO/ForkPoolWorker-3] 
Task cache-warmup[73c09754-4dcb-4674-9ac2-087b04b6e209] 
                            succeeded in 0.1351924880000297s: 
                            {'success': [
                                
'http://superset:8088/superset/explore/?form_data=%7B%22slice_id%22%3A%2031%7D',
 
                                
'http://superset:8088/superset/explore/?form_data=%7B%22slice_id%22%3A%2032%7D',
 
                                
'http://superset:8088/superset/explore/?form_data=%7B%22slice_id%22%3A%2033%7D'],
 
                            'errors': []}
   
   
   ... we would expect to have something (more or less) like this in the 
Superset server logs:
   ```
   superset_1         | 172.20.0.6 - - [2020-04-20 13:05:00,049] "POST 
/superset/explore_json/?form_data=%7B%22slice_id%22%3A HTTP/1.1" 
                        200 738 "http://superset:8088/superset/dashboard/1/"; 
"python-urllib2"
   ```
   
   Of course, we also hope to have a bunch of items in the Redis logs, and that 
loading dashboards is lightning-quick.
   
   ### Actual results
   
   But we get these logs instead, which show there is a 302 redirect to the 
login page, followed by a 200 on the login page. This redirect is interpreted 
as a success by the tests.
   
   ```
   superset_1         | 172.20.0.6 - - [20/Apr/2020 08:12:00] "GET 
/superset/explore/?form_data=%7B%22slice_id%22%3A%2030%7D HTTP/1.1" 
                        302 -
   superset_1         | INFO:werkzeug:172.20.0.6 - - [20/Apr/2020 08:12:00] 
"GET /superset/explore/?form_data=%7B%22slice_id%22%3A%2030%7D HTTP/1.1" 
                        302 -
   superset_1         | 172.20.0.6 - - [20/Apr/2020 08:12:00] "GET 
/login/?next=http%3A%2F%2Fsuperset%3A8088%2Fsuperset%2Fexplore%2F%3Fform_data%3D%257B%2522slice_id%2522%253A%252030%257D
 HTTP/1.1" 
                        200 -
   ```
   
   (I added a few line returns)
   
   In the Redis, here is the only stored key:
   ```
   $ docker-compose exec redis redis-cli
   127.0.0.1:6379> KEYS *
   1) "_kombu.binding.celery"
   ```
   
   Last, the dashboards take time loading the data on the first connection.
   
   #### Screenshots
   
   None
   
   #### How to reproduce the bug
   
   I had to patch the master branch to get this to work. In particular, I have 
to admit it was not very clear to me whether the config was read from file 
`docker/pythonpath_dev/superset_config.py` or file `superset/config.py`. So I 
kind of adapted `superset/config.py` and copied it over to the `pythonpath` one 
(which looks like it is read by the celery worker, but not the server). 
   
   Anyway, this reproduces the bug:
   
   <ol>
   <li><code>$ docker system prune --all</code> to remove all dangling images, 
exited containers and volumes.</li>
   <li><code>$ git checkout master && git pull origin master</code></li>
   <li><code>$ wget -O configs.patch 
https://gist.githubusercontent.com/Pinimo/c339ea828974d2141423b6ae64192aa4/raw/f02d49c0bca8acce879936d2650719eb4b0dd6d8/0001-bug-Patch-master-to-reproduce-sweetly-the-cache-warm.patch
 && git apply configs.patch</code><br>This will apply patches to master to make 
the scenario work out neatly, in particular add the <code>--beat</code> flag 
and specify a cache warmup task on all dashboards every minute.</li>
   <li><code>$ docker-compose up -d</code></li>
   <li>Wait for the containers to be built and up.</li>
   <li><code>$ docker-compose logs superset-worker | grep 
cache-warmup</code></li>
   <li><code>$ docker-compose logs superset | grep slice</code></li>
   <li><code>$ docker-compose exec redis redis-cli</code> then type <code>KEYS 
*</code></li>
   
   </ol>
   
   ### Environment
   
   (please complete the following information):
   
   - superset version: 0.36.0
   - python version: dockerized
   - node.js version: dockerized
   - npm version: dockerized
   
   ### Checklist
   
   - [x] I have checked the superset logs for python stacktraces and included 
it here as text if there are any.
   - [x] I have reproduced the issue with at least the latest released version 
of superset.
   - [x] I have checked the issue tracker for the same issue and I haven't 
found one similar.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to