Pinimo opened a new issue #9597:
URL: https://github.com/apache/incubator-superset/issues/9597
Globally: the **cache warm-up tasks launched by Celery workers all silently
fail**. Indeed, they perform `GET`s on the main server's URL without providing
the required authentication. However, dashboards may not be loaded without
being logged in.
**Related bugs:**
- unit tests on this feature miss the error
- the documentation should mention that the Celery worker needs the `--beat`
flag to listen on CeleryBeat schedules (cf `docker-compose.yml` configuration)
At stake: long dashboard load times for our users, or outdated dashboards.
### Expected results
When the Celery worker logs this (notice `'errors': []`):
superset-worker_1 | [2020-04-20 13:05:00,299: INFO/ForkPoolWorker-3]
Task cache-warmup[73c09754-4dcb-4674-9ac2-087b04b6e209]
succeeded in 0.1351924880000297s:
{'success': [
'http://superset:8088/superset/explore/?form_data=%7B%22slice_id%22%3A%2031%7D',
'http://superset:8088/superset/explore/?form_data=%7B%22slice_id%22%3A%2032%7D',
'http://superset:8088/superset/explore/?form_data=%7B%22slice_id%22%3A%2033%7D'],
'errors': []}
... we would expect to have something (more or less) like this in the
Superset server logs:
```
superset_1 | 172.20.0.6 - - [2020-04-20 13:05:00,049] "POST
/superset/explore_json/?form_data=%7B%22slice_id%22%3A HTTP/1.1"
200 738 "http://superset:8088/superset/dashboard/1/"
"python-urllib2"
```
Of course, we also hope to have a bunch of items in the Redis logs, and that
loading dashboards is lightning-quick.
### Actual results
But we get these logs instead, which show there is a 302 redirect to the
login page, followed by a 200 on the login page. This redirect is interpreted
as a success by the tests.
```
superset_1 | 172.20.0.6 - - [20/Apr/2020 08:12:00] "GET
/superset/explore/?form_data=%7B%22slice_id%22%3A%2030%7D HTTP/1.1"
302 -
superset_1 | INFO:werkzeug:172.20.0.6 - - [20/Apr/2020 08:12:00]
"GET /superset/explore/?form_data=%7B%22slice_id%22%3A%2030%7D HTTP/1.1"
302 -
superset_1 | 172.20.0.6 - - [20/Apr/2020 08:12:00] "GET
/login/?next=http%3A%2F%2Fsuperset%3A8088%2Fsuperset%2Fexplore%2F%3Fform_data%3D%257B%2522slice_id%2522%253A%252030%257D
HTTP/1.1"
200 -
```
(I added a few line returns)
In the Redis, here is the only stored key:
```
$ docker-compose exec redis redis-cli
127.0.0.1:6379> KEYS *
1) "_kombu.binding.celery"
```
Last, the dashboards take time loading the data on the first connection.
#### Screenshots
None
#### How to reproduce the bug
I had to patch the master branch to get this to work. In particular, I have
to admit it was not very clear to me whether the config was read from file
`docker/pythonpath_dev/superset_config.py` or file `superset/config.py`. So I
kind of adapted `superset/config.py` and copied it over to the `pythonpath` one
(which looks like it is read by the celery worker, but not the server).
Anyway, this reproduces the bug:
<ol>
<li><code>$ docker system prune --all</code> to remove all dangling images,
exited containers and volumes.</li>
<li><code>$ git checkout master && git pull origin master</code></li>
<li><code>$ wget -O configs.patch
https://gist.githubusercontent.com/Pinimo/c339ea828974d2141423b6ae64192aa4/raw/f02d49c0bca8acce879936d2650719eb4b0dd6d8/0001-bug-Patch-master-to-reproduce-sweetly-the-cache-warm.patch
&& git apply configs.patch</code><br>This will apply patches to master to make
the scenario work out neatly, in particular add the <code>--beat</code> flag
and specify a cache warmup task on all dashboards every minute.</li>
<li><code>$ docker-compose up -d</code></li>
<li>Wait for the containers to be built and up.</li>
<li><code>$ docker-compose logs superset-worker | grep
cache-warmup</code></li>
<li><code>$ docker-compose logs superset | grep slice</code></li>
<li><code>$ docker-compose exec redis redis-cli</code> then type <code>KEYS
*</code></li>
</ol>
### Environment
(please complete the following information):
- superset version: 0.36.0
- python version: dockerized
- node.js version: dockerized
- npm version: dockerized
### Checklist
- [x] I have checked the superset logs for python stacktraces and included
it here as text if there are any.
- [x] I have reproduced the issue with at least the latest released version
of superset.
- [x] I have checked the issue tracker for the same issue and I haven't
found one similar.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]