I think this isn't right. I got a user saying that their CSV export today was just 1k rows and I was confused as I thought the limit was much higher like 1M for CSVs. I got confused because I read the code and thought "mmh I thought there was 2 different settings for this..." but couldn't find evidence of that since this PR had changed that.
So when is `async` mode we should apply the `SQL_MAX_ROW` limit (1M by default) to send to the results backend, so that the `csv/` endpoint can hit that in async mode. The `results/` endpoint in async mode (reading from the results backend) should read only the first `DISPLAY_SQL_MAX_ROW` rows only and return that to prevent the UI from crashing. I think this was all working prior to this PR. In non-async mode, we should apply `DISPLAY_SQL_MAX_ROW` on the `sql_json/` endpoint, and apply no limit o the `csv/` endpoint. I think this was not working before, we would just return `SQL_MAX_ROW` and crash the browser. This didn't happen much at Lyft and Airbnb since we run most big databases in async mode. Sorry this is all a bit confusing and clearly needs to be documented better. We also need to allow the user to know what is happening here and maybe allow them to change their limits, though we shouldn't let them crash their browser too easily. @timifasubaa @villebro ^^^ [ Full content available at: https://github.com/apache/incubator-superset/pull/5392 ] This message was relayed via gitbox.apache.org for [email protected]
