GitHub user dosubot[bot] added a comment to the discussion: How to Handle Large Data Exports (10M+ Rows) in Superset?
<!-- Answer --> To handle large data exports in Superset efficiently, especially for datasets over 10 million rows, consider the following strategies: 1. **Use Celery for Asynchronous Processing**: Superset can leverage Celery as a background worker process to handle large data exports asynchronously. This setup involves configuring Celery workers, a message broker like Redis or RabbitMQ, and a results backend. This allows Superset to manage long-running queries without blocking the main application, which can help with large data exports <sup>[[1]](https://github.com/apache/superset/blob/master/docs/docs/configuration/async-queries-celery.mdx)</sup><sup>[[2]](https://github.com/apache/superset/blob/master/superset/async_events/async_query_manager.py)</sup>. 2. **Optimize CSV Downloads with S3**: If you're using Presto DB, configure it to automatically persist query results to S3. Enable the `DownloadCSVFromS3` and `ShowDefaultCSVOptions` feature flags, and set the necessary environment variables (`SUPERSET_REGION`, `SUPERSET_WORKGROUP`, `SUPERSET_ATHENA_DB`). This approach uses Athena's native S3 functionality to return a presigned S3 URL, reducing download time for large CSV files <sup>[[3]](https://github.com/apache/superset/issues/31482)</sup>. 3. **Limitations and Constraints**: Be aware that Superset's current export functionality does not support pagination for CSV exports, which means the entire dataset is exported at once. This can lead to performance issues when dealing with large datasets <sup>[[4]](https://github.com/apache/superset/blob/master/superset-frontend/CHANGELOG.md)</sup>. These strategies can help manage memory usage and improve performance during large data exports in Superset. <!-- Dosu Comment Footer --> *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* --- <sup>Help Dosu learn! Give it feedback: [Great Response](https://app.dosu.dev/response-feedback/d6a1affb-4eb6-4703-a75a-cdbe074eb5fc?feedback_type=great_response) | [Irrelevant Answer](https://app.dosu.dev/response-feedback/d6a1affb-4eb6-4703-a75a-cdbe074eb5fc?feedback_type=irrelevant_answer) | [Incorrect Sources](https://app.dosu.dev/response-feedback/d6a1affb-4eb6-4703-a75a-cdbe074eb5fc?feedback_type=incorrect_sources) | [Too Verbose](https://app.dosu.dev/response-feedback/d6a1affb-4eb6-4703-a75a-cdbe074eb5fc?feedback_type=too_verbose) | [Hallucination](https://app.dosu.dev/response-feedback/d6a1affb-4eb6-4703-a75a-cdbe074eb5fc?feedback_type=hallucination) | [Bug Report](https://app.dosu.dev/response-feedback/d6a1affb-4eb6-4703-a75a-cdbe074eb5fc?feedback_type=bug_report) | [Other](https://app.dosu.dev/response-feedback/d6a1affb-4eb6-4703-a75a-cdbe074eb5fc?feedback_type=other)</sup> GitHub link: https://github.com/apache/superset/discussions/33530#discussioncomment-13202543 ---- This is an automatically sent email for notifications@superset.apache.org. To unsubscribe, please send an email to: notifications-unsubscr...@superset.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For additional commands, e-mail: notifications-h...@superset.apache.org