cryptoe commented on issue #17709:
URL: https://github.com/apache/druid/issues/17709#issuecomment-2654237317
> Getting the result through the Druid endpoint
“druid/v2/sql/statements/{queryId}/results/page=[page$]&&resultFormat=csv”, is
sequential and very time-consuming (one test I did is that it takes 30min to
get 1G of query result, using a curl HTTPS call running inside the same AWS
region (as the S3 bucket))
If you could share the flamegraph of the broker while running this, I can
debug the cause of slowness.
Regarding 2, the processing engine has various tasks which are distributed.
To do what you are saying means effectively having one task in the final stage
since we want to preserve ordering and stuff across the tasks which is not a
scalable design.
For your use case, if you want to do this, add a limit to the query. It will
force the last stage to be a single stage which will result in a single csv
file on s3.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]