DanielLeens commented on issue #10923: URL: https://github.com/apache/seatunnel/issues/10923#issuecomment-4523790045
Thanks for the concrete follow-up. With this new detail, this is no longer just a capability question; it is a real feature request. At the moment, the built-in `Http` source is still designed around text / JSON payloads: - the response body is materialized as `String` - the schema path does not expose a raw binary payload contract - and there is no built-in filename / content-type / chunk metadata model to pass downstream So supporting PDFs, images, videos, ZIPs, Office documents, and large-file stream transfer to a file sink would require more than a small connector tweak. It needs a clearer binary response contract and explicit design decisions around: 1. how filename metadata is propagated (for example from `Content-Disposition`) 2. whether the first phase is batch only, streaming only, or both 3. whether chunking / block transfer is part of the first version, or deferred This looks reasonable to track as an enhancement, but I would strongly suggest narrowing the first scope. A practical MVP would be something like: 1. batch-first raw binary download 2. one binary payload field plus filename / content-type metadata 3. no chunked resume / multi-part transfer in the first round If you want to continue with this direction, keeping the issue focused around that first MVP will make it much easier for the community to evaluate and implement. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
