MoMingMq commented on issue #10923: URL: https://github.com/apache/seatunnel/issues/10923#issuecomment-4515043820
It's best to support all files. You can start by supporting regular file formats such as images (png/jpg), videos (mp4/avi), compressed files, Word documents, PDFs, PPTs, etc., and output them as file streams. Then, you can connect them to the sink stream, such as file, to save files. Both batch and stream formats should be supported, and one thing to consider is large file processing, such as partitioning? Stream read in chunks and then stream output directly, otherwise large files are prone to OOM, such as 10GB files? You can also add a parameter to design the block size and whether to enable block transfer, which defaults to false For example: Request GET: /file/down There is a file name in the request header content-disposition attachment; filename=demo.pdf You can take the filename file name and save the file name downstream to a file name in the specified directory -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
