kbendick edited a comment on pull request #1573: URL: https://github.com/apache/iceberg/pull/1573#issuecomment-706620901
> @kbendick For the multipart upload/transfer manager, I do think we should toward something like that, but I'm trying to keep this initial implementation somewhat simple so it doesn't become a huge commit. > > We actually used to use the transfer manager, but the last time I looked, it required writing out the full file and then initiate the upload. What we ended up doing was building a progressive upload so that it would split up the stream and upload in parts as the stream was being written to. > > I think some of these optimization should be done as follow up. I'm on board with doing these optimizations as follow ups 👍 . And admittedly I've only used the transfer manager for downloads. You're right, it appears it still requires you to write the full file out. My experience in the realm of having a transfer manager like resource for concurrent multipart uploads from streams is limited to python. I'm definitely good with keeping things simple to start. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
