rdblue commented on PR #4792:
URL: https://github.com/apache/iceberg/pull/4792#issuecomment-1153296951

   @samredai, to review this, I started looking more into the boto3 API. It 
looks like the API that you're using isn't a streaming API, which is what we 
typically want so that we can avoid things like buffering whole files in memory 
before writing them with a single PUT. When I went looking more into how to use 
boto3 for streaming reads and streaming writes, I quickly ran into 
`smart_open`, which appears to do everything that we want.
   
   I think you had a S3FileIO that used smart_open before. Is there a reason 
not to use that to wrap boto3 now? I think we would be able to avoid 
maintaining a lot of this code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to