danielcweeks commented on a change in pull request #1767:
URL: https://github.com/apache/iceberg/pull/1767#discussion_r523788964



##########
File path: aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputStream.java
##########
@@ -69,14 +132,54 @@ public void flush() throws IOException {
 
   @Override
   public void write(int b) throws IOException {
+    if (stream.getCount() >= multiPartSize) {
+      newStream();
+      uploadParts();
+    }
+
     stream.write(b);
     pos += 1;
   }
 
   @Override
   public void write(byte[] b, int off, int len) throws IOException {
-    stream.write(b, off, len);
+    int remaining = len;
+    int relativeOffset = off;
+
+    // Write the remainder of the part size to the staging file
+    // and continue to write new staging files if the write is
+    // larger than the part size.
+    while (stream.getCount() + remaining > multiPartSize) {
+      int writeSize = multiPartSize - (int) stream.getCount();

Review comment:
       I feel like this is really unlikely to be an issue.  The loop is mostly 
for completeness, but for you to hit a second iteration, the write size would 
have to be 2x the multipart size and given this is internally wrapped in a 
buffered output stream, the part size would need to be less than 4K (less than 
half the output buffer size).  
   
   I think your right that it wouldn't start uploading parts until after the 
loop, but I just don't feel like it's worth the extra complexity.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to