xuang7 commented on code in PR #4136:
URL: https://github.com/apache/texera/pull/4136#discussion_r2658733492
##########
common/workflow-core/src/main/scala/org/apache/texera/service/util/S3StorageClient.scala:
##########
@@ -259,4 +259,59 @@ object S3StorageClient {
DeleteObjectRequest.builder().bucket(bucketName).key(objectKey).build()
)
}
+
+ /**
+ * Uploads a single part for an in-progress S3 multipart upload.
+ *
+ * This method wraps the AWS SDK v2 {@code UploadPart} API:
+ * it builds an {@link
software.amazon.awssdk.services.s3.model.UploadPartRequest}
+ * and streams the part payload via a {@link
software.amazon.awssdk.core.sync.RequestBody}.
+ *
+ * Payload handling:
+ * - If {@code contentLength} is provided, the payload is streamed
directly from {@code inputStream}
+ * using {@code RequestBody.fromInputStream(inputStream, len)}.
+ * - If {@code contentLength} is {@code None}, the entire {@code
inputStream} is read into memory
+ * ({@code readAllBytes}) and uploaded using {@code
RequestBody.fromBytes(bytes)}.
+ * This is convenient but can be memory-expensive for large parts;
prefer providing a known length.
+ *
+ * Notes:
+ * - {@code partNumber} must be in the valid S3 range (typically
1..10,000).
+ * - The caller is responsible for closing {@code inputStream}.
+ * - This method is synchronous and will block the calling thread until
the upload completes.
+ *
+ * @param bucket S3 bucket name.
+ * @param key Object key (path) being uploaded.
+ * @param uploadId Multipart upload identifier returned by
CreateMultipartUpload.
+ * @param partNumber 1-based part number for this upload.
+ * @param inputStream Stream containing the bytes for this part.
+ * @param contentLength Optional size (in bytes) of this part; provide it
to avoid buffering in memory.
Review Comment:
Makes sense, thank you for explaining. Since multipart upload requires
contentLength, it may not be necessary for current use case, but I agree
keeping it optional makes the method more general for potential future use. We
could also get others input.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]