vttranlina commented on PR #1981:
URL: https://github.com/apache/james-project/pull/1981#issuecomment-2380396460

   Recently, I had the chance to investigate this issue again, and I have 
catched the cause.
   
   Starting from Minio `RELEASE.2024-01-31T20-20-33Z`. When we perform a PUT 
Object request but the bucket does not exist.
   
   The problem is that, for large objects (e.g., 12MB in our test), minio 
returns a response with a header `Connection: close`, which is absent for 
smaller objects.
   In both object scenarios, the response body is the same `HTTP 404, 
<Code>NoSuchBucket</Code>`
   
   This header causes AWS SDK S3 to throw an `SdkClientException` (message: 
"Unable to execute HTTP request: The connection was closed during the 
request"), which we haven't handled -> test case failure.
   
   Without the `Connection: close` header, the library throws a 
`NoSuchBucketException`, which we handle by creating the bucket (via 
`createBucketOnRetry`) -> the putObject then succeeds.
   
   
   This is why Rene mentioned: "Also if before the two ByteSource big blob 
saves you do a save on a smaller blob, then it's green." 
   I agree this won't happen in production since we create the bucket upfront.
   
   I'm not sure whether this is a bug or a feature of S3 Minio (I’m asking them 
on their Slack channel and awaiting response).
   
   ___________________
   
   Here are some of my thoughts around this issue:
   
   1. Modify the current logic to handle both `NoSuchBucketException` and 
`SdkClientException`
   
   // this approach doesn’t sound convincing (create bucket when 
`SdkClientException`)
   
   2. Override the test class S3MinioTest, create the bucket before running the 
logic.
   
   3. Use S3 multipart upload api for large objects. 
https://www.baeldung.com/aws-s3-multipart-upload 
   
   4. Experiment with using `S3CrtAsyncClient` instead of `S3AsyncClient`. I 
quickly researched “crt,” which is a client developed by AWS, and based on the 
promotion, it sounds very promising for handling large object uploads.
   
   Regarding points (3) and (4), I'm unsure if these changes are worth it, 
considering our object size limit is 20-30MB.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@james.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@james.apache.org
For additional commands, e-mail: notifications-h...@james.apache.org

Reply via email to