vttranlina commented on PR #1981: URL: https://github.com/apache/james-project/pull/1981#issuecomment-2380396460
Recently, I had the chance to investigate this issue again, and I have catched the cause. Starting from Minio `RELEASE.2024-01-31T20-20-33Z`. When we perform a PUT Object request but the bucket does not exist. The problem is that, for large objects (e.g., 12MB in our test), minio returns a response with a header `Connection: close`, which is absent for smaller objects. In both object scenarios, the response body is the same `HTTP 404, <Code>NoSuchBucket</Code>` This header causes AWS SDK S3 to throw an `SdkClientException` (message: "Unable to execute HTTP request: The connection was closed during the request"), which we haven't handled -> test case failure. Without the `Connection: close` header, the library throws a `NoSuchBucketException`, which we handle by creating the bucket (via `createBucketOnRetry`) -> the putObject then succeeds. This is why Rene mentioned: "Also if before the two ByteSource big blob saves you do a save on a smaller blob, then it's green." I agree this won't happen in production since we create the bucket upfront. I'm not sure whether this is a bug or a feature of S3 Minio (I’m asking them on their Slack channel and awaiting response). ___________________ Here are some of my thoughts around this issue: 1. Modify the current logic to handle both `NoSuchBucketException` and `SdkClientException` // this approach doesn’t sound convincing (create bucket when `SdkClientException`) 2. Override the test class S3MinioTest, create the bucket before running the logic. 3. Use S3 multipart upload api for large objects. https://www.baeldung.com/aws-s3-multipart-upload 4. Experiment with using `S3CrtAsyncClient` instead of `S3AsyncClient`. I quickly researched “crt,” which is a client developed by AWS, and based on the promotion, it sounds very promising for handling large object uploads. Regarding points (3) and (4), I'm unsure if these changes are worth it, considering our object size limit is 20-30MB. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@james.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@james.apache.org For additional commands, e-mail: notifications-h...@james.apache.org