steveloughran commented on PR #7396: URL: https://github.com/apache/hadoop/pull/7396#issuecomment-2665905814
Thanks, these are really interesting results. ## performance difference Yes, there is some with large files, but with all the tests ongoing it's hard to isolate. Can you * do a full release build, move it out the source tree and set it up with credentials somehow (or run in a VM with some) * download the latest cloudstore jar (https://github.com/steveloughran/cloudstore) and use it's bandwidth command? https://github.com/steveloughran/cloudstore/blob/main/src/main/site/bandwidth.md something like ``` bin/hadoop jar $CLOUDSTORE bandwidth -block 256M -csv results.csv 10G $BUCKET/testfile ``` And repeat for checksums on/off (there's a -xmlfile option to take a path to an xml file of extra settings), then share those CSVs? Incidentally, presumably the checksum is calculated during the upload. We queue blocks for upload, and if the checksum could be calculated at the time of queueing takes place, maybe it could be more efficient, as the thread doing the upload would not be held up, just the worker thread of the application. ## delete throttling ``` [ERROR] Errors: [ERROR] ILoadTestS3ABulkDeleteThrottling.test_020_DeleteThrottling [INFO] Run 1: PASS [ERROR] Run 2: software.amazon.awssdk.services.s3.model.S3Exception: Please reduce your request rate. (Service: S3, Status Code: 200, Request ID: DP0XV0P28HHBMMB0, Extended Request ID: QS0YF7VpXRMDpgIsuVCZCavni+uFNTnsCA0pylJxoqXx9DdsGQot698AaQncMHPIO4qs0Fgce8AVRHL6i4V6Hg==) [INFO] Run 3: PASS [INFO] Run 4: PASS [INFO] ``` This is a very interesting as it shows we aren't detecting, mapping and handling throttle responses from bulk requests. In the V1 SDK their response was always a 503, now we get a 200 and a text message, which is a pain as string matching for errors is brittle. * Would you be able to get a full stack trace of that? It should be in the output files of ILoadTestS3ABulkDeleteThrottling; a `mvn surefire-report:report-only` would generate the full html reports, but I'm happy with the raw files. * Do you know if this error text is frozen? Our bulk delete API does rate limit, but we don't do that for directory delete (yet) as we never did. Maybe we should revisit that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org