yihua commented on code in PR #13836:
URL: https://github.com/apache/hudi/pull/13836#discussion_r2331661737
##########
hudi-aws/src/main/java/org/apache/hudi/aws/transaction/lock/S3StorageLockClient.java:
##########
@@ -267,6 +268,52 @@ private static S3Client createS3Client(Region region, long
timeoutSecs, Properti
.region(region).build();
}
+ @Override
+ public Option<String> readObject(String filePath, boolean checkExistsFirst) {
+ try {
+ // Parse the file path to get bucket and key
+ URI uri = new URI(filePath);
+ String bucket = uri.getHost();
+ String key = uri.getPath().replaceFirst("/", "");
+
+ if (checkExistsFirst) {
+ // First check if the file exists (lightweight HEAD request)
+ try {
+ s3Client.headObject(HeadObjectRequest.builder()
+ .bucket(bucket)
+ .key(key)
+ .build());
+ } catch (S3Exception e) {
+ if (e.statusCode() == NOT_FOUND_ERROR_CODE) {
+ // File doesn't exist - this is the common case for optional
configs
+ logger.debug("JSON config file not found: {}", filePath);
+ return Option.empty();
+ }
+ throw e; // Re-throw other errors
+ }
+ }
Review Comment:
Got it. If there is performance improvement without affecting correctness,
we can keep this. I'm curious on the case where the object does not exist
(which is the common case you mentioned); in this case, there is no content to
read for `getObjectAsBytes`, so does `getObjectAsBytes` still have higher
latency than `headObject`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]