gianm commented on a change in pull request #8903: S3 input source
URL: https://github.com/apache/incubator-druid/pull/8903#discussion_r348869424
##########
File path:
extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/S3Utils.java
##########
@@ -168,6 +244,26 @@ public S3ObjectSummary next()
};
}
+
+ /**
+ * Create an {@link URI} from the given {@link S3ObjectSummary}. The result
URI is composed as below.
+ *
+ * <pre>
+ * {@code s3://{BUCKET_NAME}/{OBJECT_KEY}}
+ * </pre>
+ */
+ public static URI summaryToUri(S3ObjectSummary object)
+ {
+ final String originalAuthority = object.getBucketName();
+ final String originalPath = object.getKey();
+ final String authority = originalAuthority.endsWith("/") ?
+ originalAuthority.substring(0,
originalAuthority.length() - 1) :
+ originalAuthority;
+ final String path = originalPath.startsWith("/") ?
originalPath.substring(1) : originalPath;
+
+ return URI.create(StringUtils.format("s3://%s/%s", authority, path));
Review comment:
This is bad, because it won't encode funny characters in `path`. Imagine the
`path` has a `?` in it. It needs to be URI-encoded, or else pulling the key out
later won't work. The tricky characters are `/` (which you _don't_ want to
encode) and `?`, `#`, and others (which you do).
`StringUtils.urlEncode` might help you here.
Alternatively, don't use URIs internally, instead use bucket/key pairs.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]