jackye1995 commented on code in PR #7128:
URL: https://github.com/apache/iceberg/pull/7128#discussion_r1142662715
##########
core/src/main/java/org/apache/iceberg/LocationProviders.java:
##########
@@ -167,5 +167,14 @@ private static String pathContext(String tableLocation) {
return resolvedContext;
}
+
+ private static String computeHash(String fileName) {
+ Preconditions.checkState(fileName != null, "fileName cannot be null");
+ byte[] messageDigest =
+
HASH_FUNC.hashBytes(fileName.getBytes(StandardCharsets.UTF_8)).asBytes();
+ String hash = Base64.getUrlEncoder().encodeToString(messageDigest);
+
+ return hash.substring(0, 8);
Review Comment:
Looking at the javadoc, it always produces 32 bits, and if we base64 encode
it to 6 bits per char, it means we will just get 5 chars out of it plus some
padding. And from what @danielcweeks suggested before, 5 chars is enough
entropy, so we can get the last 30 bits and use that to always encode a 5-char
prefix. Is this understanding right?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]