s-sanjay commented on a change in pull request #1350: [HUDI-629]: Replace
Guava's Hashing with an equivalent in NumericUtils.java
URL: https://github.com/apache/incubator-hudi/pull/1350#discussion_r392642073
##########
File path:
hudi-common/src/main/java/org/apache/hudi/common/util/NumericUtils.java
##########
@@ -31,4 +38,27 @@ public static String humanReadableByteCount(double bytes) {
String pre = "KMGTPE".charAt(exp - 1) + "";
return String.format("%.1f %sB", bytes / Math.pow(1024, exp), pre);
}
+
+ public static long getMessageDigestHash(final String algorithmName, final
String string) {
+ MessageDigest md;
+ try {
+ md = MessageDigest.getInstance(algorithmName);
+ } catch (NoSuchAlgorithmException e) {
+ throw new HoodieException(e);
+ }
+ return
asLong(Objects.requireNonNull(md).digest(string.getBytes(StandardCharsets.UTF_8)));
+ }
+
+ public static long asLong(byte[] bytes) {
+ ValidationUtils.checkState(bytes.length >= 8, "HashCode#asLong() requires
>= 8 bytes.");
+ return padToLong(bytes);
+ }
+
+ public static long padToLong(byte[] bytes) {
+ long retVal = (bytes[0] & 0xFF);
+ for (int i = 1; i < Math.min(bytes.length, 8); i++) {
Review comment:
wondering instead of making this public, if we can make it private and then
test the asLong method
also would it help in readability if we unroll the for loop like this ?
```
byte[] padded = Arrays.copyOf(bytes, 8);
long retVal = padded[0]
retVal |= (padded[1] << i * 8)
retVal |= (padded[2] << i * 16)
retVal |= (padded[3] << i * 24)
retVal |= (padded[4] << i * 32)
retVal |= (padded[5] << i * 40)
retVal |= (padded[6] << i * 48)
retVal |= (padded[7] << i * 56)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services