difin commented on code in PR #5792:
URL: https://github.com/apache/hive/pull/5792#discussion_r2069608953


##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergAcidUtil.java:
##########
@@ -243,14 +247,49 @@ public static long getDeleteFilePosition(Record rec) {
     return rec.get(DELETE_FILE_META_COLS.get(MetadataColumns.ROW_POSITION), 
Long.class);
   }
 
+  private static long hashObjectArray(Object[] values) {
+    Hasher hasher = Hashing.murmur3_128().newHasher();
+
+    for (Object val : values) {
+      if (val == null) {
+        // Unique constant for null
+        hasher.putInt(0xDEADBEEF);
+      } else if (val instanceof Integer) {
+        hasher.putInt((Integer) val);
+      } else if (val instanceof Long) {
+        hasher.putLong((Long) val);
+      } else if (val instanceof String) {
+        hasher.putString((String) val, StandardCharsets.UTF_8);
+      } else if (val instanceof Boolean) {
+        hasher.putBoolean((Boolean) val);
+      } else if (val instanceof Short) {
+        hasher.putShort((Short) val);
+      } else if (val instanceof Byte) {
+        hasher.putByte((Byte) val);
+      } else if (val instanceof Character) {
+        hasher.putChar((Character) val);
+      } else if (val instanceof Double) {
+        hasher.putDouble((Double) val);
+      } else if (val instanceof Float) {
+        hasher.putFloat((Float) val);
+      } else {
+        // Fallback to object's string representation
+        hasher.putLong(Objects.hash(val));
+      }
+    }
+
+    HashCode hashCode = hasher.hash();
+    return hashCode.asLong();
+  }
+
   public static long computeHash(StructLike struct) {

Review Comment:
   Reimplemented with using Iceberg's code as you suggested and it works, but 
only if combined with the fix to avoid null and 0 collisions. 
   
   Iceberg uses a recursive function where it processes structs recursively and 
the `Objects::hashCode` is the recursion base case which has the same issue 
that we are trying to fix in hive-iceberg.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to