onebox-li commented on code in PR #2086:
URL: 
https://github.com/apache/incubator-celeborn/pull/2086#discussion_r1392094713


##########
common/src/main/scala/org/apache/celeborn/common/meta/WorkerInfo.scala:
##########
@@ -232,8 +235,21 @@ class WorkerInfo(
   }
 
   override def hashCode(): Int = {
-    val state = Seq(host, rpcPort, pushPort, fetchPort, replicatePort)
-    state.map(_.hashCode()).foldLeft(0)((a, b) => 31 * a + b)
+    var h = hash
+    if (h == 0 || isZeroHash) {
+      val state = Array(host, rpcPort, pushPort, fetchPort, replicatePort)
+      var i = 0
+      while (i < state.length) {
+        h = 31 * h + state(i).hashCode()
+        i = i + 1
+      }
+      if (h == 0) {
+        isZeroHash = true
+      } else {
+        hash = h
+      }
+    }
+    h

Review Comment:
   I have tested WorkerInfo#hashCode() with and without cache. 
   ```
   hashCode() with cache 1 cost 3665 ns
   hashCode() with cache 2 cost 472 ns
   hashCode() with cache 3 cost 390 ns
   
   hashCode() without cache 1 cost 3389 ns
   hashCode() without cache 2 cost 1872 ns
   hashCode() without cache 3 cost 1915 ns
   ```
   Since the time unit is small, the improvement is not large in nature, but 
the cost does not seem to be large. Maybe we can retain the cache?



##########
common/src/main/scala/org/apache/celeborn/common/meta/WorkerInfo.scala:
##########
@@ -232,8 +235,21 @@ class WorkerInfo(
   }
 
   override def hashCode(): Int = {
-    val state = Seq(host, rpcPort, pushPort, fetchPort, replicatePort)
-    state.map(_.hashCode()).foldLeft(0)((a, b) => 31 * a + b)
+    var h = hash
+    if (h == 0 || isZeroHash) {
+      val state = Array(host, rpcPort, pushPort, fetchPort, replicatePort)
+      var i = 0
+      while (i < state.length) {
+        h = 31 * h + state(i).hashCode()
+        i = i + 1
+      }
+      if (h == 0) {
+        isZeroHash = true
+      } else {
+        hash = h
+      }
+    }
+    h

Review Comment:
   I have tested WorkerInfo#hashCode() with and without cache. 
   ```
   hashCode() with cache 1 cost 3665 ns
   hashCode() with cache 2 cost 472 ns
   hashCode() with cache 3 cost 390 ns
   
   hashCode() without cache 1 cost 3389 ns
   hashCode() without cache 2 cost 1872 ns
   hashCode() without cache 3 cost 1915 ns
   ```
   Since the time unit is small, the improvement is not large in nature, but 
the cost does not seem to be large. Maybe we can retain the cache?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to