onebox-li commented on code in PR #2086:
URL:
https://github.com/apache/incubator-celeborn/pull/2086#discussion_r1392094713
##########
common/src/main/scala/org/apache/celeborn/common/meta/WorkerInfo.scala:
##########
@@ -232,8 +235,21 @@ class WorkerInfo(
}
override def hashCode(): Int = {
- val state = Seq(host, rpcPort, pushPort, fetchPort, replicatePort)
- state.map(_.hashCode()).foldLeft(0)((a, b) => 31 * a + b)
+ var h = hash
+ if (h == 0 || isZeroHash) {
+ val state = Array(host, rpcPort, pushPort, fetchPort, replicatePort)
+ var i = 0
+ while (i < state.length) {
+ h = 31 * h + state(i).hashCode()
+ i = i + 1
+ }
+ if (h == 0) {
+ isZeroHash = true
+ } else {
+ hash = h
+ }
+ }
+ h
Review Comment:
I have tested WorkerInfo#hashCode() with and without cache.
```
hashCode() with cache 1 cost 3665 ns
hashCode() with cache 2 cost 472 ns
hashCode() with cache 3 cost 390 ns
hashCode() without cache 1 cost 3389 ns
hashCode() without cache 2 cost 1872 ns
hashCode() without cache 3 cost 1915 ns
```
Since the time unit is small, the improvement is not large in nature, but
the cost does not seem to be large. Maybe we can retain the cache?
##########
common/src/main/scala/org/apache/celeborn/common/meta/WorkerInfo.scala:
##########
@@ -232,8 +235,21 @@ class WorkerInfo(
}
override def hashCode(): Int = {
- val state = Seq(host, rpcPort, pushPort, fetchPort, replicatePort)
- state.map(_.hashCode()).foldLeft(0)((a, b) => 31 * a + b)
+ var h = hash
+ if (h == 0 || isZeroHash) {
+ val state = Array(host, rpcPort, pushPort, fetchPort, replicatePort)
+ var i = 0
+ while (i < state.length) {
+ h = 31 * h + state(i).hashCode()
+ i = i + 1
+ }
+ if (h == 0) {
+ isZeroHash = true
+ } else {
+ hash = h
+ }
+ }
+ h
Review Comment:
I have tested WorkerInfo#hashCode() with and without cache.
```
hashCode() with cache 1 cost 3665 ns
hashCode() with cache 2 cost 472 ns
hashCode() with cache 3 cost 390 ns
hashCode() without cache 1 cost 3389 ns
hashCode() without cache 2 cost 1872 ns
hashCode() without cache 3 cost 1915 ns
```
Since the time unit is small, the improvement is not large in nature, but
the cost does not seem to be large. Maybe we can retain the cache?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]