[
https://issues.apache.org/jira/browse/SPARK-49491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17878686#comment-17878686
]
Yuming Wang edited comment on SPARK-49491 at 9/3/24 2:46 AM:
-------------------------------------------------------------
LongMap vs HashMap
{code:scala}
import scala.util.Random
import org.apache.spark.benchmark.Benchmark
val size = 4000
val map1 = new collection.mutable.HashMap[Long, Object]()
val map2 = new collection.mutable.LongMap[Object]()
Range(0, size).foreach { id =>
map1.put(id, new Object())
map2.put(id, new Object())
}
val keys = Range(1, size * 20000).map { _ =>
new Random().nextInt(size + 10)
}
val benchmark = new Benchmark("Benchmark Map", size, minNumIters = 30)
benchmark.addCase("HashMap") { _ =>
keys.foreach { key => map1.getOrElseUpdate(key, new Object()) }
}
benchmark.addCase("LongMap") { _ =>
keys.foreach { key => map2.getOrElseUpdate(key, new Object()) }
}
benchmark.run()
{code}
2.12.18:
{noformat}
OpenJDK 64-Bit Server VM 1.8.0_382-b05 on Mac OS X 14.5
Apple M2 Max
Benchmark Map: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
HashMap 1870 1888
29 0.0 467420.1 1.0X
LongMap 629 634
4 0.0 157347.9 3.0X
{noformat}
2.13.8:
{noformat}
OpenJDK 64-Bit Server VM 1.8.0_382-b05 on Mac OS X 14.5
Apple M2 Max
Benchmark Map: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
HashMap 735 759
52 0.0 183763.2 1.0X
LongMap 570 575
4 0.0 142464.2 1.3X
{noformat}
was (Author: q79969786):
LongMap vs HashMap
{code:scala}
import scala.util.Random
import org.apache.spark.benchmark.Benchmark
val size = 4000
val map1 = new collection.mutable.HashMap[Long, Object]()
val map2 = new collection.mutable.LongMap[Object]()
Range(0, size).foreach { id =>
map1.put(id, new Object())
map2.put(id, new Object())
}
val keys = Range(1, size * 20000).map { _ =>
new Random().nextInt(size + 10)
}
val benchmark = new Benchmark("Benchmark Map", size, minNumIters = 30)
benchmark.addCase("HashMap") { _ =>
keys.foreach { key => map1.getOrElseUpdate(key, new Object()) }
}
benchmark.addCase("LongMap") { _ =>
keys.foreach { key => map2.getOrElseUpdate(key, new Object()) }
}
benchmark.run()
{code}
{noformat}
OpenJDK 64-Bit Server VM 1.8.0_382-b05 on Mac OS X 14.5
Apple M2 Max
Benchmark Map: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
HashMap 1870 1888
29 0.0 467420.1 1.0X
LongMap 629 634
4 0.0 157347.9 3.0X
{noformat}
> Replace HashMap with LongMap or AnyRefMap
> -----------------------------------------
>
> Key: SPARK-49491
> URL: https://issues.apache.org/jira/browse/SPARK-49491
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 4.0.0
> Reporter: Yuming Wang
> Priority: Major
>
> JDK 1.8:
> {noformat}
> OpenJDK 64-Bit Server VM 1.8.0_382-b05 on Mac OS X 14.5
> Apple M2 Max
> Benchmark Map: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> HashMap 2028 2063
> 98 0.0 506933.0 1.0X
> AnyRefMap 1901 1936
> 19 0.0 475346.3 1.1X
> {noformat}
> Java 17:
> {noformat}
> OpenJDK 64-Bit Server VM 17.0.7+7-LTS on Mac OS X 14.5
> Apple M2 Max
> Benchmark Map: Best Time(ms) Avg Time(ms)
> Stdev(ms) Rate(M/s) Per Row(ns) Relative
> ------------------------------------------------------------------------------------------------------------------------
> HashMap 1575 1615
> 47 0.0 393832.3 1.0X
> AnyRefMap 1495 1502
> 5 0.0 373664.5 1.1X
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]