Github user ConeyLiu commented on the issue:
https://github.com/apache/spark/pull/22371
Thanks @felixcheung, @srowen, @cloud-fan for your time. There is only one
instance of `IndexShuffleBlockResolver` per executor, and the synchronize is
used to protect the modify safely when there are same tasks with different
attempt update at the same time. The synchronize is unnecessary for most of the
tasks, and the modify is very simple.
I have tested locally, the results as follow. I admit that this change
brings little improvement to complex tasks, but it does not cause performance
degradation.
`./spark-shell --master local[20] --driver-memory 40g`
`spark.range(0, 10000000, 1, 100).repartition(200).count()`
before:
map | reduce
---- | ---
2s | 0.4s
0.8s | 0.2s
0.7s | 0.2s
after:
map | reduce
---- | ---
0.8s | 0.2s
0.5s | 0.4s
0.5s | 0.2s
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]