SalvadorRomo opened a new pull request, #2547:
URL: https://github.com/apache/uniffle/pull/2547
Title: [#2544][Bug] NPE about StatisticsCodec
### What changes were proposed in this pull request?
Converting into a synchronize `List<codeCost>`
### Why are the changes needed?
this bugs happened in concurrent environment when spark workers with the
`RssShuffleManager` and spark.rss.client.io.compression.statisticsEnabled
property enabled, logs its compression statistics when finished, but since the
class was not prepared for concurrent enviroment, at the time to call
`List<codeCost>` into the `codec.statistics()` methods enters in a race
condition.
Fix: # 2544
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
this issue was diffcult to replicate into a local enviroment, but instead
when the patch was applied, i make sure the application continue working as
usual by:
1- deploying the application based on `./deploy/docker/read.me`
2- when executing the spark-shell make sure to do it by including the
`spark.rss.client.io.compression.statisticsEnabled ` props as follow:
```
docker exec -it rss-spark-master-1 /opt/spark/bin/spark-shell \
--master spark://rss-spark-master-1:7077 \
--conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
--conf spark.shuffle.manager=org.apache.spark.shuffle.RssShuffleManager \
--conf
spark.rss.coordinator.quorum=rss-coordinator-1:19999,rss-coordinator-2:19999 \
--conf spark.rss.storage.type=MEMORY_LOCALFILE \
--conf spark.speculation=true \
--conf spark.rss.client.io.compression.statisticsEnabled=true
```
3- run multipe spark scala jobs
4- when finishing, into each worker, look for the logs in
`/opt/spark/work/...`
5- looks for every entry in the file that succesfully logs the statistics.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]