I am facing a very tricky issue here. I have a treeReduce task. The reduce-function returns a very large object. In fact it is a Map[Int, Array[Double]]. Each reduce task inserts and/or updates values into the map or updates the array. My problem is, that this Map can become very large. Currently, I am looking at about 500 MB (serialized size). The performance of the entire reduce task is incredibly slow. While my reduce function takes only about 10 seconds to execute, the shuffle-subsystem of Spark takes very long. My task returns after about 100-300 seconds. This even happens if I just have 2 nodes with 2 worker cores. So the only thing spark would have to do is to send the 500 MB over the network (both machines are connected via Gigabit Ethernet) which should take a couple of seconds.
It is also interesting to note that if I choose "nio" as block transport manager, the speed is very good. Only a couple of seconds as expected. But just discovered that "nio" support is discontinued. So, how can I get good performance for such a usage scenario. Large objects, treeReduce, not very many nodes with netty? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Shuffle-performance-tuning-How-to-tune-netty-tp25433.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org