Regression of external shuffle service spark 2.3 vs spark 2.2

igor.berman Mon, 19 Nov 2018 04:15:51 -0800

Hi,
any inputs will be welcome regarding below
We are running with external shuffle service. Mesos cluster(1.5.1)


After upgrading our production workload to spark 2.3 we started to see OOM
failures of external shuffle services(running on each node).

Does anybody experienced same problems?
Any direction to any code would be helpful(I know that there was work done
in external shuffle service domain under 2.3, but from reading PRs can't
pinpoint what change causing those OOM)

Unfortunately there is no test case for reproduction and even with 2.3, OOM
failures start after 2+ days of production load

Igor



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Regression of external shuffle service spark 2.3 vs spark 2.2

Reply via email to