I found an old JIRA referring the same. https://issues.apache.org/jira/browse/SPARK-5421
Thanks Best Regards On Sun, Sep 6, 2015 at 8:53 PM, Madhu <ma...@madhu.com> wrote: > I'm not sure if this has been discussed already, if so, please point me to > the thread and/or related JIRA. > > I have been running with about 1TB volume on a 20 node D2 cluster (255 > GiB/node). > I have uniformly distributed data, so skew is not a problem. > > I found that default settings (or wrong setting) for driver and executor > memory caused out of memory exceptions during shuffle (subtractByKey to be > exact). This was not easy to track down, for me at least. > > Once I bumped up driver to 12G and executor to 10G with 300 executors and > 3000 partitions, shuffle worked quite well (12 mins for subtractByKey). I'm > sure there are more improvement to made, but it's a lot better than heap > space exceptions! > > From my reading, the shuffle OOM problem is in ExternalAppendOnlyMap or > similar disk backed collection. > I have some familiarity with that code based on previous work with external > sorting. > > Is it possible to detect misconfiguration that leads to these OOMs and > produce a more meaningful error messages? I think that would really help > users who might not understand all the inner workings and configuration of > Spark (myself included). As it is, heap space issues are a challenge and > does not present Spark in a positive light. > > I can help with that effort if someone is willing to point me to the > precise > location of memory pressure during shuffle. > > Thanks! > > > > ----- > -- > Madhu > https://www.linkedin.com/in/msiddalingaiah > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Detecting-configuration-problems-tp13980.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >