Akhil, Ah, very good point. I guess "SET spark.sql.shuffle.partitions=1024" should do it.
Alex On Sun, Jan 18, 2015 at 10:29 PM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > Its the executor memory (spark.executor.memory) which you can set while > creating the spark context. By default it uses 0.6% of the executor memory > for Storage. Now, to show some memory usage, you need to cache (persist) > the RDD. Regarding the OOM Exception, you can increase the level of > parallelism (also you can increase the number of partitions depending on > your data size) and it should be fine. > > Thanks > Best Regards > > On Mon, Jan 19, 2015 at 11:36 AM, Alessandro Baretta < > alexbare...@gmail.com> wrote: > >> All, >> >> I'm getting out of memory exceptions in SparkSQL GROUP BY queries. I have >> plenty of RAM, so I should be able to brute-force my way through, but I >> can't quite figure out what memory option affects what process. >> >> My current memory configuration is the following: >> export SPARK_WORKER_MEMORY=83971m >> export SPARK_DAEMON_MEMORY=15744m >> >> What does each of these config options do exactly? >> >> Also, how come the executors page of the web UI shows no memory usage: >> >> 0.0 B / 42.4 GB >> >> And where does 42.4 GB come from? >> >> Alex >> > >