I'm new to Spark. I want to try out a few simple example from the Spark shell. However, I'm not sure how to configure it so that I can make the max. use of memory on my workers.
On average I've around 48 GB of RAM on each node on my cluster. I've around 10 nodes. Based on the documentation I could find memory based configuration in two places. *1. $SPARK_INSTALL_DIR/dist/conf/spark-env.sh * *SPARK_WORKER_MEMORY*Total amount of memory to allow Spark applications to use on the machine, e.g. 1000m, 2g (default: total memory minus 1 GB); note that each application's *individual* memory is configured using its spark.executor.memory property. *2. spark.executor.memory JVM flag. * spark.executor.memory512mAmount of memory to use per executor process, in the same format as JVM memory strings (e.g. 512m, 2g). http://spark.incubator.apache.org/docs/latest/configuration.html#system-properties In my case I want to use the max. memory possible on each node. My understanding is that I don't have to change *SPARK_WORKER_MEMORY *and I will have to increase spark.executor.memory to something big (e.g., 24g or 32g). Is this correct? If yes, what is the correct way of setting this property if I just want to use the spark-shell. Thanks. -Soumya
