Hi, I failed to config spark for in-memory shuffle so currently just using
linux memory mapped directory (tmpfs) as working directory of spark, so
everything is fast Sent using Zoho Mail ---- On Wed, 17 Oct 2018 16:41:32 +0330
thomas lavocat <thomas.lavo...@univ-grenoble-alpes.fr> wrote ---- Hi everyone,
The possibility to have in memory shuffling is discussed in this issue
https://github.com/apache/spark/pull/5403. It was in 2015. In 2016 the paper
"Scaling Spark on HPC Systems" says that Spark still shuffle using disks. I
would like to know : What is the current state of in memory shuffling ? Is it
implemented in production ? Does the current shuffle still use disks to work ?
Is it possible to somehow do it in RAM only ? Regards, Thomas
--------------------------------------------------------------------- To
unsubscribe e-mail: user-unsubscr...@spark.apache.org