> > I would guess in such shuffles the bottleneck is serializing the data > rather than raw IO, so I'm not sure explicitly buffering the data in the > JVM process would yield a large improvement.
Good guess! It is very hard to beat the performance of retrieving shuffle outputs from the OS buffer cache, and very easy to actually make things slower while trying to do so. On Thu, Jun 11, 2015 at 4:33 PM, Patrick Wendell <pwend...@gmail.com> wrote: > Hey Corey, > > Yes, when shuffles are smaller than available memory to the OS, most > often the outputs never get stored to disk. I believe this holds same > for the YARN shuffle service, because the write path is actually the > same, i.e. we don't fsync the writes and force them to disk. I would > guess in such shuffles the bottleneck is serializing the data rather > than raw IO, so I'm not sure explicitly buffering the data in the JVM > process would yield a large improvement. > > Writing shuffle to an explicitly pinned memory filesystem is also > possible (per Davies suggestion), but it's brittle because the job > will fail if shuffle output exceeds memory. > > - Patrick > > On Wed, Jun 10, 2015 at 9:50 PM, Davies Liu <dav...@databricks.com> wrote: > > If you have enough memory, you can put the temporary work directory in > > tempfs (in memory file system). > > > > On Wed, Jun 10, 2015 at 8:43 PM, Corey Nolet <cjno...@gmail.com> wrote: > >> Ok so it is the case that small shuffles can be done without hitting any > >> disk. Is this the same case for the aux shuffle service in yarn? Can > that be > >> done without hitting disk? > >> > >> On Wed, Jun 10, 2015 at 9:17 PM, Patrick Wendell <pwend...@gmail.com> > wrote: > >>> > >>> In many cases the shuffle will actually hit the OS buffer cache and > >>> not ever touch spinning disk if it is a size that is less than memory > >>> on the machine. > >>> > >>> - Patrick > >>> > >>> On Wed, Jun 10, 2015 at 5:06 PM, Corey Nolet <cjno...@gmail.com> > wrote: > >>> > So with this... to help my understanding of Spark under the hood- > >>> > > >>> > Is this statement correct "When data needs to pass between multiple > >>> > JVMs, a > >>> > shuffle will always hit disk"? > >>> > > >>> > On Wed, Jun 10, 2015 at 10:11 AM, Josh Rosen <rosenvi...@gmail.com> > >>> > wrote: > >>> >> > >>> >> There's a discussion of this at > >>> >> https://github.com/apache/spark/pull/5403 > >>> >> > >>> >> > >>> >> > >>> >> On Wed, Jun 10, 2015 at 7:08 AM, Corey Nolet <cjno...@gmail.com> > wrote: > >>> >>> > >>> >>> Is it possible to configure Spark to do all of its shuffling FULLY > in > >>> >>> memory (given that I have enough memory to store all the data)? > >>> >>> > >>> >>> > >>> >>> > >>> >> > >>> > > >> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >