Spark on HDFS vs. Lustre vs. other file systems - formal research and performance evaluation

Edmon Begoli Fri, 13 Mar 2015 15:10:54 -0700

All,

Does anyone have any reference to a publication or other, informal sources
(blogs, notes), showing
performance of Spark on HDFS vs. other shared (Lustre, etc.) or other file
system (NFS).


I need this for formal performance research.

We are currently doing a research into this on a very specific, butique
machine, and we are seeing some controversial results.

For the purpose of literature survey and general comparison I would like to
see the findings that others have had. I know that general wisdom states
that Spark and HDFS should work the best because of the data locality
awareness.

Thank you,
*Edmon Begoli, PhD*
Chief Data Officer
Joint Institute for Computational Sciences (JICS)
ebeg...@tennessee.edu
https://www.linkedin.com/in/ebegoli

Spark on HDFS vs. Lustre vs. other file systems - formal research and performance evaluation

Reply via email to