All,

Does anyone have any reference to a publication or other, informal sources
(blogs, notes), showing
performance of Spark on HDFS vs. other shared (Lustre, etc.) or other file
system (NFS).

I need this for formal performance research.

We are currently doing a research into this on a very specific, butique
machine, and we are seeing some controversial results.

For the purpose of literature survey and general comparison I would like to
see the findings that others have had. I know that general wisdom states
that Spark and HDFS should work the best because of the data locality
awareness.

Thank you,
*Edmon Begoli, PhD*
Chief Data Officer
Joint Institute for Computational Sciences (JICS)
ebeg...@tennessee.edu
https://www.linkedin.com/in/ebegoli

Reply via email to