latency is much bigger for S3 (if that matters) And with HDFS you'd get data-locality that will boost your app performance.
I did some light experimenting on this. see my presentation here for some benchmark numbers ..etc http://www.slideshare.net/sujee/hadoop-to-sparkv2 from slide# 34 cheers Sujee Maniyam (http://sujee.net | http://www.linkedin.com/in/sujeemaniyam ) teaching Spark <http://elephantscale.com/training/spark-for-developers/?utm_source=mailinglist&utm_medium=email&utm_campaign=signature> On Wed, Jul 8, 2015 at 11:35 PM, Brandon White <bwwintheho...@gmail.com> wrote: > Are there any significant performance differences between reading text > files from S3 and hdfs? >