Ensuring data locality when opening files

Daniel Haviv Mon, 09 Mar 2015 00:47:55 -0700

Hi,
We wrote a spark steaming app that receives file names on HDFS from Kafka
and opens them using Hadoop's libraries.
The problem with this method is that I'm not utilizing data locality
because any worker might open any file without giving precedence to data
locality.
I can't open the files using sparkContext because it's limited to the
driver class.


Is there a way I could open files at runtime and benefit from data locality?

Thanks,
Daniel

Ensuring data locality when opening files

Reply via email to