Neeraj Mahajan wrote:
I read from Hadoop docs that the task scheduler tries to execute the task
closer to the data. Can this functionality be applied without using HDFS?
How?

You can subclass LocalFileSystem and override getFileCacheHints() to return the host where the file is known to be local.

http://lucene.apache.org/hadoop/api/org/apache/hadoop/fs/FileSystem.html#getFileCacheHints(org.apache.hadoop.fs.Path,%20long,%20long)

Doug

Reply via email to