Driver
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
Path cachefile = new Path("path/to/file");
FileStatus[] list = fs.globStatus(cachefile);
for (FileStatus status : list) {
DistributedCache.addCacheFile(status.getPath().toUri(), conf);
}
In setup
public void setup(Context context) throws IOException{
Configuration conf = context.getConfiguration();
FileSystem fs = FileSystem.get(conf);
URI[] cacheFiles = DistributedCache.getCacheFiles(conf);
Path getPath = new Path(cacheFiles[0].getPath());
BufferedReader bf = new BufferedReader(new
InputStreamReader(fs.open(getPath)));
String setupData = null;
while ((setupData = bf.readLine()) != null) {
System.out.println("Setup Line in reducer "+setupData);
}
}
Hope this link helps:
http://unmeshasreeveni.blogspot.in/2014/10/how-to-load-file-in-distributedcache-in.html
On Mon, Dec 22, 2014 at 2:58 PM, Marko Dinic <[email protected]>
wrote:
> Hello Hadoopers,
>
> I'm getting this exception in Hadoop while trying to read file that was
> added to distributed cache, and the strange thing is that the file exists
> on the given location
>
> java.io.FileNotFoundException: File does not exist:
> /tmp/hadoop-pera/mapred/local/taskTracker/distcache/-1517670662102870873_-
> 1918892372_1898431787/localhost/work/output/temporalcentroids/centroids-
> iteration0-noOfClusters2/part-r-00000
>
> I'm adding the file in before starting my job using
>
> DistributedCache.addCacheFile(URI.create(args[2]),
> job.getConfiguration());
>
> And I'm trying to read from the file from setup metod in my mapper using
>
> DistributedCache.getLocalCacheFiles(conf);
>
> As I said, I can confirm that the file is on the local system, but the
> exception is thrown.
>
> I'm running the job in pseudo-distributed mode, on one computer.
>
> Any ideas?
>
> Thanks
>
--
*Thanks & Regards *
*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Centre for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/