Hello,
I am trying to package some config data out to my mappers. Was just testing
while running locally, and I can't get anything to work for me.
~/hadoop/hadoop-0.20.204.0/bin/hadoop jar the_jar.jar com.bar.ApplyMappings
-files data/config_stuff.txt#config_stuff.txt input_dir output_dir
configure function:
@Override
public void configure(JobConf conf) {
try {
Scanner s = new Scanner(new File("config_stuff.txt"));
System.out.println(s.nextLine());
} catch (Exception e) {
throw new RuntimeException(e);
}
}
I thought distributed cache was supposed to be symlinking the file to my
working directory. To test, I also tried the raw path:
@Override
public void configure(JobConf conf) {
try {
Scanner s = new Scanner(new
File(DistributedCache.getLocalCacheFiles(conf)[0].toString()));
System.out.println(s.nextLine());
} catch (Exception e) {
throw new RuntimeException(e);
}
}
This gives
java.io.FileNotFoundException:
/tmp/hadoop-jvincent/mapred/local/archive/-3183172400467095803_-988649232_1047835448/file/Users/jvincent/projects/foo_project/data/config_stuff.txt
(No such file or directory)
So the file is making it to the local "distributed cache", but not actually
getting copied anywhere useful. Is there something else I need to be
setting up in my JobConf? Or any other ideas? If I pass bogus paths to the
-files flag, I get an exception so it's finding the files and all that
initially.
Thanks!
Justin