Hi Pi,
I have a JIRA on this issue: PIG-102
It needs feedback from the community whether it should be a
configuration property or a high-level command.
Craig
Benjamin Reed wrote:
This uses the FileLocalizer. All file references are sent through the
FileLocalizer. If we are doing MAPREDUCE and a file reference starts with
file:, we copy it to a temp file in HDFS before we start the job and use that
temp file as the input or output of the map reduce job.
ben
On Thursday 06 March 2008 04:07:41 pi song wrote:
Dear pig-dev mailling-list,
I just wanna understand this bit quickly. Below is the code from
TestMapReduce.java. As you can see the temp file is created in local
machine but I don't understand how Hadoop MapReduce pick up the file from
local file system rather than HDFS?
PigServer pig = new PigServer(MAPREDUCE);
File tmpFile = File.createTempFile("test", ".txt");
PrintStream ps = new PrintStream(new FileOutputStream(tmpFile));
for(int i = 0; i < 10; i++) {
ps.println(i+"\t"+i);
}
ps.close();
String query = "foreach (load 'file:"+tmpFile+"') generate $0,$1;";
System.out.println(query);
pig.registerQuery("asdf_id = " + query);
try {
pig.deleteFile("frog");
} catch(Exception e) {}
Cheers,
Pi