Re: A question regarding the execution engine

Benjamin Reed Thu, 06 Mar 2008 06:45:06 -0800

This uses the FileLocalizer. All file references are sent through the 
FileLocalizer. If we are doing MAPREDUCE and a file reference starts with 
file:, we copy it to a temp file in HDFS before we start the job and use that 
temp file as the input or output of the map reduce job.


ben

On Thursday 06 March 2008 04:07:41 pi song wrote:
> Dear pig-dev mailling-list,
>
> I just wanna understand this bit quickly. Below is the code from
> TestMapReduce.java. As you can see the temp file is created in local
> machine but I don't understand how Hadoop MapReduce pick up the file from
> local file system rather than HDFS?
>
>         PigServer pig = new PigServer(MAPREDUCE);
>         File tmpFile = File.createTempFile("test", ".txt");
>         PrintStream ps = new PrintStream(new FileOutputStream(tmpFile));
>         for(int i = 0; i < 10; i++) {
>             ps.println(i+"\t"+i);
>         }
>         ps.close();
>         String query = "foreach (load 'file:"+tmpFile+"') generate $0,$1;";
>         System.out.println(query);
>         pig.registerQuery("asdf_id = " + query);
>         try {
>             pig.deleteFile("frog");
>         } catch(Exception e) {}
>
> Cheers,
> Pi

Re: A question regarding the execution engine

Reply via email to