Re: A question regarding the execution engine

Craig Macdonald Thu, 06 Mar 2008 06:51:38 -0800

Hi Pi,

I have a JIRA on this issue: PIG-102

It needs feedback from the community whether it should be aconfiguration property or a high-level command.


Craig

Benjamin Reed wrote:

This uses the FileLocalizer. All file references are sent through theFileLocalizer. If we are doing MAPREDUCE and a file reference starts withfile:, we copy it to a temp file in HDFS before we start the job and use thattemp file as the input or output of the map reduce job.


ben

On Thursday 06 March 2008 04:07:41 pi song wrote:

Dear pig-dev mailling-list,

I just wanna understand this bit quickly. Below is the code from
TestMapReduce.java. As you can see the temp file is created in local
machine but I don't understand how Hadoop MapReduce pick up the file from
local file system rather than HDFS?

        PigServer pig = new PigServer(MAPREDUCE);
        File tmpFile = File.createTempFile("test", ".txt");
        PrintStream ps = new PrintStream(new FileOutputStream(tmpFile));
        for(int i = 0; i < 10; i++) {
            ps.println(i+"\t"+i);
        }
        ps.close();
        String query = "foreach (load 'file:"+tmpFile+"') generate $0,$1;";
        System.out.println(query);
        pig.registerQuery("asdf_id = " + query);
        try {
            pig.deleteFile("frog");
        } catch(Exception e) {}

Cheers,
Pi

Re: A question regarding the execution engine

Reply via email to