Hi Pi,

I have a JIRA on this issue: PIG-102
It needs feedback from the community whether it should be a configuration property or a high-level command.

Craig

Benjamin Reed wrote:
This uses the FileLocalizer. All file references are sent through the FileLocalizer. If we are doing MAPREDUCE and a file reference starts with file:, we copy it to a temp file in HDFS before we start the job and use that temp file as the input or output of the map reduce job.

ben

On Thursday 06 March 2008 04:07:41 pi song wrote:
Dear pig-dev mailling-list,

I just wanna understand this bit quickly. Below is the code from
TestMapReduce.java. As you can see the temp file is created in local
machine but I don't understand how Hadoop MapReduce pick up the file from
local file system rather than HDFS?

        PigServer pig = new PigServer(MAPREDUCE);
        File tmpFile = File.createTempFile("test", ".txt");
        PrintStream ps = new PrintStream(new FileOutputStream(tmpFile));
        for(int i = 0; i < 10; i++) {
            ps.println(i+"\t"+i);
        }
        ps.close();
        String query = "foreach (load 'file:"+tmpFile+"') generate $0,$1;";
        System.out.println(query);
        pig.registerQuery("asdf_id = " + query);
        try {
            pig.deleteFile("frog");
        } catch(Exception e) {}

Cheers,
Pi



Reply via email to