[ 
https://issues.apache.org/jira/browse/PIG-619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703941#action_12703941
 ] 

Viraj Bhat commented on PIG-619:
--------------------------------

So when does the Multi-Store query optimization get committed/merged  into the 
main branch, (where this is default way the multi-store happens). 
Viraj

> Dumping empty results produces "Unable to get results for 
> /tmp/temp-1964806069/tmp256878619  org.apache.pig.builtin.BinStorage" message
> ---------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-619
>                 URL: https://issues.apache.org/jira/browse/PIG-619
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.2.0
>         Environment: Hadoop 18, Multi-node hadoop installation
>            Reporter: Viraj Bhat
>            Assignee: Alan Gates
>         Attachments: mydata.txt, tmpfileload.pig
>
>
> Following pig script stores empty filter results into  'emptyfilteredlogs' 
> HDFS dir. It later reloads this data from an empty HDFS dir for additional 
> grouping and counting. It has been observed that this script, succeeds on a 
> single node hadoop installation with the following message as the alias 
> COUNT_EMPTYFILTERED_LOGS contains empty data.
> ==============================================================================================================
> 2009-01-13 21:47:08,988 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Success!
> ==============================================================================================================
> But on a multi-node Hadoop installation, the script fails with the following 
> error:
> ==============================================================================================================
> 2009-01-13 13:48:34,602 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Success!
> java.io.IOException: Unable to open iterator for alias: 
> COUNT_EMPTYFILTERED_LOGS [Unable to get results for 
> /tmp/temp-1964806069/tmp256878619:org.apache.pig.builtin.BinStorage]
>         at 
> org.apache.pig.backend.hadoop.executionengine.HJob.getResults(HJob.java:74)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:408)
>         at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269)
>         at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:306)
> Caused by: org.apache.pig.backend.executionengine.ExecException: Unable to 
> get results for 
> /tmp/temp-1964806069/tmp256878619:org.apache.pig.builtin.BinStorage
>         ... 7 more
> Caused by: java.io.IOException: /tmp/temp-1964806069/tmp256878619 does not 
> exist
>         at 
> org.apache.pig.impl.io.FileLocalizer.openDFSFile(FileLocalizer.java:188)
>         at org.apache.pig.impl.io.FileLocalizer.open(FileLocalizer.java:291)
>         at 
> org.apache.pig.backend.hadoop.executionengine.HJob.getResults(HJob.java:69)
>         ... 6 more
> ==============================================================================================================
> {code}
> RAW_LOGS = load 'mydata.txt' as (url:chararray, numvisits:int);
> RAW_LOGS = limit RAW_LOGS 2;
> FILTERED_LOGS = filter RAW_LOGS by numvisits < 0;
> store FILTERED_LOGS into 'emptyfilteredlogs' using PigStorage();
> EMPTY_FILTERED_LOGS = load 'emptyfilteredlogs' as (url:chararray, 
> numvisits:int);
> GROUP_EMPTYFILTERED_LOGS = group EMPTY_FILTERED_LOGS by numvisits;
> COUNT_EMPTYFILTERED_LOGS = foreach GROUP_EMPTYFILTERED_LOGS generate
>                              group, COUNT(EMPTY_FILTERED_LOGS);
> explain COUNT_EMPTYFILTERED_LOGS;
> dump COUNT_EMPTYFILTERED_LOGS;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to