[jira] [Resolved] (MAPREDUCE-3619) Change streaming code to use new mapreduce api.

Liyin Liang (Resolved) (JIRA) Mon, 09 Jan 2012 06:45:43 -0800

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Liyin Liang resolved MAPREDUCE-3619.
------------------------------------

    Resolution: Duplicate
    
> Change streaming code to use new mapreduce api.
> -----------------------------------------------
>
>                 Key: MAPREDUCE-3619
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3619
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/streaming, mrv2
>    Affects Versions: 0.23.1
>            Reporter: Liyin Liang
>
> If we run a streaming job with following python script as mapper or reducer, 
> the job will throws NullPointerException.
> {code:}
> #!/usr/bin/python
> import sys,os
> class MyTask:
>   def __init__(self, file=sys.stdin):
>     self.file = file
>     print >>sys.stderr, "reporter:counter:spam,disp_flag_record,0"
>     print >>sys.stderr, "reporter:counter:spam,spam_record,0"
>   def process(self):
>     while True:
>       line = self.file.readline()
>       if not line:
>         break;
>       print line
> if __name__ == "__main__":
>   task = MyTask()
>   task.process()
> {code}
> Here is the NPE related log:
> 2011-12-22 14:14:06,310 WARN org.apache.hadoop.streaming.PipeMapRed: 
> java.lang.NullPointerException
>       at 
> org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.incrCounter(PipeMapRed.java:502)
>       at 
> org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.run(PipeMapRed.java:444)
> This is because the above script's "print >>sys.stderr" will invoke 
> reporter.incrCounter() during PipeMapper|PipeReducer.configure(). While we 
> can not get reporter in configure() function. 
> To fix this problem, we should change streaming code to use new-api. Then we 
> can call context.getCounter() in Mapper|Reducer.setup() function.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-3619) Change streaming code to use new mapreduce api.

Reply via email to