[ https://issues.apache.org/jira/browse/MAPREDUCE-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Liyin Liang resolved MAPREDUCE-3619. ------------------------------------ Resolution: Duplicate > Change streaming code to use new mapreduce api. > ----------------------------------------------- > > Key: MAPREDUCE-3619 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3619 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/streaming, mrv2 > Affects Versions: 0.23.1 > Reporter: Liyin Liang > > If we run a streaming job with following python script as mapper or reducer, > the job will throws NullPointerException. > {code:} > #!/usr/bin/python > import sys,os > class MyTask: > def __init__(self, file=sys.stdin): > self.file = file > print >>sys.stderr, "reporter:counter:spam,disp_flag_record,0" > print >>sys.stderr, "reporter:counter:spam,spam_record,0" > def process(self): > while True: > line = self.file.readline() > if not line: > break; > print line > if __name__ == "__main__": > task = MyTask() > task.process() > {code} > Here is the NPE related log: > 2011-12-22 14:14:06,310 WARN org.apache.hadoop.streaming.PipeMapRed: > java.lang.NullPointerException > at > org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.incrCounter(PipeMapRed.java:502) > at > org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.run(PipeMapRed.java:444) > This is because the above script's "print >>sys.stderr" will invoke > reporter.incrCounter() during PipeMapper|PipeReducer.configure(). While we > can not get reporter in configure() function. > To fix this problem, we should change streaming code to use new-api. Then we > can call context.getCounter() in Mapper|Reducer.setup() function. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira