----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/547/#review383 -----------------------------------------------------------
trunk/src/org/apache/pig/backend/hadoop/streaming/HadoopExecutableManager.java <https://reviews.apache.org/r/547/#comment725> Referencing PigMapReduce.sJobContext may cause a race condition in local Pig jobs, similar to what is described in PIG-1831. Should a similar fix be applied where the context in PigMapReduce is in thread local storage? - Adam On 2011-05-19 16:27:22, Adam Warrington wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/547/ > ----------------------------------------------------------- > > (Updated 2011-05-19 16:27:22) > > > Review request for pig. > > > Summary > ------- > > This is a patch for PIG-1702, which describes an issue where the task output > logs for PIG streaming jobs contains null input-split information. The > ability to query the input-split information through the JobConf went away > with the new MR API. We must now gain a reference to the underlying > FiletSplit, and query this reference for that information. > > > Diffs > ----- > > > trunk/src/org/apache/pig/backend/hadoop/streaming/HadoopExecutableManager.java > 1088692 > > Diff: https://reviews.apache.org/r/547/diff > > > Testing > ------- > > To test this, I wrote a very simple python script to pass data through using > PIG. After checking the task logs of the completed task, the stderr logs now > contain valid input split information. Below are the scripts and test data > used. > > ### PIG commands run ### > DEFINE testpy `test.py` SHIP ('test.py'); > raw_records = LOAD '/test.txt2'; > T1 = STREAM raw_records THROUGH testpy; > dump T1; > > ### test.py ### > #!/usr/bin/python > import sys > > cnt = 0 > for line in sys.stdin: > print line.strip() + " " + str(cnt) > cnt += 1 > > ### contents of /test.txt on hdfs ### > one line > two line > three line > four line > > > Thanks, > > Adam > >
