Hi,
I use 0.20.2 on Debian 6.0 (squeeze) nodes.
I have 2 problems with my streaming jobs:
1) I start the job like so:
hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar \
        -file /proj/Search/wall/experiment/ \
        -mapper './nolog.sh mapper' \
        -reducer './nolog.sh reducer' \
        -input sim-input -output sim-output

nolog.sh is just a simple wrapper for my python program,
it calls build-models.py with --mapper or --reducer, depending on which 
argument it got,
and it removes any bogus logging output using grep.
it looks like this:

#!/bin/sh
python $(dirname $0)/build-models.py --$1 | egrep -v 'INFO|DEBUG|WARN'

build-models.py is a python 2 program containing all mapper/reducer/etc logic, 
it has the executable flag set for owner/group/other.
(I even added `chmod +x` on it in nolog.sh to be really sure)

The problems:
When I use this shebang for build-models.py: "#!/usr/bin/python" or 
"#!/usr/bin/env python" (I would expect the last to work for sure?),
and 
$(dirname $0)/build-models.py in nolog.sh
I get this error: 
/tmp/hadoop-dplaetin/mapred/local/taskTracker/jobcache/job_201103311017_0008/attempt_201103311017_0008_m_000000_0/work/././nolog.sh:
9: 
/tmp/hadoop-dplaetin/mapred/local/taskTracker/jobcache/job_201103311017_0008/attempt_201103311017_0008_m_000000_0/work/././build-models.py:
Permission denied
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
failed with code 1


So, despite not understanding why it's needed (python is installed correctly, 
executable flags set, etc), I can "solve" this by using the invocation in 
nolog.sh as shown above (`python <scriptname>`).
Since, if you invoke a python program like that, you can just as well remove 
the shebang because it's not needed (I verified this manually).
However when running it in hadoop it tries to execute the python file as a bash 
file, and yields a bunch of "command not found" errors.
What is going on? Why can't I just execute the file and rely on the shebang? 
And if I invoke the file as argument to the python program, why is the shebang 
still needed?


2) the second problem is somewhat related: I notice my mappers jump to "100% 
completion" right away - but they take about an hour to complete, so I see them 
running for an hour in 'RUNNING' with 100% completion, then they really finish.
this is probably an issue with the reading of stdin, as python uses
buffering by default (see
http://stackoverflow.com/questions/3670323/setting-smaller-buffer-size-for-sys-stdin
 )
In my code I iterate over stdin like this: `for line in sys.stdin:`, so I 
process line by line, but apparently python reads the entire stdin right away, 
my hdfs blocksize is 20KiB (which according to the thread above happens to be 
pretty much the size of the python buffer)

Now, why is this related? -> Because I can invoke python in a different way to 
keep it from doing the buffering.
apparently using the -u flag should do the trick, or setting the environment 
variable PYTHONUNBUFFERED to a nonempty string.
However:
- putting `python -u` in nolog.sh doesn't do it, why?
- neither does putting `export PYTHONUNBUFFERED=true` in nolog.sh before the 
invocation, why?
- in build-models.py shebang:
  putting `/usr/bin/env python -u` or '/usr/bin/env 'python -u'` gives:
  /usr/bin/env: python -u: No such file or directory, why?
I did find a working variant, that is, I can use this shebang:
`#!/usr/bin/env PYTHONUNBUFFERED=true python2`, however since I use the same 
file for multiple things, this made i/o for a bunch of other things way too 
slow, so I tried solving this in the python code (as per the tip in the above 
link), but to no avail. (I know, my final question is a bit less related)

So I tried remapping sys.stdin (before iterating it) with these two attemptst:
( see http://docs.python.org/library/os.html#os.fdopen )
newin = os.fdopen(sys.stdin.fileno(), 'r', 100) # should make buffersize +- 
100bytes
newin = os.fdopen(sys.stdin.fileno(), 'r', 1) # should make python buffer line 
by line

however, neither of those worked..

Any help/input is welcome.
I'm usually pretty good at figuring out issues with these kinds of issues of 
invocation, but this one blows my mind :/

Dieter

Reply via email to