On Wed, Oct 22, 2008 at 18:55, Steve Gao [EMAIL PROTECTED] wrote:
I am using Hadoop Streaming. The input are multiple files.
Is there a way to get the current filename in mapper?
Streaming map tasks should have a map_input_file environment
variable like the following:
On Fri, Jun 27, 2008 at 08:57, Chris Anderson [EMAIL PROTECTED] wrote:
The problem is that when there are a large number of map tasks to
complete, Hadoop doesn't seem to obey the map.tasks.maximum. Instead,
it is spawning 8 map tasks per tasktracker (even when I change the
for the quick response! I see this feature is in trunk and not
available in last stable release. Anyway will try if it works for me from
the trunk, and will try does it catch segmentation faults too.
Rick Cox wrote:
Try -jobconf stream.non.zero.exit.status.is.failure=true
Try -jobconf stream.non.zero.exit.status.is.failure=true.
That will tell streaming that a non-zero exit is a task failure. To
turn that into an immediate whole job failure, I think configuring 0
task retries (mapred.map.max.attempts=1 and
mapred.reduce.max.attempts=1) will be sufficient.
rick
On Tue, Apr 8, 2008 at 12:36 PM, Ian Tegebo [EMAIL PROTECTED] wrote:
My original question was about specifying MaxMapTaskFailuresPercent as a job
conf parameter on the command line for streaming jobs. Is there a conf
setting
like the following?
mapred.taskfailure.percent
The job