Pig/Zebra fails without proper error message when the 
mapred.jobtracker.maxtasks.per.job exceeds threshold
----------------------------------------------------------------------------------------------------------

                 Key: PIG-1377
                 URL: https://issues.apache.org/jira/browse/PIG-1377
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.6.0, 0.7.0
            Reporter: Viraj Bhat


I have a Zebra script which generates huge amount of mappers around 400K. The 
mapred.jobtracker.maxtasks.per.job is currently set at 200k. The job fails at 
the initialization phase. It is very hard to find out the cause.

We need a way to report the right error message to users. Unfortunately for Pig 
to get this error in the backend, Map Reduce Jira: 
https://issues.apache.org/jira/browse/MAPREDUCE-1049 needs to be fixed.
{code}

-- Sorted format
%set default_parallel 100;
raw = load '/user/viraj/generated/raw/zebra-sorted/20100203'
                USING org.apache.hadoop.zebra.pig.TableLoader('', 'sorted')
                as (id,
                        timestamp,
                        code,
                        ip,
                        host,
                        reference,
                        type,
                        flag,
                        params : map[]
                );
describe raw;
user_events = filter raw by id == 'viraj';
describe user_events;
dump user_events;
sorted_events = order user_events by id, timestamp;
dump sorted_events;
store sorted_events into 'finalresult';
{code}

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to