[
https://issues.apache.org/jira/browse/PIG-3913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993148#comment-13993148
]
Aniket Mokashi commented on PIG-3913:
-------------------------------------
Looking at this, it seems main difference is - in hadoop 1, jobcontrol.Job does
not support an api to get the mapreduce/mapred.Job object from it (in fact it
does not even store it). In hadoop 2, jobcontrol.Job is just a wrapper around
ControlledJob which stores a pointer to the real mapreduce.Job. In hadoop 1 for
querying stats, we should get jobClient of the job and ask for the runningJob
given the JobID for the job. In hadoop 2, we should directly the real job using
getJob of jobcontrol.Job and query it for stats.
I will construct necessary shims to hide this behind interfaces.
> Pig should use job's jobClient wherever possible (fixes local mode counters)
> ----------------------------------------------------------------------------
>
> Key: PIG-3913
> URL: https://issues.apache.org/jira/browse/PIG-3913
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.13.0
> Reporter: Aniket Mokashi
> Assignee: Gera Shegalov
> Fix For: 0.13.0
>
> Attachments: PIG-3913-1.patch, PIG-3913-MRv2.v01.patch
>
>
> MapreduceLauncher initializes a statsJobClient to poll counter information of
> jobs. This works fine in mapreduce mode but it reports incorrect information
> in local (auto-local) mode. Pig code should try to use
> org.apache.hadoop.mapred.jobcontrol.Job's getJobClient api to get handle to
> jobClient wherever possible. statsJobClient (and wherever its references are
> passed) should be deprecated.
--
This message was sent by Atlassian JIRA
(v6.2#6252)