Has anyone seen this issue before? Separately, what are the cases in which JobClient.getJob api returns null? We call this api to get counter info. Is there another way to do so?
For the oozie issue as well, it seems to be same root cause but I am not sure how to resolve it. For now, we are retrying the entire workflow whenever an MR action in it fails with this error code. On Wed, Feb 6, 2013 at 4:25 PM, Siva Subramanian <[email protected]> wrote: > Hey, > > We have a oozie workflow that triggers multiple MR actions. The workflow > failed last week when one of the MR actions failed with - > > MR002: Unknown hadoop job [job_201301292116_65447] associated with action > [0000161-130130163537664-oozie-oozi-W@topic_mr_1016227728]. Failing this > action! > > The corresponding MR job did exist and it had run successfully. That was > the first time we had seen this error and we are still trying to reproduce > it. Moreover there is nothing in the logs (oozie, hadoop) to indicate the > root-cause. I found this link when researching about this error but I dont > know what the resolution steps are - > http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-dev/201203.mbox/%3c92308400.32028.1332178177590.javamail.tom...@hel.zones.apache.org%3E > > > One thing that we have observed in the past that might help resolve this > issue was that JobClient.getJob(job id) was returning null for some MR > jobs. We still see this randomly but havent resolved this yet. > > We are using oozie 2.3.2-cdh3u3 & Hadoop 0.20.2-cdh3u3. > > Appreciate any help... > > Thanks, > Siva > -- *Siva Subramanian* *Engineering Analytics*
