Thanks Ryota! That seems to be the issue. I could verify that the job has infact been moved out of the cache from the job tracker logs.
Maybe this question should have been directed to map-reduce user group but wanted to check nevertheless. Do you know how to retrieve the information from the completed job status store? I am assuming it is not straightforward. Thanks again, Siva On Mon, Feb 11, 2013 at 2:17 PM, Ryota Egashira <[email protected]>wrote: > Siva > > This might be map-reduce issue, MAPREDUCE:2535, MAPREDUCE:2470 related? > I didn't hit this myself, but just looked at similar case where null > returned if one directly calls the > JobClient.getJob(JobID) on a job that has been retired and is moved from > from > RetireJobCcahe and CompletedJobStatusStore. > > Hope it's relevant. > > Thanks > Ryota > > On 2/11/13 1:50 PM, "Siva Subramanian" <[email protected]> wrote: > > >Has anyone seen this issue before? > > > >Separately, what are the cases in which JobClient.getJob api returns null? > >We call this api to get counter info. Is there another way to do so? > > > >For the oozie issue as well, it seems to be same root cause but I am not > >sure how to resolve it. For now, we are retrying the entire workflow > >whenever an MR action in it fails with this error code. > > > >On Wed, Feb 6, 2013 at 4:25 PM, Siva Subramanian <[email protected]> wrote: > > > >> Hey, > >> > >> We have a oozie workflow that triggers multiple MR actions. The workflow > >> failed last week when one of the MR actions failed with - > >> > >> MR002: Unknown hadoop job [job_201301292116_65447] associated with > >>action > >> [0000161-130130163537664-oozie-oozi-W@topic_mr_1016227728]. Failing > >>this > >> action! > >> > >> The corresponding MR job did exist and it had run successfully. That was > >> the first time we had seen this error and we are still trying to > >>reproduce > >> it. Moreover there is nothing in the logs (oozie, hadoop) to indicate > >>the > >> root-cause. I found this link when researching about this error but I > >>dont > >> know what the resolution steps are - > >> > >> > http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-dev/201203.mbox > >>/%3c92308400.32028.1332178177590.javamail.tom...@hel.zones.apache.org%3E > >> > >> > >> One thing that we have observed in the past that might help resolve this > >> issue was that JobClient.getJob(job id) was returning null for some MR > >> jobs. We still see this randomly but havent resolved this yet. > >> > >> We are using oozie 2.3.2-cdh3u3 & Hadoop 0.20.2-cdh3u3. > >> > >> Appreciate any help... > >> > >> Thanks, > >> Siva > >> > > > > > > > >-- > >*Siva Subramanian* > >*Engineering Analytics* > > -- *Siva Subramanian* *Engineering Analytics*
