[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870997#action_12870997
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-1505:
----------------------------------------------------

bq. All of the o.a.h.mapreduce.Job constructors that don't require the caller 
to have already created and supplied a Cluster are deprecated. 
Dick, I did not understand your comment above. Job constructors are deprecated 
in favor of static getInstance methods wrt [comment1 
|https://issues.apache.org/jira/browse/MAPREDUCE-777?focusedCommentId=12746014&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12746014]
 and [comment2 
|https://issues.apache.org/jira/browse/MAPREDUCE-777?focusedCommentId=12755973&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12755973]

If the user is passing a Cluster handle, it is fine to initialize it in the 
constructor. So, current constructors and getInstance methods look fine. Only 
if user does not pass Cluster handle, then we need to create it lazily. 

We can add following method in Job.java which creates Cluster lazily:
{code}
public static getInstance(Configuration conf)
{code}

Also, will have to change deprecated constructors to create Cluster handle 
lazily.

Thoughts?

> Cluster class should create the rpc client only when needed
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-1505
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1505
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.2
>            Reporter: Devaraj Das
>            Assignee: Dick King
>             Fix For: 0.22.0
>
>         Attachments: mapreduce-1505--2010-05-19.patch, 
> MAPREDUCE-1505_yhadoop20.patch, MAPREDUCE-1505_yhadoop20_9.patch
>
>
> It will be good to have the org.apache.hadoop.mapreduce.Cluster create the 
> rpc client object only when needed (when a call to the jobtracker is actually 
> required). org.apache.hadoop.mapreduce.Job constructs the Cluster object 
> internally and in many cases the application that created the Job object 
> really wants to look at the configuration only. It'd help to not have these 
> connections to the jobtracker especially when Job is used in the tasks (for 
> e.g., Pig calls mapreduce.FileInputFormat.setInputPath in the tasks and that 
> requires a Job object to be passed).
> In Hadoop 20, the Job object internally creates the JobClient object, and the 
> same argument applies there too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to