[ 
https://issues.apache.org/jira/browse/HADOOP-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HADOOP-3722:
----------------------------------

      Issue Type: Improvement  (was: New Feature)
    Release Note: 
This issue 
1. changed StreamJob(of streaming) and Submitter(of pipes) to implement Tool 
and Configurable. Streaming and submitter now accepts GenericOptionsParser 
arguments :
  -fs, -jt, -conf, -D, -libjars, -files, -archives

2. Deprecated -jobconf, -cacheArchive, -dfs, -cacheArchive, 
-additionalconfspec,  from streaming and pipes(where applicable) in favor of 
the generic options. The options still work issuing a warning as a side effect, 
however they may be later removed in the following releases.  

3. removed from streaming :
 -config : since it is not documented anywhere
 -mapred.job.tracker : it sets the wrong property, so it not used currently. 
 -cluster : because setting -cluster gives "Unexpected -cluster while 
processing" error, so it is not used currently. 

    Hadoop Flags: [Incompatible change, Reviewed]  (was: [Reviewed, 
Incompatible change])

Added a release note.

> Provide a unified way to pass jobconf options from bin/hadoop
> -------------------------------------------------------------
>
>                 Key: HADOOP-3722
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3722
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.19.0
>            Reporter: Matei Zaharia
>            Assignee: Enis Soztutar
>            Priority: Minor
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3722.patch, jobconfoptions_v1.patch, 
> jobconfoptions_v2.patch
>
>
> Often when running a job it is useful to override some jobconf parameters 
> from jobconf.xml for that particular job - for example, setting the job 
> priority, setting the number of reduce tasks, setting the HDFS replication 
> level, etc. Currently the Hadoop examples, streaming, pipes, etc take these 
> extra jobconf parameters in different was: the examples in 
> hadoop-examples.jar use -Dkey=value, streaming uses -jobconf key=value, and 
> pipes uses -jobconf key1=value1,key2=value2,etc. Things would be simpler if 
> bin/hadoop could take the jobconf parameters itself, so that you could run 
> for example bin/hadoop -Dkey=value jar [whatever] as well as bin/hadoop 
> -Dkey=value pipes [whatever]. This is especially useful when an organization 
> needs to require users to use a particular property, e.g. the name of a queue 
> to use for scheduling in HADOOP-3445. Otherwise, users may confuse one way of 
> passing parameters with another and may not notice that they forgot to 
> include certain properties.
> I propose adding support in bin/hadoop for jobconf options to be specified 
> with -C key=value. This would have the effect of setting 
> hadoop.jobconf.key=value in Java's system properties. The Configuration class 
> would then be modified to read any system properties that begin with 
> hadoop.jobconf and override the values in hadoop-site.xml.
> I can write a patch for this pretty quickly if the design is sound. If 
> there's a better way of specifying jobconf parameters uniformly across Hadoop 
> commands, let me know.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to