[
https://issues.apache.org/jira/browse/HADOOP-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Enis Soztutar updated HADOOP-3722:
----------------------------------
Issue Type: Improvement (was: New Feature)
Release Note:
This issue
1. changed StreamJob(of streaming) and Submitter(of pipes) to implement Tool
and Configurable. Streaming and submitter now accepts GenericOptionsParser
arguments :
-fs, -jt, -conf, -D, -libjars, -files, -archives
2. Deprecated -jobconf, -cacheArchive, -dfs, -cacheArchive,
-additionalconfspec, from streaming and pipes(where applicable) in favor of
the generic options. The options still work issuing a warning as a side effect,
however they may be later removed in the following releases.
3. removed from streaming :
-config : since it is not documented anywhere
-mapred.job.tracker : it sets the wrong property, so it not used currently.
-cluster : because setting -cluster gives "Unexpected -cluster while
processing" error, so it is not used currently.
Hadoop Flags: [Incompatible change, Reviewed] (was: [Reviewed,
Incompatible change])
Added a release note.
> Provide a unified way to pass jobconf options from bin/hadoop
> -------------------------------------------------------------
>
> Key: HADOOP-3722
> URL: https://issues.apache.org/jira/browse/HADOOP-3722
> Project: Hadoop Core
> Issue Type: Improvement
> Components: conf
> Affects Versions: 0.19.0
> Reporter: Matei Zaharia
> Assignee: Enis Soztutar
> Priority: Minor
> Fix For: 0.19.0
>
> Attachments: HADOOP-3722.patch, jobconfoptions_v1.patch,
> jobconfoptions_v2.patch
>
>
> Often when running a job it is useful to override some jobconf parameters
> from jobconf.xml for that particular job - for example, setting the job
> priority, setting the number of reduce tasks, setting the HDFS replication
> level, etc. Currently the Hadoop examples, streaming, pipes, etc take these
> extra jobconf parameters in different was: the examples in
> hadoop-examples.jar use -Dkey=value, streaming uses -jobconf key=value, and
> pipes uses -jobconf key1=value1,key2=value2,etc. Things would be simpler if
> bin/hadoop could take the jobconf parameters itself, so that you could run
> for example bin/hadoop -Dkey=value jar [whatever] as well as bin/hadoop
> -Dkey=value pipes [whatever]. This is especially useful when an organization
> needs to require users to use a particular property, e.g. the name of a queue
> to use for scheduling in HADOOP-3445. Otherwise, users may confuse one way of
> passing parameters with another and may not notice that they forgot to
> include certain properties.
> I propose adding support in bin/hadoop for jobconf options to be specified
> with -C key=value. This would have the effect of setting
> hadoop.jobconf.key=value in Java's system properties. The Configuration class
> would then be modified to read any system properties that begin with
> hadoop.jobconf and override the values in hadoop-site.xml.
> I can write a patch for this pretty quickly if the design is sound. If
> there's a better way of specifying jobconf parameters uniformly across Hadoop
> commands, let me know.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.