[
https://issues.apache.org/jira/browse/HADOOP-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597988#action_12597988
]
Allen Wittenauer commented on HADOOP-3287:
------------------------------------------
But what if I'm in a heterogeneous network such that some machines have eight
cores and others have two cores? The TaskTracker config will play a part
there, correct?
> Being able to set default job configuration values on the jobtracker
> --------------------------------------------------------------------
>
> Key: HADOOP-3287
> URL: https://issues.apache.org/jira/browse/HADOOP-3287
> Project: Hadoop Core
> Issue Type: Bug
> Components: conf, mapred
> Environment: all
> Reporter: Alejandro Abdelnur
> Priority: Critical
>
> The jobtracker hadoop-site.xml carries custom configuration for the cluster
> and the 'final' flag allows to fix a value ignoring any override by a client
> when submitting a job.
> There are several properties for which a cluster may want to set some default
> values (different from the ones in the hadoop-default.xml), for example:
> * enabling/disabling compression
> * type of compression, record/block
> * number of task retries
> * block replication factor
> * job priority
> * tasks JVM options
> The cluster default values should apply to submitted jobs when the job
> submitter does not care about those values. When the job submitter cares, it
> should include its preferred values. Using the final flag on the jobtracker
> hadoop-site.xml will lock the value ignoring the value set in the client
> jobconf.
> Currently the only way of doing this is to distribute the jobtracker
> hadoop-site.xml to all clients and make sure they use it when creating the
> job configuration.
> There are situations where this is not practical:
> * In a shared cluster with several clients submitting jobs. It requires
> redistributing the hadoop-site.xml to all clients.
> * In a cluster where the jobs are dispatched by a webapp application. It
> requires rebundling and redeploying the webapp.
> The current behavior happens because the jobconf when serialized, to be sent
> to the jobtracker, sends all the values found in the hadoop-default.xml
> bundled with the hadoop JAR file. On the jobtracker side, all those values
> override all but the 'final' properties of the jobtracker hadoop-site.xml.
> According to the javadocs of the Configuration.write(OutpuStream) this should
> not happen ' Writes non-default properties in this configuration.'
> If taken the javadocs as the proper behavior this is a bug in the current
> implementation and it could be easily fixed by avoiding writing default
> values on write.
> This is a generalization of the problem mentioned in Hadoop-3171.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.