[ 
https://issues.apache.org/jira/browse/HADOOP-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700963#action_12700963
 ] 

Topher ZiCornell commented on HADOOP-5708:
------------------------------------------

> The defaults come from the jar file, and the jar must currently have the same 
> version of the code in the cluster, so in practice we overwrite things with 
> the same values.

I think you might be making the assumption that the defaults you package with 
the jar and ship out are the only defaults that could possibly be loaded.  
That's not true.

The defaults are loaded from the classpath.  There are many ways defaults can 
be introduced for specific environments.  Hod itself took advantage of this by 
writing a hadoop-site.xml file in a directory, which then gets added by the 
hadoop script to the front of the classpath so that it's the first instance of 
that file encountered.  Even extending that example a bit, hod pulls _it's_ 
defaults from a default configuration directory, which may or may not be what 
was packaged with Hadoop.  

In short, the product team doesn't (and shouldn't) need to be aware of what the 
operations team is setting as the defaults.

> anything that should not be overridden should be declared final in the 
> cluster's configuration, and otherwise the user's configuration, including 
> defaults, should be observed, no?

Actually, that's not the issue.  You're looking at the scenario where I 
hand-craft my job XML files with only the settings I want to set.

Let me clarify a bit:  I'm lazy.  I make my computer do that work for me.  It 
builds the job for me (well, for my team, but nevermind that).  If I write that 
job's Configuration out, it includes all the settings of whatever the defaults 
are on the computer I'm currently on.  When that XML then gets loaded, all 
those defaults are treated as if they are user-overrides, when in fact that are 
not.

In a nutshell: There is currently no way to write an XML just of my settings so 
that it can be loaded in again.

.  Topher


> Configuration should provide a way to write only properties that have been set
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-5708
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5708
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.19.1
>            Reporter: Topher ZiCornell
>            Priority: Minor
>
> The Configuration.write and .writeXml methods always output all properties, 
> whether they came from a default source, a loaded resource file, or an 
> "overlay" set call.  There should be a way to write only the properties that 
> were set, leaving out the properties that came from a default source.
> Why?  Suppose I build a configuration on a machine that is not associated 
> with a grid, write it out to XML, then try to load it on a grid gateway.  The 
> configuration would contain all of the defaults picked up from my non-grid 
> machine, and would completely overwrite all the defaults on that grid.
> I propose to add methods to write out only the overlay values in Object and 
> XML formats.
> I see two options for implementing this:
> 1) Either completely new methods could be crafted (writeOverlay(DataOutput) 
> and writeOverlayXml(OutputStream), or 
> 2) The existing write() and writeXml() methods could be adjusted to take an 
> additional parameter indicating whether the full properties or overlay 
> properties should be written.  (Of course, the existing write() and 
> writeXml() methods would remain, defaulting to the current behavior.)
> Option 1 has less impact to existing code.  Option 2 is a cleaner 
> implementation with less code-duplication involved.  I would much prefer to 
> do option 2.
> Oh, and in case it's not clear, I'm offering to make this change and submit 
> it.
> Thoughts?
> .  Topher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to