Hey Chris, The dfs.replication param is an exception to the <final> config feature. If one uses the FileSystem API, one can pass in any short value they want the replication to be. This bypasses the configuration, and the configuration (being per-file) is also client sided.
The right way for an administrator to enforce a "max" replication value at a create/setRep level, would be to set the dfs.replication.max to a desired value at the NameNode and restart it. On Tue, Oct 16, 2012 at 12:48 AM, Chris Nauroth <[email protected]> wrote: > Hello Patai, > > Has your configuration file change been copied to all nodes in the cluster? > > Are there applications connecting from outside of the cluster? If so, then > those clients could have separate configuration files or code setting > dfs.replication (and other configuration properties). These would not be > limited by final declarations in the cluster's configuration files. > <final>true</final> controls configuration file resource loading, but it > does not necessarily block different nodes or different applications from > running with completely different configurations. > > Hope this helps, > --Chris > > > On Mon, Oct 15, 2012 at 12:01 PM, Patai Sangbutsarakum > <[email protected]> wrote: >> >> Hi Hadoopers, >> >> I have >> <property> >> <name>dfs.replication</name> >> <value>2</value> >> <final>true</final> >> </property> >> >> set in hdfs-site.xml in staging environment cluster. while the staging >> cluster is running the code that will later be deployed in production, >> those code is trying to have dfs.replication of 3, 10, 50, other than >> 2; the number that developer thought that will fit in production >> environment. >> >> Even though I final the property dfs.replication in staging cluster >> already. every time i run fsck on the staging cluster i still see it >> said under replication. >> I thought final keyword will not honor value in job config, but it >> doesn't seem so when i run fsck. >> >> I am on cdh3u4. >> >> please suggest. >> Patai > > -- Harsh J
