On Wed, Oct 13, 2010 at 10:09 AM, Henning Blohm
<[email protected]> wrote:
> Hi,
>
> I used
>
> Configuration cfg = new Configuration();
> cfg.addResource(InputStream)
>
> on org.apache.hadoop.conf.Configuration to create hadoop configuration.
> That configuration is
> used to construct HbaseConfiguration objects several times later on:
>
> HbaseConfiguration hbcfg = new HbaseConfiguration(cfg)
>
In TRUNK HBaseConfiguration is deprecated. Do you see same phenomena
when you do HBaseConfiguration.create(cfg)?
See below...
> At the second this leads to an error as the passed on configuraiton
> object
> is asked to load its properties again, trying to read from the
> inputstream once more
> (as the stream is memorized!!).
>
> The fact that the properties of the passed on configuration object have
> been reset somewhat
> surprising: When using the hbase configuration object constructed above
> in HbaseAdmin, this
> will eventually load JobConf which has a static initializer that calls
>
> Configuration.addDefaultResource("mapred-default.xml");
>
> which in turn goes through the Configuration.REGISTRY which holds on to
> any previously
> not yet collected Configuration objects to (hold your breath!) force
> them to reload their configuration
> by calling reloadConfiguration that (now we are there) sets
> properties=null.
>
> Did anybody follow that....
>
> It seems there is somewhat surprising side effects in hadoop/hbase
> configuration handling.
>
> Wouldn't it be better to have the default resource (pragmatically)
> defined once in Configuration
> and not (even think about) touch already instantiated config objects
> later on?
>
Yes. That sounds completely reasonable.
See if HBaseConfiguration.create(cfg) gives you the same issue. If
so, then its the way Hadoop Configuration works currently. I haven't
spent time on it in a while but I remember getting into interesting
scenarios loading properties. If you can confine the issue some, lets
file an issue up in hadoop common?
Thanks Henning,
St.Ack