I am running pig on a custom hadoop implementation but it doesnt respect
params in mapred-site.xml.

Looking into the code, I find that the following 2 files are slightly
different from stock hadoop in that some patches are not present.

hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
and
src/mapred/org/apache/hadoop/mapred/JobConf.java

Given the constraint that I cannot modify these files, what change should I
make within pig to recognize mapred-site.xml parameters?

I pulled in PIG-3135 and PIG-3145 which make changes to
 org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java

But the params in mapred-site.xml are still not getting recognized. Upon
remote eclipse  debugging with breakpoints in the file above, this is what
I found -

HExecutionEngine.java  - jc = new  jobConf()
calls
1st call upon JobConf() constructor -
Configuration.get(String)
 Configuration.getProps() --> if properties ==null, properties = new
Properties(); loadResources(properties, resources...);

JobConf static constructor -
Configuration.addDefaultResource(mapred-site.xml)

HExecutionEngine: 2nd call jc.addResource("mapred-site.xml") -
Configuration.get
val = getProps().getProperty(name)
 if (val==null) val =
<custom_hadoop_impl_configuration_object>.getDefault(name);

HExecutionEngine: 3rd call   recomputeProperties(jc, properties) -->
clearing properties which were added so it gets reloaded again.


What do I ned to do to make sure the  getProps().getProperty call is not
null  so that the mapred-site.xml values are not over-ridden by defaults in
custom_hadoop implementation ?

Thanks,
Suhas.

Reply via email to