PIG-3135 that you referred to earlier lets you do that, but that's a part of 0.12/trunk. You should be able to apply that patch and rebuild pig to be able to use it.
Here are the notes on how you can use this feature ########### Override hadoop configs programatically ################# # By default, Pig expects hadoop configs (hadoop-site.xml and core-site.xml) # to be present on the classpath. There are cases when these configs are # needed to be passed programatically, such as while using the PigServer API. # In such cases, you can override hadoop configs by setting the property # "pig.use.overriden.hadoop.configs". # # When this property is set to true, Pig ignores looking for hadoop configs # in the classpath and instead picks it up from Properties/Configuration # object passed to it. # pig.use.overriden.hadoop.configs=true # ###################################################################### On Tue, Aug 6, 2013 at 7:44 PM, Suhas Satish <[email protected]> wrote: > The parameter values in default file are all marked as public static > *final. > *That explains why they were not being over-ridden by *site.xml > > > Cheers, > Suhas. > > > On Tue, Aug 6, 2013 at 5:32 PM, Suhas Satish <[email protected]> > wrote: > > > None of the parameters in mapred-site.xml are respected. they're being > > over-ridden by default configurations in the following file - > > hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java > > > > public String get(String name) { > > String val = getProps().getProperty(name); > > if (val == null) { > > val = CustomConf.getDefault(name); > > } > > return substituteVars(val); > > } > > > > Running via java APIs, not via grunt shell. > > mapred-site.xml exists on the classpath. > > > > > > If I have to make a change to add resource mapred-site.xml in apache > > pig's org/apache/pig/backend/hadoop/executionengine/ > > HExecutionEngine.java > > the following changes aren't enough - > > > > * private static final String MAPRED_SITE = "mapred-site.xml";* > > * jc.addResource(MAPRED_SITE); > > * > > recomputeProperties(jc, properties); > > > > > > My question is, how should I add a new configuration resource file > > myconfig-site.xml to pig and get pig to use it without making changes to > > hadoop layer (Configuration.java or JobConf.java)? > > > > > > > > Cheers, > > Suhas. > > > > > > On Tue, Aug 6, 2013 at 1:32 PM, Prashant Kommireddi <[email protected] > >wrote: > > > >> Can you tell us how exactly you are running the pig script? Is your > >> mapred-site.xml on the classpath? Are you trying to run this via grunt > or > >> Java APIs? > >> > >> > >> On Tue, Aug 6, 2013 at 1:16 PM, Suhas Satish <[email protected]> > >> wrote: > >> > >> > I am running pig on a custom hadoop implementation but it doesnt > respect > >> > params in mapred-site.xml. > >> > > >> > Looking into the code, I find that the following 2 files are slightly > >> > different from stock hadoop in that some patches are not present. > >> > > >> > hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java > >> > and > >> > src/mapred/org/apache/hadoop/mapred/JobConf.java > >> > > >> > Given the constraint that I cannot modify these files, what change > >> should I > >> > make within pig to recognize mapred-site.xml parameters? > >> > > >> > I pulled in PIG-3135 and PIG-3145 which make changes to > >> > org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java > >> > > >> > But the params in mapred-site.xml are still not getting recognized. > Upon > >> > remote eclipse debugging with breakpoints in the file above, this is > >> what > >> > I found - > >> > > >> > HExecutionEngine.java - jc = new jobConf() > >> > calls > >> > 1st call upon JobConf() constructor - > >> > Configuration.get(String) > >> > Configuration.getProps() --> if properties ==null, properties = new > >> > Properties(); loadResources(properties, resources...); > >> > > >> > JobConf static constructor - > >> > Configuration.addDefaultResource(mapred-site.xml) > >> > > >> > HExecutionEngine: 2nd call jc.addResource("mapred-site.xml") - > >> > Configuration.get > >> > val = getProps().getProperty(name) > >> > if (val==null) val = > >> > <custom_hadoop_impl_configuration_object>.getDefault(name); > >> > > >> > HExecutionEngine: 3rd call recomputeProperties(jc, properties) --> > >> > clearing properties which were added so it gets reloaded again. > >> > > >> > > >> > What do I ned to do to make sure the getProps().getProperty call is > not > >> > null so that the mapred-site.xml values are not over-ridden by > >> defaults in > >> > custom_hadoop implementation ? > >> > > >> > Thanks, > >> > Suhas. > >> > > >> > > > > >
