[ https://issues.apache.org/jira/browse/PIG-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851489#action_12851489 ]
Pradeep Kamath commented on PIG-1338: ------------------------------------- I haven't done a full review but had a comment on one of the changes which is pretty important: {noformat} Index: src/org/apache/pig/backend/hadoop/datastorage/ConfigurationUtil.java =================================================================== --- src/org/apache/pig/backend/hadoop/datastorage/ConfigurationUtil.java (revision 928370) +++ src/org/apache/pig/backend/hadoop/datastorage/ConfigurationUtil.java (working copy) @@ -30,7 +30,9 @@ public static Configuration toConfiguration(Properties properties) { assert properties != null; - final Configuration config = new Configuration(); + final Configuration config = new Configuration(false); + config.addResource("core-default.xml"); + config.addResource("mapred-default.xml"); final Enumeration<Object> iter = properties.keys(); while (iter.hasMoreElements()) { final String key = (String) iter.nextElement() {noformat} Looking at the Configuration class's implementation I found the following code: {noformat} static{ //print deprecation warning if hadoop-site.xml is found in classpath ClassLoader cL = Thread.currentThread().getContextClassLoader(); if (cL == null) { cL = Configuration.class.getClassLoader(); } if(cL.getResource("hadoop-site.xml")!=null) { LOG.warn("DEPRECATED: hadoop-site.xml found in the classpath. " + "Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, " + "mapred-site.xml and hdfs-site.xml to override properties of " + "core-default.xml, mapred-default.xml and hdfs-default.xml " + "respectively"); } addDefaultResource("core-default.xml"); addDefaultResource("core-site.xml"); } private void loadResources(Properties properties, ArrayList resources, boolean quiet) { if(loadDefaults) { for (String resource : defaultResources) { loadResource(properties, resource, quiet); } //support the hadoop-site.xml as a deprecated case if(getResource("hadoop-site.xml")!=null) { loadResource(properties, "hadoop-site.xml", quiet); } } for (Object resource : resources) { loadResource(properties, resource, quiet); } } {noformat} There are two questions related to the code in Configuration Vs the change in this patch: 1) In the patch, core-default.xml and mapred-default.xml are added as resources while in Configuration core-default.xml and core-site.xml are added by default 2) In the patch, hadoop-site.xml is not considered while in Configuration, it is - so if a hadoop 20.x cluster is installed with hadoop-site.xml configured and without the other .xml files (like core-default.xml etc.) then pig would not get the cluster config information right? > Pig should exclude hadoop conf in local mode > -------------------------------------------- > > Key: PIG-1338 > URL: https://issues.apache.org/jira/browse/PIG-1338 > Project: Pig > Issue Type: Improvement > Components: impl > Affects Versions: 0.7.0 > Reporter: Daniel Dai > Assignee: Daniel Dai > Attachments: PIG-1338-1.patch, PIG-1338-2.patch > > > Currently, the behavior for hadoop conf look up is: > * in local mode, if there is hadoop conf, bail out; if there is no hadoop > conf, launch local mode > * in hadoop mode, if there is hadoop conf, use this conf to launch Pig; if > no, still launch without warning, but many functionality will go wrong > We should bring it to a more intuitive way, which is: > * in local mode, always launch Pig in local mode > * in hadoop mode, if there is hadoop conf, use this conf to launch Pig; if > no, bail out with a meaningful message -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.