Regarding the ability to include other files in configuration files: http://issues.apache.org/jira/browse/HADOOP-4944
I'm seeing apparent differing behavior that may make sense to someone. I say I have my core-site.xml as follows: <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration xmlns:xi="http://www.w3.org/2001/XInclude"> <!-- various normal <property> tags here... --> <xi:include href="include-site.xml"/> </configuration> include-site.xml has (only) multiple <property> elements in it, in the same form as entries in core-site.xml (or hdfs-site or mapred-site). Using the normal apache hadoop 0.20.1 installation (on an OS X box), that works fine. Using a current Cloudera 0.20.1 install (on CentOS) I'm seeing this fail with the following error (when doing any command line operation, such as "hadoop fs -ls"): [Fatal Error] include-site.xml:6:2: The markup in the document following the root element must be well-formed. [Fatal Error] core-site.xml:8:40: Error attempting to parse XML file (href='include-site.xml'). 09/12/28 18:09:05 FATAL conf.Configuration: error parsing conf file: org.xml.sax.SAXParseException: Error attempting to parse XML file (href='include-site.xml'). Exception in thread "main" java.lang.RuntimeException: org.xml.sax.SAXParseException: Error attempting to parse XML file (href='include-site.xml'). at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1266) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1125) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1064) at org.apache.hadoop.conf.Configuration.set(Configuration.java:447) at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:627) at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:290) at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:375) at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:153) at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:138) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:59) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.fs.FsShell.main(FsShell.java:1880) Caused by: org.xml.sax.SAXParseException: Error attempting to parse XML file (href='include-site.xml'). at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:239) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:283) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1174) ... 11 more Line 6 in include-site.xml (as referenced at the top of the error) is the line with the second <property> element, and line 8 of core-site.xml in the one with the include. If I put the property entries in multiple include files, one each, and include each, it will work. Thanks for any insight, Derek
