Reloading configuration when using imputstream resources results in
org.xml.sax.SAXParseException
-------------------------------------------------------------------------------------------------
Key: HADOOP-7614
URL: https://issues.apache.org/jira/browse/HADOOP-7614
Project: Hadoop Common
Issue Type: Bug
Components: conf
Affects Versions: 0.21.0
Reporter: Ferdy
Priority: Minor
When using an inputstream as a resource for configuration, reloading this
configuration will throw the following exception:
Exception in thread "main" java.lang.RuntimeException:
org.xml.sax.SAXParseException: Premature end of file.
at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1576)
at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1445)
at
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1381)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:569)
...
Caused by: org.xml.sax.SAXParseException: Premature end of file.
at
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249)
at
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124)
at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1504)
... 4 more
To reproduce see following testcode:
Configuration conf = new Configuration();
ByteArrayInputStream bais = new
ByteArrayInputStream("<configuration></configuration>".getBytes());
conf.addResource(bais);
System.out.println(conf.get("blah"));
conf.addResource("core-site.xml"); //just add a named resource, doesn't
matter which one
System.out.println(conf.get("blah"));
Allowing inputstream resources is flexible, but in cases such as this in can
lead to difficult to debug problems.
What do you think is the best solution? We could:
A) reset the inputstream after it is read instead of closing it (but what to do
when the stream does not support marking?)
B) leave it up to the client (for example make sure you implement close() so
that it resets the steam)
C) when reading the inputstream for the first time, cache or wrap the contents
somehow so that is can be read multiple times (let's at least document it)
D) remove inputstream method altogether
e) something else?
For now I have attached a patch for solution A.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira