This is an automated email from the ASF dual-hosted git repository.
slawrence pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/daffodil.git
The following commit(s) were added to refs/heads/main by this push:
new 3f31018d3 Add DaffodilUnparseContentHandler.finish() function to SAX
unparse API
3f31018d3 is described below
commit 3f31018d3a5756736c8408a173df31f97595bb43
Author: Steve Lawrence <[email protected]>
AuthorDate: Mon Oct 13 07:57:28 2025 -0400
Add DaffodilUnparseContentHandler.finish() function to SAX unparse API
When unparsing using the SAX API, if an XMLReader ends the parse early
(usually due to invalid XML), then it stops sending events to the
DaffodilUnparseContentHandler. This can lead to dangling threads since the
DaffodilUnparseCotentHandler and SAXInfosetInputter are coroutines,
which uses threads behind the scenes.
To fix this, we add a new finish() method to the
DaffodilUnparseContentHandler which should be called by SAX unparse API
users after the XMLReader parse completes, regardless of success. For
example,
val contentHandler = dp.newContentHandlerInstance(...)
xmlReader.setContentHandler(contentHandler)
try {
xmlReader.parse(...)
} catch (...) {
...
} finally {
contentHandler.finish()
}
The new finish function cleans up the coroutines setting the next event
to null, which indicates to the SAXInfosetInputter coroutine that no
more events are coming, ands leads to ending the unparse and the
coroutine. This now also means getUnparseResult will always return a
result, even if the XMLReader sees invalid XML, which makes our API a
bit more consistent.
Note that this discovered a case where the InfosetInputter.initialize()
method could throw an UnparseError if there was no StartDocument event,
which was not caught correctly. Fixing this requires moving the initialize
call
inside the try/catch block, which is after the UState is created.
This means the UState can no longer assert that the InfosetInputter is
initialized, but this isn't necessary UState creation to succeed, as
long as the infoset inputter is initialized shortly thereafter.
- Change DaffodilUnhandledSAXException from a SAXException to a
RuntimeException. This exception generally indicatesa bug and so we
should not require that it be caught. Instead, it should be handled
just like any other RuntimeException.
Deprecation/Compatibility:
A new DaffodilUnparseContentHandler.finish() method is added for users
of the SAX unparse API. This should be called at the end of the
XMLReader.parse() method to to ensure all internal state is cleaned up
and an unparse result is available. For example:
DaffodilUnparseContentHandler contentHandler =
dp.newContentHandlerInstance();
xmlReader.setContentHandler(contentHandler);
try {
xmlReader.parse(...):
} catch (...)
...
} finally {
contentHandler.finish();
}
UnparseResult ur = contentHandler.getUnparseResult();
Deprecation/Compatibility:
The DaffodilUnhandledSAXException, which can be thrown when unparsing
using the SAX API and usually indicates a Daffodil bug, is changed from
a SAXException to a RuntimeException. It is no longer a checked
exception and does not need to be caught. It should usually be handled
just like any other unchecked RuntimeException.
DAFFODIL-2768
---
.../api/DaffodilUnparseContentHandler.java | 22 +++++---
.../exceptions/DaffodilUnhandledSAXException.java | 2 +-
.../java/org/apache/daffodil/api/package-info.java | 64 ++++++++++++----------
.../runtime1/infoset/SAXInfosetInputter.scala | 11 +++-
.../DaffodilUnparseContentHandlerImpl.scala | 30 ++++++++--
.../runtime1/processors/DataProcessor.scala | 2 +-
.../runtime1/processors/unparsers/UState.scala | 1 -
.../java/org/apache/daffodil/jexample/TestAPI.java | 9 ++-
.../core/processor/TestSAXParseUnparseAPI.scala | 4 ++
.../core/processor/TestSAXUnparseAPI.scala | 29 ++++++++--
.../org/apache/daffodil/sexample/TestAPI.scala | 5 +-
.../processor/tdml/DaffodilTDMLDFDLProcessor.scala | 4 +-
.../section00/general/infosetIgnorableContent.xml | 27 +++++++++
.../section00/general/testUnparserGeneral.tdml | 6 ++
.../section00/general/TestUnparserGeneral.scala | 2 +
15 files changed, 160 insertions(+), 58 deletions(-)
diff --git
a/daffodil-core/src/main/java/org/apache/daffodil/api/DaffodilUnparseContentHandler.java
b/daffodil-core/src/main/java/org/apache/daffodil/api/DaffodilUnparseContentHandler.java
index 7ee45e1f2..41b585b3e 100644
---
a/daffodil-core/src/main/java/org/apache/daffodil/api/DaffodilUnparseContentHandler.java
+++
b/daffodil-core/src/main/java/org/apache/daffodil/api/DaffodilUnparseContentHandler.java
@@ -17,7 +17,6 @@
package org.apache.daffodil.api;
-import org.apache.daffodil.api.exceptions.DaffodilUnhandledSAXException;
import org.apache.daffodil.api.exceptions.DaffodilUnparseErrorSAXException;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
@@ -28,26 +27,35 @@ import org.xml.sax.Locator;
*/
public interface DaffodilUnparseContentHandler extends ContentHandler {
/**
- * Returns the result of the SAX unparse containing diagnostic information.
In the case of an
- * DaffodilUnhandledSAXException, this will return null.
+ * Returns the result of the SAX unparse containing diagnostic information.
The {@link finish()}
+ * method should be called prior to calling this function.
+ *
+ * If the XMLReader parse method throws a DaffodilUnhandledSAXException,
which generally indicates
+ * a bug, this will return null.
*
* @return result of the SAX unparse containing diagnostic information
*/
UnparseResult getUnparseResult();
+ /**
+ * Ensure calls to {@link getUnparseResult()} return a value and clean up
internal state. This
+ * should be called after XMLReader parsing has ended, even if the XMLReader
throws an exception.
+ */
+ void finish();
+
void setDocumentLocator(Locator locator);
- void startDocument() throws DaffodilUnparseErrorSAXException,
DaffodilUnhandledSAXException;
+ void startDocument() throws DaffodilUnparseErrorSAXException;
- void endDocument() throws DaffodilUnparseErrorSAXException,
DaffodilUnhandledSAXException;
+ void endDocument() throws DaffodilUnparseErrorSAXException;
void startPrefixMapping(String prefix, String uri);
void endPrefixMapping(String prefix);
- void startElement(String uri, String localName, String qName, Attributes
attributes) throws DaffodilUnparseErrorSAXException,
DaffodilUnhandledSAXException;
+ void startElement(String uri, String localName, String qName, Attributes
attributes) throws DaffodilUnparseErrorSAXException;
- void endElement(String uri, String localName, String qName) throws
DaffodilUnparseErrorSAXException, DaffodilUnhandledSAXException;
+ void endElement(String uri, String localName, String qName) throws
DaffodilUnparseErrorSAXException;
void characters(char[] ch, int start, int length);
diff --git
a/daffodil-core/src/main/java/org/apache/daffodil/api/exceptions/DaffodilUnhandledSAXException.java
b/daffodil-core/src/main/java/org/apache/daffodil/api/exceptions/DaffodilUnhandledSAXException.java
index 60ee3acd6..6c4ea48ec 100644
---
a/daffodil-core/src/main/java/org/apache/daffodil/api/exceptions/DaffodilUnhandledSAXException.java
+++
b/daffodil-core/src/main/java/org/apache/daffodil/api/exceptions/DaffodilUnhandledSAXException.java
@@ -26,7 +26,7 @@ import org.xml.sax.SAXException;
* the {@code DaffodilUnparseContentHandler.getUnparseResult} returns null.
This most
* likely represents a bug in Daffodil.
*/
-public class DaffodilUnhandledSAXException extends SAXException {
+public class DaffodilUnhandledSAXException extends RuntimeException {
/**
* constructor for error message only
*
diff --git
a/daffodil-core/src/main/java/org/apache/daffodil/api/package-info.java
b/daffodil-core/src/main/java/org/apache/daffodil/api/package-info.java
index 0d7527535..e22ba4fda 100644
--- a/daffodil-core/src/main/java/org/apache/daffodil/api/package-info.java
+++ b/daffodil-core/src/main/java/org/apache/daffodil/api/package-info.java
@@ -250,41 +250,47 @@
* data ought to be written to. Any XMLReader implementation is permissible,
as long as they have
* XML Namespace support.
*
- * <pre>
- * {@code
- * ByteArrayInputStream is = new ByteArrayInputStream(data);
- * ByteArrayOutputStream os = new ByteArrayOutputStream();
- * WritableByteChannel wbc = java.nio.channels.Channels.newChannel(os);
- * DaffodilUnparseContentHandler unparseContentHandler =
dp.newContentHandlerInstance(wbc);
- * try {
- * XMLReader xmlReader =
SAXParserFactory.newInstance().newSAXParser().getXMLReader();
- * xmlReader.setContentHandler(unparseContentHandler)
- * xmlReader.parse(is)
- * } catch (ParserConfigurationException | SAXException e) {
- * ...
- * } catch (DaffodilUnparseErrorSAXException | DaffodilUnhandledSAXException
e) {
- * ...
- * }
- * }
- * </pre>
- *
- * The call to the XMLReader.parse method must be wrapped in a try/catch, as
+ * The call to the XMLReader.parse method must be wrapped in a try/catch, as
the
* {@link org.apache.daffodil.api.DaffodilUnparseContentHandler} relies on
throwing an exception to
- * end processing in the case of any errors/failures.
- * There are two kinds of errors to expect
- * {@link
org.apache.daffodil.api.exceptions.DaffodilUnparseErrorSAXException}, for the
case when the
- * {@link org.apache.daffodil.api.UnparseResult#isError()} is true, and
- * {@link org.apache.daffodil.api.exceptions.DaffodilUnhandledSAXException},
for any other errors.
- *
- * In the case of an {@link
org.apache.daffodil.api.exceptions.DaffodilUnhandledSAXException},
- * {@link
org.apache.daffodil.api.DaffodilUnparseContentHandler#getUnparseResult()} will
return null.
+ * end processing in the case of any errors/failures. There are two kinds of
exceptions it could throw:
+ * <ul>
+ * <li>{@link
org.apache.daffodil.api.exceptions.DaffodilUnparseErrorSAXException} - thrown
when a
+ * processing error is encountered while unparsing. In this case,
+ * {@link org.apache.daffodil.api.UnparseResult#isError()} is true.
+ * </li>
+ * <li>{@link
org.apache.daffodil.api.exceptions.DaffodilUnhandledSAXException} - usually
indicates
+ * a bug in Daffodil. This is an unchecked RuntimeException and should usually
not be explicitly
+ * caught--it should be handled like one would handle any other
RuntimeException. Note that if this
+ * is thrown
+ * {@link
org.apache.daffodil.api.DaffodilUnparseContentHandler#getUnparseResult()}
returns null.
+ * </li>
+ * </ul>
+ *
+ * After the XMLReader parse has completed, either successfully or via a
thrown exception, the
+ * {@link org.apache.daffodil.api.DaffodilUnparseContentHandler#finish()}
method should be called--this
+ * allows content handler state to be cleaned up and ensures the
+ * {@link
org.apache.daffodil.api.DaffodilUnparseContentHandler#getUnparseResult()}
returns an
+ * {@link org.apache.daffodil.api.UnparseResult} even in cases where the
XMLReader encounters an
+ * error. For example:
*
* <pre>
* {@code
+ * InputSource input = ...
+ * OutputStream output = ...
+ * WritableByteChannel wbc = Channels.newChannel(output);
+ * DaffodilUnparseContentHandler unparseContentHandler =
dp.newContentHandlerInstance(wbc);
+ *
+ * XMLReader xmlReader = ...
+ * xmlReader.setContentHandler(unparseContentHandler)
* try {
- * xmlReader.parse(new InputSource(is));
- * } catch (DaffodilUnparseErrorSAXException | DaffodilUnhandledSAXException
e) {
+ * xmlReader.parse(input);
+ * } catch (DaffodilUnparseErrorSAXException e) {
+ * // generally can be ignored, use getUnparseResult() instead
+ * } catch (SAXException e) {
+ * // non-Daffodil related exceptions created by the XMLReader, for example
invalid XML
* ...
+ * } finally {
+ * unparseContentHandler.finish();
* }
* UnparseResult ur = unparseContentHandler.getUnparseResult();
* }
diff --git
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/SAXInfosetInputter.scala
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/SAXInfosetInputter.scala
index 1c1d4c890..f5719b599 100644
---
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/SAXInfosetInputter.scala
+++
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/infoset/SAXInfosetInputter.scala
@@ -126,9 +126,14 @@ class SAXInfosetInputter(
}
override def hasNext(): Boolean = {
- // If we haven't reached an EndDocument event yet, there must be more
- // events on their way, even if we don't know for sure yet.
- currentEvent.eventType.get ne EndDocument
+ // If we haven't reached an EndDocument event yet, there must be more
events on their way,
+ // even if we don't know for sure yet. The exception to this is when the
XMLReader ends a
+ // parse without reaching the end of the document (e.g. invalid XML). When
this happens, the
+ // DaffodilUnparseContentHandler.finish() method sets the current event to
null, and resume
+ // this coroutine. This indicates there are no more events and hasNext()
returns false--this
+ // causes us to end the unparse, end this coroutine, and return an
UnparseResult back to the
+ // main coroutine.
+ currentEvent != null && (currentEvent.eventType.get ne EndDocument)
}
override def next(): Unit = {
diff --git
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/DaffodilUnparseContentHandlerImpl.scala
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/DaffodilUnparseContentHandlerImpl.scala
index 610452c6f..4a13d5a07 100644
---
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/DaffodilUnparseContentHandlerImpl.scala
+++
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/DaffodilUnparseContentHandlerImpl.scala
@@ -180,9 +180,13 @@ class DaffodilUnparseContentHandlerImpl(dp:
DFDL.DataProcessor, output: DFDL.Out
* coroutine is still unparsing), or if an unexpected exception occurred that
* prevented unparse from completing. Otherwise returns the UnparseResult.
*/
- def getUnparseResult: DFDL.UnparseResult = unparseResult.orNull
+ def getUnparseResult: DFDL.UnparseResult =
+ if (maybeUnparseResultOrException.isDefined)
+ maybeUnparseResultOrException.get.toOption.orNull
+ else
+ null
- private var unparseResult: Maybe[DFDL.UnparseResult] = Nope
+ private var maybeUnparseResultOrException: Maybe[Either[Exception,
DFDL.UnparseResult]] = Nope
/**
* Enable support for converting relative URIs in xs:anyURI blobs to
absolute URIs
@@ -385,7 +389,7 @@ class DaffodilUnparseContentHandlerImpl(dp:
DFDL.DataProcessor, output: DFDL.Out
// We completely fill up the batched events buffer or we have reached the
// EndDocument event. Send everything we've batched to be unparsed by the
// SAXInfosetInputter subroutine.
- val maybeUnparseResultOrException = this.resume(inputter,
batchedInfosetEvents)
+ maybeUnparseResultOrException = this.resume(inputter,
batchedInfosetEvents)
if (maybeUnparseResultOrException.isEmpty) {
// no error and not finished, the SAXInfosetInputter just needs more
events. The
@@ -401,7 +405,6 @@ class DaffodilUnparseContentHandlerImpl(dp:
DFDL.DataProcessor, output: DFDL.Out
// getUnparseResult() function. If the unparse failed, we also
throw the
// UnparseResult as a SAX Exception since that is generally what
SAX API
// users expect on failure
- unparseResult = One(ur)
if (ur.isError) {
throw new DaffodilUnparseErrorSAXException(ur)
}
@@ -409,7 +412,8 @@ class DaffodilUnparseContentHandlerImpl(dp:
DFDL.DataProcessor, output: DFDL.Out
// $COVERAGE-OFF$
case Left(e) => {
// unparse threw an unexpected exception, this is likely a bug. We
don't
- // have an UnparseResult so just rethrow the exception as a
SAXException.
+ // have an UnparseResult so just rethrow the exception as an
unchecked
+ // RuntimeException
throw new DaffodilUnhandledSAXException(e)
}
// $COVERAGE-ON$
@@ -472,4 +476,20 @@ class DaffodilUnparseContentHandlerImpl(dp:
DFDL.DataProcessor, output: DFDL.Out
// do nothing
}
+ def finish(): Unit = {
+ if (maybeUnparseResultOrException.isEmpty) {
+ // The XMLReader could potentially stop sending events to this
ContentHandler (e.g.
+ // invalid XML).This leaves the SAXInfosetInputter coroutine active
waiting for more
+ // events that it will never get, resulting in a dangling thread and
without an
+ // UnparseResult. The user of this content handler should call this
finish function, which
+ // allows us to let the coroutine finish and give us an UnparseResult.
We do this by
+ // setting the first infoset event to null (i.e. the next one the
SAXInfosetInputter will
+ // see)nulling out and resume the SAXInfosetInputter coroutine. The
coroutine should see
+ // this null event and tell the unparse that there are no more events,
which should
+ // eventually lead to the coroutine calling resumeFinal with a failed
UnparseResult.
+ batchedInfosetEvents(0) = null
+ maybeUnparseResultOrException = this.resume(inputter,
batchedInfosetEvents)
+ Assert.invariant(maybeUnparseResultOrException.isDefined)
+ }
+ }
}
diff --git
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/DataProcessor.scala
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/DataProcessor.scala
index e21f02fb8..fa1a7243b 100644
---
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/DataProcessor.scala
+++
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/DataProcessor.scala
@@ -441,11 +441,11 @@ class DataProcessor(
def unparse(actualInputter: api.infoset.InfosetInputter, outStream:
java.io.OutputStream) = {
val inputter = new InfosetInputter(actualInputter)
- inputter.initialize(ssrd.elementRuntimeData, tunables)
val unparserState =
UState.createInitialUState(outStream, this, inputter, areDebugging)
val res =
try {
+ inputter.initialize(ssrd.elementRuntimeData, tunables)
if (areDebugging) {
Assert.invariant(optDebugger.isDefined)
addEventHandler(debugger)
diff --git
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/unparsers/UState.scala
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/unparsers/UState.scala
index fb0f627cd..d64a1710a 100644
---
a/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/unparsers/UState.scala
+++
b/daffodil-core/src/main/scala/org/apache/daffodil/runtime1/processors/unparsers/UState.scala
@@ -753,7 +753,6 @@ object UState {
inputter: InfosetInputter,
areDebugging: Boolean
): UStateMain = {
- Assert.invariant(inputter.isInitialized)
/**
* This is a full deep copy as variableMap is mutable. Reusing
diff --git
a/daffodil-core/src/test/java/org/apache/daffodil/jexample/TestAPI.java
b/daffodil-core/src/test/java/org/apache/daffodil/jexample/TestAPI.java
index 212f05b3f..58082e470 100644
--- a/daffodil-core/src/test/java/org/apache/daffodil/jexample/TestAPI.java
+++ b/daffodil-core/src/test/java/org/apache/daffodil/jexample/TestAPI.java
@@ -55,7 +55,6 @@ import
org.apache.daffodil.api.validation.ValidatorNotRegisteredException;
import org.apache.daffodil.japi.SAXErrorHandlerForAPITest;
import org.apache.daffodil.api.UnparseResult;
import org.apache.daffodil.api.ProcessorFactory;
-import org.apache.daffodil.api.exceptions.DaffodilUnhandledSAXException;
import org.apache.daffodil.api.exceptions.DaffodilUnparseErrorSAXException;
import org.apache.daffodil.api.exceptions.ExternalVariableException;
import org.apache.daffodil.api.infoset.InfosetInputter;
@@ -954,6 +953,8 @@ public class TestAPI {
unparseXMLReader.parse(new org.xml.sax.InputSource(is));
} catch (javax.xml.parsers.ParserConfigurationException |
org.xml.sax.SAXException e) {
fail("Error: " + e);
+ } finally {
+ unparseContentHandler.finish();
}
UnparseResult saxUr = unparseContentHandler.getUnparseResult();
@@ -1071,10 +1072,12 @@ public class TestAPI {
unparseXMLReader.setFeature(SAX_NAMESPACE_PREFIXES_FEATURE, true);
// kickstart unparse
unparseXMLReader.parse(new org.xml.sax.InputSource(fis));
- } catch (DaffodilUnparseErrorSAXException | DaffodilUnhandledSAXException
ignored) {
- // do nothing; UnparseError is handled below while we don't expect
Unhandled in this test
+ } catch (DaffodilUnparseErrorSAXException e) {
+ // do nothing; UnparseError is handled below
} catch (javax.xml.parsers.ParserConfigurationException |
org.xml.sax.SAXException e) {
fail("Error: " + e);
+ } finally {
+ unparseContentHandler.finish();
}
UnparseResult res = unparseContentHandler.getUnparseResult();
diff --git
a/daffodil-core/src/test/scala/org/apache/daffodil/core/processor/TestSAXParseUnparseAPI.scala
b/daffodil-core/src/test/scala/org/apache/daffodil/core/processor/TestSAXParseUnparseAPI.scala
index 569d13481..8530dd7ed 100644
---
a/daffodil-core/src/test/scala/org/apache/daffodil/core/processor/TestSAXParseUnparseAPI.scala
+++
b/daffodil-core/src/test/scala/org/apache/daffodil/core/processor/TestSAXParseUnparseAPI.scala
@@ -64,6 +64,7 @@ class TestSAXParseUnparseAPI {
val baisUnparse = new ByteArrayInputStream(baosParse.toByteArray)
val inputSourceUnparse = new InputSource(baisUnparse)
unparseXMLReader.parse(inputSourceUnparse)
+ unparseContentHandler.finish()
val ur = unparseContentHandler.getUnparseResult
val unparsedData = baosUnparse.toString
assertTrue(!ur.isError)
@@ -91,6 +92,7 @@ class TestSAXParseUnparseAPI {
val baisUnparse = new ByteArrayInputStream(parsedData.toString.getBytes)
val inputSourceUnparse = new InputSource(baisUnparse)
unparseXMLReader.parse(inputSourceUnparse)
+ unparseContentHandler.finish()
val ur = unparseContentHandler.getUnparseResult
val unparsedData = baosUnparse.toString
assertTrue(!ur.isError)
@@ -110,6 +112,7 @@ class TestSAXParseUnparseAPI {
val baisUnparse = new ByteArrayInputStream(testInfosetString.getBytes)
val inputSourceUnparse = new InputSource(baisUnparse)
unparseXMLReader.parse(inputSourceUnparse)
+ unparseContentHandler.finish()
val ur = unparseContentHandler.getUnparseResult
val unparsedData = baosUnparse.toString
assertTrue(!ur.isError)
@@ -143,6 +146,7 @@ class TestSAXParseUnparseAPI {
val baisUnparse = new ByteArrayInputStream(testInfosetString.getBytes)
val inputSourceUnparse = new InputSource(baisUnparse)
unparseXMLReader.parse(inputSourceUnparse)
+ unparseContentHandler.finish()
val ur = unparseContentHandler.getUnparseResult
val unparsedData = baosUnparse.toString
assertTrue(!ur.isError)
diff --git
a/daffodil-core/src/test/scala/org/apache/daffodil/core/processor/TestSAXUnparseAPI.scala
b/daffodil-core/src/test/scala/org/apache/daffodil/core/processor/TestSAXUnparseAPI.scala
index e2e98c2a6..7c5c41a7c 100644
---
a/daffodil-core/src/test/scala/org/apache/daffodil/core/processor/TestSAXUnparseAPI.scala
+++
b/daffodil-core/src/test/scala/org/apache/daffodil/core/processor/TestSAXUnparseAPI.scala
@@ -51,6 +51,7 @@ class TestSAXUnparseAPI {
xmlReader.setFeature(XMLUtils.SAX_NAMESPACE_PREFIXES_FEATURE, true)
val bai = new ByteArrayInputStream(testInfosetString.getBytes)
xmlReader.parse(new InputSource(bai))
+ unparseContentHandler.finish()
val ur = unparseContentHandler.getUnparseResult
assertTrue(!ur.isError)
assertEquals(testData, bao.toString)
@@ -82,6 +83,7 @@ class TestSAXUnparseAPI {
xmlReader.setFeature(XMLUtils.SAX_NAMESPACE_PREFIXES_FEATURE, false)
val bai = new ByteArrayInputStream(testInfosetString.getBytes)
xmlReader.parse(new InputSource(bai))
+ unparseContentHandler.finish()
val ur = unparseContentHandler.getUnparseResult
assertTrue(!ur.isError)
assertEquals(testData, bao.toString)
@@ -100,6 +102,7 @@ class TestSAXUnparseAPI {
xmlReader.setFeature(XMLUtils.SAX_NAMESPACE_PREFIXES_FEATURE, true)
val bai = new ByteArrayInputStream(testInfosetString.getBytes)
xmlReader.parse(new InputSource(bai))
+ unparseContentHandler.finish()
val ur = unparseContentHandler.getUnparseResult
assertTrue(!ur.isError)
assertEquals(testData, bao.toString)
@@ -122,6 +125,7 @@ class TestSAXUnparseAPI {
<p:list xmlns:p="http://example.com"
ignored="attr"><p:w>9</p:w><p:w>1</p:w><p:w>0</p:w></p:list>
val bai = new ByteArrayInputStream(infoset.toString.getBytes)
xmlReader.parse(new InputSource(bai))
+ unparseContentHandler.finish()
val ur = unparseContentHandler.getUnparseResult
assertTrue(!ur.isError)
assertEquals(testData, bao.toString)
@@ -151,12 +155,21 @@ class TestSAXUnparseAPI {
"""
val bai = new ByteArrayInputStream(xmlWithDocType.getBytes)
val e = intercept[SAXParseException] {
- xmlReader.parse(new InputSource(bai))
+ try {
+ xmlReader.parse(new InputSource(bai))
+ } finally {
+ unparseContentHandler.finish()
+ }
}
- // should be null since unparse never completed
- assertEquals(null, unparseContentHandler.getUnparseResult)
val m = e.getMessage()
assertTrue(m.contains("DOCTYPE is disallowed"))
+ // the XMLReader never finished the parse because it found invalid XML.
Because we still
+ // called unparseContentHandler.finish(), we should get an unparse result
+ val ur = unparseContentHandler.getUnparseResult
+ assertTrue(ur.isError)
+ assertTrue(
+ ur.getDiagnostics.get(0).getMessage.contains("does not start with
StartDocument")
+ )
}
/**
@@ -179,6 +192,9 @@ class TestSAXUnparseAPI {
assertTrue(m.contains("Mixed content"))
assertTrue(m.contains("prior to start"))
assertTrue(m.contains("{http://example.com}w"))
+ unparseContentHandler.finish()
+ val ur = unparseContentHandler.getUnparseResult()
+ assertTrue(ur.isError)
}
/**
@@ -201,6 +217,9 @@ class TestSAXUnparseAPI {
assertTrue(m.contains("Mixed content"))
assertTrue(m.contains("prior to end"))
assertTrue(m.contains("{http://example.com}list"))
+ unparseContentHandler.finish()
+ val ur = unparseContentHandler.getUnparseResult
+ assertTrue(ur.isError)
}
@Test def testDaffodilUnhandledSAXException_creation_bothMessageAndCause():
Unit = {
@@ -222,8 +241,8 @@ class TestSAXUnparseAPI {
val expectedException = new IllegalArgumentException("Illegal Argument
Message")
val actualException = new DaffodilUnhandledSAXException(expectedException)
// when the detailMessage is null as is the case when no message is passed
in,
- // getMessage returns the detailMessage from the embedded exception
- assertEquals(expectedException.getMessage, actualException.getMessage)
+ // getMessage returns null
+ assertNull(actualException.getMessage)
assertEquals(expectedException, actualException.getCause)
}
@Test def
testDaffodilUnhandledSAXException_creation_onlyCauseNoCauseMessage(): Unit = {
diff --git
a/daffodil-core/src/test/scala/org/apache/daffodil/sexample/TestAPI.scala
b/daffodil-core/src/test/scala/org/apache/daffodil/sexample/TestAPI.scala
index 74df45b96..6ebc42253 100644
--- a/daffodil-core/src/test/scala/org/apache/daffodil/sexample/TestAPI.scala
+++ b/daffodil-core/src/test/scala/org/apache/daffodil/sexample/TestAPI.scala
@@ -37,7 +37,6 @@ import org.apache.daffodil.api.Daffodil
import org.apache.daffodil.api.DaffodilParseXMLReader
import org.apache.daffodil.api.DataProcessor
import org.apache.daffodil.api.ParseResult
-import org.apache.daffodil.api.exceptions.DaffodilUnhandledSAXException
import org.apache.daffodil.api.exceptions.DaffodilUnparseErrorSAXException
import org.apache.daffodil.api.exceptions.ExternalVariableException
import org.apache.daffodil.api.infoset.XMLTextEscapeStyle
@@ -956,6 +955,7 @@ class TestAPI {
// kickstart unparse
unparseXMLReader.parse(new org.xml.sax.InputSource(is))
+ unparseContentHandler.finish()
val saxUr = unparseContentHandler.getUnparseResult
wbc.close()
@@ -1047,7 +1047,8 @@ class TestAPI {
unparseXMLReader.parse(new org.xml.sax.InputSource(is))
} catch {
case _: DaffodilUnparseErrorSAXException => // do nothing; handled below
- case _: DaffodilUnhandledSAXException => // do nothing; we don't expect
this in this test
+ } finally {
+ unparseContentHandler.finish()
}
val res = unparseContentHandler.getUnparseResult
diff --git
a/daffodil-tdml-processor/src/main/scala/org/apache/daffodil/processor/tdml/DaffodilTDMLDFDLProcessor.scala
b/daffodil-tdml-processor/src/main/scala/org/apache/daffodil/processor/tdml/DaffodilTDMLDFDLProcessor.scala
index 7256f88c0..696a1ab8e 100644
---
a/daffodil-tdml-processor/src/main/scala/org/apache/daffodil/processor/tdml/DaffodilTDMLDFDLProcessor.scala
+++
b/daffodil-tdml-processor/src/main/scala/org/apache/daffodil/processor/tdml/DaffodilTDMLDFDLProcessor.scala
@@ -350,10 +350,12 @@ class DaffodilTDMLDFDLProcessor private[tdml] (
xmlReader.parse(new InputSource(saxInputStream))
} catch {
case e: DaffodilUnhandledSAXException =>
- // In the case of an unexpected errors, catch and throw as
TDMLException
+ // In the case of an unexpected error, which indicates a bug, catch
and throw as TDMLException
throw TDMLException("Unexpected error during SAX Unparse:" + e, None)
case _: DaffodilUnparseErrorSAXException =>
// do nothing as unparseResult and its diagnostics will be handled
below
+ } finally {
+ unparseContentHandler.finish()
}
val actualSAX =
unparseContentHandler.getUnparseResult.asInstanceOf[UnparseResult]
diff --git
a/daffodil-test/src/test/resources/org/apache/daffodil/section00/general/infosetIgnorableContent.xml
b/daffodil-test/src/test/resources/org/apache/daffodil/section00/general/infosetIgnorableContent.xml
new file mode 100644
index 000000000..0be0a4629
--- /dev/null
+++
b/daffodil-test/src/test/resources/org/apache/daffodil/section00/general/infosetIgnorableContent.xml
@@ -0,0 +1,27 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<!-- an XML comment is allowed -->
+
+<?processingInstruction isAllowed="true"?>
+
+<ex:e2 xmlns:ex="http://example.com">a</ex:e2>
+
+<!-- an XML comment is allowed -->
+
+<?processingInstruction isAllowed="true"?>
diff --git
a/daffodil-test/src/test/resources/org/apache/daffodil/section00/general/testUnparserGeneral.tdml
b/daffodil-test/src/test/resources/org/apache/daffodil/section00/general/testUnparserGeneral.tdml
index c5fd2ac32..c8a226f4f 100644
---
a/daffodil-test/src/test/resources/org/apache/daffodil/section00/general/testUnparserGeneral.tdml
+++
b/daffodil-test/src/test/resources/org/apache/daffodil/section00/general/testUnparserGeneral.tdml
@@ -987,5 +987,11 @@
</tdml:infoset>
</tdml:unparserTestCase>
+ <tdml:unparserTestCase name="unparseIgnorableContent" root="e2"
model="fixedLengthStrings">
+ <tdml:infoset>
+ <tdml:dfdlInfoset
type="file">infosetIgnorableContent.xml</tdml:dfdlInfoset>
+ </tdml:infoset>
+ <tdml:document>a</tdml:document>
+ </tdml:unparserTestCase>
</tdml:testSuite>
diff --git
a/daffodil-test/src/test/scala/org/apache/daffodil/section00/general/TestUnparserGeneral.scala
b/daffodil-test/src/test/scala/org/apache/daffodil/section00/general/TestUnparserGeneral.scala
index 2eb9d8865..93b272578 100644
---
a/daffodil-test/src/test/scala/org/apache/daffodil/section00/general/TestUnparserGeneral.scala
+++
b/daffodil-test/src/test/scala/org/apache/daffodil/section00/general/TestUnparserGeneral.scala
@@ -77,4 +77,6 @@ class TestUnparserGeneral extends TdmlTests {
// DFDL-1589
@Test def emptyOutputNewLine1 = test
+
+ @Test def unparseIgnorableContent = test
}