tuxji commented on code in PR #821:
URL: https://github.com/apache/daffodil/pull/821#discussion_r1052488841
##########
daffodil-sapi/src/main/scala/org/apache/daffodil/sapi/Daffodil.scala:
##########
@@ -86,6 +86,14 @@ object ValidationMode extends Enumeration {
}
}
+/**
+ * XMLOutputStyle determines how to wrap values of elements of type xs:string
+ */
+object XMLOutputStyle extends Enumeration {
+ type XMLOutputStyle = Value
+ val Standard, PrettyPrintSafe = Value
+}
+
Review Comment:
I concur with Steve about moving XMLTextEscapeStyle to the infoset package
(one directory below), and I also want to mention that Scala 3 is deprecating
scala.Enumeration in favor of enum Type { NAME1, NAME2, ... }, while Enumeratum
is recommended for Scala 2.X
(https://stackoverflow.com/questions/1898932/case-objects-vs-enumerations-in-scala).
##########
daffodil-japi/src/main/scala/org/apache/daffodil/japi/packageprivate/Utils.scala:
##########
@@ -42,6 +44,17 @@ private[japi] object ValidationConversions {
}
}
+private[japi] object XMLOutputStyleConversions {
+
+ def modeToScala(mode: XMLOutputStyle): SXMLOutputStyle.Type = {
Review Comment:
Agreed, styleToScala makes more sense, or simply say toScala(xmlOutputStyle).
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/XMLTextInfosetOutputter.scala:
##########
@@ -160,7 +163,13 @@ class XMLTextInfosetOutputter private (writer:
java.io.Writer, pretty: Boolean)
if (simple.erd.runtimeProperties.get(XMLTextInfoset.stringAsXml) ==
"true") {
writeStringAsXml(simpleVal)
} else {
- writer.write(scala.xml.Utility.escape(remapped(simpleVal)))
+ val simpleValEscaped = xmlOutputStyle match {
+ case XMLOutputStyle.PrettyPrintSafe => {
+ "<![CDATA[%s]]>".format(remapped(simpleVal).replaceAll("]]>",
"]]]]><![CDATA[>"))
+ }
+ case XMLOutputStyle.Standard =>
scala.xml.Utility.escape(remapped(simpleVal))
Review Comment:
I like XMLTextEscapeStyle myself. We seem to be using XML strings and XML
text pretty interchangeably, but I think we should use XML string only for any
sequence of XML tags and context mixed together and use XML text only for any
textual content between XML tags, which I believe this PR is describing. The
XML standard calls it textual content, or text for short.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/XMLTextInfosetOutputter.scala:
##########
@@ -31,12 +31,15 @@ import org.apache.daffodil.util.Indentable
* @param writer The writer to write the XML text to
* @param pretty Whether or to enable pretty printing. Set to true, XML
* elements are indented and newlines are inserted.
+ * @param xmlOutputStyle Determine whether to wrap values of elements of type
+ * xs:string in CDATA chunks in order to preserve whitespace.
*/
-class XMLTextInfosetOutputter private (writer: java.io.Writer, pretty: Boolean)
+class XMLTextInfosetOutputter private (writer: java.io.Writer, pretty:
Boolean, xmlOutputStyle: XMLOutputStyle.Type)
extends InfosetOutputter with Indentable with XMLInfosetOutputter {
- def this(os: java.io.OutputStream, pretty: Boolean) = {
- this(new java.io.OutputStreamWriter(os, StandardCharsets.UTF_8), pretty)
+ def this(os: java.io.OutputStream, pretty: Boolean, xmlOutputStyle:
XMLOutputStyle.Type = XMLOutputStyle.Standard) = {
+ this(new java.io.OutputStreamWriter(os, StandardCharsets.UTF_8), pretty,
xmlOutputStyle)
+ Assert.usage(xmlOutputStyle != null, "Null is not a valid value for
parameter xmlOutputStyle")
Review Comment:
We can be sure Scala code won't be calling the constructor with a null
xmlOutputStyle because the Scala compiler wouldn't let such a call compile.
Only Java code can call a Daffodil API with a null, but we already would catch
that case before it gets here because our corresponding Java constructor calls
XMLOutputStyleConversions.modeToScala(xmlOutputStyle) to convert the enum and
it'll throw an error immediately. So remove the Assert.usage here, it'll never
get a null anyway.
I did some googling on validating API constructor/function arguments and
here are some take-aways. Putting lines like Assert.usage(), require(), etc.,
throw errors (exceptions), which is generally frowned upon. Recommendations
say to use types which prevent or encapsulate the errors and put validation
code into companion objects' methods if still necessary.
##########
daffodil-japi/src/main/scala/org/apache/daffodil/japi/packageprivate/Utils.scala:
##########
@@ -42,6 +44,17 @@ private[japi] object ValidationConversions {
}
}
+private[japi] object XMLOutputStyleConversions {
+
+ def modeToScala(mode: XMLOutputStyle): SXMLOutputStyle.Type = {
+ val smode: SXMLOutputStyle.Type = mode match {
+ case XMLOutputStyle.Standard => SXMLOutputStyle.Standard
+ case XMLOutputStyle.PrettyPrintSafe => SXMLOutputStyle.PrettyPrintSafe
Review Comment:
Can say sxmlOutputStyle instead of smode. Also, keep in mind that Java code
can pass null here and we will get a NoMatchException. If we need a clearer
error message (we should add a test for this case if we haven't already), add a
case _ which throws a better message.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/XMLTextInfosetOutputter.scala:
##########
@@ -25,18 +25,23 @@ import org.apache.daffodil.dpath.NodeInfo
import org.apache.daffodil.exceptions.Assert
import org.apache.daffodil.util.Indentable
+import org.apache.daffodil.api.XMLOutputStyle
+
+class NoMatchException(s: String) extends Exception(s) {}
+
/**
* Writes the infoset to a java.io.Writer as XML text.
*
* @param writer The writer to write the XML text to
* @param pretty Whether or to enable pretty printing. Set to true, XML
* elements are indented and newlines are inserted.
*/
-class XMLTextInfosetOutputter private (writer: java.io.Writer, pretty: Boolean)
+class XMLTextInfosetOutputter private (writer: java.io.Writer, pretty:
Boolean, xmlOutputStyle: XMLOutputStyle.Type)
Review Comment:
No, we don't need a null check, see below.
##########
daffodil-sapi/src/main/scala/org/apache/daffodil/sapi/infoset/Infoset.scala:
##########
@@ -248,8 +251,14 @@ class XMLTextInfosetOutputter private (outputter:
SXMLTextInfosetOutputter)
* @param pretty enable or disable pretty printing. Pretty printing will only
* insert indentation and newlines where it will not affect the
* content of the XML.
+ * @param xmlOutputStyle determine whether to wrap values of elements of type
+ * xs:string in CDATA chunks in order to preserve
+ * whitespace.
*/
- def this(os: java.io.OutputStream, pretty: Boolean) = this(new
SXMLTextInfosetOutputter(os, pretty))
+ def this(os: java.io.OutputStream, pretty: Boolean,
+ xmlOutputStyle: XMLOutputStyle.Value = XMLOutputStyle.Standard) = {
+ this(new SXMLTextInfosetOutputter(os, pretty,
XMLOutputStyleConversions.modeToScala(xmlOutputStyle)))
Review Comment:
The enum conversion function will detect null as well although Scala code
would have to do something unusual to pass null anyway.
##########
daffodil-sapi/src/test/scala/org/apache/daffodil/example/TestScalaAPI.scala:
##########
@@ -1127,44 +1129,101 @@ class TestScalaAPI {
}
- @Test
- def testScalaAPI25(): Unit = {
- // Demonstrates the use of a custom InfosetInputter/Outputter
-
- val expectedData = "42"
- val expectedEvents = Array(
- TestInfosetEvent.startDocument(),
- TestInfosetEvent.startComplex("e1", "http://example.com"),
- TestInfosetEvent.startSimple("e2", "http://example.com", expectedData),
- TestInfosetEvent.endSimple("e2", "http://example.com"),
- TestInfosetEvent.endComplex("e1", "http://example.com"),
- TestInfosetEvent.endDocument()
- )
-
- val c = Daffodil.compiler()
-
- val schemaFile = getResource("/test/sapi/mySchema1.dfdl.xsd")
- val pf = c.compileFile(schemaFile)
- val dp = pf.onPath("/")
-
- val file = getResource("/test/sapi/myData.dat")
- val fis = new java.io.FileInputStream(file)
- val dis = new InputSourceDataInputStream(fis)
- val outputter = new TestInfosetOutputter()
- val pr = dp.parse(dis, outputter)
-
- assertFalse(pr.isError())
- assertArrayEquals(
- expectedEvents.asInstanceOf[Array[Object]],
- outputter.events.toArray.asInstanceOf[Array[Object]]
- )
-
- val bos = new java.io.ByteArrayOutputStream()
- val wbc = java.nio.channels.Channels.newChannel(bos)
- val inputter = new TestInfosetInputter(expectedEvents: _*)
-
- val ur = dp.unparse(inputter, wbc)
- assertFalse(ur.isError)
- assertEquals(expectedData, bos.toString())
+ @Test
+ def testScalaAPI25(): Unit = {
+ // Demonstrates the use of a custom InfosetInputter/Outputter
+
+ val expectedData = "42"
+ val expectedEvents = Array(
+ TestInfosetEvent.startDocument(),
+ TestInfosetEvent.startComplex("e1", "http://example.com"),
+ TestInfosetEvent.startSimple("e2", "http://example.com", expectedData),
+ TestInfosetEvent.endSimple("e2", "http://example.com"),
+ TestInfosetEvent.endComplex("e1", "http://example.com"),
+ TestInfosetEvent.endDocument()
+ )
+
+ val c = Daffodil.compiler()
+
+ val schemaFile = getResource("/test/sapi/mySchema1.dfdl.xsd")
+ val pf = c.compileFile(schemaFile)
+ val dp = pf.onPath("/")
+
+ val file = getResource("/test/sapi/myData.dat")
+ val fis = new java.io.FileInputStream(file)
+ val dis = new InputSourceDataInputStream(fis)
+ val outputter = new TestInfosetOutputter()
+ val pr = dp.parse(dis, outputter)
+
+ assertFalse(pr.isError())
+ assertArrayEquals(
+ expectedEvents.asInstanceOf[Array[Object]],
+ outputter.events.toArray.asInstanceOf[Array[Object]]
+ )
+
+ val bos = new java.io.ByteArrayOutputStream()
+ val wbc = java.nio.channels.Channels.newChannel(bos)
+ val inputter = new TestInfosetInputter(expectedEvents: _*)
+
+ val ur = dp.unparse(inputter, wbc)
+ assertFalse(ur.isError)
+ assertEquals(expectedData, bos.toString())
+ }
+
+ @Test
+ def testScalaAPICDATA1(): Unit = {
+ val expected = "<![CDATA[NO_WHITESPACE_AT_ALL]]>"
+ val data = "NO_WHITESPACE_AT_ALL$"
+ val schemaType = "string"
+ doXMLOutputStyleTest(expected, data, schemaType)
+ }
+
+ @Test
+ def testScalaAPICDATA2(): Unit = {
+ val expected = "<![CDATA[ 'some' stuff here  and
]]]]><![CDATA[> even]]>"
+ val data = " 'some' stuff here  and ]]> even$"
+ val schemaType = "string"
+ doXMLOutputStyleTest(expected, data, schemaType)
+ }
+
+ @Test
+ def testScalaAPICDATA3(): Unit = {
+ val expected = "6.892"
+ val data = "6.892"
+ val schemaType = "float"
+ doXMLOutputStyleTest(expected, data, schemaType)
+ }
+
+ @Test
+ def testScalaAPICDATA4(): Unit = {
+ val expected = "<![CDATA[this contains a CRLF\nline ending]]>"
+ val data = "this contains a CRLF\r\nline ending$"
+ val schemaType = "string"
+ doXMLOutputStyleTest(expected, data, schemaType)
+ }
+
+ def doXMLOutputStyleTest(expect: String, data: String, schemaType: String):
Unit = {
+ val c = Daffodil.compiler()
+ val schemaFile = schemaType match {
+ case "string" => getResource("/test/sapi/mySchemaCDATAString.dfdl.xsd")
+ case "float" => getResource("/test/sapi/mySchemaCDATAFloat.dfdl.xsd")
}
+ val pf = c.compileFile(schemaFile)
+ var dp = pf.onPath("/")
+
+ val is = new ByteArrayInputStream(data.getBytes(StandardCharsets.UTF_8))
+ val input = new InputSourceDataInputStream(is)
+ val bosDP = new ByteArrayOutputStream()
+ val outputter = new XMLTextInfosetOutputter(bosDP, true,
XMLOutputStyle.PrettyPrintSafe)
+ val res = dp.parse(input, outputter)
+ val err = res.isError()
+
+ val infosetDPString = bosDP.toString()
+ val start = infosetDPString.indexOf(".com\">") + 6
+ val end = infosetDPString.indexOf("</tns")
+ val value = infosetDPString.substring(start, end)
+
+ assertFalse(err)
+ assertEquals(expect, value)
+ }
}
Review Comment:
Yes, let's add JAPI tests too.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]