mbeckerle commented on code in PR #821:
URL: https://github.com/apache/daffodil/pull/821#discussion_r942582315


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/debugger/InteractiveDebugger.scala:
##########
@@ -987,10 +987,10 @@ class InteractiveDebugger(runner: 
InteractiveDebuggerRunner, eCompilers: Express
             debugPrintln(_)
           }
           res match {
-            case ie: InfosetElement => debugPrettyPrintXML(ie)
+            case ie: InfosetElement => debugPrettyPrintXML(ie, 
state.tunable.xmlOutputStyle)

Review Comment:
   Should not be a string. Should be an enum.  Copy technique from 
   ```
   object UnqualifiedPathStepPolicy extends Enum[UnqualifiedPathStepPolicy] {
   ```
   which is another tunable that is an Enum. 
   
   This will require a pervasive replace of all points where you have 
```xmlOutputStyle: String``` replacing it with ```xmlOutputStyle: 
XMLOutputStyle```, and even in the Java API you have ```String 
xmlOutputStyle``` to fix. 
   



##########
daffodil-sapi/src/test/scala/org/apache/daffodil/example/TestScalaAPI.scala:
##########
@@ -1167,4 +1168,80 @@ class TestScalaAPI {
       assertFalse(ur.isError)
       assertEquals(expectedData, bos.toString())
     }
+
+    @Test
+    def testScalaAPICDATA1(): Unit = {
+      val c = 
Daffodil.compiler().withTunable("xmlOutputStyle","prettyPrintSafe")

Review Comment:
   Factor out duplicate code here into a reused method in this test class. 
These tests all seem near identical. Just the expected result string and data 
file name are different. 
   Each test should look like:
   ```
   doXMLOutputStyleTest("<![CDATA[NO_WHITESPACE_AT_ALL]]>", 
"NO_WHITESPACE_AT_ALL")
   ````



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/JDOMInfosetOutputter.scala:
##########
@@ -48,14 +48,57 @@ class JDOMInfosetOutputter extends InfosetOutputter
     true
   }
 
-  def startSimple(diSimple: DISimple): Boolean = {
+  def startSimple(diSimple: DISimple, xmlOutputStyle: String): Boolean = {
 
     val elem = createElement(diSimple)
+    val correctFormat = new StringBuilder("");
+    val cdataIntro = "<![CDATA["
+    val cdataOutro = "]]>"
+    var charEntMode: Boolean = false
 
     if (diSimple.hasValue) {
       val text =
         if (diSimple.erd.optPrimType.get.isInstanceOf[NodeInfo.String.Kind]) {
-          remapped(diSimple.dataValueAsString)
+          val s = remapped(diSimple.dataValueAsString)
+
+            if(xmlOutputStyle == "prettyPrintSafe"){
+              val newData = s.replaceAll(">","&gt;").replace("\\r\\n", 
"&#xE00D;")

Review Comment:
   CRLFs and CR should already be replaced above in the remapped(....) call. So 
you don't need to do this replacement of "\\r\\n" at all. 
   
   You should not be replacing all ">" with "&gt;" because using CDATA 
brackets, the vast majority of "<" and ">" just get encapsulated by CDATA and 
can stay as is. 
   
   The only ">" you should replace with "&gt;" is the final ">" of "]]>" which 
is the terminator of a CDATA region. That you have to replace. So I think this 
wants to be ```replaceAll("]]>", "]]&gt;")``` 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/JDOMInfosetOutputter.scala:
##########
@@ -48,14 +48,57 @@ class JDOMInfosetOutputter extends InfosetOutputter
     true
   }
 
-  def startSimple(diSimple: DISimple): Boolean = {
+  def startSimple(diSimple: DISimple, xmlOutputStyle: String): Boolean = {
 
     val elem = createElement(diSimple)
+    val correctFormat = new StringBuilder("");
+    val cdataIntro = "<![CDATA["
+    val cdataOutro = "]]>"
+    var charEntMode: Boolean = false
 
     if (diSimple.hasValue) {
       val text =
         if (diSimple.erd.optPrimType.get.isInstanceOf[NodeInfo.String.Kind]) {
-          remapped(diSimple.dataValueAsString)
+          val s = remapped(diSimple.dataValueAsString)
+
+            if(xmlOutputStyle == "prettyPrintSafe"){
+              val newData = s.replaceAll(">","&gt;").replace("\\r\\n", 
"&#xE00D;")
+              val readyForCDATA = newData.replaceAll("0x","&#x")

Review Comment:
   What's this "0x" stuff? Don't think this should be modified. 



##########
daffodil-sapi/src/test/resources/test/sapi/mySchemaCDATA1.dfdl.xsd:
##########
@@ -0,0 +1,32 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+      http://www.apache.org/licenses/LICENSE-2.0
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+<schema xmlns="http://www.w3.org/2001/XMLSchema";
+  targetNamespace="http://example.com"; 
xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/";
+  xmlns:xsd="http://www.w3.org/2001/XMLSchema";
+  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; 
xmlns:tns="http://example.com";>
+
+  <annotation>
+    <appinfo source="http://www.ogf.org/dfdl/";>
+      <dfdl:format ref="tns:GeneralFormat" />
+    </appinfo>
+  </annotation>
+
+  <include 
schemaLocation="org/apache/daffodil/xsd/DFDLGeneralFormat.dfdl.xsd"/>
+
+  <element name="foo" type="xsd:string" dfdl:lengthKind="explicit" 
dfdl:length="20"/>

Review Comment:
   If you use dfdl:lengthKind='delimited' dfdl:terminator="$" then you can just 
add "$" to the end of your test data, and then reuse the same schema for all 
the tests. You won't need this different schema per test. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/JDOMInfosetOutputter.scala:
##########
@@ -48,14 +48,57 @@ class JDOMInfosetOutputter extends InfosetOutputter
     true
   }
 
-  def startSimple(diSimple: DISimple): Boolean = {
+  def startSimple(diSimple: DISimple, xmlOutputStyle: String): Boolean = {
 
     val elem = createElement(diSimple)
+    val correctFormat = new StringBuilder("");
+    val cdataIntro = "<![CDATA["
+    val cdataOutro = "]]>"
+    var charEntMode: Boolean = false
 
     if (diSimple.hasValue) {
       val text =
         if (diSimple.erd.optPrimType.get.isInstanceOf[NodeInfo.String.Kind]) {
-          remapped(diSimple.dataValueAsString)
+          val s = remapped(diSimple.dataValueAsString)
+
+            if(xmlOutputStyle == "prettyPrintSafe"){
+              val newData = s.replaceAll(">","&gt;").replace("\\r\\n", 
"&#xE00D;")
+              val readyForCDATA = newData.replaceAll("0x","&#x")
+              //Figure out what mode you have to be in
+              if(readyForCDATA(0) == '&') {
+                charEntMode = true
+              } else {
+                charEntMode = false
+                correctFormat.append(cdataIntro)
+              }
+
+              //Traverse over the string surrounding correct areas with CDATA 
info
+              for(c <- readyForCDATA) {
+                if(charEntMode) {
+                  correctFormat.append(c)
+                  if(c == ';'){
+                    correctFormat.append(cdataIntro)
+                    charEntMode = false
+                  }
+                } else {
+                  if(c == '&'){
+                    correctFormat.append(cdataOutro)
+                    charEntMode = true
+                  }
+                  correctFormat.append(c)
+                }
+              }
+
+              //You are done with the string. If you are still a non
+                   //char ent then close and finish up.

Review Comment:
   What if the string ends with "abcd&gt" i.e., no ";" at end of entity?
   You will create "<![CDATA[abcd]]>&gt" when really it should have been 
"<![CDATA[abcd&gt]]>". 
   I think your loop needs to grab whole char entities in a sub-loop, and only 
decide to end the CDATA with the cdataOutro and then set down the entity IF the 
";" is found. If no ";" is found and end of string is hit, then you just 
include the characters inside the CDATA.
   
   Add a test for "abcd&gt" case.



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/JDOMInfosetOutputter.scala:
##########
@@ -48,14 +48,57 @@ class JDOMInfosetOutputter extends InfosetOutputter
     true
   }
 
-  def startSimple(diSimple: DISimple): Boolean = {
+  def startSimple(diSimple: DISimple, xmlOutputStyle: String): Boolean = {
 
     val elem = createElement(diSimple)
+    val correctFormat = new StringBuilder("");
+    val cdataIntro = "<![CDATA["
+    val cdataOutro = "]]>"
+    var charEntMode: Boolean = false
 
     if (diSimple.hasValue) {
       val text =
         if (diSimple.erd.optPrimType.get.isInstanceOf[NodeInfo.String.Kind]) {
-          remapped(diSimple.dataValueAsString)
+          val s = remapped(diSimple.dataValueAsString)
+
+            if(xmlOutputStyle == "prettyPrintSafe"){
+              val newData = s.replaceAll(">","&gt;").replace("\\r\\n", 
"&#xE00D;")
+              val readyForCDATA = newData.replaceAll("0x","&#x")
+              //Figure out what mode you have to be in
+              if(readyForCDATA(0) == '&') {
+                charEntMode = true
+              } else {
+                charEntMode = false
+                correctFormat.append(cdataIntro)
+              }
+
+              //Traverse over the string surrounding correct areas with CDATA 
info
+              for(c <- readyForCDATA) {
+                if(charEntMode) {
+                  correctFormat.append(c)
+                  if(c == ';'){
+                    correctFormat.append(cdataIntro)
+                    charEntMode = false
+                  }
+                } else {
+                  if(c == '&'){
+                    correctFormat.append(cdataOutro)
+                    charEntMode = true
+                  }
+                  correctFormat.append(c)
+                }
+              }
+
+              //You are done with the string. If you are still a non

Review Comment:
   indentation off



##########
daffodil-sapi/src/test/scala/org/apache/daffodil/example/TestScalaAPI.scala:
##########
@@ -1167,4 +1168,80 @@ class TestScalaAPI {
       assertFalse(ur.isError)
       assertEquals(expectedData, bos.toString())
     }
+
+    @Test
+    def testScalaAPICDATA1(): Unit = {
+      val c = 
Daffodil.compiler().withTunable("xmlOutputStyle","prettyPrintSafe")
+
+      val schemaFile = getResource("/test/sapi/mySchemaCDATA1.dfdl.xsd")
+      val pf = c.compileFile(schemaFile)
+      var dp = pf.onPath("/")
+      dp = reserializeDataProcessor(dp)
+
+      val file = getResource("/test/sapi/myDataCDATA1.dat")

Review Comment:
   You should get rid of these  ".dat" files, and just use literal strings in 
these tests. 
   You want to use this:
   ```
   new ByteArrayInputStream(dataString.getBytes("ascii"))
   ```



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/W3CDOMInfosetOutputter.scala:
##########
@@ -56,14 +56,56 @@ class W3CDOMInfosetOutputter extends InfosetOutputter
     true
   }
 
-  def startSimple(diSimple: DISimple): Boolean = {
+  def startSimple(diSimple: DISimple, xmlOutputStyle: String): Boolean = {
 
     val elem = createElement(diSimple)
 
+    val correctFormat = new StringBuilder("");
+    val cdataIntro = "<![CDATA["
+    val cdataOutro = "]]>"
+    var charEntMode: Boolean = false
+
     if (diSimple.hasValue) {
       val text =
         if (diSimple.erd.optPrimType.get.isInstanceOf[NodeInfo.String.Kind]) {
-          remapped(diSimple.dataValueAsString)
+          val s = remapped(diSimple.dataValueAsString)
+
+          if(xmlOutputStyle == "prettyPrintSafe"){

Review Comment:
   All this logic is duplicated across JDOM, ScalaXML and W3CDOM infoset 
outputters. 
   Factor out so it can be shared. You may want to put a final method on 
InfosetOutputter for this, or make a scala object just for this, called perhaps 
XMLPrettyPrintSafe to hold the method. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/JDOMInfosetOutputter.scala:
##########
@@ -48,14 +48,57 @@ class JDOMInfosetOutputter extends InfosetOutputter
     true
   }
 
-  def startSimple(diSimple: DISimple): Boolean = {
+  def startSimple(diSimple: DISimple, xmlOutputStyle: String): Boolean = {
 
     val elem = createElement(diSimple)
+    val correctFormat = new StringBuilder("");
+    val cdataIntro = "<![CDATA["
+    val cdataOutro = "]]>"
+    var charEntMode: Boolean = false
 
     if (diSimple.hasValue) {
       val text =
         if (diSimple.erd.optPrimType.get.isInstanceOf[NodeInfo.String.Kind]) {
-          remapped(diSimple.dataValueAsString)
+          val s = remapped(diSimple.dataValueAsString)
+
+            if(xmlOutputStyle == "prettyPrintSafe"){
+              val newData = s.replaceAll(">","&gt;").replace("\\r\\n", 
"&#xE00D;")
+              val readyForCDATA = newData.replaceAll("0x","&#x")
+              //Figure out what mode you have to be in
+              if(readyForCDATA(0) == '&') {
+                charEntMode = true
+              } else {
+                charEntMode = false
+                correctFormat.append(cdataIntro)
+              }
+
+              //Traverse over the string surrounding correct areas with CDATA 
info
+              for(c <- readyForCDATA) {
+                if(charEntMode) {
+                  correctFormat.append(c)
+                  if(c == ';'){
+                    correctFormat.append(cdataIntro)
+                    charEntMode = false
+                  }
+                } else {
+                  if(c == '&'){
+                    correctFormat.append(cdataOutro)
+                    charEntMode = true
+                  }
+                  correctFormat.append(c)
+                }
+              }
+
+              //You are done with the string. If you are still a non
+                   //char ent then close and finish up.
+              if(!charEntMode){
+                correctFormat.append(cdataOutro)
+              }
+
+              correctFormat.toString()
+            }else{

Review Comment:
   
   Suggest you pull the entire set of lines for the stringWithPrettyPrintSafe 
situation out to a private method. 
   (I mean lines 65 to 98 roughly)
   
   Then this if-then-else nest would fit on a screen and it would be clearer 
what end brace goes with what condition. 
   
   Also "}else{" should have spaces like "} else {"
   



##########
daffodil-propgen/src/main/resources/org/apache/daffodil/xsd/dafext.xsd:
##########
@@ -559,6 +559,17 @@
             </xs:restriction>
           </xs:simpleType>
         </xs:element>
+        <xs:element name="xmlOutputStyle" type="xs:string" default="none" 
minOccurs="0">
+          <xs:annotation>
+            <xs:documentation>
+              values is a whitespace separated list of tokens drawn from this 
set.
+              Values are:
+              - none: (Current behavior - ok if data is not being pretty 
printed, or will not be re-read in, or if whitespace is fungible in the actual 
data format),

Review Comment:
   Remove "Current behavior" as that is dated as soon as this gets merged. "Use 
if data is ..."
   
   Also add that this behavior applies only to type xs:string or types derived 
from it. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/JDOMInfosetOutputter.scala:
##########
@@ -48,14 +48,57 @@ class JDOMInfosetOutputter extends InfosetOutputter
     true
   }
 
-  def startSimple(diSimple: DISimple): Boolean = {
+  def startSimple(diSimple: DISimple, xmlOutputStyle: String): Boolean = {
 
     val elem = createElement(diSimple)
+    val correctFormat = new StringBuilder("");
+    val cdataIntro = "<![CDATA["
+    val cdataOutro = "]]>"
+    var charEntMode: Boolean = false
 
     if (diSimple.hasValue) {
       val text =
         if (diSimple.erd.optPrimType.get.isInstanceOf[NodeInfo.String.Kind]) {
-          remapped(diSimple.dataValueAsString)
+          val s = remapped(diSimple.dataValueAsString)
+
+            if(xmlOutputStyle == "prettyPrintSafe"){
+              val newData = s.replaceAll(">","&gt;").replace("\\r\\n", 
"&#xE00D;")
+              val readyForCDATA = newData.replaceAll("0x","&#x")
+              //Figure out what mode you have to be in
+              if(readyForCDATA(0) == '&') {
+                charEntMode = true

Review Comment:
   CodeCov says you need a test for this case. I'm not sure how that's 
possible, but it seems to be. 
   
   Add test that has to hit this like with "&#x31;" in it. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to