mbeckerle commented on a change in pull request #478:
URL: https://github.com/apache/daffodil/pull/478#discussion_r588612770



##########
File path: daffodil-lib/src/main/scala/org/apache/daffodil/xml/QNameBase.scala
##########
@@ -385,8 +387,8 @@ final case class LocalDeclQName(prefix: Option[String], 
local: String, namespace
 /**
  * QName for a global declaration or definition (of element, type, group, 
format, etc.)
  */
-final case class GlobalQName(prefix: Option[String], local: String, namespace: 
NS)
-  extends NamedQName(prefix, local, namespace) {
+final case class GlobalQName(lexicalPrefix: Option[String], local: String, 
namespace: NS)

Review comment:
       This feels improperly named to me. When you define something that is 
named, you never specify a prefix. The namespace is the target namespace of the 
schema, and there is no prefix specified. The target namespace doesn't even 
have to have *any* prefix defined. Other schemas that import/include this 
schema *may* supply a prefix most likely they do, but they could define it to 
be the default namespace and have no other means of access in a RefQName to one 
of the items in this schema. 
   
   Only RefQNames have lexicalPrefix. GlobalQNames never have a prefix, or if 
they do, it's because we're choosing one of the prefixes for the target 
namespace that we prefer. 
   
   So perhaps this paremter to GlobalQName should be called preferredTnsPrefix ?
   
   When we create a local element decl, the namespace is either no-namespace, 
or the TNS depending on the form of the element (or elementFormDefault of the 
schema, if the element has no form attribute, which is typical. I've never 
actually seen the form attribute used, and I'm not sure we support the form 
attribute. We do support elementFormDefault on the xs:schema however.) Again 
there is no explicit prefix. So I think the constructor for NamedQName should 
call it's prefix parameter preferredTnsPrefix. 

##########
File path: 
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DaffodilParseOutputStreamContentHandler.scala
##########
@@ -66,13 +88,74 @@ class DaffodilParseOutputStreamContentHandler(out: 
OutputStream, pretty: Boolean
 
   override def startPrefixMapping(prefix: String, uri: String): Unit = {
     val _prefix = if (prefix == "") null else prefix
-    prefixMapping = NamespaceBinding(_prefix, uri, prefixMapping)
+    globalPrefixMapping = NamespaceBinding(_prefix, uri, globalPrefixMapping)

Review comment:
       As we call this startPrefixMapping repeatedly, the globalPrefixMapping 
var is no longer really the global one is it? 
   Or is this only called at the top level before we start handling elements?
   
   I expected this to be named currentPrefixMapping. 

##########
File path: daffodil-lib/src/main/scala/org/apache/daffodil/xml/XMLUtils.scala
##########
@@ -812,6 +812,8 @@ object XMLUtils {
     actual: Node,
     ignoreProcInstr: Boolean = true,
     checkPrefixes: Boolean = false,
+    // checkNamespaces isn't order specific; it only checks that they have the 
same (element-wise)

Review comment:
       What does "element wise" namespaces mean? Maybe just say same set of 
namespace bindings (unordered). 

##########
File path: 
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DaffodilParseOutputStreamContentHandler.scala
##########
@@ -24,15 +24,37 @@ import scala.xml.NamespaceBinding
 
 import org.apache.daffodil.infoset.XMLInfosetOutputter
 import org.apache.daffodil.util.Indentable
+import org.apache.daffodil.util.MStackOf
 import org.apache.daffodil.util.MStackOfBoolean
+import org.apache.daffodil.util.Maybe
+import org.apache.daffodil.util.Maybe.Nope
+import org.apache.daffodil.xml.XMLUtils
 import org.xml.sax.Attributes
 import org.xml.sax.ContentHandler
 import org.xml.sax.Locator
 
 class DaffodilParseOutputStreamContentHandler(out: OutputStream, pretty: 
Boolean = false)

Review comment:
       Whole class Scala doc please. This is a SAX content handler for when you 
want to write textualized XML to a java output stream from SAX events created 
by Daffodil's parser. 
   
   Question. Why does this have Daffodil in its name. Isn't this a generic 
thing that woudl work with any source of SAX events? If that's the case, isn't 
there one of these in Xerces, or Apache Commons XML or something? 

##########
File path: daffodil-lib/src/main/scala/org/apache/daffodil/xml/QNameBase.scala
##########
@@ -425,8 +427,8 @@ final case class RefQName(prefix: Option[String], local: 
String, namespace: NS)
     }
   }
 
-  def toStepQName = StepQName(prefix, local, namespace)
-  def toGlobalQName = GlobalQName(prefix, local, namespace)
+  def toStepQName = StepQName(lexicalPrefix, local, namespace)
+  def toGlobalQName = GlobalQName(lexicalPrefix, local, namespace)

Review comment:
       Worth a coment here that for GlobalQNames, comparison of a RefQName to a 
GlobalQName to see if they match ignores the prefix. The only use of this 
lexicalPrefix parameter should be in diagnostic messages. 
   
   There's an argument that this should always be supplied as None, not the 
lexicalPrefix. But most schemas don't play subtle namespace games, so that may 
lead to diagnostic messages that are more verbose (using namespaces) when the 
prefix would have been more meaningful to the user. So I think keep the 
lexicalPrefix here anyway. 

##########
File path: 
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DaffodilParseOutputStreamContentHandler.scala
##########
@@ -66,13 +88,74 @@ class DaffodilParseOutputStreamContentHandler(out: 
OutputStream, pretty: Boolean
 
   override def startPrefixMapping(prefix: String, uri: String): Unit = {
     val _prefix = if (prefix == "") null else prefix
-    prefixMapping = NamespaceBinding(_prefix, uri, prefixMapping)
+    globalPrefixMapping = NamespaceBinding(_prefix, uri, globalPrefixMapping)
+    currentElementPrefixMapping = NamespaceBinding(_prefix, uri, 
currentElementPrefixMapping)
   }
 
   override def endPrefixMapping(prefix: String): Unit = {
     // do nothing
   }
 
+  /**
+   * Uses Attributes, which is passed in to the startElement callback, to 
extract prefix mappings
+   * and other attributes and return a tuple of an updated globalPrefixMapping 
and a Seq of attribute=val
+   * String pairings
+   */
+  def processAttributes(atts: Attributes, prefixMapping: NamespaceBinding): 
(NamespaceBinding, Seq[String]) = {
+    var pm = prefixMapping
+    var attrPairings: Seq[String] = Seq()
+    var i = 0
+    var mappingsList: Seq[(String, String)] = Seq()
+    while (i < atts.getLength) {
+      val qName = atts.getQName(i)
+      val attrVal =  atts.getValue(i)
+      if (qName.nonEmpty) { //if qName is populated; as in when 
namespacePrefixes == true
+        if (qName.startsWith("xmlns:") || qName == "xmlns")  { // describing 
namespace mapping
+          // get prefix
+          val pre = if (qName.startsWith("xmlns:")) {
+            qName.substring(6)
+          } else {
+            null
+          }
+          // we make this call to check if the prefix already exists. If it 
doesn't exist, we get the
+          // npe so we can add it, but if it does exist, nothing happens and 
it doesn't get re-added
+          // and we instead proceed to the next item in Attributes
+          val prefixUri = XMLUtils.getURIOption(pm, pre)
+          prefixUri match {
+            case Some(_) => // already exists; do nothing
+            case None => // add it
+              mappingsList +:= (pre, attrVal)
+          }
+        } else {
+          // regular attribute
+          attrPairings +:= s" $qName=" + "\"" + s"${attrVal}" + "\""

Review comment:
       Can this happen? DFDL doesn't have attributes. Coverage says this isn't 
tested. Can this even happen? 
   Maybe this is Assert.invariantFailed("DFDL Infoset XML shouldn't have 
attributes.")
   
   Maybe we need an example with both ns bindings on an element and 
xsi:type="xs:string" or xsi:nil="true" so that this will be covered?

##########
File path: 
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/SAXInfosetOutputter.scala
##########
@@ -144,29 +149,25 @@ class SAXInfosetOutputter(xmlReader: 
DFDL.DaffodilParseXMLReader)
   }
 
   private def doStartPrefixMapping(diElem: DIElement, contentHandler: 
ContentHandler): Unit = {
-    val nsbStart = diElem.erd.minimizedScope
-    val nsbEnd = if (diElem.isRoot) {
-      scala.xml.TopScope
-    } else {
-      diElem.diParent.erd.minimizedScope
-    }
+    val (nsbStart: NamespaceBinding, nsbEnd: NamespaceBinding) = 
getNsbStartAndEnd(diElem)
     var n = nsbStart
+    var mappingsList: Seq[(String, String)] = Seq()
     while (n != nsbEnd && n != null && n != scala.xml.TopScope) {
       val prefix = if (n.prefix == null) "" else n.prefix
       val uri = if (n.uri == null) "" else n.uri
-      contentHandler.startPrefixMapping(prefix, uri)
+      // we generate a list here by prepending so can build the prefixMapping 
in the
+      // same order as it is in minimizedScope when we call 
startPrefixMapping; we do this because
+      // getPrefix in the contentHandler is order dependent
+      mappingsList +:= (prefix, uri)
       n = n.parent
     }
+    mappingsList.foreach{ case (prefix, uri) =>
+      contentHandler.startPrefixMapping(prefix, uri)
+    }
   }
 
   private def doEndPrefixMapping(diElem: DIElement, contentHandler: 
ContentHandler): Unit = {
-    val nsbStart = diElem.erd.minimizedScope
-    val nsbEnd = if (diElem.isRoot) {
-      scala.xml.TopScope
-    } else {
-      diElem.diParent.erd
-        .minimizedScope
-    }
+    val (nsbStart: NamespaceBinding, nsbEnd: NamespaceBinding) = 
getNsbStartAndEnd(diElem)
     var n = nsbStart

Review comment:
       I think this code is depending on iterating down nsbStart's bindings 
eventually gets you to nsbEnd. I.e., nsbEnd is the tail of the nsbStart chain. 
   
   I am not sure what == and != mean on NamespaceBinding objects. But they 
might be doing elaborate comparisons to see if they define equivalent scopes 
(all same bindings, just different order). 
   
   I think your while could be while ( (n ne nsbEnd && .... i.e., using pointer 
equality here. This may actually be more correct than some sort of 
binding-scope equality, given what your algorthm is trying to achieve here 
which is to one by one remove the bindings that were added to the front. 
   

##########
File path: 
daffodil-runtime1/src/main/scala/org/apache/daffodil/infoset/SAXInfosetOutputter.scala
##########
@@ -175,26 +176,78 @@ class SAXInfosetOutputter(xmlReader: 
DFDL.DaffodilParseXMLReader)
     }
   }
 
-  private def doStartElement(diElem: DIElement, contentHandler: 
ContentHandler): Unit = {
-    val (ns: String, elemName: String, qName: String) = 
getNameSpaceElemNameAndQName(diElem)
-    doStartPrefixMapping(diElem, contentHandler)
+  /**
+   * Add the prefixes and uris from the element's NamespaceBinding to 
Attributes,
+   * when namespacePrefixes feature is true
+   */
+  private def doAttributesPrefixMapping(diElem: DIElement, attrs: 
AttributesImpl): AttributesImpl = {
+    val (nsbStart: NamespaceBinding, nsbEnd: NamespaceBinding) = 
getNsbStartAndEnd(diElem)
+    var n = nsbStart
+    while (n != nsbEnd && n != null && n != scala.xml.TopScope) {
+      val prefix = if (n.prefix == null) "xmlns" else s"xmlns:${n.prefix}"
+      val uri = if (n.uri == null) "" else n.uri
+      // uri and localname are always empty for NamespaceBinding attributes
+      attrs.addAttribute("", "", prefix, "CDATA", uri)
+      n = n.parent
+    }
+    attrs
+  }
 
-    val attrs = if (isNilled(diElem)) {
-      createNilAttribute()
+  private def getNsbStartAndEnd(diElem: DIElement) = {

Review comment:
       Is there any requirement about sharing between nsbStart and nsbEnd here?
   Seems to me we're depending on these being somehow one an extension of the 
other. Is that the case?
   
   then we should state that in a comment, and maybe check it as an 
Assert.invariant. 
   
   Alternatively, where minimized scope is computed, the invariant that each 
element's minimized scope is an extension of its parents minimized scope could 
be checked there. (maybe it is checked there?)
   
   I think your code is depending on this for correctness, so just let's be 
explicit about this sharing. 
   
   What I'm wondering about is if you compare two NamespaceBinding objects with 
==, does it check for "equivalence" of the namespace bindings? I.e, are they 
the same list, then true else sort both sets of bindings, etc. etc.  
Algorrithmically here, I think you are treating NamespaceBindings as linked 
chains, and when comparing them I think you basically always want eq or ne, not 
== or !=. 
   
   The Scala XML library really made a big mistake with these things. Namespace 
bindings should have been an ordinary List of Binding objects where a binding 
is just a single prefix + NS pair. So many idioms would have worked out better 
that way. Ordinary mapping, finding, etc. Instead we have these collection like 
things that are not standard scala collections. 

##########
File path: daffodil-lib/src/main/scala/org/apache/daffodil/xml/QNameBase.scala
##########
@@ -224,8 +224,7 @@ trait QNameBase extends Serializable {
    * must show up in diagnostic messages. Mistakes by having the wrong prefix
    * or by omitting one, are very common.
    */
-  def prefix: Option[String]
-  def prefixOrNull: String = prefix.orNull
+  def lexicalPrefix: Option[String]

Review comment:
       I think your scaladoc isn't complete here.
   
   I think you mean by lexical prefix, the prefix that was taken off of a QName 
reference to a named thing. 
   A QName is a:b where Some("a") is the prefix, or it's just b, in which case 
None is the lexicalPrefix.
   
   One more ambiguity to clarify.
   If a schema defines several prefixes for a namespace, which will be the 
lexicalPrefix?
   
   I've seen lots of things like
   ```
   targetNamespace="http://pcap.com";
   xmlns:pcap="http://pcap.com";
   xmlns:tns="http://pcap.com";
   xmlns="http://pcap.com";
   ```
   So the default namespace is the same as tns is the same as pcap. So a QName 
reference given this default such as
   ```
   <xs:element ref="foobar"/>
   ```
   will be to the pcap.com URL, but what will be the lexical prefix in this 
case? None, or Some("pcap") or Some("tns"), or undetermined, but one of those?
   

##########
File path: 
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DaffodilParseXMLReader.scala
##########
@@ -43,35 +42,28 @@ class DaffodilParseXMLReader(dp: DataProcessor) extends 
DFDL.DaffodilParseXMLRea
   private var errorHandler: ErrorHandler = _
   private var dtdHandler: DTDHandler = _
   private var entityResolver: EntityResolver = _
-  var saxParseResultPropertyValue: ParseResult = _
-  var saxBlobDirectoryPropertyValue: Path = 
Paths.get(System.getProperty("java.io.tmpdir"))
-  var saxBlobPrefixPropertyValue: String = "daffodil-sax-"
-  var saxBlobSuffixPropertyValue: String = ".blob"
-
-  private val featureMap = mutable.Map[String, Boolean](
-    XMLUtils.SAX_NAMESPACES_FEATURE -> false,
-    XMLUtils.SAX_NAMESPACE_PREFIXES_FEATURE -> false
-  )
+  private var saxParseResultPropertyValue: ParseResult = _

Review comment:
       Class needs scaladoc. What is a DaffodilParseXMLReader? It reads data, 
creates SAX events from it, and calls the supplied content handler with the 
events yes?
   
   I am somehow now thinking this class is used only for testing, because it 
constructs and uses (when parse is called) a SAXInfosetOutputter which we use 
to test unparsing.  Is that right? 
   
   I'm sort of losing track of what is test rig and what is core SAX algorithm. 
If this used for testing only, can it live in src/test ?

##########
File path: 
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DaffodilParseOutputStreamContentHandler.scala
##########
@@ -66,13 +88,74 @@ class DaffodilParseOutputStreamContentHandler(out: 
OutputStream, pretty: Boolean
 
   override def startPrefixMapping(prefix: String, uri: String): Unit = {
     val _prefix = if (prefix == "") null else prefix
-    prefixMapping = NamespaceBinding(_prefix, uri, prefixMapping)
+    globalPrefixMapping = NamespaceBinding(_prefix, uri, globalPrefixMapping)
+    currentElementPrefixMapping = NamespaceBinding(_prefix, uri, 
currentElementPrefixMapping)
   }
 
   override def endPrefixMapping(prefix: String): Unit = {
     // do nothing
   }
 
+  /**
+   * Uses Attributes, which is passed in to the startElement callback, to 
extract prefix mappings
+   * and other attributes and return a tuple of an updated globalPrefixMapping 
and a Seq of attribute=val
+   * String pairings
+   */
+  def processAttributes(atts: Attributes, prefixMapping: NamespaceBinding): 
(NamespaceBinding, Seq[String]) = {
+    var pm = prefixMapping
+    var attrPairings: Seq[String] = Seq()
+    var i = 0
+    var mappingsList: Seq[(String, String)] = Seq()
+    while (i < atts.getLength) {
+      val qName = atts.getQName(i)
+      val attrVal =  atts.getValue(i)
+      if (qName.nonEmpty) { //if qName is populated; as in when 
namespacePrefixes == true
+        if (qName.startsWith("xmlns:") || qName == "xmlns")  { // describing 
namespace mapping
+          // get prefix
+          val pre = if (qName.startsWith("xmlns:")) {
+            qName.substring(6)
+          } else {
+            null
+          }
+          // we make this call to check if the prefix already exists. If it 
doesn't exist, we get the
+          // npe so we can add it, but if it does exist, nothing happens and 
it doesn't get re-added

Review comment:
       Type npe here does not mean null pointer exception it is supposed to be 
None I think. 
   Unless this really does throw an NPE which is a really bad API, and we 
should change it. 

##########
File path: 
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/DaffodilParseOutputStreamContentHandler.scala
##########
@@ -116,9 +247,16 @@ class DaffodilParseOutputStreamContentHandler(out: 
OutputStream, pretty: Boolean
       }
     }
     writer.write("</")
-    writer.write(qName)
+    outputTagName(uri, localName, qName)
     writer.write(">")
     outputNewlineStack.pop()
+
+    if (!globalPrefixMappingContextStack.isEmpty) 
globalPrefixMappingContextStack.pop
+    if (globalPrefixMappingContextStack.isEmpty) {
+      globalPrefixMapping = null
+    } else {
+      globalPrefixMapping = globalPrefixMappingContextStack.top
+    }

Review comment:
       Ok. I didn't see a single thing in that class that has anything to do 
with DFDL/Daffodil. It's a generic SAX even to text content handler. Can't we 
just use one and not have to maintain this code?
   
   If not, what did I miss?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to