mbeckerle commented on code in PR #1195:
URL: https://github.com/apache/daffodil/pull/1195#discussion_r1541102710


##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/dfa/Runtime.scala:
##########
@@ -133,6 +134,17 @@ trait DFAField extends DFA {
   final override def run(r: Registers): Unit = runLoop(r, DFA.EndOfData, 
StateKind.EndOfData)
 }
 
+object DFADelimiter {
+  private val controlOrWhitespace = "\\p{C}|\\p{Z}".r
+
+  private def containsCtrlOrWS(s: String) = 
controlOrWhitespace.findFirstMatchIn(s).isDefined
+
+  def strForDiagnostic(s: String) =
+    if (containsCtrlOrWS(s))

Review Comment:
   My thought was not about performance. This is only invoked when we're about 
to cause an error though as a PE it could get backtracked, so performance is an 
issue. This was mostly to avoid showing a second version of the characters 
which convert whitespace and control characters to visible things but not 
annoying the user with that if they aren't present. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/dfa/Runtime.scala:
##########
@@ -147,6 +159,9 @@ trait DFADelimiter extends DFA {
   final val isES = lookingFor == "%ES;"
 
   def unparseValue: String
+
+  def strForDiagnostic = DFADelimiter.strForDiagnostic(lookingFor)

Review Comment:
   Control characters like NUL and Ctrl-B, STX/ETX, STT, EOT, ESC, DEL, etc. 
are also used as delimiters quite a bit. 
   
   ALso, I think you can do this: dfdl:terminator="  	" which is a 
space and a tab character. XML allows it. I am not sure DFDL won't normalize 
that away and reject the definition, but either way you can have other 
quasi-whitespace control chars that would be confusing to display. 



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/DelimiterParsers.scala:
##########
@@ -73,6 +76,35 @@ class DelimiterTextParser(
     foundLocalDFAIndex >= 0
   }
 
+  private def localDelimiters(state: PState): Seq[DFADelimiter] = {
+    var foundLocalDFAIndex = -1
+    val localIndexStart = state.mpstate.delimitersLocalIndexStack.top
+    val inScopeDelimiters = state.mpstate.delimiters
+    val res = inScopeDelimiters.slice(localIndexStart, 
inScopeDelimiters.length)
+    res
+  }
+
+  private def didNotFindExpectedDelimiter(foundDelimiter: ParseResult, start: 
PState): Unit = {
+    val delimString = foundDelimiter.matchedDelimiterValue.get
+    val localDelims = localDelimiters(start)
+    val foundDFA = foundDelimiter.matchedDFAs(0)
+    PE(
+      start,
+      s"""Found enclosing delimiter: %s during scan for local delimiter(s): %s.
+         | The expected delimiter(s) were: %s.
+         | The enclosing delimiter was from %s.
+         |""".stripMargin,
+      foundDFA.strForDiagnostic,
+      localDelims.map { d => d.strForDiagnostic }.mkString(", "),
+      localDelims

Review Comment:
   I was thinking getting what was not found and what was found instead onto 
the first line of the message would be helpful with other details on later 
lines. 
   



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/DelimiterParsers.scala:
##########
@@ -73,6 +76,35 @@ class DelimiterTextParser(
     foundLocalDFAIndex >= 0
   }
 
+  private def localDelimiters(state: PState): Seq[DFADelimiter] = {
+    var foundLocalDFAIndex = -1
+    val localIndexStart = state.mpstate.delimitersLocalIndexStack.top
+    val inScopeDelimiters = state.mpstate.delimiters
+    val res = inScopeDelimiters.slice(localIndexStart, 
inScopeDelimiters.length)
+    res
+  }
+
+  private def didNotFindExpectedDelimiter(foundDelimiter: ParseResult, start: 
PState): Unit = {
+    val delimString = foundDelimiter.matchedDelimiterValue.get
+    val localDelims = localDelimiters(start)
+    val foundDFA = foundDelimiter.matchedDFAs(0)
+    PE(
+      start,
+      s"""Found enclosing delimiter: %s during scan for local delimiter(s): %s.
+         | The expected delimiter(s) were: %s.
+         | The enclosing delimiter was from %s.
+         |""".stripMargin,
+      foundDFA.strForDiagnostic,
+      localDelims.map { d => d.strForDiagnostic }.mkString(", "),
+      localDelims
+        .map { d =>
+          s"  ${d.delimType.toString} ${d.strForDiagnostic} from location 
${d.location}."
+        }
+        .mkString("\n", "\n", ""),
+      foundDFA.location,

Review Comment:
   Ah. I was wondering why I wasn't getting file/line num info. I though it 
might just be due to TDML tests with embedded schema.
   
   Will fix. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to