stevedlawrence commented on a change in pull request #253: Support terminators 
and initiators containing %ES;
URL: https://github.com/apache/incubator-daffodil/pull/253#discussion_r298582978
 
 

 ##########
 File path: 
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/DelimiterParsers.scala
 ##########
 @@ -74,57 +74,20 @@ class DelimiterTextParser(
     return false
   }
 
-  private def findES(delimIter: DelimiterIterator): Maybe[DFADelimiter] = {
-    delimIter.reset()
-    while (delimIter.hasNext()) {
-      val d = delimIter.next()
-      if (d.lookingFor == "%ES;") {
-        return One(d)
-      }
-    }
-    Nope
-  }
-
-  private def findLocalES(state: PState): Maybe[DFADelimiter] = {
-    val delimIter = new LocalTypedDelimiterIterator(delimiterType, 
state.mpstate.delimiters, state.mpstate.delimitersLocalIndexStack)
-    findES(delimIter)
-  }
+  override def parse(start: PState): Unit = {
 
-  private def findRemoteES(state: PState): Maybe[DFADelimiter] = {
-    val delimIter = new RemoteTypedDelimiterIterator(delimiterType, 
state.mpstate.delimiters, state.mpstate.delimitersLocalIndexStack)
-    findES(delimIter)
-  }
+    val maybeDelimIter =
+      if (delimiterType == DelimiterTextType.Terminator && !isDelimited) {
+        Maybe(new LocalTypedDelimiterIterator(delimiterType, 
start.mpstate.delimiters, start.mpstate.delimitersLocalIndexStack.top))
+      } else if (delimiterType == DelimiterTextType.Initiator || 
!start.delimitedParseResult.isDefined) {
+        Maybe(new 
RemoteTerminatingMarkupAndLocalTypedDelimiterIterator(delimiterType, 
start.mpstate.delimiters, start.mpstate.delimitersLocalIndexStack.top))
+      } else {
+        Nope
+      }
 
-  override def parse(start: PState): Unit = {
     val foundDelimiter =
-      if (delimiterType == DelimiterTextType.Initiator || 
!start.delimitedParseResult.isDefined) {
-        //
-        // We are going to scan for delimiter text.
-        //
-        // FIXME: This is incorrect. It grabs all local delimiters even if 
we're last in a group so the
 
 Review comment:
   Although I've confirmed that this bug does still exist, I think this comment 
is wrong, or at least it's in the wrong place. The purpose of DelimiterParser 
to to just consume delimiters, and it appears to do scana/consume the correct  
delimiters correctly. For example, if I take the above schema and make all the 
lengths explicit, I've confirmed that it scans for what you would expect--"foo" 
scans only for a semicolon, "bar" scans for both the semicolon and comma, and 
"baz" scans for just the comma (no scanning is required for "quux").
   
   However, I have confirmed that things fail when lengths are delimited. In 
fact, no actual scanning even happens in DelmiterParser in that case. All that 
scanning is done in StringDelimitedParser, which caches the result and then 
DelimiterParser just determines if i found an out of scope delimiter. So 
there's definitely a bug in StringDelimitedParser scanning for the wrong 
delimters. There might still be a bug somewhere in this parser, but that's 
where things start to go off the rails. And it's possible something is broken 
with how we build the delimiter stack.
   
   So I think the issues are fundamental and spread out, and not just contained 
in this parser. So I'm not sure these comment belongs here any more, and I 
rather just have this information in a bug where we'll see it and keep track of 
it.
   
   I'll add these two test (explicit lengths and delimited length scanning) and 
open new bug that has all the relevant information.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to