stevedlawrence commented on pull request #481:
URL: https://github.com/apache/daffodil/pull/481#issuecomment-800316354


   If we look at Cookers.scala, we see these lines:
   ```scala
   object TerminatorCookerNoES extends DelimiterCookerNoES("terminator")
   
   object SeparatorCooker extends DelimiterCookerNoES("separator")
   ```
   So ``TerminatorCookerNoES`` (which is used for terminators when 
lengthKind="delimited") and ``SeparatorCooker`` both use 
``DelimiterCookerNoES``. But if we look at the specification, terminators and 
separators actually have slightly different restrictions:
   
   **dfdl:separator**
   > * DFDL Character Classes NL, WSP, WSP+, and WSP* are allowed.
   > * The WSP* entity cannot appear on its own as one of the string literals 
in the list when determining the length of a component by scanning for 
delimiters.
   
   (Note, I'm not sure why dfdl:separator allows ES on it's own if WSP* isn't 
allowed, I think  that's a bug in the spec and that's just missing?)
   
   **dfdl:terminator**
   > * DFDL Character Classes NL, WSP, WSP+, WSP*, and ES are allowed.
   > * Neither the ES entity nor the WSP* entity may appear on their own as one 
of the string literals in the list when the parser is determining the length of 
a component by scanning for delimiters.
   
   So separator needs a cooker that always disallows ES and only disallows WSP* 
(and probably ES too) when it's alone (what you have now).
   
   But terminator needs a cooker that allows all character classes (), and 
disallows both WSP* and ES when they're alone.
   
   So I think we just need two separator Cookers. Maybe the one for separator 
is called ``DelimiterCookerNoES`` (what you have now but with I think the 
missing ES added), and one called ``DelimterCookerNoSoleES`` which is old 
DelimiterCookerNoES (has no restriction on character class, with the only 
restriction being sole things.
   
   Note, I wonder if it's also a bug that separator doesn't allow ES? I feel 
like terminator and separator should have the same restrictions when when 
length kind is delimited? They are both terminating markup in that case, and I 
would expect to behave the same. Maybe the originl DleimiterCookerNoES is the 
right thing?
   
   Going to need @mbeckerle to weigh in on this.
    


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to