mbeckerle commented on a change in pull request #570:
URL: https://github.com/apache/daffodil/pull/570#discussion_r644056891
##########
File path:
daffodil-core/src/main/scala/org/apache/daffodil/dsom/RuntimePropertyMixins.scala
##########
@@ -654,6 +655,36 @@ trait SequenceRuntimeValuedPropertiesMixin
}
}
+ def checkDelimiterEscapeConflict(childTerm: Term): Unit = {
+ if (childTerm.optionEscapeScheme.isDefined &&
+ (childTerm.optionEscapeScheme.get.escapeKind ==
EscapeKind.EscapeCharacter) &&
+ (hasTerminator || hasSeparator)) {
+ val ecEv = childTerm.optionEscapeScheme.get.escapeCharacterEv
+ if (ecEv.isConstant) {
+ val ec = ecEv.constValue
+ var delims = Set[String]()
+
+ val termEv = this.terminatorParseEv
+ if (termEv.isConstant)
+ delims = delims ++ termEv.constValue.map { _.lookingFor }
+
+ val sepEv = this.separatorParseEv
+ if (sepEv.isConstant)
+ delims = delims ++ sepEv.constValue.map { _.lookingFor }
+
+ if (delims.exists { _.startsWith(ec) } )
+ SDE("The dfdl:terminator and dfdl:separator may not begin with the
dfdl:escapeCharacter: '%s'.", ec)
+
+ /*val maybeEecEv =
childTerm.optionEscapeScheme.get.optionEscapeEscapeCharacterEv
+ if (maybeEecEv.isDefined && maybeEecEv.get.isConstant) {
+ val eec = maybeEecEv.get.constValue
+ if (delims.exists { _.startsWith(eec) } )
+ SDE("The dfdl:terminator and dfdl:separator may not begin with the
dfdl:escapeEscapeCharacter: '%s'.", eec)
Review comment:
Two points:
(1) allowing the EEC to be the same as the start of a delimiter enables
what? What obscure format is possible that isn't otherwise - The only example I
can contrive is that a format that uses "/" separators could be used as an
encapsulation around a format that uses "/" as the EEC. So it's a composition
property. I have never seen anything like that, but it may exist somewhere, or
a need for it could arise.
(2) allowing these sorts of overlaps usually just makes bugs or oversights
hard for users to spot. We're not doing anybody a favor by allowing this if
there is in fact no actual use case.
The usual decision on these sorts of things is to disallow, knowing we could
relax the restriction later. This is conservative in that it preserves freedom
to change in the future.
The only issue in this case is we're adding this restriction late.
I suggest we add the restriction with the understanding that we might have
to back this out, should compatibility issues arise. If those do, helping users
reformulate their schemas so as to side-step the problem would be good to try
first.
We do have a task, I started on it, but set it aside long ago, to run the
Daffodil test suite against IBM DFDL so as to narrow down on test cases that
demonstrate differences in behavior. Probably we should pick and choose areas
of it to cross test instead of trying to do all of it. Just delimiter and
escape related tests is one such area. That ticket is DAFFODIL-2028. The last
commit on the currently parked branch for that says 140 tests fail from
daffodil-test. The issues blocking are mostly date/time simple type-related
stuff, so it's possible the tests for separators and escapes will cross test
now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]