stevedlawrence commented on code in PR #799:
URL: https://github.com/apache/daffodil/pull/799#discussion_r892813211
##########
daffodil-tdml-lib/src/main/scala/org/apache/daffodil/tdml/TDMLRunner.scala:
##########
@@ -2364,6 +2358,53 @@ sealed abstract class DocumentPart(part: Node, parent:
Document) {
}
}.trim.toUpperCase()
+ private lazy val doubleForwardPattern = "//.*".r
+ private lazy val openClosePattern = "/[*](.|\n)*?[*]/".r
+
+ //Accessor for doubleForwardPattern regex
+ def getDoubleForwardPattern() : scala.util.matching.Regex = {
+ return doubleForwardPattern
+ }
+
+ //Accessor for openClosePattern regex
+ def getOpenClosePattern() : scala.util.matching.Regex = {
+ return openClosePattern
+ }
+
+ /*
+ * Allow "//" and "/* */" (inline) to act as comments.
+ * Any valid XML characters not explicitly allowed are also considered
comments and are removed.
+ */
+ def canonicalizeData(validCharactersSet : String, userData : String) :
String = {
+ val noWarnCharsSet = "|()[]."
+ var doWarning: Boolean = false
+ Assert.invariant(!userData.contains('\r')) // \r should not exist in
userData
+
+ //Remove the comments (//) and (/* */)
+ val noCommentsForward = getDoubleForwardPattern().replaceAllIn(userData,"")
+ val noCommentsBothFormats =
getOpenClosePattern().replaceAllIn(noCommentsForward,"")
+
+ //Throw exception if /* or */ still found. This means user input was not
formatted correctly.
+ if(noCommentsBothFormats.contains("/*") ||
noCommentsBothFormats.contains("*/")){
+ throw TDMLException("Improper formatting of /* */ style comment", None)
+ }
+
+ //Split on new line and white spaces - concatenate after
+ val noCommentsString = noCommentsBothFormats.split("\\s").mkString
+
+ //Check value of the characters, if invalid character found create log and
skip over it
+ val validData = noCommentsString.map{ch =>
+ if(validCharactersSet.contains(ch)) ch
+ else { if(!noWarnCharsSet.contains(ch)) doWarning=true; ""}
+ }
Review Comment:
Fine with me, I don't feel strongly about it either. Only suggestion with
the existing stuff is maybe to better match scala style? Add some spaces where
needed, and move the empty string to another line instead of using a semiolon.
Might also switch to filter instead of map?
```scala
val validData = noCommentsString.filter { ch =>
if (validCharactersSet.contains(ch)) true
else {
if (!noWarnCharsSet.contains(ch)) doWarning = true
false
}
}
```
I missed the empty string at the end of the line when first looking at this
and it took me a second to figure out how it worked.
Note, this is just style suggestions, I don't feel that strongly about it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]