stevedlawrence commented on code in PR #911:
URL: https://github.com/apache/daffodil/pull/911#discussion_r1068154928


##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -170,13 +279,23 @@ trait ConvertTextNumberMixin {
   final protected def pattern = e.textNumberPattern
 
   /**
-   * Checks a pattern for suitability with V (virtual decimal point) in the 
pattern
-   * in the context of the corresponding textNumberRep. Computes number of 
digits to right of the V.
+   * Analogous to the property dfdl:binaryDecimalVirtualPoint
+   *
+   * Value is 0 if there is no virtual decimal point.
+   *
+   * Value is the number of digits to the right of the 'V' for V patterns.
    *
-   * SDE if pattern has a syntax error.
+   * For P patterns, the value is the number of P characters when
+   * the P chars are on the right of the digits, or it is negated if the
+   * P chars are on the left of the digits.
    *
-   * @return the number of digits to the right of the V character. Must be 1 
or greater.
+   * Examples:
    *
+   * "000.00" returns 2, so text "12345" yields 123.45

Review Comment:
   Should this be `000V00`?



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -80,14 +99,72 @@ object TextNumberPatternUtils {
      // don't forget the ^ and $ (start of data, end of data) because we want
      // the match to consume all the characters, starting at the beginning.
     val vPattern=
-     
s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}(?:;${negPrefix}${negBeforeV}V${negAfterV}${negSuffix})?$$"""
+     s"""^${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}""" ++
+       s"""(?:;${negPrefix}${negBeforeV}(?:V${negAfterV}${negSuffix})?)?$$"""

Review Comment:
   I know this is just a long-line change, but I missed in the V PR. If there 
is a negative pattern, them the `V`, after V, and suffix combined are optional? 
So for example, this pattern is valid `+00V00;-00`. Is this intentional?
   
   My reading of the spec, it seems the positive and negative patterns should 
be the exact same pattern, just with the entire negative pattern being 
optional. Almost something like
   ```scala
   val subpattern = s"""${posPrefix}${posBeforeV}V${posAfterV}${posSuffix}"""
   val vPattern = s"""^${subpattern}(?:;${subpattern})?$$"""
   ```
   
   I noticed in the P pattern you don't allow that optionality.



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -275,17 +410,67 @@ case class ConvertTextStandardNumberPrim(e: ElementBase)
   extends Terminal(e, true)
   with ConvertTextNumberMixin {
 
+
   final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
-    val r = TextNumberPatternUtils.vregexStandard
-    r.findFirstMatchIn(patternStripped) match {
-      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
-      case None =>
-        e.SDE(
-          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal 
point).
-             | Other than the sign indicators, it can contain only
-             | '#', then digits 0-9 then 'V' then digits 0-9.
-             | The positive part of the dfdl:textNumberPattern is 
mandatory.""".stripMargin('|'),
+    if (hasV) {
+      val r = TextNumberPatternUtils.vRegexStandard
+      r.findFirstMatchIn(patternNoQuoted) match {
+        case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+        case None =>
+          e.SDE(
+            s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal 
point).
+               | Other than the sign indicators, it can contain only
+               | '#', then digits 0-9 then 'V' then digits 0-9.
+               | The positive part of the dfdl:textNumberPattern is 
mandatory.""".stripMargin('|'),
+            pattern)
+      }

Review Comment:
   Just trying to think about edge cases, assume a pattern is invalid and it 
has both `V` and `P`, e.g. `P00V00`. In this case both `hasV` and `hasP` will 
be true, and we'll fall into this `hasV` block, the regex will fail and we'll 
say it's not a valid V pattern? Seems like a reasonable way to error, just 
double checking that I'm understanding this all right.



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -275,17 +410,67 @@ case class ConvertTextStandardNumberPrim(e: ElementBase)
   extends Terminal(e, true)
   with ConvertTextNumberMixin {
 
+
   final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
-    val r = TextNumberPatternUtils.vregexStandard
-    r.findFirstMatchIn(patternStripped) match {
-      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
-      case None =>
-        e.SDE(
-          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal 
point).
-             | Other than the sign indicators, it can contain only
-             | '#', then digits 0-9 then 'V' then digits 0-9.
-             | The positive part of the dfdl:textNumberPattern is 
mandatory.""".stripMargin('|'),
+    if (hasV) {
+      val r = TextNumberPatternUtils.vRegexStandard
+      r.findFirstMatchIn(patternNoQuoted) match {
+        case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+        case None =>
+          e.SDE(
+            s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal 
point).
+               | Other than the sign indicators, it can contain only
+               | '#', then digits 0-9 then 'V' then digits 0-9.
+               | The positive part of the dfdl:textNumberPattern is 
mandatory.""".stripMargin('|'),
+            pattern)
+      }
+    } else if (hasP) {
+      val rr = TextNumberPatternUtils.pOnRightRegexStandard
+      val rl = TextNumberPatternUtils.pOnLeftRegexStandard
+      val rightMatch = rr.findFirstMatchIn(patternNoQuoted)
+      val leftMatch = rl.findFirstMatchIn(patternNoQuoted)
+      (leftMatch, rightMatch) match {
+        case (None, None) => e.SDE(
+        """The dfdl:textNumberPattern '%s' contains 'P' (virtual decimal point 
positioners).
+            |However, it did not match the allowed syntax which allows the 
sign indicator
+            |plus digits on only one side of the P symbols.""".stripMargin,
           pattern)
+        case (Some(rl(_, ps, digits, _, negPre, negPs, negDigits, _)), None) 
=> {
+          checkPPPMatchSyntax(ps, digits, negPre, negPs, negDigits)
+          ps.length + digits.length
+        }
+        case (None, Some(rr(_, digits, ps, _, negPre, negDigits, negPs, _))) 
=> {
+          checkPPPMatchSyntax(ps, digits, negPre, negPs, negDigits)
+          - ps.length // negate value.
+        }
+        case _ => Assert.invariantFailed("Should not match both left P and 
right P regular expressions.")
+      }
+    } else {
+      0 // there is no V nor P, so there is no virtual decimal point scaling 
involved.
+    }
+  }
+
+  /**
+   * Check for common errors in the pattern matches for 'P' patterns.
+   *
+   * @param ps        match string for positive pattern P characters
+   * @param digits    match string for positive pattern digits (and sharps)
+   * @param negPre    negative prefix match (null if there is no negative 
pattern at all)
+   * @param negPs     match string for negative pattern P character (or null)
+   * @param negDigits match string for negative pattern P digits (and sharps)
+   */
+  private def checkPPPMatchSyntax(ps: String, digits: String, negPre: String, 
negPs: String, negDigits: String): Unit = {
+    Assert.invariant(ps.length >= 1)
+    digits.length >= 1

Review Comment:
   Was this intended to be an `Assert(digits.length >= 1)`. I'm surprises Scala 
didn't give a compiler warning about this line not doing anything. Maybe it 
doesn't know if `.length` has side effects or not?



##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/processors/parsers/ConvertTextStandardNumberParser.scala:
##########
@@ -89,21 +92,52 @@ trait TextDecimalVirtualPointMixin {
   }
   // $COVERAGE-ON$
 
+
+  /**
+   * Always creates an integer from a JFloat, JDouble, or JBigDecimal
+   * by scaling the argument by the virtual decimal point scaling factor.
+   *
+   * If the argument is some other JNumber, i.e., not a JFloat, JDouble, or 
JBigDecimal
+   * it can only be an integer type. This method this returns the argument 
value

Review Comment:
   Should be "This method returns"



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -186,41 +305,57 @@ trait ConvertTextNumberMixin {
    * but cannot be used as an operational pattern given that
    * potentially lots of things have been removed from it.
    */
-  final protected lazy val patternStripped = {
+  final protected lazy val patternNoQuoted = {
     // note: tick == apos == ' == single quote.
     // First remove entirely all escaped ticks ie., ''
     val noEscapedTicksRegex = """''""".r
     val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(pattern, "")
     // Next remove all tick-escaped characters entirely
     val noQuotedRegex = """'[^']+'""".r
-    val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+    val res = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
     // the remaining string contains only pattern special characters
     // and regular non-pattern characters
-    patternNoQuoted
+    res
   }
 
-  protected final lazy val hasV = patternStripped.contains("V")
-  protected final lazy val hasP = patternStripped.contains("P")
+  protected final lazy val hasV = patternNoQuoted.contains("V")
+  protected final lazy val hasP = patternNoQuoted.contains("P")
 
   /**
-   * analogous to the property dfdl:binaryDecimalVirtualPoint
+   * Analogous to the property dfdl:binaryDecimalVirtualPoint
    *
    * Value is 0 if there is no virtual decimal point.
-   * Value is the number of digits to the right of the 'V'
+   *
+   * Value is the number of digits to the right of the 'V' for V patterns.
+   *
+   * For P patterns, the value is the number of P characters when
+   * the P chars are on the right of the digits, or it is negated if the
+   * P chars are on the left of the digits.
+   *
+   * Examples:
+   *
+   * "000.00" returns 2, so text "12345" yields 123.45

Review Comment:
   Same question here, should this be `000V00`?



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -186,41 +305,57 @@ trait ConvertTextNumberMixin {
    * but cannot be used as an operational pattern given that
    * potentially lots of things have been removed from it.
    */
-  final protected lazy val patternStripped = {
+  final protected lazy val patternNoQuoted = {
     // note: tick == apos == ' == single quote.
     // First remove entirely all escaped ticks ie., ''
     val noEscapedTicksRegex = """''""".r
     val patternNoEscapedTicks = noEscapedTicksRegex.replaceAllIn(pattern, "")
     // Next remove all tick-escaped characters entirely
     val noQuotedRegex = """'[^']+'""".r
-    val patternNoQuoted = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
+    val res = noQuotedRegex.replaceAllIn(patternNoEscapedTicks, "")
     // the remaining string contains only pattern special characters
     // and regular non-pattern characters
-    patternNoQuoted
+    res
   }
 
-  protected final lazy val hasV = patternStripped.contains("V")
-  protected final lazy val hasP = patternStripped.contains("P")
+  protected final lazy val hasV = patternNoQuoted.contains("V")
+  protected final lazy val hasP = patternNoQuoted.contains("P")
 
   /**
-   * analogous to the property dfdl:binaryDecimalVirtualPoint
+   * Analogous to the property dfdl:binaryDecimalVirtualPoint
    *
    * Value is 0 if there is no virtual decimal point.
-   * Value is the number of digits to the right of the 'V'
+   *
+   * Value is the number of digits to the right of the 'V' for V patterns.
+   *
+   * For P patterns, the value is the number of P characters when
+   * the P chars are on the right of the digits, or it is negated if the
+   * P chars are on the left of the digits.
+   *
+   * Examples:
+   *
+   * "000.00" returns 2, so text "12345" yields 123.45
+   *
+   * "PP000" returns 2, so text "123" yields 0.00123
+   *
+   * "000PP" returns -2 so text "123" yields 12300.0
    */
   final lazy val textDecimalVirtualPoint: Int = {
-    if (!hasV && !hasP) 0 // no virtual point
-    else {
-      if (hasP) {
-        e.notYetImplemented("textNumberPattern with P symbol")
-      }
-      Assert.invariant(hasV)
+    lazy val virtualPoint = textDecimalVirtualPointFromPattern
+    if (hasV) {
       //
       // check for things incompatible with "V"
       //
-      val virtualPoint = textDecimalVirtualPointFromPattern
       Assert.invariant(virtualPoint >= 1) // if this fails the regex is broken.
       virtualPoint
+    } else if (hasP) {
+      //
+      // check for things incompatible with "V"

Review Comment:
   Should be `incompatiable with "P"` instead of "V"?



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -114,21 +191,66 @@ object TextNumberPatternUtils {
    * Checks a pattern for suitability with V (virtual decimal point) in the 
pattern
    * in the context of zoned textNumberRep.
    *
-   * This is broken out separately for unit testing purposes.
+   * This method is here in this object for unit testing purposes.
    *
    * @param patternStripped the dfdl:textNumberPattern pattern - with all 
quoting removed.
    * @return None if the pattern is illegal syntax for use with V (virtual 
decimal point)
    *         otherwise Some(N) where N is the number of digits to the right of 
the V character.
    */
-  private[primitives] def textDecimalVirtualPointForZoned(patternStripped: 
String): Option[Int] = {
-    val r = TextNumberPatternUtils.vregexZoned
+  def textNumber_V_DecimalVirtualPointForZoned(patternStripped: String): 
Option[Int] = {

Review Comment:
   Since this is no longer `private[primitives]`, does it make sense to move 
the regex match logic to the `textDecimalVirtualPointFromPattern` val in 
`PrimivitesZoned.scala`? That would make the different 
`textDecimalVirtualPointFromPattern` implementations look more similar instead 
of one calling this function and the other having the logic inline.



##########
daffodil-core/src/main/scala/org/apache/daffodil/grammar/primitives/PrimitivesTextNumber.scala:
##########
@@ -275,17 +410,67 @@ case class ConvertTextStandardNumberPrim(e: ElementBase)
   extends Terminal(e, true)
   with ConvertTextNumberMixin {
 
+
   final override protected lazy val textDecimalVirtualPointFromPattern: Int = {
-    val r = TextNumberPatternUtils.vregexStandard
-    r.findFirstMatchIn(patternStripped) match {
-      case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
-      case None =>
-        e.SDE(
-          s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal 
point).
-             | Other than the sign indicators, it can contain only
-             | '#', then digits 0-9 then 'V' then digits 0-9.
-             | The positive part of the dfdl:textNumberPattern is 
mandatory.""".stripMargin('|'),
+    if (hasV) {
+      val r = TextNumberPatternUtils.vRegexStandard
+      r.findFirstMatchIn(patternNoQuoted) match {
+        case Some(r(_, _, afterV, _, _, _, _, _)) => afterV.length
+        case None =>
+          e.SDE(
+            s"""The dfdl:textNumberPattern '%s' contains 'V' (virtual decimal 
point).
+               | Other than the sign indicators, it can contain only
+               | '#', then digits 0-9 then 'V' then digits 0-9.
+               | The positive part of the dfdl:textNumberPattern is 
mandatory.""".stripMargin('|'),
+            pattern)
+      }
+    } else if (hasP) {
+      val rr = TextNumberPatternUtils.pOnRightRegexStandard
+      val rl = TextNumberPatternUtils.pOnLeftRegexStandard
+      val rightMatch = rr.findFirstMatchIn(patternNoQuoted)
+      val leftMatch = rl.findFirstMatchIn(patternNoQuoted)
+      (leftMatch, rightMatch) match {
+        case (None, None) => e.SDE(
+        """The dfdl:textNumberPattern '%s' contains 'P' (virtual decimal point 
positioners).
+            |However, it did not match the allowed syntax which allows the 
sign indicator
+            |plus digits on only one side of the P symbols.""".stripMargin,
           pattern)
+        case (Some(rl(_, ps, digits, _, negPre, negPs, negDigits, _)), None) 
=> {
+          checkPPPMatchSyntax(ps, digits, negPre, negPs, negDigits)
+          ps.length + digits.length
+        }
+        case (None, Some(rr(_, digits, ps, _, negPre, negDigits, negPs, _))) 
=> {
+          checkPPPMatchSyntax(ps, digits, negPre, negPs, negDigits)
+          - ps.length // negate value.
+        }
+        case _ => Assert.invariantFailed("Should not match both left P and 
right P regular expressions.")
+      }
+    } else {
+      0 // there is no V nor P, so there is no virtual decimal point scaling 
involved.
+    }
+  }
+
+  /**
+   * Check for common errors in the pattern matches for 'P' patterns.
+   *
+   * @param ps        match string for positive pattern P characters
+   * @param digits    match string for positive pattern digits (and sharps)
+   * @param negPre    negative prefix match (null if there is no negative 
pattern at all)
+   * @param negPs     match string for negative pattern P character (or null)
+   * @param negDigits match string for negative pattern P digits (and sharps)
+   */
+  private def checkPPPMatchSyntax(ps: String, digits: String, negPre: String, 
negPs: String, negDigits: String): Unit = {
+    Assert.invariant(ps.length >= 1)
+    digits.length >= 1
+    if (negPre ne null) {
+      e.schemaDefinitionUnless(ps == negPs,
+        """In the dfdl:textNumberPattern '%s' the P Symbols do not match
+          | between positive and negative patterns""".stripMargin,
+        pattern)
+      e.schemaDefinitionUnless(digits == negDigits,
+        """In the dfdl:textNumberPattern '%s' the '#' and digit symbols do not 
match
+          | between positive and negative patterns""".stripMargin,
+        pattern)
     }

Review Comment:
   Looking back at the spec I saw this line:
   
   > If there is an explicit negative subpattern, it serves only to specify the 
negative prefix and suffix; the number of digits, minimal digits, and other 
characteristics are ignored in the negative subpattern
   
   Which gives me two questions:
   1. Should we allow the negative and positive P's to be different? Seems like 
it's probably a user doing something they didn't intend if that's the case so 
maybe it's is worth an SDE as you have. Or at the very least this should be a 
warning.
   2. Similarly, should the V and P regex patterns be much more lax about what 
they allow in the negative part since it's ignored? Again, if they don't match 
it probably an accident and an SDE seems very reasonable, but maybe there are 
schemas that will just want to keep the textNumberPattern as simple as possible 
and do something like `+000V0000;-0` so that they can specific the virtual 
decimal point in the postive pattern, and then just the prefix/suffix in the 
negative pattern and not worry about them having to match.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to