[
https://issues.apache.org/jira/browse/DAFFODIL-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mike Beckerle updated DAFFODIL-2870:
------------------------------------
Labels: beginner (was: )
> textNumberPattern negative part should get warning if it specifies ignored
> pattern characters
> ---------------------------------------------------------------------------------------------
>
> Key: DAFFODIL-2870
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2870
> Project: Daffodil
> Issue Type: Improvement
> Components: Front End
> Affects Versions: 3.6.0
> Reporter: Mike Beckerle
> Priority: Major
> Labels: beginner
>
> The dfdl:textNumberPattern property is an ICU number format pattern.
> It has a positive part and optional negative part separated by a ";"
> ICU documents that the negative part is used only to define the negative sign
> indication. So for example "#;-#" means a hyphen is used as a minus sign. The
> pattern "#;(#)" means negative values are surrounded by parentheses. Other
> pattern characters can be present, but are ignored except for indicating
> where these sign indicator characters are relative to the digits of the
> number.
> So for example "00000;- #" means -12 would be formatted as -00012 because the
> number of digits is taken from the positive pattern. This pattern means the
> same:
> "0000;-######00000000"
> Since the only significance to the negative pattern "######00000000" string
> is to indicate that the hyphen/minus appears before the digits. In fact any
> number specifier like "##,###,##0.00###" in a negative pattern is ignored and
> really should just be written as a single "0" or "#" character. Other things
> like the ICU pad character specifier, if they appear in the negative pattern,
> are ignored as well regardless of the fact that they could be useful.
> The fact that these are allowed, yet ignored, is unintuitive, misleading, and
> error prone, because users are not going to realize almost everything about
> the negative pattern gets ignored.
> Daffodil should warn if the negative pattern contains anything other than
> prefix, a single "#" or "0" character, and suffix specified.
> The warning message should say the negative pattern is only used to specify
> the prefix and suffix used to indicate negative values. Everything else is
> ignored. Ideally we should parse the negative pattern syntax and point out
> all the ignored parts.
> This warning should be suppressable via the usual WarnID mechanism.
> We should consider having a tunable or property which if set escalates this
> warning to a SchemaDefinitionError.
> Honestly I think the only meaningful negative patterns are probably:
> * "-#"
> * "(#)"
> * "#-"
> With minor variations which insert spaces such as:
> * "- #" (a space was added after the sign)
> * "( # )" (spaces between digits and parens
> * "# -" (a space before the trailing sign)
> Here's an example of a complex dfdl:textNumberPattern which makes the point
> that the negative pattern is just a kind of trivial tail end.
> ```
> dfdl:textNumberPattern="+ *x#, ###,##0.00;- #"
> ```
> The only contribution of the negative part of that pattern is that "- "
> (hyphen and space) is used as the prefix for negative values. The rest all
> comes from the positive pattern.
> The value -1234.5 would unparse as `"- xxxx1,234.50"`
> If instead the user writes:
> ```
> dfdl:textNumberPattern="+ *x#, ###,##0.00;- *x#,###,##0.00"
> ```
> The warning should be issued and state that the negative part of this pattern
> is mostly ignored. Only the `"- "` is significant, and the negative part
> could be just `"- #"`, so the whole pattern shortened to `"+ *x#,
> ###,##0.00;- #"`.
> This requires a simple parse of the negative pattern to identify the
> significant parts, but this is quite easy. (Lookup Scala Regex Pattern
> Matching).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)