stevedlawrence commented on code in PR #1138:
URL: https://github.com/apache/daffodil/pull/1138#discussion_r1447647917
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/EvTextNumber.scala:
##########
@@ -186,6 +186,31 @@ class TextNumberFormatEv(
}
}
+ // If textNumberPattern specifies a pad character before the number
pattern and without a
+ // positive prefix, then ICU defaults to a pad position of
PAD_BEFORE_PREFIX with no way to
+ // change it with just the pattern. This is reasonable for most cases,
like when the pad
+ // character is a space. However, if the pad character in
textNumberPattern is '0', then
+ // negative numbers are padded with a '0' before the negative sign. For
example, a pattern
+ // of "*0####0" unparses -123 to "0-123". This is very unlikely to be what
the user wants
+ // with this pattern.
+ //
+ // So in this very specific case, we change the pad position to
PAD_AFTER_PREFIX so the zero
+ // pad character appears after the negative sign, e.g. "-0123". Note that
the check for
+ // format width > 0 is used to test if padding is enabled at all--there is
no specific API
+ // for this, and this is how ICU determines to add padding or not.
+ //
+ // If a user really wants '0' characters to the left of the negative sign,
they can use
+ // textPadKind/textTrimKind and textNumberPadCharacter to uses Daffodils
padding logic
+ // instead of ICUs.
+ if (
+ df.getFormatWidth > 0 &&
+ df.getPadCharacter == '0' &&
Review Comment:
Good points. Though I'm not sure we can remove the '0' check, if we did that
I think we couldn't format numbers with padding to the left of the number like
` (12.34)`.
I wonder if maybe the simplest solution is a new dfdl extension property?
Something like `dfdlx:textNumberPatternPadPosition`, with values
`beforePrefix`, `afterPrefix`, `beforeSuffix`, `afterSuffix`, and `fromPattern`
as the default for current behavior? Then we can easily support those other
padding styles without having to guess what the user wanted. And it keeps all
the existing ICU behavior (as weird as it is) the same.
There might be confusion with `dfdlx:textNumberPadCharacter`, which works
different and is applied after ICU formats a number to a string, but that feels
like less of a big deal?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]