tuxji commented on code in PR #982:
URL: https://github.com/apache/daffodil/pull/982#discussion_r1129828685
##########
daffodil-test/src/test/resources/org/apache/daffodil/section13/text_number_props/TextNumberProps.tdml:
##########
@@ -4039,16 +4081,13 @@
<tdml:documentPart
type="text"><![CDATA[NOTANUMBER]]]></tdml:documentPart>
</tdml:document>
- <tdml:warnings>
- <tdml:warning>DFDL property was ignored</tdml:warning>
- <tdml:warning>textStandardNaNRep</tdml:warning>
- </tdml:warnings>
+ <tdml:warnings />
Review Comment:
Can we just delete the empty tdml:warnings element without changing any
semantics?
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/processors/parsers/ConvertTextStandardNumberParser.scala:
##########
@@ -171,47 +185,91 @@ case class ConvertTextStandardNumberParser(
case Some(_) => primNumeric.fromNumber(0)
case None => {
val df = textNumberFormatEv.evaluate(start)
- val strCheckPolicy = if (df.isParseStrict) str else str.trim
+ val strToParse = if (df.isParseStrict) str else str.trim
val pos = new ParsePosition(0)
- val icuNum: Number = df.parse(strCheckPolicy, pos) match {
+ val icuNum: JNumber = df.parse(strToParse, pos) match {
case null => {
- PE(
- start,
- "Unable to parse %s from text: %s",
- context.optPrimType.get.globalQName,
- str,
- )
- return
+ val infNaN: JDouble =
+ if (df.isDecimalPatternMatchRequired) {
+ // ICU failed to parse. But there is a bug in ICU4J
(ICU-22303) that if there is
+ // a decimal in the pattern and we've set that decimal to be
required (due to
+ // strict mode), then it will fail to parse Inf/NaN
representations. As a
+ // workaround, we clone the DecimalFormat, disable requiring
the decimal, and
+ // reparse. We only accept successful Inf/NaN parses
though--everything else is
+ // considered a parse error since it meant the decimal point
was missing or
+ // wasn't either inf/nan or a valid number. If ICU fixes this
bug, we should
+ // remove this infNan variable and its use, as it is likely
pretty expensive to
+ // clone, change a setting, and reparse. Fortunately, it is
only in the error
+ // case of strict parsing so should be rare.
+ pos.setIndex(0)
+ val newDF = df.clone().asInstanceOf[DecimalFormat]
+ newDF.setDecimalPatternMatchRequired(false)
+ newDF.parse(strToParse, pos) match {
+ case d: JDouble => {
+ Assert.invariant(d.isNaN || d.isInfinite)
+ d
+ }
+ case _ => null
+ }
+ } else {
+ null
+ }
+
+ if (infNaN != null) {
+ infNaN
+ } else {
+ PE(
+ start,
+ "Unable to parse %s from text: %s",
+ context.optPrimType.get.globalQName,
+ str,
+ )
+ return
+ }
}
- case d: JDouble if primNumeric.isInteger => {
- // If ICU returns a Double when only integers are expected, it
means the
- // string must have been NaN, -Infinity, or Infinity using the
locales
- // default symbols. There does not seem to be a way to disable
this even
- // with setParseBigDecimal to false and setParseIntegerOnly set to
true.
- // So just create the same PE as if it failed to parse it, which
is what
- // we really want ICU to do
+ case d: JDouble => {
+ // ICU returns a Double only if it parsed NaN, Infinity, or
-Infinity. We will later
+ // pass this value in primNumber.fromNumber, which will fail if
the primitive type
+ // does not allow NaN/Infinity
Assert.invariant(d.isNaN || d.isInfinite)
- PE(
- start,
- "Unable to parse %s from text: %s",
- context.optPrimType.get.globalQName,
- str,
- )
- return
+ d
}
case bd: ICUBigDecimal => {
- // sometimes ICU will return their own custom BigDecimal, even if
the
- // value could be represented as a BigInteger. We only want Java
types,
- // so detect this and convert it to the appropriate type
+ // ICU will return their own custom BigDecimal if the value cannot
fit in a Long and
+ // isn't infinity/NaN. We only want Java types, so detect this and
convert it to the
Review Comment:
Asking only for curiosity. Why does ICU return a custom BigDecimal instead
of a BigDecimal? Why does the difference matter to Daffodil, that is, why
isn't the custom BigDecimal compatible enough?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]