[GitHub] [daffodil] stevedlawrence opened a new pull request, #982: Ensure all primitives use textNumberPattern and infinfity/NaN correctly

via GitHub Wed, 08 Mar 2023 08:51:18 -0800


stevedlawrence opened a new pull request, #982:
URL: https://github.com/apache/daffodil/pull/982


   Currently, if the primitive type is an integer then text number parsing 
disallows parsing decimal points, even if the pattern contains a decimal point. 
Instead, when parsing integers, we should allow decimals as long the fractional 
part is zero. And when unparsing, we should unparse a decimal point with a zero 
fractional part according to the pattern.
   
   This changes the behavior so integer parsing uses the same DecimalFormat 
configuration as non-integer parsing (i.e. decimals are allowed), but we throw 
a parse error if the fractional part that was parsed is non-zero. This also 
means that unparsing integers now outputs decimal points according to the 
pattern.
   
   Additionally, if textNumberCheckPolicy is strict, we enable ICU 
setDecimalPatternMatchRequired to true so that we allow or disallow decimal 
points in the data depending on if the pattern does or does not have a decimal 
point. Note that lax parsing always allows decimal points regardless of the 
pattern. For this reason, we now always require the grouping/decimal separator 
DFDL properties in lax mode.
   
   One bug was discovered in ICU (ICU-22303) where if we require the decimal 
point due to strict mode enabled, then ICU never parses the infinity/NaN 
representation. A workaround is added to manually check for these 
representations until this bug is fixed. ICU unit tests are also added which 
should fail if ICU fixes this bug so we can remove this workaround.
   
   Make sure we always specify infinity and NaN representations from the DFDL 
schema for all primitives, not just for xs:double/xs:float. There is no way to 
disable infinity/NaN ICU parsing, so when if we do not specify these values ICU 
just uses the locale values, which could lead to unwanted locale specific 
behavior. Related, this modifies NodeInfo types so that fromNumber fails for 
types that do not support infinity/NaN (i.e. everything except Double/Float) 
and creates a parse error.
   
   Modifies virtual decimal logic to ensure we handle cases for numbers that do 
not fit in a Long (should work) or contain decimal points (should be a parse 
error).
   
   Tests are updated so if they want to differentiate between int and decimal 
depending on if a decimal exists in the data, then they must specify a pattern 
with or without a decimal and enable strict mode--lax mode allows a decimal 
regardless of type so cannot differentiate the types.
   
   DAFFODIL-2158


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [daffodil] stevedlawrence opened a new pull request, #982: Ensure all primitives use textNumberPattern and infinfity/NaN correctly

Reply via email to