Yes, to further In hindsight, the strict/lax features for numbers and calendars 
have been problematic for DFDL.

I expect "lax" as a feature is eventually going to be deprecated in DFDL or 
made very non-portable and squishy in specification of what it actually does to 
the point where it's impossible to rely on from release-to-release.

So if you have a one-time data conversion to do, then maybe it's ok, because 
you're going to convert data, then throw away the schema. But if you are 
building a DFDL schema that you want to have be viable longer term, you would 
avoid this "lax" stuff.

So my general suggestion is to switch to "strict".

________________________________
From: Steve Lawrence <slawre...@apache.org>
Sent: Monday, May 18, 2020 12:04 PM
To: users@daffodil.apache.org <users@daffodil.apache.org>
Subject: Re: Does Daffodil support calendarCheckPolicy="lax"?

We do support the dfdl:calendarCheckPolicy property. For example, if
your data was "1970-11-32", that successfully parses as December 2,
1970. So aspects of the lax behavior do appear to work. Unfortunately,
does seem the ignoring of leading and trailing whitespace aspect of a
lax check policy does not work.

The DFDL specification of lax parsing was designed match the behavior of
ICU, which is what we use for parsing date/time. But it seems ICU
doesn't actually have the behavior that the DFDL spec describes.

We have actually run into a similar issue with ICU and lax parsing of
text numbers, where ICU actually changed the lax behavior between
versions. I believe it was decided by ICU that they don't really have a
strong definition of what "lax" parsing entails--that it's essentially
best effort and so it may change.

Because of this, I believe the DFDL-WG discussed just making "lax"
something like an implementation defined best effort, and so one may or
may not be able to rely on certain behaviors when using lax parsing. I'm
not sure what the result of the discussion was, and this is a missing
Daffodil feature or not.

In this specific case, the workaround is to use the textTrimKind,
textCalendarPadCharacter, and textCalendarJustification properties to
have Daffodil strip off the whitespace before the date is parsed.


On 5/18/20 11:30 AM, Dr. Roger L Costello wrote:
> Hi Folks,
>
> It is my understanding that if calendarCheckPolicy="lax" then any leading 
> and/or trailing whitespace around a date value is ignored. That is not the 
> behavior that I am getting. I get this error message:
>
> Parse Error: Convert to Date (for xs:date): Failed to parse '  1970-11-06  ' 
> at character 13.
>
> Here's my DFDL schema:
>
> <xs:element name="SimpleDataFormat">
>     <xs:complexType>
>         <xs:sequence>
>             <xs:element name="Birthday" type="xs:date"
>                 dfdl:calendarCheckPolicy="lax"
>                 dfdl:calendarFirstDayOfWeek="Sunday"
>                 dfdl:calendarDaysInFirstWeek="5"
>                 dfdl:calendarTimeZone="UTC+6"
>                 dfdl:calendarPatternKind="implicit"
>                 dfdl:calendarLanguage="en"
>             />
>         </xs:sequence>
>     </xs:complexType>
> </xs:element>
>
> Why am I getting the above error message?
>
> /Roger
>

Reply via email to