[ 
https://issues.apache.org/jira/browse/DAFFODIL-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guichard Desrosiers updated DAFFODIL-3084:
------------------------------------------
    Description: 
Daffodil's calendar conversion and comparison code reads the ICU 
{{EXTENDED_YEAR}} field for the year value. {{EXTENDED_YEAR}} uses astronomical 
(proleptic) numbering — 1 BCE = 0, 2 BCE = -1, etc. — whereas XSD 1.0-year 
numbering matches ICU's {{YEAR}} scheme, where 1 BCE = -0001, 2 BCE = -0002, 
and there is no year zero. Because the two conventions are offset by one for 
all non-positive years, every BCE date Daffodil produces are wrong, and off by 
one year. Astronomical year 0 is rendered as the lexically illegal {{{}0000{}}}.

 

*Proposed Fix:*

Change all uses of {{Calendar.EXTENDED_YEAR}} to {{Calendar.YEAR}} across the 
calendar conversion and comparison code, so the lexical year matches XSD 1.0 
numbering. As a consequence, year 0 is unrepresentable ({{{}YEAR{}}} minimum is 
1), which matches XSD 1.0 (no year zero) and structurally guarantees Daffodil 
never emits {{{}0000{}}}.

Under lax calendar check policy, ICU does not reject a {{0000}} / year-0 input; 
it normalizes it to {{{}0001{}}}. This is acceptable: lax is intentionally 
permissive, and the key guarantee — that Daffodil never emits {{0000}} in the 
infoset — still holds. Such input simply cannot round-trip, since the original 
{{0000}} lexical form is not reproduced.

  was:
Daffodil's calendar conversion and comparison code reads the ICU 
{{EXTENDED_YEAR}} field for the year value. {{EXTENDED_YEAR}} uses astronomical 
(proleptic) numbering — 1 BCE = 0, 2 BCE = -1, etc. — whereas XSD 1.0-year 
numbering matches ICU's {{YEAR}} scheme, where 1 BCE = -0001, 2 BCE = -0002, 
and there is no year zero. Because the two conventions are offset by one for 
all non-positive years, every BCE date Daffodil produces are wrong, and off by 
one year. Astronomical year 0 is rendered as the lexically illegal {{{}0000{}}}.

 

*Proposed Fix:*

Change all uses of {{Calendar.EXTENDED_YEAR}} to {{Calendar.YEAR}} across the 
calendar conversion _and_ comparison code, so the lexical year matches XSD1.0 
numbering. As a consequence, year 0 is unrepresentable ({{{}YEAR{}}} minimum is 
1), which matches XSD 1.0 (no year zero) and structurally guarantees Daffodil 
never emits {{{}0000{}}}. ICU under lax rejects a {{0000}} / year-0 lexical 
value outright once on {{{}YEAR{}}}. Daffodil should surface this as a *parse 
error* (a clean processing-error diagnostic), not an uncaught exception — i.e. 
catch ICU's rejection in the calendar parse path and convert it to a normal 
Daffodil parse diagnostic.


>  Calendar code uses ICU EXTENDED_YEAR instead of YEAR, producing year values 
> that don't conform to the XSD 1.0 spec
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: DAFFODIL-3084
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-3084
>             Project: Daffodil
>          Issue Type: Bug
>          Components: Back End
>            Reporter: Guichard Desrosiers
>            Assignee: Guichard Desrosiers
>            Priority: Major
>
> Daffodil's calendar conversion and comparison code reads the ICU 
> {{EXTENDED_YEAR}} field for the year value. {{EXTENDED_YEAR}} uses 
> astronomical (proleptic) numbering — 1 BCE = 0, 2 BCE = -1, etc. — whereas 
> XSD 1.0-year numbering matches ICU's {{YEAR}} scheme, where 1 BCE = -0001, 2 
> BCE = -0002, and there is no year zero. Because the two conventions are 
> offset by one for all non-positive years, every BCE date Daffodil produces 
> are wrong, and off by one year. Astronomical year 0 is rendered as the 
> lexically illegal {{{}0000{}}}.
>  
> *Proposed Fix:*
> Change all uses of {{Calendar.EXTENDED_YEAR}} to {{Calendar.YEAR}} across the 
> calendar conversion and comparison code, so the lexical year matches XSD 1.0 
> numbering. As a consequence, year 0 is unrepresentable ({{{}YEAR{}}} minimum 
> is 1), which matches XSD 1.0 (no year zero) and structurally guarantees 
> Daffodil never emits {{{}0000{}}}.
> Under lax calendar check policy, ICU does not reject a {{0000}} / year-0 
> input; it normalizes it to {{{}0001{}}}. This is acceptable: lax is 
> intentionally permissive, and the key guarantee — that Daffodil never emits 
> {{0000}} in the infoset — still holds. Such input simply cannot round-trip, 
> since the original {{0000}} lexical form is not reproduced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to