[ 
https://issues.apache.org/jira/browse/DAFFODIL-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17220760#comment-17220760
 ] 

Steve Lawrence commented on DAFFODIL-2408:
------------------------------------------

Looked a bit and I think the issue is related to how the schema handles line 
wrapping. Essentially, it doesn't, and just uses hacks like
{code}
separator="%NL;SP;"
{code}

so it treats wrapping as separators. This is fragile, and sometimes doesn't 
work. For example, an iCalendar line looks something like this:

{code}
property-name;key=value;key=value:property-value
{code}

So a property is defined by its property name, followed by optional key-value 
pairs separated by semicolons, followed by a colon and the property value. The 
problem occurs when the the property key/value pars get so long the we end up 
splitting a new line in the middle of a key. For example:

{code}
property-name;key=value;key=value;key=value;key=value;ke
  y=value:property-value
{code}

The schema doesn't handle allowing newline in values, which it just treats it 
as an array of values split on the newline/space. Normally this is fine because 
there are rarely a bunch of key/value property pairs and so we rarely split in 
a key. In these files there are. This results in an invalid parse, and cascades 
the backtracking all the way back to the beginning where we get a basically 
infoset.

The offending line is something like:
{code}
ATTENDEE;CN="Surname, First Name (Long stuff here)";ROLE=OPT-PARTICIPANT;R
        SVP=TRUE:mailto:[email protected]
{code}
In this case, we end up splitting in the "RSVP" key, and causes things to break.

So I agree that this is a schema issue, but unrelated to ATTACH and BASE64. 
Really, this schema needs to be updated to use the layering feature that 
handles this style of line wrapping. Daffodil has a special 
LineFoldedTransformer layer for exactly this purpose, so that should resolve 
this issue and clean up the schema quite a bit, so it can remove all the 
%NL;%SP; hacks.

I don't get an abort on 2.7.0 when run with --trace. Whatever the issue was 
with 2.3.0 has been resolved. Marking this issue as resolved since this appears 
to be a schema issue with iCalendar and not with Daffodil.

> iCalendar schema + data causes Daffodil to abort
> ------------------------------------------------
>
>                 Key: DAFFODIL-2408
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2408
>             Project: Daffodil
>          Issue Type: Bug
>    Affects Versions: 2.3.0, 2.7.0
>            Reporter: Mike Beckerle
>            Assignee: Shashi Ramaka
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: dfdl_2_3_ics_excep.zip
>
>
> A user is using Daffodil 2.3.0 with the iCalendar DFDL schema, and has 
> provided an ".ics" file.
> These cause Daffodil 2.3.0 to fail with invariant failed.
> I have verified that these also cause problems (but not the same one) on 
> Daffodil 2.7.0 with the latest iCalendar schema I have access to. 
> On 2.7.0 Daffodil this aborts when run with a trace, and without a trace it 
> parses producing an empty iCalendar element, without error. It doesn't even 
> complain about not consuming the data. This is of course nonsense. There's 
> plenty of data, and an empty icalendar element isn't valid. It has a required 
> vcalendar child element. (You have to turn validation off to get the empty 
> icalendar result, otherwise you get validation errors)
> Contact [~mbeckerle] for the associated files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to