[ 
https://issues.apache.org/jira/browse/TIKA-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Meier updated TIKA-2602:
--------------------------------
    Attachment: VERSION_Test

> iCalendar not properly recognized as text/calendar
> --------------------------------------------------
>
>                 Key: TIKA-2602
>                 URL: https://issues.apache.org/jira/browse/TIKA-2602
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Andreas Meier
>            Priority: Major
>         Attachments: VERSION_Test
>
>
> At the moment the detection of text/calender is covered by the following 
> mime-type-element:
> {code:xml}
>   <mime-type type="text/calendar">
>     <magic priority="50">
>       <match value="BEGIN:VCALENDAR" type="string" offset="0">
>         <match value="VERSION:2.0" type="string" offset="15:30"/>
>       </match>
>     </magic>
>     <glob pattern="*.ics"/>
>     <glob pattern="*.ifb"/>
>     <sub-class-of type="text/plain"/>
>   </mime-type>
> {code}
> This recognition will fail, if VERSION:2.0 is not the first property after 
> BEGIN:VCALENDAR.
> Since this is not always the case (check: 
> [https://tools.ietf.org/html/rfc5545|https://tools.ietf.org/html/rfc5545] 
> 3.6. Calendar Components) recognition may fail for calendar objects with 
> PRODID or other properties:
>  Section "4. iCalendar Object Examples" shows some of these cases:
> {code}
>        BEGIN:VCALENDAR
>        PRODID:-//xyz Corp//NONSGML PDA Calendar Version 1.0//EN
>        VERSION:2.0
>        BEGIN:VEVENT
>        DTSTAMP:19960704T120000Z
>        UID:[email protected]
>        ORGANIZER:mailto:[email protected]
>        DTSTART:19960918T143000Z
>        DTEND:19960920T220000Z
>        STATUS:CONFIRMED
>        CATEGORIES:CONFERENCE
>        SUMMARY:Networld+Interop Conference
>        DESCRIPTION:Networld+Interop Conference
>          and Exhibit\nAtlanta World Congress Center\n
>         Atlanta\, Georgia
>        END:VEVENT
>        END:VCALENDAR
> {code}
> or
> {code}
>        BEGIN:VCALENDAR
>        METHOD:xyz
>        VERSION:2.0
>        PRODID:-//ABC Corporation//NONSGML My Product//EN
>        BEGIN:VEVENT
>        DTSTAMP:19970324T120000Z
>        SEQUENCE:0
>        UID:[email protected]
>        ORGANIZER:mailto:[email protected]
>        ATTENDEE;RSVP=TRUE:mailto:[email protected]
>        DTSTART:19970324T123000Z
>        DTEND:19970324T210000Z
>        CATEGORIES:MEETING,PROJECT
>        CLASS:PUBLIC
>        SUMMARY:Calendaring Interoperability Planning Meeting
>        DESCRIPTION:Discuss how we can test c&s interoperability\n
>         using iCalendar and other IETF standards.
>        LOCATION:LDB Lobby
>        ATTACH;FMTTYPE=application/postscript:ftp://example.com/pub/
>         conf/bkgrnd.ps
>        END:VEVENT
>        END:VCALENDAR
> {code}
> I suggest to either 
> a) widen the offset of the VERSION-match from 15:30 to 15:200 or sth. like 
> that (not so good approach, since we don't know how Long the PRODID might be) 
> or
> b) to add sub-matches for CALSCALE, PRODID, METHOD. (This might still not 
> cover everything, since there are x-prop and iana-prop properties. For now I 
> can only confirm that there are PRODID or METHOD as first property after 
> BEGIN:VCALENDAR.)
> Regards
> Andreas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to