[ 
https://issues.apache.org/jira/browse/TIKA-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029720#comment-13029720
 ] 

Nick Burch commented on TIKA-655:
---------------------------------

In r1100039, I've pushed the iWorks detection logic from ZipContainerDetector 
to IWorkPackageParser, and made that detect similar to OfficeParser does.

Then, put the content handler selection logic into IWorkPackageParser, and 
remove IWorkParser (which claimed to be a regular parser but in fact only 
worked when called from IWorkPackageParser). The result is that tika app can 
then parse iWork files, and unit tests still work


> IWorkPackageParser / IWorkParser not registering properly
> ---------------------------------------------------------
>
>                 Key: TIKA-655
>                 URL: https://issues.apache.org/jira/browse/TIKA-655
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.9
>            Reporter: Nick Burch
>            Assignee: Nick Burch
>             Fix For: 1.0
>
>
> If you try to use AutoDetectParser to handle an iWork document, it'll fail 
> with:
>  org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is 
> not allowed in prolog.
>       at 
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
> However IWorkPackageParser works fine. It seems the IWorkParser needs just 
> the individual zip part, but is registered as the handler for the individual 
> mime types, so breaks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to