Jukka -

That looks good for comprehensive testing, but how do you feel about
TIKA-75?  This would have multiple uses, including being consistent with our
goal of providing MIME detection services.

- Keith


JIRA [EMAIL PROTECTED] wrote:
> 
> 
>      [
> https://issues.apache.org/jira/browse/TIKA-76?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> ]
> 
> Jukka Zitting updated TIKA-76:
> ------------------------------
> 
>     Attachment: TIKA-76.patch
> 
> Instead of making real copies of the documents, we could always just feed
> an incorrect file name with the original resource stream.
> 
> See the attached patch for an example of how this could work with
> AutoDetectParserTest. The patch uses the AutoDetectParser on all the
> current test documents in the following configurations:
> 
>     * correct name and type hints
>     * correct name but no type hint
>     * correct name but incorrect type hint
>     * incorrect type and no name hint
>     * correct type but no name hint
>     * correct type but incorrect name hint
>     * incorrect name and no type hint
>     * incorrect name and type hints
>     * no name or type hints
> 
> It seems we currently need MIME magic tests for Excel, PowerPoint, RTF,
> plain text, word, and XML.
> 
>> Need to add test documents with wrong extensions.
>> -------------------------------------------------
>>
>>                 Key: TIKA-76
>>                 URL: https://issues.apache.org/jira/browse/TIKA-76
>>             Project: Tika
>>          Issue Type: Improvement
>>          Components: general
>>    Affects Versions: 0.1-incubator
>>            Reporter: Keith R. Bennett
>>             Fix For: 0.1-incubator
>>
>>         Attachments: TIKA-76.patch
>>
>>
>> We need to add test documents with misleading extensions to verify that
>> the file header MIME type determination is taking precedence over the
>> file name approach.
>> I suggest copying existing files such as:
>> cp testHTML.html testReallyHTML.doc
> 
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/-jira--Created%3A-%28TIKA-76%29-Need-to-add-test-documents-with-wrong-extensions.-tf4643422.html#a13264427
Sent from the Apache Tika - Development mailing list archive at Nabble.com.

Reply via email to