Jukka - That looks good for comprehensive testing, but how do you feel about TIKA-75? This would have multiple uses, including being consistent with our goal of providing MIME detection services.
- Keith JIRA [EMAIL PROTECTED] wrote: > > > [ > https://issues.apache.org/jira/browse/TIKA-76?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Jukka Zitting updated TIKA-76: > ------------------------------ > > Attachment: TIKA-76.patch > > Instead of making real copies of the documents, we could always just feed > an incorrect file name with the original resource stream. > > See the attached patch for an example of how this could work with > AutoDetectParserTest. The patch uses the AutoDetectParser on all the > current test documents in the following configurations: > > * correct name and type hints > * correct name but no type hint > * correct name but incorrect type hint > * incorrect type and no name hint > * correct type but no name hint > * correct type but incorrect name hint > * incorrect name and no type hint > * incorrect name and type hints > * no name or type hints > > It seems we currently need MIME magic tests for Excel, PowerPoint, RTF, > plain text, word, and XML. > >> Need to add test documents with wrong extensions. >> ------------------------------------------------- >> >> Key: TIKA-76 >> URL: https://issues.apache.org/jira/browse/TIKA-76 >> Project: Tika >> Issue Type: Improvement >> Components: general >> Affects Versions: 0.1-incubator >> Reporter: Keith R. Bennett >> Fix For: 0.1-incubator >> >> Attachments: TIKA-76.patch >> >> >> We need to add test documents with misleading extensions to verify that >> the file header MIME type determination is taking precedence over the >> file name approach. >> I suggest copying existing files such as: >> cp testHTML.html testReallyHTML.doc > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > > > -- View this message in context: http://www.nabble.com/-jira--Created%3A-%28TIKA-76%29-Need-to-add-test-documents-with-wrong-extensions.-tf4643422.html#a13264427 Sent from the Apache Tika - Development mailing list archive at Nabble.com.
