[ https://issues.apache.org/jira/browse/TIKA-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris A. Mattmann resolved TIKA-366. ------------------------------------ Resolution: Fixed - fixed in r901033 > Increase buffer size for mime type sniffing > ------------------------------------------- > > Key: TIKA-366 > URL: https://issues.apache.org/jira/browse/TIKA-366 > Project: Tika > Issue Type: Improvement > Components: mime > Affects Versions: 0.5 > Environment: My local MacBook pro laptop. > Reporter: Chris A. Mattmann > Assignee: Chris A. Mattmann > Fix For: 0.6 > > > While working on TIKA-357 to address a similar problem for charset detection, > I found an issue with mime identification having to do with the same general > problem. Tika right now only deals with the first MimeTypes#getMinLength() > bytes of a magic header to do the sniffing of mime type. With the example > file attached from Ken Krugler, it's clear that the current min length size > of 4 * 1024 bytes isn't enough. Extending it to 8K (8 * 1024 bytes) addresses > this issue and seems to open up more opportunity for mime detection at little > overhead cost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.