[ https://issues.apache.org/jira/browse/TIKA-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gannon McGibbon updated TIKA-3421: ---------------------------------- Attachment: news3.txt news2.txt news.txt > Obsoleted mime types > -------------------- > > Key: TIKA-3421 > URL: https://issues.apache.org/jira/browse/TIKA-3421 > Project: Tika > Issue Type: Improvement > Components: core > Reporter: Gannon McGibbon > Priority: Minor > Attachments: news.txt, news2.txt, news3.txt > > > We're currently using Tika's `tika-mimetypes.xml` to detect mime types for > files based on extensions and magic in > [rails/marcel|https://github.com/rails/marcel]. I'm wondering what Tika's > stance is on retiring deprecated/obsolete mime types. > In [an issue|https://github.com/rails/marcel/issues/4], someone reported that > the [message/news|https://www.iana.org/assignments/media-types/message/news] > type was matching on any text file beginning with "Article". While this magic > rule seems a little aggressive to me, the type has also been [deprecated by > the IANA|https://www.iana.org/assignments/media-types/message/news] since > 2009. Does Tika ever plan on removing support for deprecated types such as > this? If not, I can change our XML parsing rules to reject certain types. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)