[ https://issues.apache.org/jira/browse/TIKA-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting resolved TIKA-384. -------------------------------- Resolution: Invalid Assignee: Jukka Zitting This is how the type detection is supposed to work. The text/css type is essentially a more accurate subtype of text/plain, and the added filename information allows the detection code to return the more accurate type as a result to the caller. > incorrect mime type detection when Metadata.RESOURCE_NAME_KEY set > ----------------------------------------------------------------- > > Key: TIKA-384 > URL: https://issues.apache.org/jira/browse/TIKA-384 > Project: Tika > Issue Type: Bug > Components: mime > Affects Versions: 0.6 > Environment: Java: 1.6.0_17; Java HotSpot(TM) Client VM 14.3-b01 > System: Windows XP version 5.1 running on x86; Cp1252; en_GB (nb) > Reporter: Jim Kay > Assignee: Jukka Zitting > Original Estimate: 2h > Remaining Estimate: 2h > > When Metadata.RESOURCE_NAME_KEY set is set as in: > metadata.set(Metadata.RESOURCE_NAME_KEY, f.getCanonicalPath()) > the incorrect mime type is set > I was trying to add .csv files as a type by editing the xml mime types. When > I ran a .csv file (and for comparison a .css file) through TikaGUI they were > both passed successfully as text. > In my AutoDetectParser example I had set the RESOURCE_NAME_KEY to > f.getCanonicalPath() (this code was copied - I don't know what it does). In > this example .css and .csv were NOT identified as text/plain. > The issue is in MimeTypes with the following code: > String resourceName = metadata.get(Metadata.RESOURCE_NAME_KEY); > if (resourceName != null) { > String name = null; > ... > ... > if (name != null) { > MimeType hint = getMimeType(name); > if (hint.isDescendantOf(type)) { > type = hint; > } > } > If the RESOURCE_NAME_KEY is not null then the code ultimately resets type to > hint, however hint is text/css. So the correct identification of type as > text/plain is overwritten. > } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.