[ https://issues.apache.org/jira/browse/NUTCH-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576003#action_12576003 ]
Andrzej Bialecki commented on NUTCH-618: ----------------------------------------- I noticed also another problem: o.a.n.u.MimeUtil doesn't use ObjectCache, so it instantiates MimeTypes over and over again. It should do this once for a given Configuration, and then use ObjectCache to store this object. > Tika error "Media type alias already exists" > -------------------------------------------- > > Key: NUTCH-618 > URL: https://issues.apache.org/jira/browse/NUTCH-618 > Project: Nutch > Issue Type: Bug > Components: mime_type_detector > Affects Versions: 1.0.0 > Reporter: Andrzej Bialecki > > After the upgrade to the latest Tika jar we see a lot of errors like this: > 2008-03-06 08:07:20,659 WARN org.apache.tika.mime.MimeTypesReader: Invalid > media type alias: text/xml > org.apache.tika.mime.MimeTypeException: Media type alias already exists: > text/xml > at org.apache.tika.mime.MimeTypes.addAlias(MimeTypes.java:312) > at org.apache.tika.mime.MimeType.addAlias(MimeType.java:238) > at > org.apache.tika.mime.MimeTypesReader.readMimeType(MimeTypesReader.java:168) > at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:138) > at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:121) > at > org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:56) > at org.apache.nutch.util.MimeUtil.(MimeUtil.java:58) > at org.apache.nutch.protocol.Content.(Content.java:85) > at > org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:226) > at > org.apache.nutch.fetcher.Fetcher2$FetcherThread.run(Fetcher2.java:523) > This is caused most likely by the duplicate tika-mimetypes.xml file - one > copy is embedded inside the Tika jar, the other is found in Nutch conf/ > directory. The one inside the jar seems to be more recent, so I propose to > simply remove the one we have in conf. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.