[
https://issues.apache.org/jira/browse/CONNECTORS-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Massiera updated CONNECTORS-1459:
----------------------------------------
Attachment: CONNECTORS-1459.patch
> Tika service wrong Content-Type
> -------------------------------
>
> Key: CONNECTORS-1459
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1459
> Project: ManifoldCF
> Issue Type: Bug
> Components: Tika service connector
> Affects Versions: ManifoldCF 2.8.1
> Reporter: Julien Massiera
> Priority: Minor
> Fix For: ManifoldCF 2.8.1
>
> Attachments: CONNECTORS-1459.patch
>
>
> I noticed that the standard behaviour of the Tika extractor connector is to
> replace the existing "Content-Type" metadata by the one it founds. This
> behaviour is not implemented in the Tika service connector which just adds a
> new metadata entry instead of replacing the existing one. The consequence is
> that two values are available for the "Content-Type" metadata but only the
> first one is kept by the connector (which can also be considered as a bug ?
> this is the case for both the Tika extractor connector and the Tika service
> connector).
> So depending on the source connector, the resulting "Content-Type" may be
> wrong if for example the original provided one is "application/octet-stream"
> I will provide a patch for this bug
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)