Julien Massiera created CONNECTORS-1459:
-------------------------------------------
Summary: Tika service wrong Content-Type
Key: CONNECTORS-1459
URL: https://issues.apache.org/jira/browse/CONNECTORS-1459
Project: ManifoldCF
Issue Type: Bug
Components: Tika service connector
Affects Versions: ManifoldCF 2.8.1
Reporter: Julien Massiera
Priority: Minor
I noticed that the standard behaviour of the Tika extractor connector is to
replace the existing "Content-Type" metadata by the one it founds. This
behaviour is not implemented in the Tika service connector which just adds a
new metadata entry instead of replacing the existing one. The consequence is
that two values are available for the "Content-Type" metadata but only the
first one is kept by the connector (which can also be considered as a bug ?
this is the case for both the Tika extractor connector and the Tika service
connector).
So depending on the source connector, the resulting "Content-Type" may be wrong
if for example the original provided one is "application/octet-stream"
I will provide a patch for this bug
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)