[ https://issues.apache.org/jira/browse/CONNECTORS-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Markus Schuch updated CONNECTORS-1571: -------------------------------------- Affects Version/s: ManifoldCF 2.10 > Web Crawler Connector checks different MIME type than it is sending down the > pipeline > ------------------------------------------------------------------------------------- > > Key: CONNECTORS-1571 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1571 > Project: ManifoldCF > Issue Type: Bug > Components: Web connector > Affects Versions: ManifoldCF 2.10 > Reporter: Markus Schuch > Priority: Minor > > The Web Crawler Connector extracts the MIME type from the request > Content-Type header. > Then it truncates the possible {{charset=whatever_encoding}} and lets the > pipeline check if the resulting MIME type (without the charset) > {{activities.checkMimeTypeIndexable(contentType);}} should be ingested. > When sending the actual {{RepositoryDocument}} it sets the full MIME type > (with the charset) in the document. This is no major bug, but a small > inconsistency since the HttpPoster of the Solr Output Connector performs a > "hard" check of the MIME type again which can have different outcome than the > preceding check activity. > I think this was introduced or (better) revealed with CONNECTORS-1482. -- This message was sent by Atlassian JIRA (v7.6.3#76005)