[ 
https://issues.apache.org/jira/browse/CONNECTORS-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16320655#comment-16320655
 ] 

Markus Schuch commented on CONNECTORS-1482:
-------------------------------------------

{quote}
First, you can only exclude mime types if you are using the extracting update 
handler
{quote}
Why is that so? As i understand, in the SolrJ case the binary content has to be 
extracted by a {{DocTransformer}} or something else, but the upstream 
repository connectors still could decide not to send the document to the 
pipeline at all, coudn't it?

> Mime type exclusion and document length exclusion in Solr output connector 
> don't apparently work
> ------------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1482
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1482
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Lucene/SOLR connector
>    Affects Versions: ManifoldCF 2.9
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 2.10
>
>         Attachments: problem_documents_connector.png, 
> problem_documents_connector_solr.png, 
> problem_documents_connector_solr_stream_size.png
>
>
> See attached images.  Setting exclusions apparently does not prevent 
> documents with that mime type from being included.  This may be because of 
> regexp characters etc but it needs to be researched and documented at least.  
> Also, the length limitation doesn't seem to be working either.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to