[ 
https://issues.apache.org/jira/browse/CONNECTORS-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007388#comment-14007388
 ] 

Karl Wright commented on CONNECTORS-943:
----------------------------------------

Were you indexing to solr?

Newer versions of the SharePoint connector look up a mime type based on the url 
extension.  The mimetype is checked against the list of mime types you created 
for your solr connection.  You *can* specify "." in that list of mimetypes, 
which means "match documents with no mime type", but for aspx it seems to me 
they are likely always html, so that's a better solution.


> .aspx files not indexed by Sharepoint connector due to missing mime type in 
> core
> --------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-943
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-943
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Framework core, SharePoint connector
>    Affects Versions: ManifoldCF 1.6
>            Reporter: Michael Wilken
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.6.1, ManifoldCF 1.7
>
>         Attachments: CONNECTORS-943.patch
>
>
> Our Sharepoint installs contain many pages ending with the .aspx file 
> extension.  The Sharepoint Connector crawls these files without any issue, 
> but fails to send them to the output connector because the .aspx extension 
> isn't in the extension-to-mime-type mapping.  Turning the log level up to 
> DEBUG shows the issue
> {code}
> DEBUG 2014-05-22 11:10:06,486 (Worker thread '17') - SharePoint: Skipping 
> document '/IT/Pages//CC-CSS-Safari-Test-5.aspx' because output connector says 
> mime type 'null' is not indexable
> DEBUG 2014-05-22 11:10:07,328 (Worker thread '32') - SharePoint: Skipping 
> document '/IT/Pages//CC-CSS-Test-Safari-3.aspx' because output connector says 
> mime type 'null' is not indexable
> DEBUG 2014-05-22 11:10:07,588 (Worker thread '26') - SharePoint: Skipping 
> document '/IT/Pages//CC-CSS-Test-7-Safari.aspx' because output connector says 
> mime type 'null' is not indexable
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to