[
https://issues.apache.org/jira/browse/CONNECTORS-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007388#comment-14007388
]
Karl Wright commented on CONNECTORS-943:
----------------------------------------
Were you indexing to solr?
Newer versions of the SharePoint connector look up a mime type based on the url
extension. The mimetype is checked against the list of mime types you created
for your solr connection. You *can* specify "." in that list of mimetypes,
which means "match documents with no mime type", but for aspx it seems to me
they are likely always html, so that's a better solution.
> .aspx files not indexed by Sharepoint connector due to missing mime type in
> core
> --------------------------------------------------------------------------------
>
> Key: CONNECTORS-943
> URL: https://issues.apache.org/jira/browse/CONNECTORS-943
> Project: ManifoldCF
> Issue Type: Bug
> Components: Framework core, SharePoint connector
> Affects Versions: ManifoldCF 1.6
> Reporter: Michael Wilken
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.6.1, ManifoldCF 1.7
>
> Attachments: CONNECTORS-943.patch
>
>
> Our Sharepoint installs contain many pages ending with the .aspx file
> extension. The Sharepoint Connector crawls these files without any issue,
> but fails to send them to the output connector because the .aspx extension
> isn't in the extension-to-mime-type mapping. Turning the log level up to
> DEBUG shows the issue
> {code}
> DEBUG 2014-05-22 11:10:06,486 (Worker thread '17') - SharePoint: Skipping
> document '/IT/Pages//CC-CSS-Safari-Test-5.aspx' because output connector says
> mime type 'null' is not indexable
> DEBUG 2014-05-22 11:10:07,328 (Worker thread '32') - SharePoint: Skipping
> document '/IT/Pages//CC-CSS-Test-Safari-3.aspx' because output connector says
> mime type 'null' is not indexable
> DEBUG 2014-05-22 11:10:07,588 (Worker thread '26') - SharePoint: Skipping
> document '/IT/Pages//CC-CSS-Test-7-Safari.aspx' because output connector says
> mime type 'null' is not indexable
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)