[
https://issues.apache.org/jira/browse/CONNECTORS-477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282223#comment-13282223
]
Karl Wright edited comment on CONNECTORS-477 at 5/24/12 6:43 AM:
-----------------------------------------------------------------
A warning: Trying to crawl SharePoint using the web connector is not going to
work for quite a number of reasons. Microsoft is notorious for ignoring web
standards in its URLs and has done this in multiple ways in SharePoint. Also,
there is no way to accurately "discover" SharePoint documents using the Web
Connector. So it is not a good idea to try to make Web Connector into a
replacement for the SharePoint connector.
was (Author: [email protected]):
Trying to crawl SharePoint using the web connector is not going to work for
quite a number of reasons. Microsoft is notorious for ignoring web standards
in its URLs and has done this in multiple ways in SharePoint. Also, there is
no way to accurately "discover" SharePoint documents using the Web Connector.
So it is not a good idea to try to make Web Connector into a replacement for
the SharePoint connector.
> Support for full-width space against url
> ----------------------------------------
>
> Key: CONNECTORS-477
> URL: https://issues.apache.org/jira/browse/CONNECTORS-477
> Project: ManifoldCF
> Issue Type: Improvement
> Components: Web connector
> Reporter: Shinichiro Abe
> Assignee: Shinichiro Abe
> Priority: Minor
> Fix For: ManifoldCF 0.6
>
> Attachments: CONNECTORS-477.patch
>
>
> When url includes full-width space (" ") MCF can't ingest their documents.
> e.g.
> 1.file name
> http://server/site1/Shared%20Documents/test/aaa bbb.txt
> 2.path
> http://localhost/aaa bbb/aaa.txt
> MCF's log says:
> {noformat}
> WEB: Can't use url '/site1/Shared%20Documents/test/aaa bbb.txt' because it is
> badly formed: Illegal character in path at index 34:
> /site1/Shared%20Documents/test/aaa bbb.txt
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira