I tried now upgrading the plugin on SharePoint and rebuilding the connector from the trunk but it returns me this exception:
ERROR 2012-09-06 20:20:00,993 (Worker thread '41') - Exception tossed: Internal error: Relative path 'Library Custom/actions-article-v2.pdf' was expected to start with '/my/personal/administrator/demosite' org.apache.manifoldcf.core.interfaces.ManifoldCFException: Internal error: Relative path 'Library Custom/actions-article-v2.pdf' was expected to start with '/my/personal/administrator/demosite' at org.apache.manifoldcf.crawler.connectors.sharepoint.SPSProxyHelper.getChildren(SPSProxyHelper.java:655) at org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1303) at org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:561) It seems that there is an issue about the URL again. Hope this helps. Piergiorgio 2012/9/6 Karl Wright <[email protected]> > I checked in a fix for the pagination issue in > integration/sharepoint-2010/trunk. Care to build the plugin, deploy > it, and let me know if it works now? > > Karl > > > On Thu, Sep 6, 2012 at 1:26 PM, Karl Wright <[email protected]> wrote: > > I conclude that the plugin is not handling paging properly - there's > > no other explanation. So I am canceling the vote and will try to > > check in a fix. > > > > Karl > > > > On Thu, Sep 6, 2012 at 1:11 PM, Karl Wright <[email protected]> wrote: > >> It looks like two problems here. First, it looks like Solr is > >> throwing a 500 error for at least one of the documents in your set. > >> > >> However, the fact that you only get 1000 documents indexed also shows > >> that the code is still broken in some way. I will check into whether > >> this looks like a problem in the connector or in the plugin. > >> > >> Karl > >> > >> On Thu, Sep 6, 2012 at 1:06 PM, Ahmet Arslan <[email protected]> wrote: > >>> Hi Karl, > >>> > >>> With a document library that 7,888 items, I setup a crawl with > mcf-trunk. Sometimes I get this exception : Error: Repeated service > interruptions - failure processing document: Ingestion HTTP error code 500 > >>> > >>> If i don't get exception only 1000 docs are indexed. > >>> > >>> ERROR 2012-09-06 19:55:13,587 (Worker thread '30') - Exception tossed: > Repeated service interruptions - failure processing document: Ingestion > HTTP error code 500 > >>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated > service interruptions - failure processing document: Ingestion HTTP error > code 500 > >>> at > org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:585) > >>> Caused by: org.apache.manifoldcf.core.interfaces.ManifoldCFException: > Ingestion HTTP error code 500 > >>> at > org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:1386) > >>> > >>> Ahmet > >>> --- On Wed, 9/5/12, Karl Wright <[email protected]> wrote: > >>> > >>>> From: Karl Wright <[email protected]> > >>>> Subject: [VOTE] Release Apache ManifoldCF SharePoint 2010 plugin 0.1 > RC0 > >>>> To: "dev" <[email protected]> > >>>> Date: Wednesday, September 5, 2012, 11:52 PM > >>>> Vote +1 if you think the Apache > >>>> ManifoldCF SharePoint 2010 plugin 0.1 > >>>> RC0 is ready for release. > >>>> > >>>> The release artifact can be found at: > >>>> > http://people.apache.org/~kwright/apache-manifoldcf-sharepoint-2010-plugin-0.1 > >>>> . > >>>> > >>>> There is also a release tag at: > >>>> > https://svn.apache.org/repos/asf/manifoldcf/integration/sharepoint-2010/tags/release-0.1-RC0 > >>>> > >>>> Karl > >>>> > > -- > Piergiorgio Lucidi > http://www.open4dev.com > >
