Hi Radek, I've created ticket CONNECTORS-1025 to track this issue, and attached a patch. Also committed the fix to trunk and to the dev_1x branch.
Thanks, Karl On Mon, Sep 8, 2014 at 7:05 PM, Radek Sklenicka <[email protected]> wrote: > Hello, > > I’m seeing the following issue when crawling SharePoint 2013. > > Manifold job gets terminated with an error when trying to fetch files that > are 'blocked' in SharePoint 2013. > > This can happen when files of certain types are uploaded into SP and then > the file type (e.g. exe, dll, sp1) is added into the list of blocked file > types. > > We tried excluding the blocked file types in the Paths rules, but we got > the same error. > > Would it be possible to get Manifold skipping the files that are blocked > by SP setup and just log warnings/errors rather than completely abort the > job? > > Thanks, > > Radek > > > ERROR 2014-09-08 11:52:50,005 (Worker thread '5') - Exception tossed: > Error fetching document ' > http://sp2013/sites/demo/test/blocked%20files/tmp.ps1': 415 > org.apache.manifoldcf.core.interfaces.ManifoldCFException: Error fetching > document ' http://sp2013/sites/demo/test/blocked %20files/tmp.ps1': 415 > at > org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.fetchAndIndexFile(SharePointRepository.java:1915) > at > org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1774) > at > org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:677) > at > org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:670) > at > org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:649) > at > org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:402) > at > org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:380) > >
