[
https://issues.apache.org/jira/browse/CONNECTORS-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420230#comment-13420230
]
Karl Wright commented on CONNECTORS-492:
----------------------------------------
I checked in a modified wsdl and a corresponding new MCPermissions method into
the SharePoint 2010 Plugin trunk. I based this on the Lists "ListItems"
operation, adding in arguments for pagination.
My problem with the approach so far is that the pagination in the web service
won't actually fix the whole problem - it will limit the memory requirements
the connector needs, but it will not limit the size of the queries done on the
SharePoint server side. The ContentIterator paradigm, as described in this
link - http://msdn.microsoft.com/en-us/library/ff798376 - is the right idea but
it is structured as a callback, which does not coexist well with SOAP, since
SOAP is a synchronous protocol. So my thought (so far) is to use the content
iterator approach, but pass in a starting row and row count, so that although
the entire content iterator is scanned on each SOAP request, only a subsection
is recorded for the response. This is inefficient, and grows ever more
inefficient as the size of the library grows, but at least it's memory
constrained on both sides. As long as we make the row count large enough (e.g.
20,000) it is unlikely that multiple requests will be needed. What do you
think?
> SharePoint connector on SP2010 throws exception when there are too many
> documents in a library
> ----------------------------------------------------------------------------------------------
>
> Key: CONNECTORS-492
> URL: https://issues.apache.org/jira/browse/CONNECTORS-492
> Project: ManifoldCF
> Issue Type: Bug
> Components: SharePoint connector
> Affects Versions: ManifoldCF 0.7
> Reporter: Karl Wright
> Assignee: Karl Wright
> Fix For: ManifoldCF 0.7
>
>
> When there are more than the document list limit set by the administrator, no
> documents for the library are crawled. Instead the following exception is
> thrown:
> {code}
> DEBUG 2012-07-16 23:58:04,036 (Worker thread '19') - Mapping Exception to
> AxisFault
> AxisFault
> faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server
> faultSubcode:
> faultString: Exception of type
> 'Microsoft.SharePoint.SoapServer.SoapServerException' was thrown.
> faultActor:
> faultNode:
> faultDetail:
> {http://schemas.microsoft.com/sharepoint/soap/}errorstring:The
> attempted operation is prohibited because it exceeds the list view threshold
> enforced by the administrator.
> {http://schemas.microsoft.com/sharepoint/soap/}errorcode:0x80070024
> Exception of type 'Microsoft.SharePoint.SoapServer.SoapServerException' was
> thrown.
> at
> org.apache.axis.message.SOAPFaultBuilder.createFault(SOAPFaultBuilder.java:222)
> at
> org.apache.axis.message.SOAPFaultBuilder.endElement(SOAPFaultBuilder.java:129)
> at
> org.apache.axis.encoding.DeserializationContext.endElement(DeserializationContext.java:1087)
> at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown
> Source)
> at
> org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement(Unknown Source)
> at
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
> Source)
> at
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
> Source)
> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
> at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
> at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
> at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
> Source)
> at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
> at
> org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227)
> at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:696)
> at org.apache.axis.Message.getSOAPEnvelope(Message.java:435)
> at
> org.apache.axis.handlers.soap.MustUnderstandChecker.invoke(MustUnderstandChecker.java:62)
> at org.apache.axis.client.AxisClient.invoke(AxisClient.java:206)
> at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
> at org.apache.axis.client.Call.invoke(Call.java:2767)
> at org.apache.axis.client.Call.invoke(Call.java:2443)
> at org.apache.axis.client.Call.invoke(Call.java:2366)
> at org.apache.axis.client.Call.invoke(Call.java:1812)
> at
> com.microsoft.schemas.sharepoint.soap.ListsSoapStub.getListItems(ListsSoapStub.java:1841)
> at
> org.apache.manifoldcf.crawler.connectors.sharepoint.SPSProxyHelper.getDocuments(SPSProxyHelper.java:629)
> at
> org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:909)
> at
> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
> at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:561)
> DEBUG
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira