Re: SharePoint: Error closing connection to file

2012-08-15 Thread Ahmet Arslan
 I just did a check-in which should
 fix the NPE.

Hi Karl, Not fully tested but I think this commit fixed the issue. I run a few 
crawls without problem. Thank for it.

I also post this on solr user ML : http://search-lucene.com/m/IeWzIc11mS
About the weird icon that pops up. Attached images too.

Ahmet


Re: SharePoint: Error closing connection to file

2012-08-14 Thread Ahmet Arslan

Hi Karl,

Somehow those scanned pdf files do not throw exception.
I tired sending them using curl : 

curl http://localhost:8983/solr/update/extract?literal.id=doc1commit=true; -F 
myfile=@ticaret_sicil_gazetesi.pdf

No exception in solr logs. File is indexed. But when i do this, java coffee 
icon appears in Dock. I don't know what this is. I will further investigate on 
tika/solr side.

Thanks for your support on this.

Anyways, I still sometimes get :
Got an unknown remote exception accessing site - axis fault = 
Server.userException, detail = java.net.UnknownHostException: null

I see following entries in manifoldcf.log

 
 WARN 2012-08-14 17:39:41,099 (Thread-10418) - Cookie rejected: $Version=0; 
http%3A%2F%2Fiknowtest%2FDiscovery=WorkspaceSiteName=SUtOb3c=WorkspaceSiteUrl=aHR0cDovL2lrbm93dGVzdA==WorkspaceSiteTime=MjAxMi0wOC0xNFQxNDozOTo0MQ==;
 $Path=/_vti_bin/Discovery.asmx. Illegal path attribute 
/_vti_bin/Discovery.asmx. Path of origin: 
/Pages/denemeIkGeneralPage0712-6740.aspx


FATAL 2012-08-14 17:55:55,096 (Startup thread) - Error tossed: null
java.lang.NullPointerException
at 
org.apache.manifoldcf.crawler.interfaces.QueueTracker$PriorityKey.hashCode(QueueTracker.java:726)
at java.util.HashMap.get(HashMap.java:300)
at 
org.apache.manifoldcf.crawler.interfaces.QueueTracker.calculatePriority(QueueTracker.java:518)
at 
org.apache.manifoldcf.crawler.system.SeedingActivity.writeSeedDocuments(SeedingActivity.java:225)
at 
org.apache.manifoldcf.crawler.system.SeedingActivity.doneSeeding(SeedingActivity.java:165)
at 
org.apache.manifoldcf.crawler.system.StartupThread.run(StartupThread.java:181)

--- On Tue, 8/14/12, Karl Wright daddy...@gmail.com wrote:

 From: Karl Wright daddy...@gmail.com
 Subject: Re: SharePoint: Error closing connection to file
 To: dev@manifoldcf.apache.org
 Date: Tuesday, August 14, 2012, 9:32 AM
 I've committed a fix to how the
 WorkerThread handles service
 interruptions.  This should eliminate the unexpected
 value
 exception.  Could you confirm that it does?
 
 After that, I believe you will have to look at your Tika
 setup on Solr
 to figure out how to avoid having PDFs blow up the
 pipeline.  You
 should confirm first that Tika is indeed throwing an
 exception when a
 PDF is sent to it, of course, and that Solr is closing the
 http
 connection under those conditions.
 
 Thanks,
 Karl
 
 On Tue, Aug 14, 2012 at 1:28 AM, Karl Wright daddy...@gmail.com
 wrote:
  There are two different issues here.  The first
 one is that you are
  having a connection close on you; not sure the reason
 why, but could
  potentially be caused by a Tika exception in
 Solr.  The second is that
  the refactored WorkerThread code I checked in Sunday
 might have a bug
  in handling exceptions of this kind.
 
  I'll have a look at these and get back to you shortly.
 
  Karl
 
  On Mon, Aug 13, 2012 at 10:28 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
  If I modify my Path Rules to index only *.doc and
 *.docx files, I can re-index over and over without
 restarting anything. Everything works fine.
  It seems that there is a problem with non text
 extractable files.
 
  /Documents/*.doc       
 file    include
  /Documents/*.docx   
    file    include
 
  --- On Tue, 8/14/12, Ahmet Arslan iori...@yahoo.com
 wrote:
 
  From: Ahmet Arslan iori...@yahoo.com
  Subject: Re: SharePoint: Error closing
 connection to file
  To: dev@manifoldcf.apache.org
  Date: Tuesday, August 14, 2012, 5:20 AM
 
  Also after this, when i hit View Repository
 Connection
  Status i get :
 
  Got an unknown remote exception accessing site
 - axis fault
  = Server.userException, detail =
  java.net.UnknownHostException: null
 
  I restart mcf, I get Connection status:
 Connection working
  at View Repository Connection Status page.
 
  --- On Tue, 8/14/12, Ahmet Arslan iori...@yahoo.com
  wrote:
 
   From: Ahmet Arslan iori...@yahoo.com
   Subject: SharePoint: Error closing
 connection to file
   To: dev@manifoldcf.apache.org
   Date: Tuesday, August 14, 2012, 5:18 AM
   Hello,
  
   Using solr output connector and SP2010
 Repository
  connector,
   I am indexing a document library named
 Documents. This
   library has some scanned pdf documents.
 Very First
  crawl
   indexes all 91 docs.
   When I hit Re-ingest all associated
 documents and
  start
   second crawl, I get : Error: Unexpected
 jobqueue
  status -
   record id 1344907007021, expecting active
 status, saw
  3
  
   Here is the stack trace:
   When i look at 
   http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf,
   it is an image (scanned) pdf.
  
   WARN 2012-08-14 05:13:22,068 (Worker
 thread '39') -
   SharePoint: Error closing connection to
 file 
 'http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf':
   Connection reset
   java.net.SocketException: Connection
 reset
       at
  
 
 java.net.SocketInputStream.read(SocketInputStream.java:113

Re: SharePoint: Error closing connection to file

2012-08-14 Thread Karl Wright
I just did a check-in which should fix the NPE.

The other exception is a warning; the crawler should retry the
document when that happens, so I would not get excited unless the job
aborts.

Karl


On Tue, Aug 14, 2012 at 5:08 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi Karl,

 Somehow those scanned pdf files do not throw exception.
 I tired sending them using curl :

 curl http://localhost:8983/solr/update/extract?literal.id=doc1commit=true; 
 -F myfile=@ticaret_sicil_gazetesi.pdf

 No exception in solr logs. File is indexed. But when i do this, java coffee 
 icon appears in Dock. I don't know what this is. I will further investigate 
 on tika/solr side.

 Thanks for your support on this.

 Anyways, I still sometimes get :
 Got an unknown remote exception accessing site - axis fault = 
 Server.userException, detail = java.net.UnknownHostException: null

 I see following entries in manifoldcf.log


  WARN 2012-08-14 17:39:41,099 (Thread-10418) - Cookie rejected: $Version=0; 
 http%3A%2F%2Fiknowtest%2FDiscovery=WorkspaceSiteName=SUtOb3c=WorkspaceSiteUrl=aHR0cDovL2lrbm93dGVzdA==WorkspaceSiteTime=MjAxMi0wOC0xNFQxNDozOTo0MQ==;
  $Path=/_vti_bin/Discovery.asmx. Illegal path attribute 
 /_vti_bin/Discovery.asmx. Path of origin: 
 /Pages/denemeIkGeneralPage0712-6740.aspx


 FATAL 2012-08-14 17:55:55,096 (Startup thread) - Error tossed: null
 java.lang.NullPointerException
 at 
 org.apache.manifoldcf.crawler.interfaces.QueueTracker$PriorityKey.hashCode(QueueTracker.java:726)
 at java.util.HashMap.get(HashMap.java:300)
 at 
 org.apache.manifoldcf.crawler.interfaces.QueueTracker.calculatePriority(QueueTracker.java:518)
 at 
 org.apache.manifoldcf.crawler.system.SeedingActivity.writeSeedDocuments(SeedingActivity.java:225)
 at 
 org.apache.manifoldcf.crawler.system.SeedingActivity.doneSeeding(SeedingActivity.java:165)
 at 
 org.apache.manifoldcf.crawler.system.StartupThread.run(StartupThread.java:181)

 --- On Tue, 8/14/12, Karl Wright daddy...@gmail.com wrote:

 From: Karl Wright daddy...@gmail.com
 Subject: Re: SharePoint: Error closing connection to file
 To: dev@manifoldcf.apache.org
 Date: Tuesday, August 14, 2012, 9:32 AM
 I've committed a fix to how the
 WorkerThread handles service
 interruptions.  This should eliminate the unexpected
 value
 exception.  Could you confirm that it does?

 After that, I believe you will have to look at your Tika
 setup on Solr
 to figure out how to avoid having PDFs blow up the
 pipeline.  You
 should confirm first that Tika is indeed throwing an
 exception when a
 PDF is sent to it, of course, and that Solr is closing the
 http
 connection under those conditions.

 Thanks,
 Karl

 On Tue, Aug 14, 2012 at 1:28 AM, Karl Wright daddy...@gmail.com
 wrote:
  There are two different issues here.  The first
 one is that you are
  having a connection close on you; not sure the reason
 why, but could
  potentially be caused by a Tika exception in
 Solr.  The second is that
  the refactored WorkerThread code I checked in Sunday
 might have a bug
  in handling exceptions of this kind.
 
  I'll have a look at these and get back to you shortly.
 
  Karl
 
  On Mon, Aug 13, 2012 at 10:28 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
  If I modify my Path Rules to index only *.doc and
 *.docx files, I can re-index over and over without
 restarting anything. Everything works fine.
  It seems that there is a problem with non text
 extractable files.
 
  /Documents/*.doc
 fileinclude
  /Documents/*.docx
fileinclude
 
  --- On Tue, 8/14/12, Ahmet Arslan iori...@yahoo.com
 wrote:
 
  From: Ahmet Arslan iori...@yahoo.com
  Subject: Re: SharePoint: Error closing
 connection to file
  To: dev@manifoldcf.apache.org
  Date: Tuesday, August 14, 2012, 5:20 AM
 
  Also after this, when i hit View Repository
 Connection
  Status i get :
 
  Got an unknown remote exception accessing site
 - axis fault
  = Server.userException, detail =
  java.net.UnknownHostException: null
 
  I restart mcf, I get Connection status:
 Connection working
  at View Repository Connection Status page.
 
  --- On Tue, 8/14/12, Ahmet Arslan iori...@yahoo.com
  wrote:
 
   From: Ahmet Arslan iori...@yahoo.com
   Subject: SharePoint: Error closing
 connection to file
   To: dev@manifoldcf.apache.org
   Date: Tuesday, August 14, 2012, 5:18 AM
   Hello,
  
   Using solr output connector and SP2010
 Repository
  connector,
   I am indexing a document library named
 Documents. This
   library has some scanned pdf documents.
 Very First
  crawl
   indexes all 91 docs.
   When I hit Re-ingest all associated
 documents and
  start
   second crawl, I get : Error: Unexpected
 jobqueue
  status -
   record id 1344907007021, expecting active
 status, saw
  3
  
   Here is the stack trace:
   When i look at 
   http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf,
   it is an image (scanned) pdf.
  
   WARN 2012-08-14 05:13:22,068 (Worker
 thread '39

SharePoint: Error closing connection to file

2012-08-13 Thread Ahmet Arslan
Hello,

Using solr output connector and SP2010 Repository connector, I am indexing a 
document library named Documents. This library has some scanned pdf documents. 
Very First crawl indexes all 91 docs.
When I hit Re-ingest all associated documents and start second crawl, I get : 
Error: Unexpected jobqueue status - record id 1344907007021, expecting active 
status, saw 3

Here is the stack trace:
When i look at 
http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf, 
it is an image (scanned) pdf. 

WARN 2012-08-14 05:13:22,068 (Worker thread '39') - SharePoint: Error closing 
connection to file 
'http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf': 
Connection reset
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:113)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown 
Source)
at org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown 
Source)
at 
org.apache.commons.httpclient.ChunkedInputStream.exhaustInputStream(Unknown 
Source)
at org.apache.commons.httpclient.ContentLengthInputStream.close(Unknown 
Source)
at java.io.FilterInputStream.close(FilterInputStream.java:155)
at 
org.apache.commons.httpclient.AutoCloseInputStream.notifyWatcher(Unknown Source)
at org.apache.commons.httpclient.AutoCloseInputStream.close(Unknown 
Source)
at 
org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1457)
at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:549)
DEBUG 2012-08-14 05:13:22,072 (Worker thread '42') - SharePoint: Path attribute 
name is null
 WARN 2012-08-14 05:13:22,081 (Worker thread '39') - SharePoint: IOException 
thrown: Connection reset
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown 
Source)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at org.apache.commons.httpclient.AutoCloseInputStream.read(Unknown 
Source)
at java.io.FilterInputStream.read(FilterInputStream.java:90)
at org.apache.commons.httpclient.AutoCloseInputStream.read(Unknown 
Source)
at 
org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1447)
at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:549)
 WARN 2012-08-14 05:13:22,186 (Worker thread '39') - Service interruption 
reported for job 1344906886879 connection 'SP2010': SharePoint is down 
attempting to read 
'http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf', 
retrying: Connection reset
ERROR 2012-08-14 05:13:22,230 (Worker thread '39') - Exception tossed: 
Unexpected jobqueue status - record id 1344907007021, expecting active status, 
saw 3
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected jobqueue 
status - record id 1344907007021, expecting active status, saw 3
at 
org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:711)
at 
org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2435)
at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:745)


Re: SharePoint: Error closing connection to file

2012-08-13 Thread Ahmet Arslan

Also after this, when i hit View Repository Connection Status i get :

Got an unknown remote exception accessing site - axis fault = 
Server.userException, detail = java.net.UnknownHostException: null

I restart mcf, I get Connection status: Connection working at View 
Repository Connection Status page.

--- On Tue, 8/14/12, Ahmet Arslan iori...@yahoo.com wrote:

 From: Ahmet Arslan iori...@yahoo.com
 Subject: SharePoint: Error closing connection to file
 To: dev@manifoldcf.apache.org
 Date: Tuesday, August 14, 2012, 5:18 AM
 Hello,
 
 Using solr output connector and SP2010 Repository connector,
 I am indexing a document library named Documents. This
 library has some scanned pdf documents. Very First crawl
 indexes all 91 docs.
 When I hit Re-ingest all associated documents and start
 second crawl, I get : Error: Unexpected jobqueue status -
 record id 1344907007021, expecting active status, saw 3
 
 Here is the stack trace:
 When i look at 
 http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf,
 it is an image (scanned) pdf. 
 
 WARN 2012-08-14 05:13:22,068 (Worker thread '39') -
 SharePoint: Error closing connection to file 
 'http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf':
 Connection reset
 java.net.SocketException: Connection reset
     at
 java.net.SocketInputStream.read(SocketInputStream.java:113)
     at
 java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
     at
 java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
     at
 java.io.BufferedInputStream.read(BufferedInputStream.java:317)
     at
 org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown
 Source)
     at
 org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown
 Source)
     at
 org.apache.commons.httpclient.ChunkedInputStream.exhaustInputStream(Unknown
 Source)
     at
 org.apache.commons.httpclient.ContentLengthInputStream.close(Unknown
 Source)
     at
 java.io.FilterInputStream.close(FilterInputStream.java:155)
     at
 org.apache.commons.httpclient.AutoCloseInputStream.notifyWatcher(Unknown
 Source)
     at
 org.apache.commons.httpclient.AutoCloseInputStream.close(Unknown
 Source)
     at
 org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1457)
     at
 org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
     at
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:549)
 DEBUG 2012-08-14 05:13:22,072 (Worker thread '42') -
 SharePoint: Path attribute name is null
  WARN 2012-08-14 05:13:22,081 (Worker thread '39') -
 SharePoint: IOException thrown: Connection reset
 java.net.SocketException: Connection reset
     at
 java.net.SocketInputStream.read(SocketInputStream.java:168)
     at
 java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
     at
 java.io.BufferedInputStream.read(BufferedInputStream.java:317)
     at
 org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown
 Source)
     at
 java.io.FilterInputStream.read(FilterInputStream.java:116)
     at
 org.apache.commons.httpclient.AutoCloseInputStream.read(Unknown
 Source)
     at
 java.io.FilterInputStream.read(FilterInputStream.java:90)
     at
 org.apache.commons.httpclient.AutoCloseInputStream.read(Unknown
 Source)
     at
 org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1447)
     at
 org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
     at
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:549)
  WARN 2012-08-14 05:13:22,186 (Worker thread '39') - Service
 interruption reported for job 1344906886879 connection
 'SP2010': SharePoint is down attempting to read 
 'http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf',
 retrying: Connection reset
 ERROR 2012-08-14 05:13:22,230 (Worker thread '39') -
 Exception tossed: Unexpected jobqueue status - record id
 1344907007021, expecting active status, saw 3
 org.apache.manifoldcf.core.interfaces.ManifoldCFException:
 Unexpected jobqueue status - record id 1344907007021,
 expecting active status, saw 3
     at
 org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:711)
     at
 org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2435)
     at
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:745)



Re: SharePoint: Error closing connection to file

2012-08-13 Thread Karl Wright
There are two different issues here.  The first one is that you are
having a connection close on you; not sure the reason why, but could
potentially be caused by a Tika exception in Solr.  The second is that
the refactored WorkerThread code I checked in Sunday might have a bug
in handling exceptions of this kind.

I'll have a look at these and get back to you shortly.

Karl

On Mon, Aug 13, 2012 at 10:28 PM, Ahmet Arslan iori...@yahoo.com wrote:
 If I modify my Path Rules to index only *.doc and *.docx files, I can 
 re-index over and over without restarting anything. Everything works fine.
 It seems that there is a problem with non text extractable files.

 /Documents/*.docfileinclude
 /Documents/*.docx   fileinclude

 --- On Tue, 8/14/12, Ahmet Arslan iori...@yahoo.com wrote:

 From: Ahmet Arslan iori...@yahoo.com
 Subject: Re: SharePoint: Error closing connection to file
 To: dev@manifoldcf.apache.org
 Date: Tuesday, August 14, 2012, 5:20 AM

 Also after this, when i hit View Repository Connection
 Status i get :

 Got an unknown remote exception accessing site - axis fault
 = Server.userException, detail =
 java.net.UnknownHostException: null

 I restart mcf, I get Connection status: Connection working
 at View Repository Connection Status page.

 --- On Tue, 8/14/12, Ahmet Arslan iori...@yahoo.com
 wrote:

  From: Ahmet Arslan iori...@yahoo.com
  Subject: SharePoint: Error closing connection to file
  To: dev@manifoldcf.apache.org
  Date: Tuesday, August 14, 2012, 5:18 AM
  Hello,
 
  Using solr output connector and SP2010 Repository
 connector,
  I am indexing a document library named Documents. This
  library has some scanned pdf documents. Very First
 crawl
  indexes all 91 docs.
  When I hit Re-ingest all associated documents and
 start
  second crawl, I get : Error: Unexpected jobqueue
 status -
  record id 1344907007021, expecting active status, saw
 3
 
  Here is the stack trace:
  When i look at 
  http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf,
  it is an image (scanned) pdf.
 
  WARN 2012-08-14 05:13:22,068 (Worker thread '39') -
  SharePoint: Error closing connection to file 
  'http://iknowtest/Documents/ik_docs/vize_evraklari/ticaret_sicil_gazetesi.pdf':
  Connection reset
  java.net.SocketException: Connection reset
  at
 
 java.net.SocketInputStream.read(SocketInputStream.java:113)
  at
 
 java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
  at
 
 java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
  at
 
 java.io.BufferedInputStream.read(BufferedInputStream.java:317)
  at
 
 org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown
  Source)
  at
 
 org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown
  Source)
  at
 
 org.apache.commons.httpclient.ChunkedInputStream.exhaustInputStream(Unknown
  Source)
  at
 
 org.apache.commons.httpclient.ContentLengthInputStream.close(Unknown
  Source)
  at
 
 java.io.FilterInputStream.close(FilterInputStream.java:155)
  at
 
 org.apache.commons.httpclient.AutoCloseInputStream.notifyWatcher(Unknown
  Source)
  at
 
 org.apache.commons.httpclient.AutoCloseInputStream.close(Unknown
  Source)
  at
 
 org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1457)
  at
 
 org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
  at
 
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:549)
  DEBUG 2012-08-14 05:13:22,072 (Worker thread '42') -
  SharePoint: Path attribute name is null
   WARN 2012-08-14 05:13:22,081 (Worker thread '39')
 -
  SharePoint: IOException thrown: Connection reset
  java.net.SocketException: Connection reset
  at
 
 java.net.SocketInputStream.read(SocketInputStream.java:168)
  at
 
 java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
  at
 
 java.io.BufferedInputStream.read(BufferedInputStream.java:317)
  at
 
 org.apache.commons.httpclient.ContentLengthInputStream.read(Unknown
  Source)
  at
 
 java.io.FilterInputStream.read(FilterInputStream.java:116)
  at
 
 org.apache.commons.httpclient.AutoCloseInputStream.read(Unknown
  Source)
  at
 
 java.io.FilterInputStream.read(FilterInputStream.java:90)
  at
 
 org.apache.commons.httpclient.AutoCloseInputStream.read(Unknown
  Source)
  at
 
 org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:1447)
  at
 
 org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
  at
 
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:549)
   WARN 2012-08-14 05:13:22,186 (Worker thread '39')
 - Service
  interruption reported for job 1344906886879 connection
  'SP2010': SharePoint is down attempting to read