On Wed, Aug 1, 2012 at 5:48 AM, Shinichiro Abe <[email protected]> wrote: > Hi Karl, > > I still have a problem. > I reduced maximum number of connections into 2. > I rebooted the file server, not domain controller. > When I configured the paths[1], the log said no error > and ShareDrive connector crawled the files successfully. > When I made the path's config default(matching * ), > the log said "all pipe instances are busy" error. > Both of path's config pointed the same location. > > Also when this error occurred, watching the log of ingest, > HttpPoster was waiting for response stream > and couldn't get response from Solr, > and threw SocketTimeoutException. > I increased jcifs.smb.client.responseTimeout > but still threw the exception. > On Solr, Jetty threw SocketException(socket wr > ite error). > I'm working on checking Solr logs. > Solr may do something wrong when running /update/extract. >
If Solr threw the exception this sounds likely. > Do you know something like this? > Does path's matching config affect those errors? > > [1]Paths Tab: > Include directory(s) matching /01* > This should have nothing to do with socket exceptions, except possibly that the crawler winds up trying to read a file that isn't actually a file but is something else, like a named pipe or something. This typically doesn't happen if the server is a Windows machine but if it is a Samba server I could imagine something like that happening. Karl > P.S. > Thank you for fix CONNECTORS-494. > I checked trunk code, worked well. > > Thank you, > Shinichiro Abe > > On 2012/07/24, at 22:13, Karl Wright wrote: > >> Hi Abe-san, >> >> Did you figure out what the problem was? >> >> Karl >> >> On Thu, Jul 19, 2012 at 5:52 AM, Karl Wright <[email protected]> wrote: >>> Hi Abe-san, >>> >>> Sometimes what looks like a server error can actually be due to the >>> domain controller. I wonder if the domain controller needs to be >>> rebooted? >>> >>> Karl >>> >>> On Thu, Jul 19, 2012 at 5:12 AM, Shinichiro Abe >>> <[email protected]> wrote: >>>> Hi Karl, >>>> Thank you for the reply. >>>> I tried to reduce maximum number of connections from 10 >>>> to 5, but didn't avoid busy error. I'll try to reduce more. >>>> Thank you. >>>> Shinichiro Abe >>>> >>>> On 2012/07/19, at 15:55, Karl Wright wrote: >>>> >>>>> Hi Abe-san, >>>>> >>>>> The "all pipe instances are busy" error is coming from the Windows >>>>> server you are trying to crawl. I don't know what is happening there >>>>> but here are some possibilities: >>>>> >>>>> (1) The Windows server is just overloaded; you can try reducing the >>>>> maximum number of connections to 2 or 3 to see if that helps. >>>>> (2) The Windows server needs rebooting. >>>>> >>>>> Thanks, >>>>> Karl >>>>> >>>>> On Wed, Jul 18, 2012 at 10:09 PM, Shinichiro Abe >>>>> <[email protected]> wrote: >>>>>> Hi, >>>>>> >>>>>> I use windows shares connector and ran a job. >>>>>> The job was aborted without done normally and the job's status said: >>>>>> Error: Repeated service interruptions - failure processing document: >>>>>> Read timed out >>>>>> >>>>>> Why was the job aborted? I use ManifoldCF 0.5.1 and the latest version's >>>>>> jcifs.jar. >>>>>> Is the crawled server busy? I think the server MCF is installed seems >>>>>> not to be busy, >>>>>> the other servers in which MCF will crawls seem to be busy. >>>>>> How can I run the job without error? What's wrong? >>>>>> >>>>>> >>>>>> the logs of connector: >>>>>> >>>>>> WARN 2012-07-12 16:28:52,648 (Worker thread '19') - JCIFS: Possibly >>>>>> transient exception detected on attempt 1 while getting share security: >>>>>> All pipe instances are busy. >>>>>> at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:563) >>>>>> at jcifs.smb.SmbTransport.send(SmbTransport.java:663) >>>>>> .. >>>>>> WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS: Possibly >>>>>> transient exception detected on attempt 3 while getting share security: >>>>>> All pipe instances are busy. >>>>>> .. >>>>>> WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS: 'Busy' >>>>>> response when getting document version for >>>>>> smb://XX.XX.XX.XX/D$/abcde/1234/123456789/e123456789a.pdf: retrying... >>>>>> .. >>>>>> WARN 2012-07-12 16:36:37,585 (Worker thread '19') - Pre-ingest service >>>>>> interruption reported for job 1342076182624 connection 'Windows shares': >>>>>> Timeout or other service interruption: All pipe instances are busy. >>>>>> .. >>>>>> WARN 2012-07-12 19:14:30,335 (Worker thread '19') - Service interruption >>>>>> reported for job 1342076182624 connection 'Windows shares': Ingestion >>>>>> API socket timeout exception waiting for response code: Read timed out; >>>>>> ingestion will be retried again later >>>>>> .. >>>>>> WARN 2012-07-12 20:43:50,210 (Worker thread '19') - Service interruption >>>>>> reported for job 1342076182624 connection 'Windows shares': Ingestion >>>>>> API socket timeout exception waiting for response code: Read timed out; >>>>>> ingestion will be retried again later >>>>>> .. >>>>>> ERROR 2012-07-12 20:43:50,210 (Worker thread '19') - Exception tossed: >>>>>> Repeated service interruptions - failure processing document: Read timed >>>>>> out >>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated >>>>>> service interruptions - failure processing document: Read timed out >>>>>> at >>>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:606) >>>>>> Caused by: java.net.SocketTimeoutException: Read timed out >>>>>> at java.net.SocketInputStream.socketRead0(Native Method) >>>>>> at java.net.SocketInputStream.read(Unknown Source) >>>>>> at java.net.SocketInputStream.read(Unknown Source) >>>>>> at >>>>>> org.apache.manifoldcf.agents.output.solr.HttpPoster.readLine(HttpPoster.java:571) >>>>>> at >>>>>> org.apache.manifoldcf.agents.output.solr.HttpPoster.getResponse(HttpPoster.java:598) >>>>>> >>>>>> Thanks in advance, >>>>>> Shinichiro Abe >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>> >
