Your disk is not writable for some reason, and that's interfering with ManifoldCF 2.8 locking.
I would suggest two things: (1) Use Zookeeper for sync instead of file-based sync. (2) Have a look if you still get failures after that. Thanks, Karl On Wed, Aug 30, 2017 at 9:37 AM, Beelz Ryuzaki <[email protected]> wrote: > Hi Mr Karl, > > Thank you Mr Karl for your quick response. I have looked into the > ManifoldCF log file and extracted the following warnings : > > - Attempt to set file lock 'D:\xxxx\apache_manifoldcf-2. > 8\multiprocess-file-example\.\.\synch > area\569\352\lock-_POOLTARGET_OUTPUTCONNECTORPOOL_ES > (Lowercase) Synapses.lock' failed : Access is denied. > > > - Couldn't write to lock file; disk may be full. Shutting down process; > locks may be left dangling. You must cleanup before restarting. > > ES (lowercase) synapses being the elasticsearch output connection. > Moreover, the job uses Tika to extract metadata and a file system as a > repository connection. During the job, I don't extract the content of the > documents. I was wandering if the issue comes from elasticsearch ? > > Othman. > > > > On Wed, 30 Aug 2017 at 14:08, Karl Wright <[email protected]> wrote: > >> Hi Othman, >> >> ManifoldCF aborts a job if there's an error that looks like it might go >> away on retry, but does not. It can be either on the repository side or on >> the output side. If you look at the Simple History in the UI, or at the >> manifoldcf.log file, you should be able to get a better sense of what went >> wrong. Without further information, I can't say any more. >> >> Thanks, >> Karl >> >> >> On Wed, Aug 30, 2017 at 5:33 AM, Beelz Ryuzaki <[email protected]> >> wrote: >> >>> Hello, >>> >>> I'm Othman Belhaj, a software engineer from société générale in France. >>> I'm actually using your recent version of manifoldCF 2.8 . I'm working on >>> an internal search engine. For this reason, I'm using manifoldcf in order >>> to index documents on windows shares. I encountered a serious problem while >>> crawling 35K documents. Most of the time, when manifoldcf start crawling a >>> big sized documents (19Mo for example), it ends the job with the following >>> error: repeated service interruptions - failure processing document : >>> software caused connection abort: socket write error. >>> Can you give me some tips on how to solve this problem, please ? >>> >>> I use PostgreSQL 9.3.x and elasticsearch 2.1.0 . >>> I'm looking forward for your response. >>> >>> Best regards, >>> >>> Othman BELHAJ >>> >> >>
