I'm using the file based example and all the changes you told me to do. I
reproduced them in the file based example. I'll try to install zookeeper
and use the zookeeper example. Will I need a configuration to do in order
to run the zookeeper example ?

Othman.

On Thu, 31 Aug 2017 at 17:46, Karl Wright <[email protected]> wrote:

> Are you using the zookeeper example, or the file-based example?
>
> If these jars have all been moved, and the options.env includes them, then
> I have to conclude that Apache POI's pom.xml is incorrect too.  It will
> take a while to figure out what's missing that poi-ooxml.jar needs that is
> not listed.
>
> Karl
>
>
> On Thu, Aug 31, 2017 at 11:39 AM, Beelz Ryuzaki <[email protected]>
> wrote:
>
>> All the dependencies you mentioned have already been added in the
>> options.env.win file in the multiprocess-file-example repository.
>>
>> On Thu, 31 Aug 2017 at 17:33, Beelz Ryuzaki <[email protected]> wrote:
>>
>>> Yes, I added it in the options.env.win file. Should it be the one in the
>>> multiprocess-zk-example document or multiprocess-file-example ?
>>>
>>> On Thu, 31 Aug 2017 at 17:30, Karl Wright <[email protected]> wrote:
>>>
>>>> It's not related at all to elasticsearch.
>>>> Karl
>>>>
>>>>
>>>> On Thu, Aug 31, 2017 at 11:26 AM, Beelz Ryuzaki <[email protected]>
>>>> wrote:
>>>>
>>>>> Could it be a problem of elasticsearch's version ? I'm actually using
>>>>> 2.1.0 which is pretty old for this new version of ManifoldCF?
>>>>>
>>>>> Othman.
>>>>>
>>>>> On Thu, 31 Aug 2017 at 17:23, Beelz Ryuzaki <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I moved back both the jars you mentioned and a different is showing.
>>>>>> You will find the stack trace attached.
>>>>>>
>>>>>> Thanks,
>>>>>> Othman
>>>>>>
>>>>>> On Thu, 31 Aug 2017 at 17:09, Karl Wright <[email protected]> wrote:
>>>>>>
>>>>>>> I've looked at the dependencies; you should not have moved
>>>>>>> poi-3.15.jar.  Please move that back, and commons-collections4-4.1.jar 
>>>>>>> too.
>>>>>>>
>>>>>>> You *will* need to move curvesapi-1.04.jar though.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Karl
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Aug 31, 2017 at 11:04 AM, Karl Wright <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> If you include poi.jar, then all dependencies of poi.jar must also
>>>>>>>> be included.  This would mean that curvesapi-1.04.jar and
>>>>>>>> commons-collections4-4.1.jar should also be included.
>>>>>>>>
>>>>>>>> Karl
>>>>>>>>
>>>>>>>> On Thu, Aug 31, 2017 at 10:23 AM, Beelz Ryuzaki <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi Karl,
>>>>>>>>>
>>>>>>>>> I added the two jars that you have mentioned and another one :
>>>>>>>>> poi-3.15.jar . Unfortunately, there is another error showing. This 
>>>>>>>>> time, it
>>>>>>>>> concerns excel files. You will find attached the stack trace.
>>>>>>>>>
>>>>>>>>> Othman.
>>>>>>>>>
>>>>>>>>> On Thu, 31 Aug 2017 at 15:32, Karl Wright <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Othman,
>>>>>>>>>>
>>>>>>>>>> Yes, this shows that the jar we moved calls back into another
>>>>>>>>>> jar, which will also need to be moved.  *That* jar has yet another
>>>>>>>>>> dependency too.
>>>>>>>>>>
>>>>>>>>>> The list of jars is thus extended to include:
>>>>>>>>>>
>>>>>>>>>> poi-ooxml-3.15.jar
>>>>>>>>>> dom4j-1.6.1.jar
>>>>>>>>>>
>>>>>>>>>> Karl
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 31, 2017 at 9:25 AM, Beelz Ryuzaki <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> You will find attached the stack trace. My apologies for the bad
>>>>>>>>>>> quality of the image, I'm doing my best to send you the stack trace 
>>>>>>>>>>> as I
>>>>>>>>>>> don't have the right to send documents outside the company.
>>>>>>>>>>>
>>>>>>>>>>> Thank you for your time,
>>>>>>>>>>>
>>>>>>>>>>> Othman
>>>>>>>>>>>
>>>>>>>>>>> On Thu, 31 Aug 2017 at 15:16, Karl Wright <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Once again, I need a stack trace to diagnose what the problem
>>>>>>>>>>>> is.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Karl
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Aug 31, 2017 at 9:14 AM, Beelz Ryuzaki <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Oh, actually it didn't solve the problem. I looked into the
>>>>>>>>>>>>> log file and saw the following error:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Error tossed : org/apache/poi/POIXMLTypeLoader
>>>>>>>>>>>>> java.lang.NoClassDefFoundError:
>>>>>>>>>>>>> org/apache/poi/POIXMLTypeLoader.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Maybe another jar is missing ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Othman.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 15:01, Beelz Ryuzaki <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have tried what you told me to do, and you expected the
>>>>>>>>>>>>>> crawling resumed. How about the regular expressions? How can I 
>>>>>>>>>>>>>> make complex
>>>>>>>>>>>>>> regular expressions in the job's paths tab ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you very much for your help.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Othman.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 14:47, Beelz Ryuzaki <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ok, I will try it right away and let you know if it works.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Othman.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 14:15, Karl Wright <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Oh, and you also may need to edit your options.env files to
>>>>>>>>>>>>>>>> include them in the classpath for startup.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:53 AM, Karl Wright <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If you are amenable, there is another workaround you could
>>>>>>>>>>>>>>>>> try.  Specifically:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (1) Shut down all MCF processes.
>>>>>>>>>>>>>>>>> (2) Move the following two files from connector-common-lib
>>>>>>>>>>>>>>>>> to lib:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> xmlbeans-2.6.0.jar
>>>>>>>>>>>>>>>>> poi-ooxml-schemas-3.15.jar
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (3) Restart everything and see if your crawl resumes.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please let me know what happens.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:33 AM, Karl Wright <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I created a ticket for this: CONNECTORS-1450.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> One simple workaround is to use the external Tika server
>>>>>>>>>>>>>>>>>> transformer rather than the embedded Tika Extractor.  I'm 
>>>>>>>>>>>>>>>>>> still looking
>>>>>>>>>>>>>>>>>> into why the jar is not being found.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:08 AM, Beelz Ryuzaki <
>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Yes, I'm actually using the latest binary version, and
>>>>>>>>>>>>>>>>>>> my job got stuck on that specific file.
>>>>>>>>>>>>>>>>>>> The job status is still Running. You can see it in the
>>>>>>>>>>>>>>>>>>> attached file. For your information, the job started 
>>>>>>>>>>>>>>>>>>> yesterday.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Othman
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 13:04, Karl Wright <
>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> It looks like a dependency of Apache POI is missing.
>>>>>>>>>>>>>>>>>>>> I think we will need a ticket to address this, if you
>>>>>>>>>>>>>>>>>>>> are indeed using the binary distribution.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 6:57 AM, Beelz Ryuzaki <
>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I'm actually using the binary version. For security
>>>>>>>>>>>>>>>>>>>>> reasons, I can't send any files from my computer. I have 
>>>>>>>>>>>>>>>>>>>>> copied the stack
>>>>>>>>>>>>>>>>>>>>> trace and scanned it with my cellphone. I hope it will be 
>>>>>>>>>>>>>>>>>>>>> helpful.
>>>>>>>>>>>>>>>>>>>>> Meanwhile, I have read the documentation about how to 
>>>>>>>>>>>>>>>>>>>>> restrict the crawling
>>>>>>>>>>>>>>>>>>>>> and I don't think the '|' works in the specified. For 
>>>>>>>>>>>>>>>>>>>>> instance, I would
>>>>>>>>>>>>>>>>>>>>> like to restrict the crawling for the documents that 
>>>>>>>>>>>>>>>>>>>>> counts the 'sound'
>>>>>>>>>>>>>>>>>>>>> word . I proceed as follows: *(SON)* . the document is 
>>>>>>>>>>>>>>>>>>>>> with capital letters
>>>>>>>>>>>>>>>>>>>>> and I noticed that it didn't take it into consideration.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> Othman
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 12:40, Karl Wright <
>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi Othman,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> The way you restrict documents with the windows share
>>>>>>>>>>>>>>>>>>>>>> connector is by specifying information on the "Paths" 
>>>>>>>>>>>>>>>>>>>>>> tab in jobs that
>>>>>>>>>>>>>>>>>>>>>> crawl windows shares.  There is end-user documentation 
>>>>>>>>>>>>>>>>>>>>>> both online and
>>>>>>>>>>>>>>>>>>>>>> distributed with all binary distributions that describe 
>>>>>>>>>>>>>>>>>>>>>> how to do this.
>>>>>>>>>>>>>>>>>>>>>> Have you found it?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 5:25 AM, Beelz Ryuzaki <
>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hello Karl,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thank you for your response, I will start using
>>>>>>>>>>>>>>>>>>>>>>> zookeeper and I will let you know if it works. I have 
>>>>>>>>>>>>>>>>>>>>>>> another question to
>>>>>>>>>>>>>>>>>>>>>>> ask. Actually, I need to make some filters while 
>>>>>>>>>>>>>>>>>>>>>>> crawling. I don't want to
>>>>>>>>>>>>>>>>>>>>>>> crawl some files and some folders. Could you give me an 
>>>>>>>>>>>>>>>>>>>>>>> example of how to
>>>>>>>>>>>>>>>>>>>>>>> use the regex. Does the regex allow to use /i to ignore 
>>>>>>>>>>>>>>>>>>>>>>> cases ?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>> Othman
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 19:53, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi Beelz,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> File-based sync is deprecated because people often
>>>>>>>>>>>>>>>>>>>>>>>> have problems with getting file permissions right, and 
>>>>>>>>>>>>>>>>>>>>>>>> they do not
>>>>>>>>>>>>>>>>>>>>>>>> understand how to shut processes down cleanly, and 
>>>>>>>>>>>>>>>>>>>>>>>> zookeeper is resilient
>>>>>>>>>>>>>>>>>>>>>>>> against that.  I highly recommend using zookeeper sync.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF is engineered to not put files into
>>>>>>>>>>>>>>>>>>>>>>>> memory so you do not need huge amounts of memory.  The 
>>>>>>>>>>>>>>>>>>>>>>>> default values are
>>>>>>>>>>>>>>>>>>>>>>>> more than enough for 35,000 files, which is a pretty 
>>>>>>>>>>>>>>>>>>>>>>>> small job for
>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 11:58 AM, Beelz Ryuzaki <
>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I'm actually not using zookeeper. i want to know
>>>>>>>>>>>>>>>>>>>>>>>>> how is zookeeper different from file based sync? I 
>>>>>>>>>>>>>>>>>>>>>>>>> also need a guidance on
>>>>>>>>>>>>>>>>>>>>>>>>> how to manage my pc's memory. How many Go should I 
>>>>>>>>>>>>>>>>>>>>>>>>> allocate for the
>>>>>>>>>>>>>>>>>>>>>>>>> start-agent of ManifoldCF? Is 4Go enough in order to 
>>>>>>>>>>>>>>>>>>>>>>>>> crawler 35K files ?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Othman.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 16:11, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Your disk is not writable for some reason, and
>>>>>>>>>>>>>>>>>>>>>>>>>> that's interfering with ManifoldCF 2.8 locking.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I would suggest two things:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> (1) Use Zookeeper for sync instead of file-based
>>>>>>>>>>>>>>>>>>>>>>>>>> sync.
>>>>>>>>>>>>>>>>>>>>>>>>>> (2) Have a look if you still get failures after
>>>>>>>>>>>>>>>>>>>>>>>>>> that.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 9:37 AM, Beelz Ryuzaki <
>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Mr Karl,
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you Mr Karl for your quick response. I
>>>>>>>>>>>>>>>>>>>>>>>>>>> have looked into the ManifoldCF log file and 
>>>>>>>>>>>>>>>>>>>>>>>>>>> extracted the following
>>>>>>>>>>>>>>>>>>>>>>>>>>> warnings :
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> - Attempt to set file lock
>>>>>>>>>>>>>>>>>>>>>>>>>>> 'D:\xxxx\apache_manifoldcf-2.8\multiprocess-file-example\.\.\synch
>>>>>>>>>>>>>>>>>>>>>>>>>>> area\569\352\lock-_POOLTARGET_OUTPUTCONNECTORPOOL_ES
>>>>>>>>>>>>>>>>>>>>>>>>>>>  (Lowercase)
>>>>>>>>>>>>>>>>>>>>>>>>>>> Synapses.lock' failed : Access is denied.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> - Couldn't write to lock file; disk may be full.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Shutting down process; locks may be left dangling. 
>>>>>>>>>>>>>>>>>>>>>>>>>>> You must cleanup before
>>>>>>>>>>>>>>>>>>>>>>>>>>> restarting.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> ES (lowercase) synapses being the elasticsearch
>>>>>>>>>>>>>>>>>>>>>>>>>>> output connection. Moreover, the job uses Tika to 
>>>>>>>>>>>>>>>>>>>>>>>>>>> extract metadata and a
>>>>>>>>>>>>>>>>>>>>>>>>>>> file system as a repository connection. During the 
>>>>>>>>>>>>>>>>>>>>>>>>>>> job, I don't extract the
>>>>>>>>>>>>>>>>>>>>>>>>>>> content of the documents. I was wandering if the 
>>>>>>>>>>>>>>>>>>>>>>>>>>> issue comes from
>>>>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 14:08, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Othman,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF aborts a job if there's an error
>>>>>>>>>>>>>>>>>>>>>>>>>>>> that looks like it might go away on retry, but 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> does not.  It can be either
>>>>>>>>>>>>>>>>>>>>>>>>>>>> on the repository side or on the output side.  If 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> you look at the Simple
>>>>>>>>>>>>>>>>>>>>>>>>>>>> History in the UI, or at the manifoldcf.log file, 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> you should be able to get
>>>>>>>>>>>>>>>>>>>>>>>>>>>> a better sense of what went wrong.  Without 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> further information, I can't
>>>>>>>>>>>>>>>>>>>>>>>>>>>> say any more.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 5:33 AM, Beelz Ryuzaki
>>>>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm Othman Belhaj, a software engineer from
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> société générale in France. I'm actually using 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> your recent version of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifoldCF 2.8 . I'm working on an internal 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> search engine. For this reason,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm using manifoldcf in order to index documents 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on windows shares. I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> encountered a serious problem while crawling 35K 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> documents. Most of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> time, when manifoldcf start crawling a big sized 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> documents (19Mo for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example), it ends the job with the following 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> error: repeated service
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> interruptions - failure processing document : 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> software caused connection
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> abort: socket write error.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can you give me some tips on how to solve this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> problem, please ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I use PostgreSQL 9.3.x and elasticsearch 2.1.0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm looking forward for your response.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman BELHAJ
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>

Reply via email to