Hi Othman, I will respin a new 2.8.1 (RC1) to address the zookeeper issue.
The failure you are seeing is "NoSuchMethodError". Therefore, the class is being found, but it is the *wrong* class. When you deployed the new release, did you deploy it in a new directory, or did you overwrite the previous deployment? If you overwrote it, you probably have multiple versions of the POI jars. Karl On Fri, Sep 1, 2017 at 9:59 AM, Beelz Ryuzaki <[email protected]> wrote: > Hi Karl, > > I have just tried the new release of ManifoldCF. At first, the first job > ended normally, but in the second I got a new stack trace concerning the > POI. Moreover, the runzookeeper.bat doesn't run properly. It shows me the > stack trace attached. > > Ps: > The second attached file contains the POI stack trace. > > Othman. > > On Fri, 1 Sep 2017 at 12:21, Karl Wright <[email protected]> wrote: > >> Hi Othman, >> >> You do not need a new database instance. >> >> You can download MCF 2.8.1 RC0 from here: >> >> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.8.1 >> >> Karl >> >> >> On Fri, Sep 1, 2017 at 5:42 AM, Beelz Ryuzaki <[email protected]> >> wrote: >> >>> Hi Karl, >>> >>> Thank you very much for your help, I'm going to try out the zookeeper >>> example. Should I initialize a new database? And how can I run the >>> zookeeper start-agent ? >>> >>> Othman. >>> >>> On Fri, 1 Sep 2017 at 11:37, Karl Wright <[email protected]> wrote: >>> >>>> Hi Othman, >>>> >>>> These exceptions are now coming from file locking and are due to >>>> permissions problems. I suggest you go to Zookeeper for file locking. >>>> >>>> I am building a 2.8.1 release candidate. When it available for >>>> download, I'll send you the URL. >>>> >>>> Thanks, >>>> Karl >>>> >>>> >>>> On Fri, Sep 1, 2017 at 5:27 AM, Beelz Ryuzaki <[email protected]> >>>> wrote: >>>> >>>>> Hi Karl, >>>>> >>>>> This morning, I have followed the steps you told me to do and I still >>>>> got stack traces. I have attached the stack traces as well as the content >>>>> of my lib repo and option.env. >>>>> I have installed zookeeper and I'm ready to use the zookeeper example. >>>>> Could you guide through it? I don't know if I follow the same steps in the >>>>> file based example, I may not get stack traces. >>>>> >>>>> Thanks, >>>>> Othman >>>>> >>>>> On Thu, 31 Aug 2017 at 18:19, Karl Wright <[email protected]> wrote: >>>>> >>>>>> Please do the following: >>>>>> >>>>>> (0) Shut down all ManifoldCF processes. >>>>>> (1) Move poi*.jar from connector-common-lib to lib. >>>>>> (2) Move dom4j*.jar from connector-common-lib to lib. >>>>>> (3) Move commons-collections4*.jar from connector-common-lib to lib. >>>>>> (4) Move xmlbeans*.java from connector-common-lib to lib. >>>>>> (5) Move curvesapi*.jar from connector-common-lib to lib. >>>>>> (6) Modify your options.env to include all of the jars you moved. >>>>>> (7) Start up all ManifoldCF processes. >>>>>> (8) If you still get stack traces, please send them to me. >>>>>> >>>>>> Karl >>>>>> >>>>>> >>>>>> On Thu, Aug 31, 2017 at 12:12 PM, Beelz Ryuzaki <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Karl, >>>>>>> >>>>>>> By 'other place', do you mean the \lib repository? If that so, then >>>>>>> I have already tried it and it didn't work. >>>>>>> >>>>>>> Othman. >>>>>>> >>>>>>> On Thu, 31 Aug 2017 at 18:07, Karl Wright <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Othman, >>>>>>>> >>>>>>>> I used the java dependency inspector to see what the issue is and >>>>>>>> it turns out that poi-ooxml.jar does refer back to poi.jar in the class >>>>>>>> that is failing. So you will need to move poi-3.15.jar and >>>>>>>> commons-collections4-1.4.jar to the other place as well. >>>>>>>> >>>>>>>> Let's hope that finally fixes this issue. >>>>>>>> >>>>>>>> I'm very unhappy about the quality of the POI project code; it is >>>>>>>> definitely not using reasonable engineering practices, and I will be >>>>>>>> opening a ticket with them. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Karl >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Aug 31, 2017 at 11:57 AM, Beelz Ryuzaki < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> I'm using the file based example and all the changes you told me >>>>>>>>> to do. I reproduced them in the file based example. I'll try to >>>>>>>>> install >>>>>>>>> zookeeper and use the zookeeper example. Will I need a configuration >>>>>>>>> to do >>>>>>>>> in order to run the zookeeper example ? >>>>>>>>> >>>>>>>>> Othman. >>>>>>>>> >>>>>>>>> On Thu, 31 Aug 2017 at 17:46, Karl Wright <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Are you using the zookeeper example, or the file-based example? >>>>>>>>>> >>>>>>>>>> If these jars have all been moved, and the options.env includes >>>>>>>>>> them, then I have to conclude that Apache POI's pom.xml is incorrect >>>>>>>>>> too. >>>>>>>>>> It will take a while to figure out what's missing that poi-ooxml.jar >>>>>>>>>> needs >>>>>>>>>> that is not listed. >>>>>>>>>> >>>>>>>>>> Karl >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Aug 31, 2017 at 11:39 AM, Beelz Ryuzaki < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> All the dependencies you mentioned have already been added in >>>>>>>>>>> the options.env.win file in the multiprocess-file-example >>>>>>>>>>> repository. >>>>>>>>>>> >>>>>>>>>>> On Thu, 31 Aug 2017 at 17:33, Beelz Ryuzaki <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Yes, I added it in the options.env.win file. Should it be the >>>>>>>>>>>> one in the multiprocess-zk-example document or >>>>>>>>>>>> multiprocess-file-example ? >>>>>>>>>>>> >>>>>>>>>>>> On Thu, 31 Aug 2017 at 17:30, Karl Wright <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> It's not related at all to elasticsearch. >>>>>>>>>>>>> Karl >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Aug 31, 2017 at 11:26 AM, Beelz Ryuzaki < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Could it be a problem of elasticsearch's version ? I'm >>>>>>>>>>>>>> actually using 2.1.0 which is pretty old for this new version of >>>>>>>>>>>>>> ManifoldCF? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 17:23, Beelz Ryuzaki < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I moved back both the jars you mentioned and a different is >>>>>>>>>>>>>>> showing. You will find the stack trace attached. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Othman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 17:09, Karl Wright < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I've looked at the dependencies; you should not have moved >>>>>>>>>>>>>>>> poi-3.15.jar. Please move that back, and >>>>>>>>>>>>>>>> commons-collections4-4.1.jar too. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> You *will* need to move curvesapi-1.04.jar though. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 11:04 AM, Karl Wright < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If you include poi.jar, then all dependencies of poi.jar >>>>>>>>>>>>>>>>> must also be included. This would mean that >>>>>>>>>>>>>>>>> curvesapi-1.04.jar and >>>>>>>>>>>>>>>>> commons-collections4-4.1.jar should also be included. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 10:23 AM, Beelz Ryuzaki < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Karl, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I added the two jars that you have mentioned and another >>>>>>>>>>>>>>>>>> one : poi-3.15.jar . Unfortunately, there is another error >>>>>>>>>>>>>>>>>> showing. This >>>>>>>>>>>>>>>>>> time, it concerns excel files. You will find attached the >>>>>>>>>>>>>>>>>> stack trace. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 15:32, Karl Wright < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi Othman, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yes, this shows that the jar we moved calls back into >>>>>>>>>>>>>>>>>>> another jar, which will also need to be moved. *That* jar >>>>>>>>>>>>>>>>>>> has yet another >>>>>>>>>>>>>>>>>>> dependency too. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The list of jars is thus extended to include: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> poi-ooxml-3.15.jar >>>>>>>>>>>>>>>>>>> dom4j-1.6.1.jar >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 9:25 AM, Beelz Ryuzaki < >>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> You will find attached the stack trace. My apologies >>>>>>>>>>>>>>>>>>>> for the bad quality of the image, I'm doing my best to >>>>>>>>>>>>>>>>>>>> send you the stack >>>>>>>>>>>>>>>>>>>> trace as I don't have the right to send documents outside >>>>>>>>>>>>>>>>>>>> the company. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thank you for your time, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Othman >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 15:16, Karl Wright < >>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Once again, I need a stack trace to diagnose what the >>>>>>>>>>>>>>>>>>>>> problem is. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 9:14 AM, Beelz Ryuzaki < >>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Oh, actually it didn't solve the problem. I looked >>>>>>>>>>>>>>>>>>>>>> into the log file and saw the following error: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Error tossed : org/apache/poi/POIXMLTypeLoader >>>>>>>>>>>>>>>>>>>>>> java.lang.NoClassDefFoundError: org/apache/poi/ >>>>>>>>>>>>>>>>>>>>>> POIXMLTypeLoader. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Maybe another jar is missing ? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 15:01, Beelz Ryuzaki < >>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I have tried what you told me to do, and you >>>>>>>>>>>>>>>>>>>>>>> expected the crawling resumed. How about the regular >>>>>>>>>>>>>>>>>>>>>>> expressions? How can I >>>>>>>>>>>>>>>>>>>>>>> make complex regular expressions in the job's paths tab >>>>>>>>>>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thank you very much for your help. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 14:47, Beelz Ryuzaki < >>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Ok, I will try it right away and let you know if it >>>>>>>>>>>>>>>>>>>>>>>> works. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 14:15, Karl Wright < >>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Oh, and you also may need to edit your options.env >>>>>>>>>>>>>>>>>>>>>>>>> files to include them in the classpath for startup. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:53 AM, Karl Wright < >>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> If you are amenable, there is another workaround >>>>>>>>>>>>>>>>>>>>>>>>>> you could try. Specifically: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> (1) Shut down all MCF processes. >>>>>>>>>>>>>>>>>>>>>>>>>> (2) Move the following two files from >>>>>>>>>>>>>>>>>>>>>>>>>> connector-common-lib to lib: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> xmlbeans-2.6.0.jar >>>>>>>>>>>>>>>>>>>>>>>>>> poi-ooxml-schemas-3.15.jar >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> (3) Restart everything and see if your crawl >>>>>>>>>>>>>>>>>>>>>>>>>> resumes. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Please let me know what happens. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:33 AM, Karl Wright < >>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I created a ticket for this: CONNECTORS-1450. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> One simple workaround is to use the external >>>>>>>>>>>>>>>>>>>>>>>>>>> Tika server transformer rather than the embedded >>>>>>>>>>>>>>>>>>>>>>>>>>> Tika Extractor. I'm still >>>>>>>>>>>>>>>>>>>>>>>>>>> looking into why the jar is not being found. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:08 AM, Beelz Ryuzaki < >>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Yes, I'm actually using the latest binary >>>>>>>>>>>>>>>>>>>>>>>>>>>> version, and my job got stuck on that specific >>>>>>>>>>>>>>>>>>>>>>>>>>>> file. >>>>>>>>>>>>>>>>>>>>>>>>>>>> The job status is still Running. You can see it >>>>>>>>>>>>>>>>>>>>>>>>>>>> in the attached file. For your information, the >>>>>>>>>>>>>>>>>>>>>>>>>>>> job started yesterday. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 13:04, Karl Wright < >>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> It looks like a dependency of Apache POI is >>>>>>>>>>>>>>>>>>>>>>>>>>>>> missing. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think we will need a ticket to address this, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> if you are indeed using the binary distribution. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks! >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 6:57 AM, Beelz Ryuzaki >>>>>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm actually using the binary version. For >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> security reasons, I can't send any files from my >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> computer. I have copied >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the stack trace and scanned it with my >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cellphone. I hope it will be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> helpful. Meanwhile, I have read the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> documentation about how to restrict the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> crawling and I don't think the '|' works in the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> specified. For instance, I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would like to restrict the crawling for the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> documents that counts the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 'sound' word . I proceed as follows: *(SON)* . >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the document is with capital >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> letters and I noticed that it didn't take it >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into consideration. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 12:40, Karl Wright < >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Othman, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The way you restrict documents with the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> windows share connector is by specifying >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> information on the "Paths" tab in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> jobs that crawl windows shares. There is >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> end-user documentation both >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> online and distributed with all binary >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> distributions that describe how to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> do this. Have you found it? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 5:25 AM, Beelz >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryuzaki <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello Karl, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for your response, I will start >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> using zookeeper and I will let you know if it >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> works. I have another >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> question to ask. Actually, I need to make some >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filters while crawling. I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't want to crawl some files and some >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> folders. Could you give me an >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example of how to use the regex. Does the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> regex allow to use /i to ignore >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cases ? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 19:53, Karl Wright < >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Beelz, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> File-based sync is deprecated because >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> people often have problems with getting file >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> permissions right, and they do >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not understand how to shut processes down >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cleanly, and zookeeper is >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> resilient against that. I highly recommend >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> using zookeeper sync. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF is engineered to not put files >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into memory so you do not need huge amounts >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of memory. The default values >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are more than enough for 35,000 files, which >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is a pretty small job for >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 11:58 AM, Beelz >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryuzaki <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm actually not using zookeeper. i want >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to know how is zookeeper different from file >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> based sync? I also need a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> guidance on how to manage my pc's memory. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How many Go should I allocate for >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the start-agent of ManifoldCF? Is 4Go enough >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in order to crawler 35K files ? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 16:11, Karl Wright >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Your disk is not writable for some >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reason, and that's interfering with >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF 2.8 locking. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would suggest two things: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (1) Use Zookeeper for sync instead of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> file-based sync. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (2) Have a look if you still get >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> failures after that. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 9:37 AM, Beelz >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryuzaki <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Mr Karl, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you Mr Karl for your quick >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> response. I have looked into the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF log file and extracted the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> following warnings : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Attempt to set file lock >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 'D:\xxxx\apache_manifoldcf-2. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 8\multiprocess-file-example\.\.\synch >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> area\569\352\lock-_POOLTARGET_OUTPUTCONNECTORPOOL_ES >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (Lowercase) Synapses.lock' failed : Access >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is denied. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Couldn't write to lock file; disk may >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be full. Shutting down process; locks may >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be left dangling. You must >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cleanup before restarting. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ES (lowercase) synapses being the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch output connection. Moreover, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the job uses Tika to extract >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> metadata and a file system as a repository >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> connection. During the job, I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't extract the content of the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> documents. I was wandering if the issue >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> comes from elasticsearch ? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 14:08, Karl >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wright <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Othman, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF aborts a job if there's an >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> error that looks like it might go away on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> retry, but does not. It can be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> either on the repository side or on the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> output side. If you look at the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Simple History in the UI, or at the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifoldcf.log file, you should be able >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to get a better sense of what went wrong. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Without further information, I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> can't say any more. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 5:33 AM, Beelz >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryuzaki <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm Othman Belhaj, a software >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> engineer from société générale in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> France. I'm actually using your recent >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> version of manifoldCF 2.8 . I'm working >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on an internal search engine. For >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this reason, I'm using manifoldcf in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> order to index documents on windows >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> shares. I encountered a serious problem >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> while crawling 35K documents. Most >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of the time, when manifoldcf start >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> crawling a big sized documents (19Mo for >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example), it ends the job with the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> following error: repeated service >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> interruptions - failure processing >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> document : software caused connection >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> abort: socket write error. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can you give me some tips on how to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> solve this problem, please ? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I use PostgreSQL 9.3.x and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch 2.1.0 . >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm looking forward for your response. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman BELHAJ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>> >>>> >>
