I wrote a custom Manifoldcf repository connector for an internal document
system and it has some strange behaviours which I am not able to explain.
1. When I schedule the job to run on a specific day and at a specific time, the
job runs but after the shutdown it decides that it still is within the run
window and it restarts again. This goes on multiple times, in the end the job
ends up running 15 times or more. I checked the job history and there is no
'job end' event but I can see all the 'job start' events which took place after
the schedule window start time. Invoking the job manually works fine, i.e. it
runs only once. Also, because I put a maximum run time of 300 minutes, the job
ends up in a waiting state after the interval expires.
Below you can find some of the logs of this particular job.
(Finisher thread) - Marked job 1470044524072 for shutdown
INFO 2017-01-13 03:53:46,848 (Job notification thread) - Found job
1470044524072 in need of notification
INFO 2017-01-13 03:53:51,349 (Job start thread) - Job '1470044524072' is
within run window at 1484276031338 ms. (which starts at 1484258400000 ms. and
goes for 18000000 ms.)
INFO 2017-01-13 03:53:51,356 (Job start thread) - Signalled for job start for
INFO 2017-01-13 03:53:55,479 (Startup thread) - Marked job 1470044524072 for
Why does it have this behaviour and how can I correct it?
2. In the second scenario I had indexed some documents and I wanted to simulate
the fact that our internal repository was not available. In the current
implementation, if there are any errors while seeding the documents, then I do
not throw an exception but instead provide an empty list of documents to be
seeded. What happens next is that Manifoldcf processes the already indexed
documents and in this case the connector throws ServiceInterruptionExceptions
which after 3 unsuccessful retries make the job stop. However, the clean-up
thread of Manifoldcf decides that all the documents need to be deleted from the
index. I would like to keep/update the documents, not delete them, that's why I
chose a connector model of ADD_CHANGE. There is only one place where I
specifically invoke activities.deleteDocument but this happens only when our
document repository is available and the document cannot be found. This
scenario is exceptional and will almost never happen in practice because the
document repository never deletes the files.
Why does the Manifoldcf clean-up thread mark the documents for deletion since
my connector does not do it on purpose?