History is shown as below as it does not indicates any error. [image: 12.JPG]
Thanks Priya On Tue, Oct 29, 2019 at 5:02 PM Karl Wright <[email protected]> wrote: > What does the history say about these documents? > Karl > > On Tue, Oct 29, 2019 at 6:53 AM Priya Arora <[email protected]> wrote: > >> >> it may be that (a) they weren't found, or (b) that the document >> specification in the job changed and they are no longer included in the job. >> >> URL's that were deleted are valid URL's(as that does not result in 404 >> or page not found error), and it is not being mentioned in Exclusion tab of >> job configuration. >> And the URL's were getting indexed earlier and except for index name in >> Elasticsearch nothing is changed in Job specification and in other >> connectors. >> >> Thanks >> Priya >> >> On Tue, Oct 29, 2019 at 3:40 PM Karl Wright <[email protected]> wrote: >> >>> ManifoldCF is an incremental crawler, which means that on every >>> (non-continuous) job run it sees which documents it can find and removes >>> the ones it can't. The history for the documents being deleted should tell >>> you why they are being deleted -- it may be that (a) they weren't found, or >>> (b) that the document specification in the job changed and they are no >>> longer included in the job. >>> >>> Karl >>> >>> >>> On Tue, Oct 29, 2019 at 5:30 AM Priya Arora <[email protected]> wrote: >>> >>>> Hi All, >>>> >>>> I have a query regarding ManifoldCF Job process.I have a job to crawl >>>> intranet site >>>> Repository Type:- Web >>>> Output Connector Type:- Elastic search. >>>> >>>> Job have to crawl around4-5 lakhs of total records. I have discarded >>>> the previous index and created a new index(in Elasticsearch) with proper >>>> mappings and settings and started the job again after cleaning Database >>>> even(Database used a PostgreSQL). >>>> But while the job continues its ingests the records properly but just >>>> before finishing (some times in between also), it initiates the process of >>>> Deletions and also it does not index the deleted documents again in index. >>>> >>>> Can you please something if I am doing anything wrong? or is this a >>>> process of manifoldcf if yes , why its not getting ingested again. >>>> >>>> Thanks and regards >>>> Priya >>>> >>>>
