History is shown as below as it does not indicates any error.
[image: 12.JPG]

Thanks
Priya

On Tue, Oct 29, 2019 at 5:02 PM Karl Wright <[email protected]> wrote:

> What does the history say about these documents?
> Karl
>
> On Tue, Oct 29, 2019 at 6:53 AM Priya Arora <[email protected]> wrote:
>
>>
>>  it may be that (a) they weren't found, or (b) that the document
>> specification in the job changed and they are no longer included in the job.
>>
>> URL's that were deleted are valid URL's(as that does not result in 404
>> or page not found error), and it is not being mentioned in Exclusion tab of
>> job configuration.
>> And the URL's were getting indexed earlier and except for index name in
>> Elasticsearch nothing is changed in Job specification and in other
>> connectors.
>>
>> Thanks
>> Priya
>>
>> On Tue, Oct 29, 2019 at 3:40 PM Karl Wright <[email protected]> wrote:
>>
>>> ManifoldCF is an incremental crawler, which means that on every
>>> (non-continuous) job run it sees which documents it can find and removes
>>> the ones it can't.  The history for the documents being deleted should tell
>>> you why they are being deleted -- it may be that (a) they weren't found, or
>>> (b) that the document specification in the job changed and they are no
>>> longer included in the job.
>>>
>>> Karl
>>>
>>>
>>> On Tue, Oct 29, 2019 at 5:30 AM Priya Arora <[email protected]> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I have a query regarding ManifoldCF Job process.I have a job to crawl
>>>> intranet site
>>>> Repository Type:- Web
>>>> Output Connector Type:- Elastic search.
>>>>
>>>> Job have to crawl around4-5 lakhs of total records. I have discarded
>>>> the previous index and created a new index(in Elasticsearch) with proper
>>>> mappings and settings and started the job again after cleaning Database
>>>> even(Database used a PostgreSQL).
>>>> But while the job continues its ingests the records properly but just
>>>> before finishing (some times in between also), it initiates the process of
>>>> Deletions and also it does not index the deleted documents again in index.
>>>>
>>>> Can you please something if I am doing anything wrong? or is this a
>>>> process of manifoldcf if yes , why its not getting ingested again.
>>>>
>>>> Thanks and regards
>>>> Priya
>>>>
>>>>

Reply via email to