Yes its a  continuous   Job.

On Tue, Sep 3, 2019 at 11:05 AM Priya Arora <pr...@smartshore.nl> wrote:

> Hi ,
> I am having a job Job:-myuniversity_intranet (which is crawling data from
> intranet site) and the data has been indexed in an index.
> My query here is, does manifold have some functionality to test a url
> before indexing that whether the URL is existing or not?.
> Likewise , in my index (say index name: abc), i am having URL(indexed).
> URL:- https:myuniversity/reaserch/info(which is an intranet url). This URL
> was existing earlier but not existing now, and resulting status is 404.
>
> Query is :- Can monifoldcf checks before indexing whether its status is
> not equal to 404(that means it exists). if the URL exists in real only then
> index otherwise skip that URL.
> Does this setting can be implemented while configuring manifold cf job.,
> or do I have to manually handle this in code.
>
>
> Kind regards
> Priya
>
> On Mon, Sep 2, 2019 at 8:19 PM Karl Wright <daddy...@gmail.com> wrote:
>
>> Hi,
>> You aren't giving me enough information to know why your job isn't
>> rechecking URLs.  Please tell me how your job is configured, specifically
>> whether it's continuous or not.  Thanks.
>>
>> Karl
>>
>>
>> On Mon, Sep 2, 2019 at 4:47 AM Priya Arora <pr...@smartshore.nl> wrote:
>>
>> > Hi,
>> >
>> > I have a query regarding manifoldCF. Is this having some kind of
>> > functionality to check, if the URL it is crawling, does exist actually
>> or
>> > page not found(404).
>> >
>> > Like I have a requirement in which i am crawling data for university and
>> > job i continuously running.After some period it found that the certain
>> > URL's have been removed from University site but its is getting indexed
>> > still also.
>> >
>> > Some pages have been marked as status 404.
>> >  How can manifold be automatise to check this , that if the URL is
>> > corresponding to 404(does not  exist anymore), it should be indexed
>> >
>> > Thanks
>> > Priya.
>> >
>>
>

Reply via email to