The document is in a state of 'Processed' and the status is 'Ready for processing'
-----Original Message----- From: Karl Wright [mailto:[email protected]] Sent: 17 September 2015 5:28 To: dev Subject: Re: Potential Issue with pausing jobs When it is in the state after the job has resumed, can you do a Document Status report and tell me what that says for your document? Thanks, Karl On Thu, Sep 17, 2015 at 12:16 PM, Colreavy, Niall < [email protected]> wrote: > Hi Karl, > > Thanks for that. I think the problem might be more fundamental. When I > start my job and monitor the simple job history I can see the job doing > things like: > > Run the seed query > Run the data query > Run the seed query > Run the data query > > Etc. > > It continues to do this indefinitely from what I have observed. As soon as > I pause and resume the job, all I can see in the simple job history is: > > Run the seed query > Run the seed query > Run the seed query > > It's like it's never going to run the data query again? > > Kind Regards, > > Niall > > -----Original Message----- > From: Karl Wright [mailto:[email protected]] > Sent: 17 September 2015 4:53 > To: dev > Subject: Re: Potential Issue with pausing jobs > > Hi Niall, > > A continuous job reseeds on a schedule, which you set as part of the job > setup. For a continuous job, if the document has been crawled, it will be > recrawled again at a specific time in the future, and if at that time it > hasn't changed, it will be scheduled for checking again even further out, > up to a certain limit (also settable within the job). > > You can look at the document's schedule, by the way, using the "Document > Status" report, and it should be pretty clear from that what should happen > and when. > > When you abort the job and restart it, everything is reset, so the document > will be checked immediately at that point, and relatively frequently for a > while until the system figures out that the document isn't changing very > rapidly. > > Thanks, > Karl > > > > > > > On Thu, Sep 17, 2015 at 11:38 AM, Colreavy, Niall < > [email protected]> wrote: > > > Hi Karl, > > > > You'll have to forgive me if my answer is a bit uncertain but I am very > > new to MCF. Just to clarify, I have a very simple job. For the JDBC > > connector, I am literally just selecting 1 for the id, 'myurl' for the > url > > and 'mydata' for the data. So there is only ever 1 document being > processed. > > > > So to answer the questions: > > > > 1. There are 0 active documents on the queue. > > 2. Single process > > 3. Yes, this is a continuous crawl. > > > > Kind Regards, > > > > Niall > > > > -----Original Message----- > > From: Karl Wright [mailto:[email protected]] > > Sent: 17 September 2015 4:27 > > To: dev > > Subject: Re: Potential Issue with pausing jobs > > > > Hi Niall, > > > > Pausing and resuming a job should have no effects *other* than > > reprioritization of the active documents on the queue, which if there > are a > > lot of them, may take some time. > > > > So let's ask some basic questions. (1) How many active documents on your > > queue? (2) What kind of synchronization are you using? Is this single > > process, or multiprocess? (3) Is this a continuous crawl? > > > > >>>>>> > > And on a side note, what is the difference between pausing a job and > > aborting a job? > > <<<<<< > > > > I can't fully answer that unless I know the characteristics of your job, > > especially continuous crawl vs. crawl to completion. > > > > Karl > > > > > > On Thu, Sep 17, 2015 at 11:07 AM, Colreavy, Niall < > > [email protected]> wrote: > > > > > Hi, > > > > > > I am experimenting with pausing a job. The job has a simple JDBC > > > connection and a null output connection. I was experimenting with > pausing > > > the job and I notice that when I resume the job, and monitor it's > > progress > > > in the simple history report, the job never seems to run the data query > > any > > > more. I can see that it runs the seed query but it doesn't progress to > > the > > > data query. If I abort the job and restart it, it does seem to start > > > running the data query again. > > > > > > Can anyone explain this behaviour? And on a side note, what is the > > > difference between pausing a job and aborting a job? > > > > > > Thanks, > > > > > > Niall > > > > > >
