Re: Resume mechanism

Shinichiro Abe Wed, 22 Jun 2011 22:05:11 -0700

Thank you for your reply. I understood.
I realized my test situation was unrealistic.


Shinichiro Abe

On 2011/06/22, at 21:28, Karl Wright wrote:

> It also occurs to me that a good way to test this is to write an
> IDBInterface implementation that wraps another implementation, which
> generates errors when you want it to.  Then you can verify that the
> system behaves as designed under the conditions of database errors
> occurring.
> 
> Karl
> 
> On Wed, Jun 22, 2011 at 5:10 AM, Karl Wright <[email protected]> wrote:
>> So, to paraphrase, you are concerned about whether the state of the
>> outside world is consistent with the state of ManifoldCF's database in
>> all situations?
>> 
>> If you deliberately corrupt the database manually, you are not
>> simulating a realistic situation, because ManifoldCF is very careful
>> to make sure that the rows in the database match the state of the
>> outside world.  This is done by ensuring that the order in which
>> ManifoldCF updates its database is always conservative and in the
>> correct order.  What you effectively did in your test was lobotomize
>> it by removing a chunk of its memory, but this could not have happened
>> without your direct manipulation, even with poor communication to the
>> database.  In some cases ManifoldCF relies on the fact that job will
>> be rerun in order for there to be a complete crawl, but it should
>> never lose track of what it is doing.
>> 
>> For example, take the situation where ManifoldCF discovers a document,
>> then fetches it, then indexes it.  The document is entered in the
>> jobqueue upon discovery, which is an atomic operation that either
>> succeeds or fails.  If this fails, the job is aborted but the parent
>> document is not updated as having been processed either, so it will be
>> retried on the next job run.  The indexing also must complete before
>> the state of the document in the jobqueue is altered, and thus the
>> document will be retried if the indexing fails.  The maintenance of
>> the ingeststatus table uses a two-phase commit as well to be sure that
>> the status of the document in the index is accurately maintained in
>> the table regardless of target system problems or database issues.
>> 
>> Karl
>> 
>> On Wed, Jun 22, 2011 at 2:34 AM, Shinichiro Abe
>> <[email protected]> wrote:
>>> Hi.
>>> 
>>> I understood MCF resilience.
>>> However, is it possible that a) and b) occur?
>>> I tested the following.
>>> I stopped MCF while MCF starts to crawl.
>>> And I deleted a half of rows in jobqueue table manually.
>>> Then I restarted MCF and MCF began to crawl.
>>> As a result, a few documents were not insert job queue table and also 
>>> ingest to Solr.
>>> It seemed that the job fails to insert documents into queue.
>>> Although this test was a intentional case, I want to check the situation in 
>>> which jobqueue data is short and inconsistent.
>>> For example, if postgreSQL are stopped suddenly and can not connect, could 
>>> this situation happen?
>>> Or, JobManager manage to cover it?If you know, please let me know.
>>> 
>>> Thank you,
>>> Shinichiro Abe
>>> 
>>> On 2011/06/17, at 10:22, Karl Wright wrote:
>>> 
>>>> Hi Shinichiro,
>>>> 
>>>> All of ManifoldCF's state information is in the database, which
>>>> maintains consistency because it is ACID.  You can stop the ManifoldCF
>>>> agents process and start it up again, and the crawl will begin where
>>>> it stopped.  The framework has been very carefully designed to not get
>>>> confused in any way when this is done.  This resilience is in fact one
>>>> of the primary design criteria of ManifoldCF.
>>>> 
>>>> Exactly how crawls are done is covered in ManifoldCF in Action,
>>>> chapters 11 and 12.  I'll send those to you privately.
>>>> 
>>>> Thanks,
>>>> Karl
>>>> 
>>>> On Thu, Jun 16, 2011 at 7:09 PM, Shinichiro Abe
>>>> <[email protected]> wrote:
>>>>> Hi.
>>>>> Please let me know about resume mechanism.
>>>>> 
>>>>> For example, when job is executing, the following things happen.
>>>>> MCF services stop, Solr shutdown, repository servers shutdown.
>>>>> The job can not connect eace connectors by shutdown, it stops to ingest 
>>>>> documents.
>>>>> But when the above things are recovered, the job starts to resume 
>>>>> ingesting, it can keep crawling consistency.
>>>>> What manages it? Does jobqueue manage this resume mechanism?
>>>>> 
>>>>> If so, are there cases that job can not keep crawling consistency?
>>>>> e.g. the following cases.
>>>>>  a)Postgresql stops before inserting all into jobqueue, jobqueue data is 
>>>>> short and inconsistent.
>>>>>  b)Though it needs to crawl a lot of documents, MCF stops before 
>>>>> inserting all into jobqueue. As a result, jobqueue data is short and 
>>>>> inconsistent.
>>>>>  c)Any other cases.
>>>>> I want to know the possibility that data is inconsistent by halfway 
>>>>> interrupting when crawling.
>>>>> 
>>>>> Also I want to read Part4 MCF architecture on ManifoldCFinAction.
>>>>> Regards,
>>>>> Shinichiro Abe
>>> 
>>> 
>>

Re: Resume mechanism

Reply via email to