[jira] [Commented] (CONNECTORS-781) Fault-Tolerant Setup for ManifoldCF Agent.

Karl Wright (JIRA) Sat, 16 Nov 2013 05:26:31 -0800

    [ 
https://issues.apache.org/jira/browse/CONNECTORS-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13824467#comment-13824467
 ]


Karl Wright commented on CONNECTORS-781:
----------------------------------------

Three other areas identified:
(1) Document scheduling - the code that assigns document priorities is not 
suitable at the moment to assign them in multiple agents processes.  Fixing 
this will require the creation of a database table to track assigned document 
counts for each each bin for the documents in the queue, which will also 
require that bin names be limited in length to 255.  This will also likely have 
performance implications.
(2) Agents process startup resets the queue. The startup maps any documents 
that are in a transient state (e.g. "Active") back to a persistent state.  This 
will not be the right thing to do in a multi-agents-process situation.
(3) Database connection reset does something similar to agents process startup; 
it maps documents in transient states back to persistent states and then tries 
to restart the appropriate groups of threads.



> Fault-Tolerant Setup for ManifoldCF Agent.
> ------------------------------------------
>
>                 Key: CONNECTORS-781
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-781
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Framework agents process, Framework core, Framework 
> crawler agent
>    Affects Versions: ManifoldCF 1.5
>            Reporter: Swami Rajamohan
>            Assignee: Karl Wright
>              Labels: agents, crawler, fault-tolerance
>             Fix For: ManifoldCF 1.5
>
>
> It should be possible to setup ManifoldCF as a Fault-Tolerant infrastructure.
> The Agent component of ManifoldCF should support multiple instances of an 
> agent crawling against a single crawl store, to be able to both distribute 
> (share) the crawl load as well as to be able to pick up a request that gets 
> abruptly terminated due to either partitioning of the instance/failure of the 
> instance itself.
> Since there is a proposal to move to a store like Voldemort, it would be nice 
> to be able to have a fault tolerant infrastructure.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (CONNECTORS-781) Fault-Tolerant Setup for ManifoldCF Agent.

Reply via email to