[
https://issues.apache.org/jira/browse/CONNECTORS-13?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814845#comment-13814845
]
Karl Wright commented on CONNECTORS-13:
---------------------------------------
Rebased the CONNECTORS-13 branch.
I've looked further at zookeeper constructs; I think that we're basically in
good shape to go forward. My last outstanding concern was how zookeeper deals
with Thread.interrupt(); according to some mail traffic from 2009 it does the
right thing. I just hope it does wait forever on sockets or anything stupid
like that... but we'll see.
It will be necessary to keep zookeeper alive while shutting down any ManifoldCF
process, most likely, or who knows what will happen.
There is also a question of how to showcase the zookeeper synchronization
approach. We could just modify the multiprocess example - but that would
change existing behavior people may have come to rely on. Or, we could
introduce a new "zookeeper" example, which would require additional scripts
that bring up zookeeper process(es). I'm hoping that, for example purposes, ONE
zookeeper process is sufficient to demonstrate the software; starting 3 is a
burden at that level.
> We should move to eliminate process synchronization via shared file system,
> and use a process/service instead
> -------------------------------------------------------------------------------------------------------------
>
> Key: CONNECTORS-13
> URL: https://issues.apache.org/jira/browse/CONNECTORS-13
> Project: ManifoldCF
> Issue Type: Improvement
> Components: Framework core
> Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2
> Reporter: Karl Wright
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.5
>
>
> The current implementation relies on the file system to synchronize activity
> between various LCF processes. This has several downsides: first, it is
> possible to get the file system into a state that is corrupted (by killing
> processes); second, this limits the future ability to spread crawler workload
> over multiple machines.
> It should be reasonably straightforward, and probably more resilient, to
> introduce a "synchronization process", which all other LCF processes talk to
> in order to manage locks, shared data, and other synchronization activities.
--
This message was sent by Atlassian JIRA
(v6.1#6144)