Adding new nodes to a cluster

Vidar Ramdal Tue, 14 Sep 2010 02:28:25 -0700

We're setting up a clustered Jackrabbit application.
The application has hight traffic, so we're concerned that the Journal
table will be very large. This, in turn, will make setting up new
nodes a time-consuming task, when the new node starts replaying the
journal to get up to date.


At [1], the concept of the janitor is described, which cleans the
journal table at certain intervals. However, the list of caveats
states that "If the janitor is enabled then you loose the possibility
to easily add cluster nodes. (It is still possible but takes detailed
knowledge of Jackrabbit.)"

What detailed knowledge does this take? Can anyone give me some hints
of what we need to look into?

Also, we're not 100% sure we know what happens when a new node is
added. We understand that the journal needs to be replayed so that the
Lucene index kan be updated. But is the Lucene index the only thing
that needs modification when a new node is started?
If so, should this procedure work:
1. Take a complete snapshot (disk image) of one of the live nodes -
including the Lucene index
2. Use the disk image to setup a new node
4. Assign a new, uniqe cluster node ID to the new node

However, when trying this procedure, we still experienced that the new
node replayed the entire journal.

Is there more we need to add to the procedure, so that we can add new
nodes without having to replay all (perhaps a year's worth of) journal
entries?

[1] http://wiki.apache.org/jackrabbit/Clustering#Removing_Old_Revisions

-- 
Vidar S. Ramdal <[email protected]> - http://www.idium.no
Sommerrogata 13-15, N-0255 Oslo, Norway
+ 47 22 00 84 00 / +47 22 00 84 76
Quando omni flunkus moritatus!

Adding new nodes to a cluster

Reply via email to