Hi, it's me again.

We are finally getting around of setting up Roller in a clustered
environment but search is totally busted. For example, currently, we
would have an index on each of the nodes, but the indexes would not be
right since deletes and modifications would not be shared across the
cluster. First thing I guess, that we are moving from an immediate
indexing strategy to a task-based one. When an entry is deleted, we
simply add it to a deleted_entries table (id, deltime) for the task to
be able to remove entries from index across all of the nodes. Next,
indexing gets done from the weblogentry table. For this I had to add a
dbmodtime column that gets updated upon entry creation or update using
database time not node time. Now that I have dbmodtime and deltime all
in db time, we can have a task at any node be capable of updating its
index safely.

Now comes the issue of whether we have all nodes maintain indexes or
not. At first, we are thinking we could do so, but we are afraid of the
possible debugging nightmare when finding which index is missing an
entry or not, or why an index is getting corrupted on a single node.
Therefore, we are looking at having a single master (selected
dynamically through task locking and lease renewals) to update the index
and place a snapshot of the index on a shared drive.

Finally, we would have a task that reads periodically from the shared
drive and brings that index online. I think that this can be more
performant since we are not serializing access to our index because we
are doing add,deletes and reads on the same index at the same time. We
have less chances of corruption and best of all, we would get rid of the
.inconsistent-startup file that would save us from rebuilding the entire
index all of the time. We can pick up from any of the previous snapshots
we desire to keep around.

NOTE: Most of this is working so far, but I have plenty of things to
work out and turn into a proposal that you guys would bless. Anyways, I
just thought I'd share what I've been doing in the past weeks (including
reading Lucene documentation).

Happy Thanksgiving.

-Elias

Reply via email to