Thanks Kurt, great answers!

 > You wrote:
> One note on all of this: this is NOT how we would like to do recovery
> going forward, we just did not have a solid cluster membership service
 > in place that we could use when the mastery/recovery code was written.
 > Once we do have a stable mechanism and API (stop/start/finish) to depend
 > upon, I would like to rewrite the whole thing for lock-table-based mastery
 > and much more sensible recovery.

What is the pedigree of that stop/start/finish API?  Is it the only stable
mechanism you know of to build a more sensible recovery on?

 > As it stands, it's a brittle structure
 > that has to continually try to detect node failures inline and make
 > adjustments as recovery is ongoing, which is no fun.

Not to mention, slow and not obviously terminating, indeed.

Regards,

Daniel

_______________________________________________
Ocfs2-devel mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-devel

Reply via email to