On Wed, 28 Jun 2006, Sun Jiang Dong wrote: > David Lee wrote: > > I'm trying to get lrmd and lrmadmin running under Solaris. Some parts are > > OK. But some parts are not working: from some lrmadmin requests, the > > result (probably failure) is not getting back to lrmadmin, which then > > waits forever even though lrmd has internally produced a result. > Can you tell more details? For example, which parts is ok or not. > > And what (compiling) environment you used? Forte or Gcc? Solaris x86 or > sparc?
gcc (3.4.5); Solaris-9/sparc > > > [...] > > 1. Is the lrmd-lrmadmin communication channel documented somewhere (to > > tell us what ought to happen)? > I'm not sure what information you need. Actually they use unix-socket and > corresponding files are /var/run/heartbeat/lrm_sock and > /var/run/heartbeat/lrm_callback_sock. I was wondering whether the protocol is described somewhere (in similar manner to RFCs describing SMTP, TCP, etc.). Alternatively, whether the expected (designed) internal operation of lrmd was documented. > > > 2. In "on_op_done()", something feels strange about "need_notify". > > Indeed Alan's comments at version 1.125 (April 2005) reinforce my > > uneasiness. If this is (at heart) a "call and response" thing, then > > shouldn't notification (response from the server) be attempted every time? > > (In the case I'm trying, I've inserted some logging statements, and it is > > following a path which keeps "need_notify" false, even though "lradmin" is > > sitting there, awaiting an answer.) > Likely you are tracking for the repeating operations, so need_notify keeps > false. I think it's not related to the issue on Solaris. The hanging "lrmadmin" was from "STONITHDBasicSanityCheck" where it does the "lrmadmin -E s1 start 0 0 0". Note that on a different subthread of this topic, Alan has suggested that I open a bugzilla report. (And Matt Soffen, another heartbeat developer using non-Linux, has also just reported a problem.) So a little documentation about how it is _designed_ to work would help folk like Matt and me chase problems. Best wishes. -- : David Lee I.T. Service : : Senior Systems Programmer Computer Centre : : Durham University : : http://www.dur.ac.uk/t.d.lee/ South Road : : Durham DH1 3LE : : Phone: +44 191 334 2752 U.K. : _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
