On 2008-02-27T20:39:13, Keisuke MORI <[EMAIL PROTECTED]> wrote:

Hi Keisuke-san,

thanks for your patch and contribution. I have to apologize in the name
of everyone for the late feedback.

I really appreciate the idea of monitoring processes directly, and
receiving async failure notifications to reduce fail-over times.

I have just discussed this with Dejan and Andrew, and we think that the
best path forward, alas necessary before inclusion, is to

- Make procd independent of Pacemaker. It should talk only to the RAs
  and the LRM.

- RAs should "sign in" with it for the processes they want monitored,
  instead of listing the processes in the procd configuration section
  (means it gets decoupled from the CIB further). The RAs could write a
  record to /var/run/heartbeat/procd/<resource-id>, for example. 
  
  The RAs would add/remove the required processes on start/promote or
  demote/stop. (So procd itself would not need to be master-slave.)

  I'm afraid that having users manually specify process lists in the CIB
  really is not workable - the users will not be able to get this
  right.

- Instead of respawning procd, there should be a resource agent which
  starts/stops (and monitors!) procd. You already have one, but why
  doesn't it go into resources/OCF/ ?

- procd should talk to the LRM to insert a "fake" failed resource
  action, which would then cause the CRM/PE to handle the resource as
  failed and initiate recovery. (This is not currently possible with the
  LRM client library; you could exec crm_resource -F, which would mean
  you no longer have a build-time dependency on the CRM.)

- This would have the advantage of decoupling procd from pacemaker as
  well as heartbeat. It could be included with the LRM/RA package build,
  and possibly be useful with other cluster managers too.

I think all that would help simplify the code.


> +#define RSCID_LEN      128 /* ref. include/lrm/lrm_api.h */
> +#define MAX_PID_LEN    256 /* ref. lrm/lrmd/lrmd.h */
> +#define MAX_LISTEN_NUM 10 /* ref. lib/clplumbing/ipcsocket.c */

If you're referencing from other include files, please do include the
includes as to avoid diverging header definitions.


Regards,
    Lars

-- 
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to