Thanks for the comments, everyone. I'll go over the details and create an improved PullRequest with proposal (2).
Best Regards, Kazunori INOUE On Fri, Jul 3, 2020 at 12:09 AM Ken Gaillot <kgail...@redhat.com> wrote: > > On Thu, 2020-07-02 at 18:12 +0900, 井上和徳 wrote: > > Hi all, > > > > We think it is desirable to output the log indicating the start and > > finish of RA execution to syslog on the same node. (End users > > monitoring the syslog are requesting that the output be on the same > > node.) > > > > Currently, the start[1] and finish[2] logs may be output by different > > nodes. > > > > Cluster Summary: > > * Stack: corosync > > * Current DC: r81-2 (version 2.0.4-556cef416) - partition with > > quorum > > * Last updated: Thu Jul 2 12:42:17 2020 > > * Last change: Thu Jul 2 12:42:13 2020 by hacluster via crmd on > > r81-2 > > * 2 nodes configured > > * 3 resource instances configured > > > > Node List: > > * Online: [ r81-1 r81-2 ] > > > > Full List of Resources: > > * dummy1 (ocf::pacemaker:Dummy): Started r81-1 > > * fence1-ipmilan (stonith:fence_ipmilan): Started r81- > > 2 > > * fence2-ipmilan (stonith:fence_ipmilan): Started r81- > > 1 > > > > *1 > > Jul 2 12:42:15 r81-2 pacemaker-controld[18009]: > > notice: Initiating start operation dummy1_start_0 on r81-1 > > *2 > > Jul 2 12:42:15 r81-1 pacemaker-controld[10109]: > > notice: Result of start operation for dummy1 on r81-1: ok > > Some background for readers who might not be familiar: > > The "Initiating" message comes from the Designated Controller (DC) > node, which decides what actions need to be done, then asks the > appropriate nodes to do them. > > The "Result of" message comes from the node executing the action. > > > As a suggestion, > > > > 1) change the following log levels to NOTICE and output the start and > > finish logs to syslog on the node where RA was executed. > > > > Jul 02 12:42:15 r81-1 pacemaker-execd [10106] (log_execute) > > info: executing - rsc:dummy1 action:start call_id:10 > > Jul 02 12:42:15 r81-1 pacemaker-execd [10106] (log_finished) > > info: dummy1 start (call 10, PID 10164) exited with status 0 > > (execution time 91ms, queue time 0ms) > > > > 2) alternatively, change the following log levels to NOTICE and > > output a log indicating the finish at the DC node. > > > > Jul 02 12:42:15 r81-2 pacemaker-controld [18009] > > (process_graph_event) > > info: Transition 2 action 7 (dummy1_start_0 on r81-1) confirmed: ok > > > rc=0 call-id=10 > > It's a tricky balance deciding which logs go to syslog (by default > that's "notice" level and higher) and which go only to the pacemaker > detail log. Pacemaker logs can already be overwhelming, but clustering > is inherently complex and lots of information is needed to understand > problems in depth. > > We want enough information in the syslog for the average user to > understand what the cluster is doing and troubleshoot problems with the > clustered services, while the detail log helps with more complex > problems, especially with the cluster software itself. > > Of your two recommendations, I like (2) better, because it reports the > complete round-trip of the whole execution process. If something goes > wrong on the DC side, (1) might not give any information about it. > > However I'd want to reword the message in (2) to be more user-friendly, > something like: > > (in syslog) > Received result of start operation for dummy1 on r81-1: ok > > (in detail log) > Received result of start operation for dummy1 on r81-1: ok | Transition > 2 action 7 (dummy1_start_0) rc=0 call-id=10 > > > > What do you think about this? (Do you have a better idea?) > > > > Best Regards, > > Kazunori INOUE > -- > Ken Gaillot <kgail...@redhat.com> > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/