> Hi, >
> > What happened: > I tried to configure a simple drbd resource following > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Clusters_from_Scratch/index.html#idm140457860751296 > I used this simple snip from the doc: > configure primitive WebData ocf:linbit:drbd params drbd_resource=wwwdata \ > op monitor interval=60s > > I did it on live cluster, which is in testing currently. I will never do this > again. Shadow will be my friend. > > The cluster reacted promptly: > crm(live)# configure primitive prim_drbd_idcc_devel ocf:linbit:drbd params > drbd_resource=idcc-devel \ > > op monitor interval=60 > WARNING: prim_drbd_idcc_devel: default timeout 20s for start is smaller than > the > advised 240 > WARNING: prim_drbd_idcc_devel: default timeout 20s for stop is smaller than > the > advised 100 > WARNING: prim_drbd_idcc_devel: action monitor not advertised in meta-data, it > may not be supported by the RA > > From what i understand until now is that i didn't configure start/stop > operations, so the cluster chooses the default from default-action-timeout. > It didn't configure the monitor operation, because this is not in the > meta-data. > > The log says: > Aug 1 14:19:33 ha-idg-1 drbd(prim_drbd_idcc_devel)[11325]: ERROR: meta > parameter misconfigured, expected clone-max -le 2, but found unset. > > ^^^^^^^^^ > Aug 1 14:19:33 ha-idg-1 crmd[4692]: notice: process_lrm_event: Operation > prim_drbd_idcc_devel_monitor_0: not configured (node=ha-idg-1, call=73, rc=6, > cib-update=37, confirmed=true) > Aug 1 14:19:33 ha-idg-1 crmd[4692]: notice: process_lrm_event: Operation > prim_drbd_idcc_devel_stop_0: not configured (node=ha-idg-1, call=74, rc=6, > cib-update=38, confirmed=true) > > > crm_mon said: > Failed actions: > prim_drbd_idcc_devel_stop_0 on ha-idg-1 'not configured' (6): call=6967, > status=complete, exit-reason='none', last-rc-change='Tue Aug 1 14:28:33 > 2017', > queued=0ms, exec=41ms > prim_drbd_idcc_devel_monitor_60000 on ha-idg-1 'not configured' (6): > call=6968, > status=complete, exit-reason='none', last-rc-change='Tue Aug 1 14:28:33 > 2017', > queued=0ms, exec=41ms > prim_drbd_idcc_devel_stop_0 on ha-idg-2 'not configured' (6): call=6963, > status=complete, exit-reason='none', last-rc-change='Tue Aug 1 14:28:33 > 2017', > queued=0ms, exec=40ms > > A big problem was that i have a ClusterMon resource running on each node. It > triggered about 20000 snmp traps in 193 seconds to my management station, > which > triggered 20000 e-Mails ... > From where comes this incredible amount of traps ? Nearly all traps said that > stop is not configured for the drdb resource. Why complaining so often ? And > why stopping after ~20.000 traps ? > And complaining about not configured monitor operation just 8 times. Ok. I configured the drbd resource wrong/completely, and that caused the trouble. What i would like to know: - from where does crm_mon retrieves its information ? - why did i get tons of lines in syslog ? One message that the resource isn't configured correctly/completely would be enough. I got thousands and thousands lines telling the same. Bernd Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 _______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
