On February 20, 2020 9:35:07 PM GMT+02:00, Maverick <m...@sapo.pt> wrote: > >Manually it starts ok, no problems: > >pcs resource debug-start apache --full >(unpack_config) warning: Blind faith: not fencing unseen nodes >Operation start for apache (systemd::httpd) returned: 'ok' (0) > > >On 20/02/2020 16:46, Strahil Nikolov wrote: >> On February 20, 2020 12:49:43 PM GMT+02:00, Maverick <m...@sapo.pt> >wrote: >>>> You really need to debug the start & stop of tthe resource . >>>> >>>> Please try the debug procedure and provide the output: >>>> https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures >>>> >>>> Best Regards, >>>> Strahil Nikolov >>> >>> Hi, >>> >>> Correct me if i'm wrong, but i think that procedure doesn't work for >>> systemd class resources, i don't know which OCF script is >responsible >>> for handling systemd class resources. >>> >>> Also crm command doesn't exist in RHEL/Fedora, i've seen the crm >>> command >>> only in SUSE. >>> >>> >>> >>> On 19/02/2020 19:23, Strahil Nikolov wrote: >>>> On February 19, 2020 7:21:12 PM GMT+02:00, Maverick <m...@sapo.pt> >>> wrote: >>>>> How is it possible that pacemaker is reporting that takes 4.2 >>> minutes >>>>> (254930ms) to execute the start of httpd systemd unit? >>>>> >>>>> Feb 19 17:04:09 boss1 pacemaker-execd [1514] (log_execute) >>>>> info: >>>>> executing - rsc:apache action:start call_id:25 >>>>> Feb 19 17:04:09 boss1 pacemaker-execd [1514] >(systemd_unit_exec) >>>>> >>>>> debug: Performing asynchronous start op on systemd unit httpd >named >>>>> 'apache' >>>>> Feb 19 17:04:09 boss1 pacemaker-execd [1514] >>>>> (systemd_unit_exec_with_unit) debug: Calling StartUnit for >>> apache: >>>>> /org/freedesktop/systemd1/unit/httpd_2eservice >>>>> Feb 19 17:04:10 boss1 pacemaker-execd [1514] (action_complete) >>> >>>>> notice: Giving up on apache start (rc=0): timeout >(elapsed=254930ms, >>>>> remaining=-154930ms) >>>>> Feb 19 17:04:10 boss1 pacemaker-execd [1514] (log_finished) > >>>>> debug: finished - rsc:apache action:monitor call_id:25 >>> exit-code:198 >>>>> exec-time:254935ms queue-time:235ms >>>>> >>>>> >>>>> Starting manually works fine and fast: >>>>> >>>>> # time systemctl start httpd >>>>> real 0m0.144s >>>>> user 0m0.005s >>>>> sys 0m0.008s >>>>> >>>>> >>>>> On 17/02/2020 22:47, Mvrk wrote: >>>>>> In attachment the pacemaker.log. On the log i can see that the >>>>> cluster >>>>>> tries to start, the start fails, then tries to stop, and the stop >>>>> also >>>>>> fails also. >>>>>> >>>>>> One more thing, my cluster was working fine on Fedora 28, i >started >>>>>> having this problem after upgrade to Fedora 31. >>>>>> >>>>>> On 17/02/2020 21:30, Ricardo Esteves wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Yes, i also don't understand why is trying to stop them first. >>>>>>> >>>>>>> SELinux is disabled: >>>>>>> >>>>>>> # getenforce >>>>>>> Disabled >>>>>>> >>>>>>> All systemd services controlled by the cluster are disabled from >>>>>>> starting at boot: >>>>>>> >>>>>>> # systemctl is-enabled httpd >>>>>>> disabled >>>>>>> >>>>>>> # systemctl is-enabled openvpn-server@01-server >>>>>>> disabled >>>>>>> >>>>>>> >>>>>>> On 17/02/2020 20:28, Ken Gaillot wrote: >>>>>>>> On Mon, 2020-02-17 at 17:35 +0000, Maverick wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> When i start my cluster, most of my systemd resources won't >>> start: >>>>>>>>> Failed Resource Actions: >>>>>>>>> * apache_stop_0 on boss1 'OCF_TIMEOUT' (198): call=82, >>>>>>>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01 >>>>>>>>> 01:00:54 +01:00', queued=29ms, exec=197799ms >>>>>>>>> * openvpn_stop_0 on boss1 'OCF_TIMEOUT' (198): call=61, >>>>>>>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01 >>>>>>>>> 01:00:54 +01:00', queued=1805ms, exec=198841ms >>>>>>>> These show that attempts to stop failed, rather than start. >>>>>>>> >>>>>>>>> So everytime i reboot my node, i need to start the resources >>>>> manually >>>>>>>>> using systemd, for example: >>>>>>>>> >>>>>>>>> systemd start apache >>>>>>>>> >>>>>>>>> and then pcs resource cleanup >>>>>>>>> >>>>>>>>> Resources configuration: >>>>>>>>> >>>>>>>>> Clone: apache-clone >>>>>>>>> Meta Attrs: maintenance=false >>>>>>>>> Resource: apache (class=systemd type=httpd) >>>>>>>>> Meta Attrs: maintenance=false >>>>>>>>> Operations: monitor interval=60 timeout=100 >(apache-monitor- >>>>>>>>> interval-60) >>>>>>>>> start interval=0s timeout=100 >>>>> (apache-start-interval- >>>>>>>>> 0s) >>>>>>>>> stop interval=0s timeout=100 >>>>> (apache-stop-interval-0s) >>>>>>>>> >>>>>>>>> Resource: openvpn (class=systemd >type=openvpn-server@01-server) >>>>>>>>> Meta Attrs: maintenance=false >>>>>>>>> Operations: monitor interval=60 timeout=100 >(openvpn-monitor- >>>>>>>>> interval-60) >>>>>>>>> start interval=0s timeout=100 >>>>> (openvpn-start-interval- >>>>>>>>> 0s) >>>>>>>>> stop interval=0s timeout=100 >>>>> (openvpn-stop-interval- >>>>>>>>> 0s) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Btw, if i try a debug-start / debug-stop the mentioned >resources >>>>>>>>> start and stop ok. >>>>>>>> Based on that, my first guess would be SELinux. Check the >SELinux >>>>> logs >>>>>>>> for denials. >>>>>>>> >>>>>>>> Also, make sure your systemd services are not enabled in >systemd >>>>> itself >>>>>>>> (e.g. via systemctl enable). Clustered systemd services should >be >>>>>>>> managed by the cluster only. >>>>> _______________________________________________ >>>>> Manage your subscription: >>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>> >>>>> ClusterLabs home: https://www.clusterlabs.org/ >>>> You really need to debug the start & stop of tthe resource . >>>> >>>> Please try the debug procedure and provide the output: >>>> https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures >>>> >>>> Best Regards, >>>> Strahil Nikolov >> Hi Maverick, >> >> >> you can replace 'crm resource stop' with 'pcs resource disable'. >> The rest is working, but sadly not for systemd. >> >> You can try to: >> 'pcs resource debug-start <resource> --full' >> Another approach is to: >> 1. Copy service to /etc/systemd/system >> 2. In '[service]' section add this: >> Environment=SYSTEMD_LOG_LEVEL=debug >> 3. Reload systemd: >> systemctl daemon_reload >> Note: I assume you got downtime for debugging the issue >> 4. Use 'debug-start --full' >> >> Note: Don't forget to remove the debug, or your journal will get >full. >> >> Best Regards, >> Strahil Nikolov
Hi Maverick, According this thread: https://lists.clusterlabs.org/pipermail/users/2016-December/021053.html You have 'startup-fencing' is set to false. Check it out - maybe this is your reason. Best Regards, Strahil Nikolov _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/