Re: [ClusterLabs] Ocassionally IPaddr2 resource fails to start

2019-10-07 Thread Jan Pokorný
Donat,

On 07/10/19 09:24 -0500, Ken Gaillot wrote:
> If this always happens when the VM is being snapshotted, you can put
> the cluster in maintenance mode (or even unmanage just the IP
> resource) while the snapshotting is happening. I don't know of any
> reason why snapshotting would affect only an IP, though.

it might be interesting if you could share the details to grow the
shared knowledge and experience in case there are some instances of
these problems reported in the future.

In particular, it'd be interesting to hear:

- hypervisor

- VM OS + if plain oblivious to running virtualized,
  or "the optimal arrangement" (e.g., specialized drivers, virtio,
  "guest additions", etc.)

(I think IPaddr2 is iproute2-only, hence in turn, VM OS must be Linux)

Of course, there might be more specific things to look at if anyone
here is an expert with particular hypervisor technology and the way
the networking works with it (no, not me at all).

-- 
Poki


pgphleQBrgswu.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Apache doesn't start under corosync with systemd

2019-10-07 Thread Jan Pokorný
On 07/10/19 10:27 -0500, Ken Gaillot wrote:
> Additionally, your pacemaker configuration is using the apache OCF
> script, so the cluster won't use /etc/init.d/apache2 at all (it invokes
> the httpd binary directly).

See my parallel response.

> Keep in mind that the httpd monitor action requires the status
> module to be enabled -- I assume that's already in place.

(For SUSE, it looks like it used to be injected.)

On-topic, spotted a small discrepancy around that part:
https://github.com/ClusterLabs/resource-agents/pull/1414

-- 
Poki


pgpiMoju8oq6D.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Unable to resource due to nvpair[@name="target-role"]: No such device or address

2019-10-07 Thread Ken Gaillot
On Mon, 2019-10-07 at 13:34 +, S Sathish S wrote:
> Hi Team,
>  
> I have two below query , we have been using Rhel 6.5 OS Version with
> below clusterlab source code compiled.
>  
> corosync-1.4.10
> pacemaker-1.1.10
> pcs-0.9.90
> resource-agents-3.9.2

Ouch, that's really old. It should still work, but not many people here
will have experience with it.
 
> Query 1 : we have added below resource group as required later we are
> trying to start the resource group , but unable to perform it .
>But while executing RA file with start option ,
> required service is started but pacemaker unable to recognized it
> started .

Are you passing any arguments on the command line when starting the
agent directly? The cluster configuration below doesn't have any, so
that would be the first thing I'd consider.

>  
> # pcs resource show MANAGER
> Resource: MANAGER (class=ocf provider=provider type=MANAGER_RA)
>   Meta Attrs: priority=100 failure-timeout=120s migration-threshold=5
>   Operations: monitor on-fail=restart interval=10s timeout=120s
> (MANAGER-monitor-interval-10s)
>   start on-fail=restart interval=0s timeout=120s
> (MANAGER-start-timeout-120s-on-fail-restart)
>   stop interval=0s timeout=120s (MANAGER-stop-timeout-
> 120s)
>  
> Starting the below resource
> #pcs resource enable MANAGER
>  
> Below are error we are getting in corosync.log file ,Please suggest
> what will be RCA for below issue.
>  
> cib: info: crm_client_new:   Connecting 0x819e00 for uid=0 gid=0
> pid=18508 id=e5fdaf69-390b-447d-b407-6420ac45148f
> cib: info: cib_process_request:  Completed cib_query
> operation for section 'all': OK (rc=0, origin=local/crm_resource/2,
> version=0.89.1)
> cib: info: cib_process_request:  Completed cib_query
> operation for section //cib/configuration/resources//*[@id="MANAGER
> "]/meta_attributes//nvpair[@name="target-role"]: No such device or
> address (rc=-6, origin=local/crm_resource/3, version=0.89.1)
> cib: info: crm_client_destroy:   Destroying 0 events

"info" level messages aren't errors. You might find /var/log/messages
more helpful in most cases.

There will be two nodes of interest. At any given time, one of the
nodes serves as "DC" -- this node's logs will have "pengine:" entries
showing any actions that are needed (such as starting or stopping a
resource). Then the node that actually runs the resource will have any
logs from the resource agent.

Additionally the "pcs status" command will show if there were any
resource failures.

> Query 2 : stack we are using classic openais (with plugin) , In that
> start the pacemaker service by default “update-origin” parameter in
> cib.xml update as hostname which pull from get_node_name function
> (uname -n)  instead we need to configure IPADDRESS of the hostname ,
> Is it possible ? we have requirement to perform the same.
>  
>  
> Thanks and Regards,
> S Sathish S

I'm not familiar with what classic openais supported. At the very least
you might consider switching from the plugin to CMAN, which was better
supported on RHEL 6.

At least with corosync 2, I believe it is possible to configure IP
addresses as node names when setting up the cluster, but I'm not sure
there's a good reason to do so. "update-origin" is just a comment
indicating which node made the most recent configuration change, and
isn't used for anything.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Apache doesn't start under corosync with systemd

2019-10-07 Thread Ken Gaillot
On Fri, 2019-10-04 at 14:10 +, Reynolds, John F - San Mateo, CA -
Contractor wrote:
> Good morning.
>  
> I’ve just upgraded a two-node active-passive cluster from SLES11 to
> SLES12.  This means that I’ve gone from /etc/init.d scripts to
> systemd services.
>  
> On the SLES11 server, this worked:
>  
>  type="apache">
>   
>  id="ncoa_apache-instance_attributes-configfile"/>
>   
>   
>  id="ncoa_apache-monitor-40s"/>
>   
> 
>  
> I had to tweak /etc/init.d/apache2 to make sure it only started on
> the active node, but that’s OK.

If pacemaker is managing a resource, the service should not be enabled
to start on boot (regardless of init or systemd). Pacemaker will start
and stop the service as needed according to the cluster configuration.

Additionally, your pacemaker configuration is using the apache OCF
script, so the cluster won't use /etc/init.d/apache2 at all (it invokes
the httpd binary directly).

Keep in mind that the httpd monitor action requires the status module
to be enabled -- I assume that's already in place.

>  
> On the SLES12 server, the resource is the same:
>  
>  type="apache">
>   
>  id="ncoa_apache-instance_attributes-configfile"/>
>   
>   
>  id="ncoa_apache-monitor-40s"/>
>   
> 
>  
> and the cluster believes the resource is started:
>  
>  
> eagnmnmep19c1:/var/lib/pacemaker/cib # crm status
> Stack: corosync
> Current DC: eagnmnmep19c0 (version 1.1.16-4.8-77ea74d) - partition
> with quorum
> Last updated: Fri Oct  4 09:02:52 2019
> Last change: Thu Oct  3 10:55:03 2019 by root via crm_resource on
> eagnmnmep19c0
>  
> 2 nodes configured
> 16 resources configured
>  
> Online: [ eagnmnmep19c0 eagnmnmep19c1 ]
>  
> Full list of resources:
>  
> Resource Group: grp_ncoa
>   (edited out for brevity)
>  ncoa_a05shared (ocf::heartbeat:Filesystem):Started
> eagnmnmep19c1
>  IP_56.201.217.146  (ocf::heartbeat:IPaddr2):   Started
> eagnmnmep19c1
>  ncoa_apache(ocf::heartbeat:apache):Started
> eagnmnmep19c1
>  
> eagnmnmep19c1:/var/lib/pacemaker/cib #
>  
>  
> But the httpd daemons aren’t started.  I can start them by hand, but
> that’s not what I need.
>  
> I have gone through the ClusterLabs and SLES docs for setting up
> apache resources, and through this list’s archive; haven’t found my
> answer.   I’m missing something in corosync, apache, or systemd.
>  Please advise.
>  
>  
> John Reynolds, Contractor
> San Mateo Unix
>  
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Ocassionally IPaddr2 resource fails to start

2019-10-07 Thread Ken Gaillot
On Mon, 2019-10-07 at 14:40 +0300, Donat Zenichev wrote:
> Hello and thank you for your answer!
> 
> So should I just disable "monitor" options at all? In my case  I'd
> better delete the whole "op" row:
> "op monitor interval=20 timeout=60 on-fail=restart"
> 
> am I correct?

Personally I wouldn't delete the monitor -- at most, I'd configure it
with on-fail=ignore. That way you can still see failures in the cluster
status, even if the cluster doesn't react to them.

If this always happens when the VM is being snapshotted, you can put
the cluster in maintenance mode (or even unmanage just the IP resource)
while the snapshotting is happening. I don't know of any reason why
snapshotting would affect only an IP, though.

Most resource agents send some logs to the system log. If that doesn't
give any clue, you could set OCF_TRACE_RA=1 in the pacemaker
environment to get tons more logs from resource agents.

> 
> On Mon, Oct 7, 2019 at 2:36 PM Ulrich Windl <
> ulrich.wi...@rz.uni-regensburg.de> wrote:
> > Hi!
> > 
> > I can't remember the exact reason, but probably it was exactly that
> > what made us remove any monitor operation from IPaddr2 (back in
> > 2011). So far no problems doing so ;-)
> > 
> > 
> > Regards,
> > Ulrich
> > P.S.: Of cource it would be nice if the real issue could be found
> > and fixed.
> > 
> > >>> Donat Zenichev  schrieb am 20.09.2019
> > um 14:43 in
> > Nachricht
> >  > >:
> > > Hi there!
> > > 
> > > I've got a tricky case, when my IpAddr2 resource fails to start
> > with
> > > literally no-reason:
> > > "IPSHARED_monitor_2 on my-master-1 'not running' (7):
> > call=11,
> > > status=complete, exitreason='',
> > >last-rc-change='Wed Sep 4 06:08:07 2019', queued=0ms,
> > exec=0ms"
> > > 
> > > Resource IpAddr2 managed to fix itself and continued to work
> > properly
> > > further after that.
> > > 
> > > What I've done after, was setting 'Failure-timeout=900' seconds
> > for my
> > > IpAddr2 resource, to prevent working of
> > > the resource on a node where it fails. I also set the
> > > 'migration-threshold=2' so IpAddr2 can fail only 2 times, and
> > goes to a
> > > Slave side after that. Meanwhile Master gets banned for 900
> > seconds.
> > > 
> > > After 900 seconds cluster tries to start IpAddr2 again at Master,
> > in case
> > > it's ok, fail counter gets cleared.
> > > That's how I avoid appearing of the error I mentioned above.
> > > 
> > > I tried to get so hard, why this can happen, but still no idea on
> > the
> > > count. Any clue how to find a reason?
> > > And another question, can snap-shoting of VM machines have any
> > impact on
> > > such?
> > > 
> > > And my configurations:
> > > ---
> > > node 01: my-master-1
> > > node 02: my-master-2
> > > 
> > > primitive IPSHARED IPaddr2 \
> > > params ip=10.10.10.5 nic=eth0 cidr_netmask=24 \
> > > meta migration-threshold=2 failure-timeout=900 target-
> > role=Started \
> > > op monitor interval=20 timeout=60 on-fail=restart
> > > 
> > > location PREFER_MASTER IPSHARED 100: my-master-1
> > > 
> > > property cib-bootstrap-options: \
> > > have-watchdog=false \
> > > dc-version=1.1.18-2b07d5c5a9 \
> > > cluster-infrastructure=corosync \
> > > cluster-name=wall \
> > > cluster-recheck-interval=5s \
> > > start-failure-is-fatal=false \
> > > stonith-enabled=false \
> > > no-quorum-policy=ignore \
> > > last-lrm-refresh=1554982967
> > > ---
> > > 
> > > Thanks in advance!
> > > 
> > > -- 
> > > -- 
> > > BR, Donat Zenichev
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Unable to resource due to nvpair[@name="target-role"]: No such device or address

2019-10-07 Thread S Sathish S
Hi Team,

I have two below query , we have been using Rhel 6.5 OS Version with below 
clusterlab source code compiled.

corosync-1.4.10
pacemaker-1.1.10
pcs-0.9.90
resource-agents-3.9.2

Query 1 : we have added below resource group as required later we are trying to 
start the resource group , but unable to perform it .
   But while executing RA file with start option , required 
service is started but pacemaker unable to recognized it started .

# pcs resource show MANAGER
Resource: MANAGER (class=ocf provider=provider type=MANAGER_RA)
  Meta Attrs: priority=100 failure-timeout=120s migration-threshold=5
  Operations: monitor on-fail=restart interval=10s timeout=120s 
(MANAGER-monitor-interval-10s)
  start on-fail=restart interval=0s timeout=120s 
(MANAGER-start-timeout-120s-on-fail-restart)
  stop interval=0s timeout=120s (MANAGER-stop-timeout-120s)

Starting the below resource
#pcs resource enable MANAGER

Below are error we are getting in corosync.log file ,Please suggest what will 
be RCA for below issue.

cib: info: crm_client_new:   Connecting 0x819e00 for uid=0 gid=0 pid=18508 
id=e5fdaf69-390b-447d-b407-6420ac45148f
cib: info: cib_process_request:  Completed cib_query operation for 
section 'all': OK (rc=0, origin=local/crm_resource/2, version=0.89.1)
cib: info: cib_process_request:  Completed cib_query operation for 
section //cib/configuration/resources//*[@id="MANAGER 
"]/meta_attributes//nvpair[@name="target-role"]: No such device or address 
(rc=-6, origin=local/crm_resource/3, version=0.89.1)
cib: info: crm_client_destroy:   Destroying 0 events

Query 2 : stack we are using classic openais (with plugin) , In that start the 
pacemaker service by default "update-origin" parameter in cib.xml update as 
hostname which pull from get_node_name function (uname -n)  instead we need to 
configure IPADDRESS of the hostname , Is it possible ? we have requirement to 
perform the same.


Thanks and Regards,
S Sathish S
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Antw: Ocassionally IPaddr2 resource fails to start

2019-10-07 Thread Donat Zenichev
Hello and thank you for your answer!

So should I just disable "monitor" options at all? In my case  I'd better
delete the whole "op" row:
"op monitor interval=20 timeout=60 on-fail=restart"

am I correct?

On Mon, Oct 7, 2019 at 2:36 PM Ulrich Windl <
ulrich.wi...@rz.uni-regensburg.de> wrote:

> Hi!
>
> I can't remember the exact reason, but probably it was exactly that what
> made us remove any monitor operation from IPaddr2 (back in 2011). So far no
> problems doing so ;-)
>
>
> Regards,
> Ulrich
> P.S.: Of cource it would be nice if the real issue could be found and
> fixed.
>
> >>> Donat Zenichev  schrieb am 20.09.2019 um
> 14:43 in
> Nachricht
> :
> > Hi there!
> >
> > I've got a tricky case, when my IpAddr2 resource fails to start with
> > literally no-reason:
> > "IPSHARED_monitor_2 on my-master-1 'not running' (7): call=11,
> > status=complete, exitreason='',
> >last-rc-change='Wed Sep 4 06:08:07 2019', queued=0ms, exec=0ms"
> >
> > Resource IpAddr2 managed to fix itself and continued to work properly
> > further after that.
> >
> > What I've done after, was setting 'Failure-timeout=900' seconds for my
> > IpAddr2 resource, to prevent working of
> > the resource on a node where it fails. I also set the
> > 'migration-threshold=2' so IpAddr2 can fail only 2 times, and goes to a
> > Slave side after that. Meanwhile Master gets banned for 900 seconds.
> >
> > After 900 seconds cluster tries to start IpAddr2 again at Master, in case
> > it's ok, fail counter gets cleared.
> > That's how I avoid appearing of the error I mentioned above.
> >
> > I tried to get so hard, why this can happen, but still no idea on the
> > count. Any clue how to find a reason?
> > And another question, can snap-shoting of VM machines have any impact on
> > such?
> >
> > And my configurations:
> > ---
> > node 01: my-master-1
> > node 02: my-master-2
> >
> > primitive IPSHARED IPaddr2 \
> > params ip=10.10.10.5 nic=eth0 cidr_netmask=24 \
> > meta migration-threshold=2 failure-timeout=900 target-role=Started \
> > op monitor interval=20 timeout=60 on-fail=restart
> >
> > location PREFER_MASTER IPSHARED 100: my-master-1
> >
> > property cib-bootstrap-options: \
> > have-watchdog=false \
> > dc-version=1.1.18-2b07d5c5a9 \
> > cluster-infrastructure=corosync \
> > cluster-name=wall \
> > cluster-recheck-interval=5s \
> > start-failure-is-fatal=false \
> > stonith-enabled=false \
> > no-quorum-policy=ignore \
> > last-lrm-refresh=1554982967
> > ---
> >
> > Thanks in advance!
> >
> > --
> > --
> > BR, Donat Zenichev
>
>
>
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>


-- 

Best regards,
Donat Zenichev
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Antw: Ocassionally IPaddr2 resource fails to start

2019-10-07 Thread Ulrich Windl
Hi!

I can't remember the exact reason, but probably it was exactly that what made 
us remove any monitor operation from IPaddr2 (back in 2011). So far no problems 
doing so ;-)


Regards,
Ulrich
P.S.: Of cource it would be nice if the real issue could be found and fixed.

>>> Donat Zenichev  schrieb am 20.09.2019 um 14:43 in
Nachricht
:
> Hi there!
> 
> I've got a tricky case, when my IpAddr2 resource fails to start with
> literally no-reason:
> "IPSHARED_monitor_2 on my-master-1 'not running' (7): call=11,
> status=complete, exitreason='',
>last-rc-change='Wed Sep 4 06:08:07 2019', queued=0ms, exec=0ms"
> 
> Resource IpAddr2 managed to fix itself and continued to work properly
> further after that.
> 
> What I've done after, was setting 'Failure-timeout=900' seconds for my
> IpAddr2 resource, to prevent working of
> the resource on a node where it fails. I also set the
> 'migration-threshold=2' so IpAddr2 can fail only 2 times, and goes to a
> Slave side after that. Meanwhile Master gets banned for 900 seconds.
> 
> After 900 seconds cluster tries to start IpAddr2 again at Master, in case
> it's ok, fail counter gets cleared.
> That's how I avoid appearing of the error I mentioned above.
> 
> I tried to get so hard, why this can happen, but still no idea on the
> count. Any clue how to find a reason?
> And another question, can snap-shoting of VM machines have any impact on
> such?
> 
> And my configurations:
> ---
> node 01: my-master-1
> node 02: my-master-2
> 
> primitive IPSHARED IPaddr2 \
> params ip=10.10.10.5 nic=eth0 cidr_netmask=24 \
> meta migration-threshold=2 failure-timeout=900 target-role=Started \
> op monitor interval=20 timeout=60 on-fail=restart
> 
> location PREFER_MASTER IPSHARED 100: my-master-1
> 
> property cib-bootstrap-options: \
> have-watchdog=false \
> dc-version=1.1.18-2b07d5c5a9 \
> cluster-infrastructure=corosync \
> cluster-name=wall \
> cluster-recheck-interval=5s \
> start-failure-is-fatal=false \
> stonith-enabled=false \
> no-quorum-policy=ignore \
> last-lrm-refresh=1554982967
> ---
> 
> Thanks in advance!
> 
> -- 
> -- 
> BR, Donat Zenichev




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Coming in Pacemaker 2.0.3: Year 2038 compatibility

2019-10-07 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 17.09.2019 um 06:38 in
Nachricht
:
> Hi all,
> 
> I wanted to highlight a feature of the next Pacemaker release: it will
> be ready for the Year 2038.
> 
> I'm sure most people reading this are familiar with the problem.
> Representing epoch timestamps (seconds since 1970‑01‑01) as signed 32‑
> bit integers will overflow at 2038‑01‑19 03:14:07 UTC, wreaking havoc
> and bringing about the collapse of civilization (or at least some
> embedded systems).
> 
> Most OSes are ready at the kernel and C library levels on 64‑bit CPU
> architectures. (There's still a lot of work to be done for filesystems
> and applications, and 32‑bit architectures may never be fixed.)

Amazingly, I had been asked about 32-bit and 2038 recently, so maybe you want
to see See also: https://en.wikipedia.org/wiki/Year_2038_problem
https://stackoverflow.com/questions/35016003/year-2038-solution-for-embedded-linux-32-bit
, too.

> 
> Until now, Pacemaker has not been Y2038‑ready, storing
> timestamps insufficiently in memory and the CIB. This is expected to be
> fully remedied in 2.0.3. So go ahead, set up that rule to put your
> cluster in standby at 10:30 p.m. March 15, 2040. :‑)
> 
> I'm planning to start the 2.0.3 release cycle in about a month.
> 
> This change will not be backported to the 1.1 series, which is expected
> to be end‑of‑life sometime in the next few years.
> ‑‑ 
> Ken Gaillot 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Apache doesn't start under corosync with systemd

2019-10-07 Thread Jan Pokorný
On 04/10/19 14:10 +, Reynolds, John F - San Mateo, CA - Contractor wrote:
> I've just upgraded a two-node active-passive cluster from SLES11 to
> SLES12.  This means that I've gone from /etc/init.d scripts to
> systemd services.
> 
> On the SLES11 server, this worked:
> 
>  type="apache">
>   
>  id="ncoa_apache-instance_attributes-configfile"/>
>   
>   
>  id="ncoa_apache-monitor-40s"/>
>   
> 
> 
> I had to tweak /etc/init.d/apache2 to make sure it only started on the active 
> node, but that's OK.
> 
> On the SLES12 server, the resource is the same:
> 
>  type="apache">
>   
>  id="ncoa_apache-instance_attributes-configfile"/>
>   
>   
>  id="ncoa_apache-monitor-40s"/>
>   
> 
> 
> and the cluster believes the resource is started:
> 
> 
> eagnmnmep19c1:/var/lib/pacemaker/cib # crm status
> Stack: corosync
> Current DC: eagnmnmep19c0 (version 1.1.16-4.8-77ea74d) - partition with quorum
> Last updated: Fri Oct  4 09:02:52 2019
> Last change: Thu Oct  3 10:55:03 2019 by root via crm_resource on 
> eagnmnmep19c0
> 
> 2 nodes configured
> 16 resources configured
> 
> Online: [ eagnmnmep19c0 eagnmnmep19c1 ]
> 
> Full list of resources:
> 
> Resource Group: grp_ncoa
>   (edited out for brevity)
>  ncoa_a05shared (ocf::heartbeat:Filesystem):Started eagnmnmep19c1
>  IP_56.201.217.146  (ocf::heartbeat:IPaddr2):   Started eagnmnmep19c1
>  ncoa_apache(ocf::heartbeat:apache):Started eagnmnmep19c1
> 
> eagnmnmep19c1:/var/lib/pacemaker/cib #
> 
> 
> But the httpd daemons aren't started.  I can start them by hand, but
> that's not what I need.
> 
> I have gone through the ClusterLabs and SLES docs for setting up
> apache resources, and through this list's archive; haven't found my
> answer.   I'm missing something in corosync, apache, or systemd.
> Please advise.

I think you've accidentally arranged a trap for yourself to fall in.

What makes be believe so is that you've explicitly mentioned you've
modified /etc/init.d/apache2 explicitly.  In RPM driven ecosystem
(incl. SUSE), it means that on removal/upgrade of the respective
package, that file won't get removed.  This is what likely happened
on your major system upgrade.

Then, you can follow the logic in the agent itself, see in particular:
https://github.com/ClusterLabs/resource-agents/blob/v4.3.0/heartbeat/apache#L187-L188
It makes the agent assume it is not in systemd realms and switches
to use that provided initscript instead.  Apparently, that is _not_
supposed to necessarily work, since at that point, you have the
differing versions of httpd executables vs. the initscript (the former
is presumably newer than the latter), so any evolutionary differences
are not accounted for in the initscript itself (it's stale at that
point).

To resolve this, try simply renaming /etc/init.d/apache2 to something
else (or moving it somewhere else, simply to retain a backup of your
modifications), unmanage and manage the resource again, hopefully
systemd route to get httpd running will turn out well then.


-- 
Poki


pgpYg3yvHJaBE.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/