10.10.2019 18:22, Lentes, Bernd пишет:
> HI,
> 
> i have a two node cluster running on SLES 12 SP4.
> I did some testing on it.
> I put one into standby (ha-idg-2), the other (ha-idg-1) got fenced a few 
> minutes later because i made a mistake.
> ha-idg-2 was DC. ha-idg-1 made a fresh boot and i started corosync/pacemaker 
> on it.
> It seems ha-idg-1 didn't find the DC after starting cluster

Which likely was the reason for fencing in the first place.

> and some sec later elected itself  to the DC, 
> afterwards fenced ha-idg-2.
> 
> Oct 09 18:04:43 [9550] ha-idg-1 corosync notice  [MAIN  ] Corosync Cluster 
> Engine ('2.3.6'): started and ready to provide service.
> Oct 09 18:04:43 [9550] ha-idg-1 corosync info    [MAIN  ] Corosync built-in 
> features: debug testagents augeas systemd pie relro bindnow
> Oct 09 18:04:43 [9550] ha-idg-1 corosync notice  [TOTEM ] Initializing 
> transport (UDP/IP Multicast).
> Oct 09 18:04:43 [9550] ha-idg-1 corosync notice  [TOTEM ] Initializing 
> transmit/receive security (NSS) crypto: aes256 hash: sha1
> Oct 09 18:04:43 [9550] ha-idg-1 corosync notice  [TOTEM ] The network 
> interface [192.168.100.10] is now up.
> 
> Oct 09 18:05:06 [9565] ha-idg-1       crmd:     info: crm_timer_popped: 
> Election Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Oct 09 18:05:06 [9565] ha-idg-1       crmd:  warning: do_log:   Input 
> I_DC_TIMEOUT received in state S_PENDING from crm_timer_popped
> Oct 09 18:05:06 [9565] ha-idg-1       crmd:     info: do_state_transition:    
>   State transition S_PENDING -> S_ELECTION | input=I_DC_TIMEOUT 
> cause=C_TIMER_POPPED origin=crm_timer_popped
> Oct 09 18:05:06 [9565] ha-idg-1       crmd:     info: election_check:   
> election-DC won by local node
> Oct 09 18:05:06 [9565] ha-idg-1       crmd:     info: do_log:   Input 
> I_ELECTION_DC received in state S_ELECTION from election_win_cb
> Oct 09 18:05:06 [9565] ha-idg-1       crmd:   notice: do_state_transition:    
>   State transition S_ELECTION -> S_INTEGRATION | input=I_ELECTION_DC 
> cause=C_FSA_INTERNAL origin=election_win_cb
> Oct 09 18:05:06 [9565] ha-idg-1       crmd:     info: do_te_control:    
> Registering TE UUID: f302e1d4-a1aa-4a3e-b9dd-71bd17047f82
> Oct 09 18:05:06 [9565] ha-idg-1       crmd:     info: set_graph_functions:    
>   Setting custom graph functions
> Oct 09 18:05:06 [9565] ha-idg-1       crmd:     info: do_dc_takeover:   
> Taking over DC status for this partition
> 
> Oct 09 18:05:07 [9564] ha-idg-1    pengine:  warning: stage6:   Scheduling 
> Node ha-idg-2 for STONITH
> Oct 09 18:05:07 [9564] ha-idg-1    pengine:   notice: LogNodeActions:    * 
> Fence (Off) ha-idg-2 'node is unclean'
> 
> Is my understanding correct ?
> 
> 
> In the log of ha-idg-2 i don't find anything for this period:
> 
> Oct 09 17:58:46 [12504] ha-idg-2 stonith-ng:     info: cib_device_update:     
>   Device fence_ilo_ha-idg-2 has been disabled on ha-idg-2: score=-10000
> Oct 09 17:58:51 [12503] ha-idg-2        cib:     info: cib_process_ping:      
>   Reporting our current digest to ha-idg-2: 59c4cfb14defeafbeb3417e222242cd9 
> for 2.9506.36 (0x242b110 0)
> 
> Oct 09 18:00:42 [12508] ha-idg-2       crmd:     info: throttle_send_command: 
>   New throttle mode: 0001 (was 0000)
> Oct 09 18:01:12 [12508] ha-idg-2       crmd:     info: 
> throttle_check_thresholds:       Moderate CPU load detected: 32.220001
> Oct 09 18:01:12 [12508] ha-idg-2       crmd:     info: throttle_send_command: 
>   New throttle mode: 0010 (was 0001)
> Oct 09 18:01:42 [12508] ha-idg-2       crmd:     info: throttle_send_command: 
>   New throttle mode: 0001 (was 0010)
> Oct 09 18:02:42 [12508] ha-idg-2       crmd:     info: throttle_send_command: 
>   New throttle mode: 0000 (was 0001)
> 
> ha-idg-2 is fenced and after a reboot i started corosync/pacmeaker on it 
> again:
> 
> Oct 09 18:29:05 [11795] ha-idg-2 corosync notice  [MAIN  ] Corosync Cluster 
> Engine ('2.3.6'): started and ready to provide service.
> Oct 09 18:29:05 [11795] ha-idg-2 corosync info    [MAIN  ] Corosync built-in 
> features: debug testagents augeas systemd pie relro bindnow
> Oct 09 18:29:05 [11795] ha-idg-2 corosync notice  [TOTEM ] Initializing 
> transport (UDP/IP Multicast).
> Oct 09 18:29:05 [11795] ha-idg-2 corosync notice  [TOTEM ] Initializing 
> transmit/receive security (NSS) crypto: aes256 hash: sha1
> 
> What is the meaning of the lines with the throttle ?
> 
> Thanks.
> 
> 
> Bernd
> 

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to