Re: [ClusterLabs] node is always offline

Eric Ren Tue, 16 Aug 2016 01:00:07 -0700

Hi,

On 08/16/2016 01:29 PM, wsl...@126.com wrote:

thank you for your reply.
The network is well,  firewall has been closed,  and selinux has been disabled.
/var/log/pacemaker.log as follows:

To be honest, this piece of logs is useless. A whole log file should be attached, which logsmessages from

the first time when things go bad until now;-)

Also, lots of info is missing, like OS distributions, packages' version and clusterconfiguration, etc.

A "hb_report" (man hb_report) would be great if package "cluster-glue" is 
intalled.

Eric

Aug 16 13:11:12 [8980] node0       crmd:     info: peer_update_callback:        
node0 is now (null)
Aug 16 13:11:12 [8980] node0       crmd:     info: crm_get_peer:        Node 1 
has uuid 1
Aug 16 13:11:12 [8980] node0       crmd:     info: crm_update_peer_proc:        
cluster_connect_cpg: Node node0[1] - corosync-cpg is now online
Aug 16 13:11:12 [8980] node0       crmd:     info: peer_update_callback:        
Client node0/peer now has status [online] (DC=<null>)
Aug 16 13:11:12 [8980] node0       crmd:     info: init_cs_connection_once:     
Connection to 'corosync': established
Aug 16 13:11:12 [8980] node0       crmd:   notice: cluster_connect_quorum:      
Quorum acquired
Aug 16 13:11:12 [8980] node0       crmd:     info: do_ha_control:       
Connected to the cluster
Aug 16 13:11:12 [8980] node0       crmd:     info: lrmd_ipc_connect:    
Connecting to lrmd
Aug 16 13:11:12 [8977] node0       lrmd:     info: crm_client_new:      
Connecting 0x16025a0 for uid=189 gid=189 pid=8980 
id=e317bf62-fb55-46a1-af17-9284234917b8
Aug 16 13:11:12 [8975] node0        cib:     info: cib_process_request:         
Completed cib_modify operation for section nodes: OK (rc=0, 
origin=local/crmd/3, version=0.1.0)
Aug 16 13:11:12 [8980] node0       crmd:     info: do_lrm_control:      LRM 
connection established
Aug 16 13:11:12 [8980] node0       crmd:     info: do_started:  Delaying start, 
no membership data (0000000000100000)
Aug 16 13:11:12 [8980] node0       crmd:     info: do_started:  Delaying start, 
no membership data (0000000000100000)
Aug 16 13:11:12 [8975] node0        cib:     info: cib_process_request:         
Completed cib_query operation for section crm_config: OK (rc=0, 
origin=local/crmd/4, version=0.1.0)
Aug 16 13:11:12 [8980] node0       crmd:     info: pcmk_quorum_notification:    
Membership 8571476093872377196: quorum retained (1)
Aug 16 13:11:12 [8980] node0       crmd:   notice: crm_update_peer_state:       
pcmk_quorum_notification: Node node0[1] - state is now member (was (null))
Aug 16 13:11:12 [8980] node0       crmd:     info: peer_update_callback:        
node0 is now member (was (null))
Aug 16 13:11:12 [8980] node0       crmd:   notice: crm_update_peer_state:       
pcmk_quorum_notification: Node node0[1] - state is now lost (was member)
Aug 16 13:11:12 [8980] node0       crmd:     info: peer_update_callback:        
node0 is now lost (was member)
Aug 16 13:11:12 [8980] node0       crmd:    error: reap_dead_nodes:     We're 
not part of the cluster anymore
Aug 16 13:11:12 [8975] node0        cib:     info: crm_client_new:      
Connecting 0x1cbcd80 for uid=0 gid=0 pid=8976 
id=9afb683d-8390-408a-a690-fb5447fefd37
Aug 16 13:11:12 [8976] node0 stonith-ng:   notice: setup_cib:   Watching for 
stonith topology changes
Aug 16 13:11:12 [8976] node0 stonith-ng:     info: qb_ipcs_us_publish:  server 
name: stonith-ng
Aug 16 13:11:12 [8976] node0 stonith-ng:     info: main:        Starting 
stonith-ng mainloop
Aug 16 13:11:12 [8976] node0 stonith-ng:     info: pcmk_cpg_membership:         
Joined[0.0] stonith-ng.1
Aug 16 13:11:12 [8976] node0 stonith-ng:     info: pcmk_cpg_membership:         
Member[0.0] stonith-ng.1
Aug 16 13:11:12 [8980] node0       crmd:    error: do_log:      FSA: Input 
I_ERROR from reap_dead_nodes() received in state S_STARTING
Aug 16 13:11:12 [8980] node0       crmd:   notice: do_state_transition:         
State transition S_STARTING -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL 
origin=reap_dead_nodes ]
Aug 16 13:11:12 [8980] node0       crmd:  warning: do_recover:  Fast-tracking 
shutdown in response to errors
Aug 16 13:11:12 [8980] node0       crmd:    error: do_started:  Start 
cancelled... S_RECOVERY
Aug 16 13:11:12 [8980] node0       crmd:    error: do_log:      FSA: Input 
I_TERMINATE from do_recover() received in state S_RECOVERY
Aug 16 13:11:12 [8980] node0       crmd:     info: do_state_transition:         
State transition S_RECOVERY -> S_TERMINATE [ input=I_TERMINATE 
cause=C_FSA_INTERNAL origin=do_recover ]
Aug 16 13:11:12 [8980] node0       crmd:   notice: lrm_state_verify_stopped:    
Stopped 0 recurring operations at shutdown (0 ops remaining)
Aug 16 13:11:12 [8980] node0       crmd:     info: do_lrm_control:      
Disconnecting from the LRM
Aug 16 13:11:12 [8980] node0       crmd:     info: lrmd_api_disconnect:         
Disconnecting from lrmd service
Aug 16 13:11:12 [8980] node0       crmd:     info: lrmd_ipc_connection_destroy: 
        IPC connection destroyed
Aug 16 13:11:12 [8980] node0       crmd:     info: lrm_connection_destroy:      
LRM Connection disconnected
Aug 16 13:11:12 [8980] node0       crmd:     info: lrmd_api_disconnect:         
Disconnecting from lrmd service
Aug 16 13:11:12 [8980] node0       crmd:   notice: do_lrm_control:      
Disconnected from the LRM
Aug 16 13:11:12 [8980] node0       crmd:     info: crm_cluster_disconnect:      
Disconnecting from cluster infrastructure: corosync
Aug 16 13:11:12 [8980] node0       crmd:   notice: terminate_cs_connection:     
Disconnecting from Corosync
Aug 16 13:11:12 [8980] node0       crmd:     info: crm_cluster_disconnect:      
Disconnected from corosync
Aug 16 13:11:12 [8980] node0       crmd:     info: do_ha_control:       
Disconnected from the cluster
Aug 16 13:11:12 [8977] node0       lrmd:     info: crm_client_destroy:  
Destroying 0 events
Aug 16 13:11:12 [8980] node0       crmd:     info: do_cib_control:      
Disconnecting CIB
Aug 16 13:11:12 [8975] node0        cib:     info: cib_process_request:         
Completed cib_query operation for section 'all': OK (rc=0, origin=local/crmd/2, 
version=0.1.0)
Aug 16 13:11:12 [8976] node0 stonith-ng:     info: init_cib_cache_cb:   
Updating device list from the cib: init
Aug 16 13:11:12 [8980] node0       crmd:     info: crmd_cib_connection_destroy: 
        Connection to the CIB terminated...
Aug 16 13:11:12 [8980] node0       crmd:     info: do_exit:     Performing 
A_EXIT_0 - gracefully exiting the CRMd
Aug 16 13:11:12 [8980] node0       crmd:     info: do_exit:     [crmd] stopped 
(0)
Aug 16 13:11:12 [8980] node0       crmd:     info: crmd_exit:   Dropping 
I_TERMINATE: [ state=S_TERMINATE cause=C_FSA_INTERNAL origin=do_stop ]
Aug 16 13:11:12 [8980] node0       crmd:     info: crmd_quorum_destroy:         
connection closed
Aug 16 13:11:12 [8980] node0       crmd:     info: crmd_cs_destroy:     
connection closed
Aug 16 13:11:12 [8980] node0       crmd:     info: crmd_init:   8980 stopped: 
OK (0)
Aug 16 13:11:12 [8980] node0       crmd:    error: crmd_fast_exit:      Could 
not recover from internal error
Aug 16 13:11:12 [8980] node0       crmd:     info: crm_xml_cleanup:     
Cleaning up memory from libxml2
Aug 16 13:11:12 [8975] node0        cib:     info: crm_client_destroy:  
Destroying 0 events
Aug 16 13:11:12 [8974] node0 pacemakerd:    error: pcmk_child_exit:     Child 
process crmd (8980) exited: Generic Pacemaker error (201)
Aug 16 13:11:12 [8974] node0 pacemakerd:    error: pcmk_child_exit:     Child 
process crmd (8980) exited: Generic Pacemaker error (201)
Aug 16 13:11:12 [8974] node0 pacemakerd:   notice: pcmk_process_exit:   
Respawning failed child process: crmd
Aug 16 13:11:12 [8974] node0 pacemakerd:   notice: pcmk_process_exit:   
Respawning failed child process: crmd
Aug 16 13:11:12 [8974] node0 pacemakerd:     info: start_child:         Using 
uid=189 and group=189 for process crmd
Aug 16 13:11:12 [8974] node0 pacemakerd:     info: start_child:         Using 
uid=189 and group=189 for process crmd
Aug 16 13:11:12 [8974] node0 pacemakerd:     info: start_child:         Forked 
child 8982 for process crmd
Aug 16 13:11:12 [8974] node0 pacemakerd:     info: start_child:         Forked 
child 8982 for process crmd
Aug 16 13:11:12 [8982] node0       crmd:     info: crm_log_init:        Changed 
active directory to /var/lib/pacemaker/cores/hacluster
Aug 16 13:11:12 [8982] node0       crmd:   notice: main:        CRM Git 
Version: 368c726
Aug 16 13:11:12 [8982] node0       crmd:     info: do_log:      FSA: Input 
I_STARTUP from crmd_init() received in state S_STARTING
Aug 16 13:11:12 [8982] node0       crmd:     info: get_cluster_type:    
Verifying cluster type: 'corosync'
Aug 16 13:11:12 [8982] node0       crmd:     info: get_cluster_type:    
Assuming an active 'corosync' cluster
Aug 16 13:11:12 [8975] node0        cib:     info: crm_client_new:      
Connecting 0x1a8dd80 for uid=189 gid=189 pid=8982 
id=5e050be4-f112-4f32-831b-1704ce1872dd
Aug 16 13:11:12 [8982] node0       crmd:     info: do_cib_control:      CIB 
connection established
Aug 16 13:11:12 [8982] node0       crmd:   notice: crm_cluster_connect:         
Connecting to cluster infrastructure: corosync
Aug 16 13:11:12 [8975] node0        cib:     info: cib_process_request:         
Completed cib_query operation for section 'all': OK (rc=0, origin=local/crmd/2, 
version=0.1.0)
Aug 16 13:11:12 [8982] node0       crmd:     info: crm_get_peer:        Created 
entry c01a1bd2-9f44-4061-9ea3-95c1df3de86d/0xea3c20 for node (null)/1 (1 total)
Aug 16 13:11:12 [8982] node0       crmd:     info: crm_get_peer:        Node 1 
is now known as node0
Aug 16 13:11:12 [8982] node0       crmd:     info: peer_update_callback:        
node0 is now (null)
Aug 16 13:11:12 [8982] node0       crmd:     info: crm_get_peer:        Node 1 
has uuid 1
Aug 16 13:11:12 [8982] node0       crmd:     info: crm_update_peer_proc:        
cluster_connect_cpg: Node node0[1] - corosync-cpg is now online
Aug 16 13:11:12 [8982] node0       crmd:     info: peer_update_callback:        
Client node0/peer now has status [online] (DC=<null>)
Aug 16 13:11:12 [8982] node0       crmd:     info: init_cs_connection_once:     
Connection to 'corosync': established
Aug 16 13:11:12 [8982] node0       crmd:   notice: cluster_connect_quorum:      
Quorum acquired
Aug 16 13:11:12 [8975] node0        cib:     info: cib_process_request:         
Completed cib_modify operation for section nodes: OK (rc=0, 
origin=local/crmd/3, version=0.1.0)
Aug 16 13:11:12 [8982] node0       crmd:     info: do_ha_control:       
Connected to the cluster
Aug 16 13:11:12 [8982] node0       crmd:     info: lrmd_ipc_connect:    
Connecting to lrmd
Aug 16 13:11:12 [8977] node0       lrmd:     info: crm_client_new:      
Connecting 0x16025a0 for uid=189 gid=189 pid=8982 
id=c6776327-b798-40f2-8ca4-a4b032cff2eb
Aug 16 13:11:12 [8982] node0       crmd:     info: do_lrm_control:      LRM 
connection established
Aug 16 13:11:12 [8982] node0       crmd:     info: do_started:  Delaying start, 
no membership data (0000000000100000)
Aug 16 13:11:12 [8982] node0       crmd:     info: do_started:  Delaying start, 
no membership data (0000000000100000)
Aug 16 13:11:12 [8975] node0        cib:     info: cib_process_request:         
Completed cib_query operation for section crm_config: OK (rc=0, 
origin=local/crmd/4, version=0.1.0)
Aug 16 13:11:12 [8982] node0       crmd:     info: pcmk_quorum_notification:    
Membership 8571476093872377196: quorum retained (1)
Aug 16 13:11:12 [8982] node0       crmd:   notice: crm_update_peer_state:       
pcmk_quorum_notification: Node node0[1] - state is now member (was (null))
Aug 16 13:11:12 [8982] node0       crmd:     info: peer_update_callback:        
node0 is now member (was (null))
Aug 16 13:11:12 [8982] node0       crmd:   notice: crm_update_peer_state:       
pcmk_quorum_notification: Node node0[1] - state is now lost (was member)
Aug 16 13:11:12 [8982] node0       crmd:     info: peer_update_callback:        
node0 is now lost (was member)
Aug 16 13:11:12 [8982] node0       crmd:    error: reap_dead_nodes:     We're 
not part of the cluster anymore
Aug 16 13:11:12 [8982] node0       crmd:    error: do_log:      FSA: Input 
I_ERROR from reap_dead_nodes() received in state S_STARTING
Aug 16 13:11:12 [8982] node0       crmd:   notice: do_state_transition:         
State transition S_STARTING -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL 
origin=reap_dead_nodes ]
Aug 16 13:11:12 [8982] node0       crmd:  warning: do_recover:  Fast-tracking 
shutdown in response to errors
Aug 16 13:11:12 [8982] node0       crmd:    error: do_started:  Start 
cancelled... S_RECOVERY
Aug 16 13:11:12 [8982] node0       crmd:    error: do_log:      FSA: Input 
I_TERMINATE from do_recover() received in state S_RECOVERY
Aug 16 13:11:12 [8982] node0       crmd:     info: do_state_transition:         
State transition S_RECOVERY -> S_TERMINATE [ input=I_TERMINATE 
cause=C_FSA_INTERNAL origin=do_recover ]
Aug 16 13:11:12 [8982] node0       crmd:   notice: lrm_state_verify_stopped:    
Stopped 0 recurring operations at shutdown (0 ops remaining)
Aug 16 13:11:12 [8982] node0       crmd:     info: do_lrm_control:      
Disconnecting from the LRM
Aug 16 13:11:12 [8982] node0       crmd:     info: lrmd_api_disconnect:         
Disconnecting from lrmd service
Aug 16 13:11:12 [8982] node0       crmd:     info: lrmd_ipc_connection_destroy: 
        IPC connection destroyed
Aug 16 13:11:12 [8977] node0       lrmd:     info: crm_client_destroy:  
Destroying 0 events
Aug 16 13:11:12 [8982] node0       crmd:     info: lrm_connection_destroy:      
LRM Connection disconnected
Aug 16 13:11:12 [8982] node0       crmd:     info: lrmd_api_disconnect:         
Disconnecting from lrmd service
Aug 16 13:11:12 [8982] node0       crmd:   notice: do_lrm_control:      
Disconnected from the LRM
Aug 16 13:11:12 [8982] node0       crmd:     info: crm_cluster_disconnect:      
Disconnecting from cluster infrastructure: corosync



wsl...@126.com


_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] node is always offline

Reply via email to