Re: [Gluster-users] question about sync replicate volume after rebooting one node

songxin Tue, 16 Feb 2016 18:54:07 -0800

Hi,
Thank you for your immediate and detailed reply.And I have a few more question 
about glusterfs. 
A node IP is 128.224.162.163.
B node IP is 128.224.162.250.
1.After reboot B node and start the glusterd service the glusterd log is as 
blow.
...
[2015-12-07 07:54:55.743966] I [MSGID: 101190] 
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with 
index 2
[2015-12-07 07:54:55.744026] I [MSGID: 101190] 
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with 
index 1
[2015-12-07 07:54:55.744280] I [MSGID: 106163] 
[glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] 0-management: 
using the op-version 30706
[2015-12-07 07:54:55.773606] I [MSGID: 106490] 
[glusterd-handler.c:2539:__glusterd_handle_incoming_friend_req] 0-glusterd: 
Received probe from uuid: b6efd8fc-5eab-49d4-a537-2750de644a44
[2015-12-07 07:54:55.777994] E [MSGID: 101076] 
[common-utils.c:2954:gf_get_hostname_from_ip] 0-common-utils: Could not lookup 
hostname of 128.224.162.163 : Temporary failure in name resolution
[2015-12-07 07:54:55.778290] E [MSGID: 106010] 
[glusterd-utils.c:2717:glusterd_compare_friend_volume] 0-management: Version of 
Cksums gv0 differ. local cksum = 2492237955, remote cksum = 4087388312 on peer 
128.224.162.163
[2015-12-07 07:54:55.778384] I [MSGID: 106493] 
[glusterd-handler.c:3780:glusterd_xfer_friend_add_resp] 0-glusterd: Responded 
to 128.224.162.163 (0), ret: 0
[2015-12-07 07:54:55.928774] I [MSGID: 106493] 
[glusterd-rpc-ops.c:480:__glusterd_friend_add_cbk] 0-glusterd: Received RJT 
from uuid: b6efd8fc-5eab-49d4-a537-2750de644a44, host: 128.224.162.163, port: 0
...
When I run gluster peer status on B node it show as below.
Number of Peers: 1



Hostname: 128.224.162.163
Uuid: b6efd8fc-5eab-49d4-a537-2750de644a44
State: Peer Rejected (Connected)


When I run "gluster volume status" on A node  it show as below.
 
Status of volume: gv0
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 128.224.162.163:/home/wrsadmin/work/t
mp/data/brick/gv0                           49152     0          Y       13019
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       13045
 
Task Status of Volume gv0
------------------------------------------------------------------------------
There are no active volume tasks


It looks like the glusterfsd service is ok on A node.


If because the peer state is Rejected so gluterd didn't start the 
glusterfsd?What causes this problem？




2. Is glustershd(self-heal-daemon) the process as below?
root       497  0.8  0.0 432520 18104 ?        Ssl  08:07   0:00 
/usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p 
/var/lib/glusterd/glustershd/run/gluster ..


If it is， I want to know if the glustershd is also the bin glusterfsd， just 
like glusterd and glusterfs.


Thanks,
Xin




At 2016-02-16 18:53:03, "Anuradha Talur" <[email protected]> wrote:
>
>
>----- Original Message -----
>> From: "songxin" <[email protected]>
>> To: [email protected]
>> Sent: Tuesday, February 16, 2016 3:59:50 PM
>> Subject: [Gluster-users] question about sync replicate volume after  
>> rebooting one node
>> 
>> Hi,
>> I have a question about how to sync volume between two bricks after one node
>> is reboot.
>> 
>> There are two node, A node and B node.A node ip is 128.124.10.1 and B node ip
>> is 128.124.10.2.
>> 
>> operation steps on A node as below
>> 1. gluster peer probe 128.124.10.2
>> 2. mkdir -p /data/brick/gv0
>> 3.gluster volume create gv0 replica 2 128.124.10.1 :/data/brick/gv0
>> 128.124.10.2 :/data/brick/gv1 force
>> 4. gluster volume start gv0
>> 5.mount -t glusterfs 128.124.10.1 :/gv0 gluster
>> 
>> operation steps on B node as below
>> 1 . mkdir -p /data/brick/gv0
>> 2.mount -t glusterfs 128.124.10.1 :/gv0 gluster
>> 
>> After all steps above , there a some gluster service process, including
>> glusterd, glusterfs and glusterfsd, running on both A and B node.
>> I can see these servic by command ps aux | grep gluster and command gluster
>> volume status.
>> 
>> Now reboot the B node.After B reboot , there are no gluster service running
>> on B node.
>> After I systemctl start glusterd , there is just glusterd service but not
>> glusterfs and glusterfsd on B node.
>> Because glusterfs and glusterfsd are not running so I can't gluster volume
>> heal gv0 full.
>> 
>> I want to know why glusterd don't start glusterfs and glusterfsd.
>
>On starting glusterd, glusterfsd should have started by itself.
>Could you share glusterd and brick log (on node B) so that we know why 
>glusterfsd
>didn't start?
>
>Do you still see glusterfsd service running on node A? You can try running 
>"gluster v start <VOLNAME> force"
>on one of the nodes and check if all the brick processes started.
>
>gluster volume status <VOLNAME> should be able to provide you with gluster 
>process status.
>
>On restarting the node, glusterfs process for mount won't start by itself. You 
>will have to run
>step 2 on node B again for it.
>
>> How do I restart these services on B node?
>> How do I sync the replicate volume after one node reboot?
>
>Once the glusterfsd process starts on node B too, glustershd -- 
>self-heal-daemon -- for replicate volume
>should start healing/syncing files that need to be synced. This deamon does 
>periodic syncing of files.
>
>If you want to trigger a heal explicitly, you can run gluster volume heal 
><VOLNAME> on one of the servers.
>> 
>> Thanks,
>> Xin
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> [email protected]
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>-- 
>Thanks,
>Anuradha.

_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] question about sync replicate volume after rebooting one node

Reply via email to