Update: Sound that the active node is finally fixed, but sound that rsync process are running from nodeA (I don’t understand the master notion so) and nodeA is the more used node so its load average become dangerously high.
How to force a geo-replication to be stated from a specific node (master). I still don’t understand why I have 3 masters… -- Cyril Peponnet On Feb 2, 2015, at 11:00 AM, PEPONNET, Cyril N (Cyril) <cyril.pepon...@alcatel-lucent.com<mailto:cyril.pepon...@alcatel-lucent.com>> wrote: But now I have strange issue: After creating the geo-rep session and starting it (from nodeB): [root@nodeB]# gluster vol geo-replication myvol slaveA::myvol status detail MASTER NODE MASTER VOL MASTER BRICK SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS FILES SYNCD FILES PENDING BYTES PENDING DELETES PENDING FILES SKIPPED ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- nodeB myvol /export/raid/myvol slaveA::myvol Passive N/A N/A 0 0 0 0 0 nodeA myvol /export/raid/myvol slaveA::myvol Passive N/A N/A 0 0 0 0 0 nodeC myvol /export/raid/myvol slaveA::myvol Active N/A Hybrid Crawl 0 8191 0 0 0 [root@nodeB]# gluster vol geo-replication myvol slaveA::myvol status detail MASTER NODE MASTER VOL MASTER BRICK SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS FILES SYNCD FILES PENDING BYTES PENDING DELETES PENDING FILES SKIPPED ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- nodeB myvol /export/raid/myvol slaveA::myvol Passive N/A N/A 0 0 0 0 0 nodeA myvol /export/raid/myvol slaveA::myvol Active N/A Hybrid Crawl 0 8191 0 0 0 nodeC myvol /export/raid/myvol slaveA::myvol Passive N/A N/A 0 0 0 0 0 [root@nodeB]# gluster vol geo-replication myvol slaveA::myvol status detail MASTER NODE MASTER VOL MASTER BRICK SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS FILES SYNCD FILES PENDING BYTES PENDING DELETES PENDING FILES SKIPPED ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- nodeB myvol /export/raid/myvol slaveA::myvol Active N/A Hybrid Crawl 0 8191 0 0 0 nodeA myvol /export/raid/myvol slaveA::myvol Passive N/A N/A 0 0 0 0 0 nodeC myvol /export/raid/myvol slaveA::myvol Passive N/A N/A 0 0 0 0 0 So. 1/Why there is 3 masters nodes ??? nodeB should be the master node only 2/Why it kept changing turn by turn from active to passive? Thanks -- Cyril Peponnet On Feb 2, 2015, at 10:40 AM, PEPONNET, Cyril N (Cyril) <cyril.pepon...@alcatel-lucent.com<mailto:cyril.pepon...@alcatel-lucent.com>> wrote: For the record, after adding operating-version=2 on every nodes (ABC) AND slave node, the commands are working -- Cyril Peponnet On Feb 2, 2015, at 9:46 AM, PEPONNET, Cyril N (Cyril) <cyril.pepon...@alcatel-lucent.com<mailto:cyril.pepon...@alcatel-lucent.com>> wrote: More informations here: I update the state of the peer in the uid file located in /v/l/g/peers from state 10 to state 3 (as it is on other node) and now the node is in cluster. gluster system:: execute gsec_create now create a proper file from master node with every node’s key in it. Now from there I try to create my georeplication between master nodeB and slaveA gluster vol geo myvol slave::myvol create push-pem force >From slaveA I got this error message logs: [2015-02-02 17:19:04.754809] E [glusterd-geo-rep.c:1686:glusterd_op_stage_copy_file] 0-: Op Version not supported. [2015-02-02 17:19:04.754890] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Copy File' failed on localhost : One or more nodes do not support the required op version. [2015-02-02 17:19:07.513547] E [glusterd-geo-rep.c:1620:glusterd_op_stage_sys_exec] 0-: Op Version not supported. [2015-02-02 17:19:07.513632] E [glusterd-geo-rep.c:1658:glusterd_op_stage_sys_exec] 0-: One or more nodes do not support the required op version. [2015-02-02 17:19:07.513660] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Execute system commands' failed on localhost : One or more nodes do not support the required op version. On slaveA I have the common pem file transfered in /v/l/g/geo/ with my 3 nodes from source site. But the /root/.ssh/authorized_keys is not populated with this file. >From the log I saw that there is a call to a script /var/lib/glusterd/hooks/1/gsync-create/post/S56glusterd-geo-rep-create-post.sh —volname=myvol is_push_pem=1 pub_file=/var/lib/glusterd/geo-replication/common_secret.pem.pub slave_ip=salveA In this script the following is done: ``` scp $pub_file $slave_ip:$pub_file_tmp ssh $slave_ip "mv $pub_file_tmp $pub_file" ssh $slave_ip "gluster system:: copy file /geo-replication/common_secret.pem.pub > /dev/null" ssh $slave_ip “gluster system:: execute add_secret_pub > /dev/null" ``` The first 2 lines passed, the third fail so the fourth is never executed. Third command on slaveA #gluster system:: copy file /geo-replication/common_secret.pem.pub One or more nodes do not support the required op version. # gluster peer status Number of Peers: 0 from logs: ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <== [2015-02-02 17:43:29.242524] E [glusterd-geo-rep.c:1686:glusterd_op_stage_copy_file] 0-: Op Version not supported. [2015-02-02 17:43:29.242610] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Copy File' failed on localhost : One or more nodes do not support the required op version. One or more nodes do not support the required op version. I have for now only one node on my remote site. Any way, as this step is done to copy the file accros all the cluster member I can deal without The fourth command is not working: [root@slaveA geo-replication]# gluster system:: execute add_secret_pub [2015-02-02 17:44:49.123326] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled [2015-02-02 17:44:49.123381] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread [2015-02-02 17:44:49.123568] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled [2015-02-02 17:44:49.123588] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread [2015-02-02 17:44:49.306482] I [socket.c:2238:socket_event_handler] 0-transport: disconnecting now ==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <== [2015-02-02 17:44:49.307921] E [glusterd-geo-rep.c:1620:glusterd_op_stage_sys_exec] 0-: Op Version not supported. [2015-02-02 17:44:49.308009] E [glusterd-geo-rep.c:1658:glusterd_op_stage_sys_exec] 0-: One or more nodes do not support the required op version. [2015-02-02 17:44:49.308038] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Execute system commands' failed on localhost : One or more nodes do not support the required op version. One or more nodes do not support the required op version. ==> /var/log/glusterfs/cli.log <== [2015-02-02 17:44:49.308493] I [input.c:36:cli_batch] 0-: Exiting with: -1 I have only one node… I don’t understand the meaning of the errror: One or more nodes do not support the required op version. -- Cyril Peponnet On Feb 2, 2015, at 8:49 AM, PEPONNET, Cyril N (Cyril) <cyril.pepon...@alcatel-lucent.com<mailto:cyril.pepon...@alcatel-lucent.com>> wrote: Every node is connected: root@nodeA geo-replication]# gluster peer status Number of Peers: 2 Hostname: nodeB Uuid: 6a9da7fc-70ec-4302-8152-0e61929a7c8b State: Peer in Cluster (Connected) Hostname: nodeC Uuid: c12353b5-f41a-4911-9329-fee6a8d529de State: Peer in Cluster (Connected) [root@nodeB ~]# gluster peer status Number of Peers: 2 Hostname: nodeC Uuid: c12353b5-f41a-4911-9329-fee6a8d529de State: Peer in Cluster (Connected) Hostname: nodeA Uuid: 2ac172bb-a2d0-44f1-9e09-6b054dbf8980 State: Peer is connected and Accepted (Connected) [root@nodeC geo-replication]# gluster peer status Number of Peers: 2 Hostname: nodeA Uuid: 2ac172bb-a2d0-44f1-9e09-6b054dbf8980 State: Peer in Cluster (Connected) Hostname: nodeB Uuid: 6a9da7fc-70ec-4302-8152-0e61929a7c8b State: Peer in Cluster (Connected) The only difference is State: Peer is connected and Accepted (Connected) from nodeB about nodeA When I execute gluster system from node A or C, I have the 3 nodes keys in common pem file. But from nodeB, I only have keys for nodeB and node C. This is infortunate as I try to launch the georeplication job from nodeB (master). -- Cyril Peponnet On Feb 2, 2015, at 2:07 AM, Aravinda <avish...@redhat.com<mailto:avish...@redhat.com>> wrote: Looks like node C is in diconnected state. Please let us know the output of `gluster peer status` from all the master nodes and slave nodes. -- regards Aravinda On 01/22/2015 12:27 AM, PEPONNET, Cyril N (Cyril) wrote: So, On master node of my 3 node setup: 1) gluster system:: execute gsec_create in /var/lib/glusterd/geo-replication/common_secret.pub I have pem pub key from master node A and node B (not node C). On node C in don’t have anything in /v/l/g/geo/ except the gsync template config. So here I have an issue. The only error I saw on node C is: [2015-01-21 18:36:41.179601] E [rpc-clnt.c:208:call_bail] 0-management: bailing out frame type(Peer mgmt) op(—(2)) xid = 0x23 sent = 2015-01-21 18:26:33.031937. timeout = 600 for xx.xx.xx.xx:24007 On node A, the cli.log looks like: [2015-01-21 18:49:49.878905] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled [2015-01-21 18:49:49.878947] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread [2015-01-21 18:49:49.879085] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled [2015-01-21 18:49:49.879095] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread [2015-01-21 18:49:49.951835] I [socket.c:2238:socket_event_handler] 0-transport: disconnecting now [2015-01-21 18:49:49.972143] I [input.c:36:cli_batch] 0-: Exiting with: 0 If I run gluster system:: execute gsec_create on node C or node B, the common pem key file contains my 3 nodes pem puk keys. So in some way node A is unable to get the key from node C. So let’s try to fix this one before going further. -- Cyril Peponnet On Jan 20, 2015, at 9:38 PM, Aravinda <avish...@redhat.com<mailto:avish...@redhat.com> <mailto:avish...@redhat.com>> wrote: On 01/20/2015 11:01 PM, PEPONNET, Cyril N (Cyril) wrote: Hi, I’m ready for new testing, I delete the geo-rep session between master and slace, remove the lines in authorized keys file on slave. I also remove the common secret pem from slave, and from master. There is only the gsyncd_template.conf in /var/lib/gluster now. Here is our setup: Site A: gluster 3 nodes Site B: gluster 1 node (for now, a second will come). I can issue gluster systen:: execute gsec_create what to check? common_secret.pem.pub is created in /var/lib/glusterd/geo-replication/common_secret.pem.pub, which should contain public keys from all Master nodes(Site A). Should match with contents of /var/lib/glusterd/geo-replication/secret.pem.pub and /var/lib/glusterd/geo-replication/tar_ssh.pem.pub. then gluster geo vol geo_test slave::geo_test create push-pem force (force is needed because the slave vol is smaller than the master vol). What to check ? Check for any errors in, /var/log/glusterfs/etc-glusterfs-glusterd.vol.log in rpm installation or in /var/log/glusterfs/usr-local-etc-glusterfs-glusterd.vol.log if source installation. In case of any errors related to hook execution, run directly the hook command copied from the log. From your previous mail I understand their is some issue while executing hook script. I will look into the issue in hook script. I want to use change_detector changelog and not rsync btw. change_detector is crawling mecanism. Available options are: changelog and xsync. xsync is FS Crawl. sync mecanisms available are: rsync and tarssh. Can you guide me to setup this but also debug why it’s not working out of the box ? If needed I can get in touch with you through IRC. Sure. IRC nickname is aravindavk. Thanks for your help. -- regards Aravinda http://aravindavk.in<http://aravindavk.in/>
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users