Re: [Gluster-users] Geo-Replication push-pem actually does'nt append common_secret_pub.pem to authrorized_keys file

PEPONNET, Cyril N (Cyril) Mon, 02 Feb 2015 11:45:36 -0800

Update:

Sound that the active node is finally fixed, but sound that rsync process are 
running from nodeA (I don’t understand the master notion so) and nodeA is the 
more used node so its load average become dangerously high.

How to force a geo-replication to be stated from a specific node (master).

I still don’t understand why I have 3 masters…

--
Cyril Peponnet

On Feb 2, 2015, at 11:00 AM, PEPONNET, Cyril N (Cyril) 
<cyril.pepon...@alcatel-lucent.com<mailto:cyril.pepon...@alcatel-lucent.com>> 
wrote:

But now I have strange issue:

After creating the geo-rep session and starting it (from nodeB):

[root@nodeB]#  gluster vol geo-replication myvol slaveA::myvol status detail

MASTER NODE     MASTER VOL    MASTER BRICK               SLAVE            
STATUS     CHECKPOINT STATUS    CRAWL STATUS    FILES SYNCD    FILES PENDING    
BYTES PENDING    DELETES PENDING    FILES SKIPPED
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
nodeB           myvol         /export/raid/myvol         slaveA::myvol    
Passive    N/A                  N/A             0              0                
0                0                  0
nodeA           myvol         /export/raid/myvol         slaveA::myvol    
Passive    N/A                  N/A             0              0                
0                0                  0
nodeC           myvol         /export/raid/myvol         slaveA::myvol    
Active     N/A                  Hybrid Crawl    0              8191             
0                0                  0

[root@nodeB]#  gluster vol geo-replication myvol slaveA::myvol status detail

MASTER NODE     MASTER VOL    MASTER BRICK               SLAVE            
STATUS     CHECKPOINT STATUS    CRAWL STATUS    FILES SYNCD    FILES PENDING    
BYTES PENDING    DELETES PENDING    FILES SKIPPED
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
nodeB           myvol         /export/raid/myvol         slaveA::myvol    
Passive    N/A                  N/A             0              0                
0                0                  0
nodeA           myvol         /export/raid/myvol         slaveA::myvol    
Active     N/A                  Hybrid Crawl    0              8191             
0                0                  0
nodeC           myvol         /export/raid/myvol         slaveA::myvol    
Passive    N/A                  N/A             0              0                
0                0                  0

[root@nodeB]#  gluster vol geo-replication myvol slaveA::myvol status detail

MASTER NODE     MASTER VOL    MASTER BRICK               SLAVE            
STATUS     CHECKPOINT STATUS    CRAWL STATUS    FILES SYNCD    FILES PENDING    
BYTES PENDING    DELETES PENDING    FILES SKIPPED
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
nodeB           myvol         /export/raid/myvol         slaveA::myvol    
Active     N/A                  Hybrid Crawl    0              8191             
0                0                  0
nodeA           myvol         /export/raid/myvol         slaveA::myvol    
Passive    N/A                  N/A             0              0                
0                0                  0
nodeC           myvol         /export/raid/myvol         slaveA::myvol    
Passive    N/A                  N/A             0              0                
0                0                  0

So.

    1/Why there is 3 masters nodes ??? nodeB should be the master node only
    2/Why it kept changing turn by turn from active to passive?

Thanks

--
Cyril Peponnet

On Feb 2, 2015, at 10:40 AM, PEPONNET, Cyril N (Cyril) 
<cyril.pepon...@alcatel-lucent.com<mailto:cyril.pepon...@alcatel-lucent.com>> 
wrote:

For the record, after adding

operating-version=2

on every nodes (ABC) AND slave node, the commands are working
--
Cyril Peponnet

On Feb 2, 2015, at 9:46 AM, PEPONNET, Cyril N (Cyril) 
<cyril.pepon...@alcatel-lucent.com<mailto:cyril.pepon...@alcatel-lucent.com>> 
wrote:

More informations here:

I update the state of the peer in the uid file located in /v/l/g/peers from 
state 10 to state 3 (as it is on other node) and now the node is in cluster.

gluster system:: execute gsec_create now create a proper file from master node 
with every node’s key in it.

Now from there I try to create my georeplication between master nodeB and slaveA

gluster vol geo myvol slave::myvol create push-pem force

>From slaveA I got this error message logs:

[2015-02-02 17:19:04.754809] E 
[glusterd-geo-rep.c:1686:glusterd_op_stage_copy_file] 0-: Op Version not 
supported.
[2015-02-02 17:19:04.754890] E [glusterd-syncop.c:912:gd_stage_op_phase] 
0-management: Staging of operation 'Volume Copy File' failed on localhost : One 
or more nodes do not support the required op version.
[2015-02-02 17:19:07.513547] E 
[glusterd-geo-rep.c:1620:glusterd_op_stage_sys_exec] 0-: Op Version not 
supported.
[2015-02-02 17:19:07.513632] E 
[glusterd-geo-rep.c:1658:glusterd_op_stage_sys_exec] 0-: One or more nodes do 
not support the required op version.
[2015-02-02 17:19:07.513660] E [glusterd-syncop.c:912:gd_stage_op_phase] 
0-management: Staging of operation 'Volume Execute system commands' failed on 
localhost : One or more nodes do not support the required op version.

On slaveA I have the common pem file transfered in /v/l/g/geo/ with my 3 nodes 
from source site.

But the /root/.ssh/authorized_keys is not populated with this file.

>From the log I saw that there is a call to a script

/var/lib/glusterd/hooks/1/gsync-create/post/S56glusterd-geo-rep-create-post.sh 
—volname=myvol is_push_pem=1 
pub_file=/var/lib/glusterd/geo-replication/common_secret.pem.pub slave_ip=salveA

In this script the following is done:

```
    scp $pub_file $slave_ip:$pub_file_tmp
    ssh $slave_ip "mv $pub_file_tmp $pub_file"
    ssh $slave_ip "gluster system:: copy file 
/geo-replication/common_secret.pem.pub > /dev/null"
    ssh $slave_ip “gluster system:: execute add_secret_pub > /dev/null"
```

The first 2 lines passed, the third fail so the fourth is never executed.

Third command on slaveA

#gluster system:: copy file /geo-replication/common_secret.pem.pub
One or more nodes do not support the required op version.

# gluster peer status
Number of Peers: 0

from logs:

==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <==
[2015-02-02 17:43:29.242524] E 
[glusterd-geo-rep.c:1686:glusterd_op_stage_copy_file] 0-: Op Version not 
supported.
[2015-02-02 17:43:29.242610] E [glusterd-syncop.c:912:gd_stage_op_phase] 
0-management: Staging of operation 'Volume Copy File' failed on localhost : One 
or more nodes do not support the required op version.
One or more nodes do not support the required op version.

I have for now only one node on my remote site.

Any way, as this step is done to copy the file accros all the cluster member I 
can deal without

The fourth command is not working:

[root@slaveA geo-replication]# gluster system:: execute add_secret_pub
[2015-02-02 17:44:49.123326] I [socket.c:3561:socket_init] 0-glusterfs: SSL 
support is NOT enabled
[2015-02-02 17:44:49.123381] I [socket.c:3576:socket_init] 0-glusterfs: using 
system polling thread
[2015-02-02 17:44:49.123568] I [socket.c:3561:socket_init] 0-glusterfs: SSL 
support is NOT enabled
[2015-02-02 17:44:49.123588] I [socket.c:3576:socket_init] 0-glusterfs: using 
system polling thread
[2015-02-02 17:44:49.306482] I [socket.c:2238:socket_event_handler] 
0-transport: disconnecting now

==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <==
[2015-02-02 17:44:49.307921] E 
[glusterd-geo-rep.c:1620:glusterd_op_stage_sys_exec] 0-: Op Version not 
supported.
[2015-02-02 17:44:49.308009] E 
[glusterd-geo-rep.c:1658:glusterd_op_stage_sys_exec] 0-: One or more nodes do 
not support the required op version.
[2015-02-02 17:44:49.308038] E [glusterd-syncop.c:912:gd_stage_op_phase] 
0-management: Staging of operation 'Volume Execute system commands' failed on 
localhost : One or more nodes do not support the required op version.
One or more nodes do not support the required op version.

==> /var/log/glusterfs/cli.log <==
[2015-02-02 17:44:49.308493] I [input.c:36:cli_batch] 0-: Exiting with: -1

I have only one node… I don’t understand the meaning of the errror: One or more 
nodes do not support the required op version.

 --
Cyril Peponnet

On Feb 2, 2015, at 8:49 AM, PEPONNET, Cyril N (Cyril) 
<cyril.pepon...@alcatel-lucent.com<mailto:cyril.pepon...@alcatel-lucent.com>> 
wrote:

Every node is connected:

root@nodeA geo-replication]# gluster peer status
Number of Peers: 2

Hostname: nodeB
Uuid: 6a9da7fc-70ec-4302-8152-0e61929a7c8b
State: Peer in Cluster (Connected)

Hostname: nodeC
Uuid: c12353b5-f41a-4911-9329-fee6a8d529de
State: Peer in Cluster (Connected)

[root@nodeB ~]# gluster peer status
Number of Peers: 2

Hostname: nodeC
Uuid: c12353b5-f41a-4911-9329-fee6a8d529de
State: Peer in Cluster (Connected)

Hostname: nodeA
Uuid: 2ac172bb-a2d0-44f1-9e09-6b054dbf8980
State: Peer is connected and Accepted (Connected)

[root@nodeC geo-replication]# gluster peer status
Number of Peers: 2

Hostname: nodeA
Uuid: 2ac172bb-a2d0-44f1-9e09-6b054dbf8980
State: Peer in Cluster (Connected)

Hostname: nodeB
Uuid: 6a9da7fc-70ec-4302-8152-0e61929a7c8b
State: Peer in Cluster (Connected)

The only difference is State: Peer is connected and Accepted (Connected)  from 
nodeB about nodeA

When I execute gluster system from node A or C, I have the 3 nodes keys in 
common pem file. But from nodeB, I only have keys for nodeB and node C. This is 
infortunate as I try to launch the georeplication job from nodeB (master).

--
Cyril Peponnet

On Feb 2, 2015, at 2:07 AM, Aravinda 
<avish...@redhat.com<mailto:avish...@redhat.com>> wrote:

Looks like node C is in diconnected state. Please let us know the output of 
`gluster peer status` from all the master nodes and slave nodes.

--
regards
Aravinda

On 01/22/2015 12:27 AM, PEPONNET, Cyril N (Cyril) wrote:
So,

On master node of my 3 node setup:

1) gluster system:: execute gsec_create

in /var/lib/glusterd/geo-replication/common_secret.pub I have pem pub key from 
master node A and node B (not node C).

On node C in don’t have anything in /v/l/g/geo/ except the gsync template 
config.

So here I have an issue.

The only error I saw on node C is:

   [2015-01-21 18:36:41.179601] E [rpc-clnt.c:208:call_bail]
   0-management: bailing out frame type(Peer mgmt) op(—(2)) xid =
   0x23 sent = 2015-01-21 18:26:33.031937. timeout = 600 for
   xx.xx.xx.xx:24007

On node A, the cli.log looks like:

   [2015-01-21 18:49:49.878905] I [socket.c:3561:socket_init]
   0-glusterfs: SSL support is NOT enabled
   [2015-01-21 18:49:49.878947] I [socket.c:3576:socket_init]
   0-glusterfs: using system polling thread
   [2015-01-21 18:49:49.879085] I [socket.c:3561:socket_init]
   0-glusterfs: SSL support is NOT enabled
   [2015-01-21 18:49:49.879095] I [socket.c:3576:socket_init]
   0-glusterfs: using system polling thread
   [2015-01-21 18:49:49.951835] I
   [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
   [2015-01-21 18:49:49.972143] I [input.c:36:cli_batch] 0-: Exiting
   with: 0

If I run gluster system:: execute gsec_create on node C or node B, the common 
pem key file contains my 3 nodes pem puk keys. So in some way node A is unable 
to get the key from node C.

So let’s try to fix this one before going further.

--
Cyril Peponnet

On Jan 20, 2015, at 9:38 PM, Aravinda 
<avish...@redhat.com<mailto:avish...@redhat.com> <mailto:avish...@redhat.com>> 
wrote:

On 01/20/2015 11:01 PM, PEPONNET, Cyril N (Cyril) wrote:
Hi,

I’m ready for new testing, I delete the geo-rep session between master and 
slace, remove the lines in authorized keys file on slave.
I also remove the common secret pem from slave, and from master. There is only 
the gsyncd_template.conf in /var/lib/gluster now.

Here is our setup:

Site A: gluster 3 nodes
Site B: gluster 1 node (for now, a second will come).

I can issue

gluster systen:: execute gsec_create

what to check?
common_secret.pem.pub is created in 
/var/lib/glusterd/geo-replication/common_secret.pem.pub, which should contain 
public keys from all Master nodes(Site A). Should match with contents of 
/var/lib/glusterd/geo-replication/secret.pem.pub and 
/var/lib/glusterd/geo-replication/tar_ssh.pem.pub.

then

gluster geo vol geo_test slave::geo_test create push-pem force (force is needed 
because the slave vol is smaller than the master vol).

What to check ?
Check for any errors in, /var/log/glusterfs/etc-glusterfs-glusterd.vol.log in 
rpm installation or in 
/var/log/glusterfs/usr-local-etc-glusterfs-glusterd.vol.log if source 
installation. In case of any errors related to hook execution, run directly the 
hook command copied from the log. From your previous mail I understand their is 
some issue while executing hook script. I will look into the issue in hook 
script.

I want to use change_detector changelog and not rsync btw.
change_detector is crawling mecanism. Available options are: changelog and 
xsync. xsync is FS Crawl.
sync mecanisms available are: rsync and tarssh.

Can you guide me to setup this but also debug why it’s not working out of the 
box ?

If needed I can get in touch with you through IRC.
Sure. IRC nickname is aravindavk.

Thanks for your help.

--
regards
Aravinda
http://aravindavk.in<http://aravindavk.in/>

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Geo-Replication push-pem actually does'nt append common_secret_pub.pem to authrorized_keys file

Reply via email to