Re: [Gluster-users] trashcan on dist. repl. volume with geo-replication

2018-03-13 Thread Dietmar Putz

Hi Kotresh,

...another test. this time the trashcan was enabled on master only. as 
in the test before it's a gfs 3.12.6 on ubuntu 16.04.4
the geo rep error appeared again and disabling the trashcan does not 
change anything.
as in the former test the error appears when i try to list files in the 
trashcan.
the shown gfid belongs to a directory in trashcan with just one file in 
it...like in the former test.


[2018-03-13 11:08:30.777489] E [master(/brick1/mvol1):784:log_failures] 
_GMaster: ENTRY FAILED  data=({'uid': 0, 'gfid': 
'71379ee0-c40a-49db-b3ed-9f3145ed409a', 'gid': 0, 'mode': 16877, 
'entry': '.gfid/4f59c068-6c77-40f2-b556-aa761834caf1/dir1', 'op': 
'MKDIR'}, 2, {'gfid_mismatch': False, 'dst': False})


below the setup, further informations and all activities.
is there anything else i could test or check...?

a generally question, is there a recommendation for the use of the 
trashcan feature in geo-replication envrionments...?
for my use-case it's not necessary to activate it on the slave...but is 
this needed to activate it on master and slave ?


best regards

Dietmar


master volume :
root@gl-node1:~# gluster volume info mvol1

Volume Name: mvol1
Type: Distributed-Replicate
Volume ID: 7590b6a0-520b-4c51-ad63-3ba5be0ed0df
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gl-node1-int:/brick1/mvol1
Brick2: gl-node2-int:/brick1/mvol1
Brick3: gl-node3-int:/brick1/mvol1
Brick4: gl-node4-int:/brick1/mvol1
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
features.trash-max-filesize: 2GB
features.trash: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
root@gl-node1:~#


slave volume :
root@gl-node5:~# gluster volume info mvol1

Volume Name: mvol1
Type: Distributed-Replicate
Volume ID: aba4e057-7374-4a62-bcd7-c1c6f71e691b
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gl-node5-int:/brick1/mvol1
Brick2: gl-node6-int:/brick1/mvol1
Brick3: gl-node7-int:/brick1/mvol1
Brick4: gl-node8-int:/brick1/mvol1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
root@gl-node5:~#

root@gl-node1:~# gluster volume geo-replication mvol1 
gl-node5-int::mvol1 config

special_sync_mode: partial
state_socket_unencoded: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.socket
gluster_log_file: 
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.gluster.log
ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no 
-i /var/lib/glusterd/geo-replication/secret.pem

ignore_deletes: false
change_detector: changelog
gluster_command_dir: /usr/sbin/
state_file: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.status

remote_gsyncd: /nonexistent/gsyncd
log_file: 
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.log
changelog_log_file: 
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-changes.log

socketdir: /var/run/gluster
working_dir: 
/var/lib/misc/glusterfsd/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1
state_detail_file: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-detail.status

use_meta_volume: true
ssh_command_tar: ssh -oPasswordAuthentication=no 
-oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem
pid_file: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.pid
georep_session_working_dir: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/

access_mount: true
gluster_params: aux-gfid-mount acl
root@gl-node1:~#

root@gl-node1:~# gluster volume geo-replication mvol1 
gl-node5-int::mvol1 status


MASTER NODE MASTER VOL    MASTER BRICK SLAVE USER 
SLAVE  SLAVE NODE  STATUS CRAWL STATUS   
LAST_SYNCED


gl-node1-int    mvol1 /brick1/mvol1    root 
gl-node5-int::mvol1    gl-node5-int    Active Changelog Crawl    
2018-03-13 09:43:46
gl-node4-int    mvol1 /brick1/mvol1    root 
gl-node5-int::mvol1    gl-node8-int    Active Changelog Crawl    
2018-03-13 09:43:47
gl-node2-int    mvol1 /brick1/mvol1    root 
gl-node5-int::mvol1    gl-node6-int    Passive N/A    N/A
gl-node3-int    mvol1 /brick1/mvol1    root 
gl-node5-int::mvol1    gl-node7-int    Passive N/A    N/A

root@gl-node1:~#

volume's are locally mounted as :

gl-node1:/mvol1 20G   65M   20G 1% /m_vol

Re: [Gluster-users] trashcan on dist. repl. volume with geo-replication

2018-03-12 Thread Kotresh Hiremath Ravishankar
Hi Dietmar,

I am trying to understand the problem and have few questions.

1. Is trashcan enabled only on master volume?
2. Does the 'rm -rf' done on master volume synced to slave ?
3. If trashcan is disabled, the issue goes away?

The geo-rep error just says the it failed to create the directory
"Oracle_VM_VirtualBox_Extension" on slave.
Usually this would be because of gfid mismatch but I don't see that in your
case. So I am little more interested
in present state of the geo-rep. Is it still throwing same errors and same
failure to sync the same directory. If
so does the parent 'test1/b1' exists on slave?

And doing ls on trashcan should not affect geo-rep. Is there a easy
reproducer for this ?


Thanks,
Kotresh HR

On Mon, Mar 12, 2018 at 10:13 PM, Dietmar Putz 
wrote:

> Hello,
>
> in regard to
> https://bugzilla.redhat.com/show_bug.cgi?id=1434066
> i have been faced to another issue when using the trashcan feature on a
> dist. repl. volume running a geo-replication. (gfs 3.12.6 on ubuntu 16.04.4)
> for e.g. removing an entire directory with subfolders :
> tron@gl-node1:/myvol-1/test1/b1$ rm -rf *
>
> afterwards listing files in the trashcan :
> tron@gl-node1:/myvol-1/test1$ ls -la /myvol-1/.trashcan/test1/b1/
>
> leads to an outage of the geo-replication.
> error on master-01 and master-02 :
>
> [2018-03-12 13:37:14.827204] I [master(/brick1/mvol1):1385:crawl]
> _GMaster: slave's time stime=(1520861818, 0)
> [2018-03-12 13:37:14.835535] E [master(/brick1/mvol1):784:log_failures]
> _GMaster: ENTRY FAILEDdata=({'uid': 0, 'gfid':
> 'c38f75e3-194a-4d22-9094-50ac8f8756e7', 'gid': 0, 'mode': 16877, 'entry':
> '.gfid/5531bd64-ac50-462b-943e-c0bf1c52f52c/Oracle_VM_VirtualBox_Extension',
> 'op': 'MKDIR'}, 2, {'gfid_mismatch': False, 'dst': False})
> [2018-03-12 13:37:14.835911] E 
> [syncdutils(/brick1/mvol1):299:log_raise_exception]
> : The above directory failed to sync. Please fix it to proceed further.
>
>
> both gfid's of the directories as shown in the log :
> brick1/mvol1/.trashcan/test1/b1 0x5531bd64ac50462b943ec0bf1c52f52c
> brick1/mvol1/.trashcan/test1/b1/Oracle_VM_VirtualBox_Extension
> 0xc38f75e3194a4d22909450ac8f8756e7
>
> the shown directory contains just one file which is stored on gl-node3 and
> gl-node4 while node1 and 2 are in geo replication error.
> since the filesize limitation of the trashcan is obsolete i'm really
> interested to use the trashcan feature but i'm concerned it will interrupt
> the geo-replication entirely.
> does anybody else have been faced with this situation...any hints,
> workarounds... ?
>
> best regards
> Dietmar Putz
>
>
> root@gl-node1:~/tmp# gluster volume info mvol1
>
> Volume Name: mvol1
> Type: Distributed-Replicate
> Volume ID: a1c74931-568c-4f40-8573-dd344553e557
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x 2 = 4
> Transport-type: tcp
> Bricks:
> Brick1: gl-node1-int:/brick1/mvol1
> Brick2: gl-node2-int:/brick1/mvol1
> Brick3: gl-node3-int:/brick1/mvol1
> Brick4: gl-node4-int:/brick1/mvol1
> Options Reconfigured:
> changelog.changelog: on
> geo-replication.ignore-pid-check: on
> geo-replication.indexing: on
> features.trash-max-filesize: 2GB
> features.trash: on
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
> root@gl-node1:/myvol-1/test1# gluster volume geo-replication mvol1
> gl-node5-int::mvol1 config
> special_sync_mode: partial
> gluster_log_file: /var/log/glusterfs/geo-replica
> tion/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%
> 2F%2F127.0.0.1%3Amvol1.gluster.log
> ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
> /var/lib/glusterd/geo-replication/secret.pem
> change_detector: changelog
> use_meta_volume: true
> session_owner: a1c74931-568c-4f40-8573-dd344553e557
> state_file: /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/
> monitor.status
> gluster_params: aux-gfid-mount acl
> remote_gsyncd: /nonexistent/gsyncd
> working_dir: /var/lib/misc/glusterfsd/mvol1/ssh%3A%2F%2Froot%40192.168.
> 178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1
> state_detail_file: /var/lib/glusterd/geo-replicat
> ion/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.
> 178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-detail.status
> gluster_command_dir: /usr/sbin/
> pid_file: /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/
> monitor.pid
> georep_session_working_dir: /var/lib/glusterd/geo-replicat
> ion/mvol1_gl-node5-int_mvol1/
> ssh_command_tar: ssh -oPasswordAuthentication=no
> -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replicat
> ion/tar_ssh.pem
> master.stime_xattr_name: trusted.glusterfs.a1c74931-568
> c-4f40-8573-dd344553e557.d62bda3a-1396-492a-ad99-7c6238d93c6a.stime
> changelog_log_file: /var/log/glusterfs/geo-replica
> tion/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%
> 2F%2F127.0.0.1%3Amvol1-changes.log
> socketdir: /var/run/gluster
> volume_id: a1c74931-568c-4f40-8573-dd344553e557
> ignore_deletes: false
> 

[Gluster-users] trashcan on dist. repl. volume with geo-replication

2018-03-12 Thread Dietmar Putz

Hello,

in regard to
https://bugzilla.redhat.com/show_bug.cgi?id=1434066
i have been faced to another issue when using the trashcan feature on a 
dist. repl. volume running a geo-replication. (gfs 3.12.6 on ubuntu 16.04.4)

for e.g. removing an entire directory with subfolders :
tron@gl-node1:/myvol-1/test1/b1$ rm -rf *

afterwards listing files in the trashcan :
tron@gl-node1:/myvol-1/test1$ ls -la /myvol-1/.trashcan/test1/b1/

leads to an outage of the geo-replication.
error on master-01 and master-02 :

[2018-03-12 13:37:14.827204] I [master(/brick1/mvol1):1385:crawl] 
_GMaster: slave's time stime=(1520861818, 0)
[2018-03-12 13:37:14.835535] E [master(/brick1/mvol1):784:log_failures] 
_GMaster: ENTRY FAILED    data=({'uid': 0, 'gfid': 
'c38f75e3-194a-4d22-9094-50ac8f8756e7', 'gid': 0, 'mode': 16877, 
'entry': 
'.gfid/5531bd64-ac50-462b-943e-c0bf1c52f52c/Oracle_VM_VirtualBox_Extension', 
'op': 'MKDIR'}, 2, {'gfid_mismatch': False, 'dst': False})
[2018-03-12 13:37:14.835911] E 
[syncdutils(/brick1/mvol1):299:log_raise_exception] : The above 
directory failed to sync. Please fix it to proceed further.



both gfid's of the directories as shown in the log :
brick1/mvol1/.trashcan/test1/b1 0x5531bd64ac50462b943ec0bf1c52f52c
brick1/mvol1/.trashcan/test1/b1/Oracle_VM_VirtualBox_Extension 
0xc38f75e3194a4d22909450ac8f8756e7


the shown directory contains just one file which is stored on gl-node3 
and gl-node4 while node1 and 2 are in geo replication error.
since the filesize limitation of the trashcan is obsolete i'm really 
interested to use the trashcan feature but i'm concerned it will 
interrupt the geo-replication entirely.
does anybody else have been faced with this situation...any hints, 
workarounds... ?


best regards
Dietmar Putz


root@gl-node1:~/tmp# gluster volume info mvol1

Volume Name: mvol1
Type: Distributed-Replicate
Volume ID: a1c74931-568c-4f40-8573-dd344553e557
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gl-node1-int:/brick1/mvol1
Brick2: gl-node2-int:/brick1/mvol1
Brick3: gl-node3-int:/brick1/mvol1
Brick4: gl-node4-int:/brick1/mvol1
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
features.trash-max-filesize: 2GB
features.trash: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

root@gl-node1:/myvol-1/test1# gluster volume geo-replication mvol1 
gl-node5-int::mvol1 config

special_sync_mode: partial
gluster_log_file: 
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.gluster.log
ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no 
-i /var/lib/glusterd/geo-replication/secret.pem

change_detector: changelog
use_meta_volume: true
session_owner: a1c74931-568c-4f40-8573-dd344553e557
state_file: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.status

gluster_params: aux-gfid-mount acl
remote_gsyncd: /nonexistent/gsyncd
working_dir: 
/var/lib/misc/glusterfsd/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1
state_detail_file: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-detail.status

gluster_command_dir: /usr/sbin/
pid_file: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.pid
georep_session_working_dir: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/
ssh_command_tar: ssh -oPasswordAuthentication=no 
-oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem
master.stime_xattr_name: 
trusted.glusterfs.a1c74931-568c-4f40-8573-dd344553e557.d62bda3a-1396-492a-ad99-7c6238d93c6a.stime
changelog_log_file: 
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-changes.log

socketdir: /var/run/gluster
volume_id: a1c74931-568c-4f40-8573-dd344553e557
ignore_deletes: false
state_socket_unencoded: 
/var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.socket
log_file: 
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.log

access_mount: true
root@gl-node1:/myvol-1/test1#

--

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users