[Gluster-users] Rebalance + VM corruption - current status and request for feedback

2017-05-16 Thread Krutika Dhananjay
Hi,

In the past couple of weeks, we've sent the following fixes concerning VM
corruption upon doing rebalance -
https://review.gluster.org/#/q/status:merged+project:glusterfs+branch:master+topic:bug-1440051

These fixes are very much part of the latest 3.10.2 release.

Satheesaran within Red Hat also verified that they work and he's not seeing
corruption issues anymore.

I'd like to hear feedback from the users themselves on these fixes (on your
test environments to begin with) before even changing the status of the bug
to CLOSED.

Although 3.10.2 has a patch that prevents rebalance sub-commands from being
executed on sharded volumes, you can override the check by using the
'force' option.

For example,

# gluster volume rebalance myvol start force

Very much looking forward to hearing from you all.

Thanks,
Krutika
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 120k context switches on GlsuterFS nodes

2017-05-16 Thread Ravishankar N

On 05/16/2017 11:13 PM, mabi wrote:
Today I even saw up to 400k context switches for around 30 minutes on 
my two nodes replica... Does anyone else have so high context switches 
on their GlusterFS nodes?


I am wondering what is "normal" and if I should be worried...





 Original Message 
Subject: 120k context switches on GlsuterFS nodes
Local Time: May 11, 2017 9:18 PM
UTC Time: May 11, 2017 7:18 PM
From: m...@protonmail.ch
To: Gluster Users 

Hi,

Today I noticed that for around 50 minutes my two GlusterFS 3.8.11 
nodes had a very high amount of context switches, around 120k. 
Usually the average is more around 1k-2k. So I checked what was 
happening and there where just more users accessing (downloading) 
their files at the same time. These are directories with typical 
cloud files, which means files of any sizes ranging from a few kB to 
MB and a lot of course.


Now I never saw such a high number in context switches in my entire 
life so I wanted to ask if this is normal or to be expected? I do not 
find any signs of errors or warnings in any log files.




What context switch are you referring to (syscalls context-switch on the 
bricks?) ? How did you measure this?

-Ravi
My volume is a replicated volume on two nodes with ZFS as filesystem 
behind and the volume is mounted using FUSE on the client (the cloud 
server). On that cloud server the glusterfs process was using quite a 
lot of system CPU but that server (VM) only has 2 vCPUs so maybe I 
should increase the number of vCPUs...


Any ideas or recommendations?



Regards,
M.




___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 120k context switches on GlsuterFS nodes

2017-05-16 Thread mabi
Today I even saw up to 400k context switches for around 30 minutes on my two 
nodes replica... Does anyone else have so high context switches on their 
GlusterFS nodes?

I am wondering what is "normal" and if I should be worried...

 Original Message 
Subject: 120k context switches on GlsuterFS nodes
Local Time: May 11, 2017 9:18 PM
UTC Time: May 11, 2017 7:18 PM
From: m...@protonmail.ch
To: Gluster Users 

Hi,

Today I noticed that for around 50 minutes my two GlusterFS 3.8.11 nodes had a 
very high amount of context switches, around 120k. Usually the average is more 
around 1k-2k. So I checked what was happening and there where just more users 
accessing (downloading) their files at the same time. These are directories 
with typical cloud files, which means files of any sizes ranging from a few kB 
to MB and a lot of course.

Now I never saw such a high number in context switches in my entire life so I 
wanted to ask if this is normal or to be expected? I do not find any signs of 
errors or warnings in any log files.

My volume is a replicated volume on two nodes with ZFS as filesystem behind and 
the volume is mounted using FUSE on the client (the cloud server). On that 
cloud server the glusterfs process was using quite a lot of system CPU but that 
server (VM) only has 2 vCPUs so maybe I should increase the number of vCPUs...

Any ideas or recommendations?

Regards,
M.___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2017-05-16 Thread Joe Julian

On 04/13/17 23:50, Pranith Kumar Karampuri wrote:



On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N > wrote:


Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it
helps, you could try mounting it via gluster NFS (gnfs) and then
see if there is an improvement in speed. Fuse mounts are slower
than gnfs mounts but you get the benefit of avoiding a single
point of failure. Unlike fuse mounts, if the gluster node
containing the gnfs server goes down, all mounts done using that
node will fail). For fuse mounts, you could try tweaking the
write-behind xlator settings to see if it helps. See the
performance.write-behind and performance.write-behind-window-size
options in `gluster volume set help`. Of course, even for gnfs
mounts, you can achieve fail-over by using CTDB.


Ravi,
  Do you have any data that suggests fuse mounts are slower than 
gNFS servers?


Pat,
  I see that I am late to the thread, but do you happen to have 
"profile info" of the workload?




I have done actual testing. For directory ops, NFS is faster due to the 
default cache settings in the kernel. For raw throughput, or ops on an 
open file, fuse is faster.


I have yet to test this but I expect with the newer caching features in 
3.8+, even directory op performance should be similar to nfs and more 
accurate.


You can follow 
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/ 
to get the information.



Thanks,
Ravi


On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster disk
when compared to writing to an NFS disk. Specifically when using
dd (data duplicator) to write a 4.3 GB file of zeros:

  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or
anything else. The hardware is (literally) the same:

  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the card,
/mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after
changing the log level):

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3
3108 [Invader] (rev 02)



*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3
times as fast*


Copy from /gdata to /gdata (gluster to gluster)
*[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* -
realllyyy slooowww


*Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* -
realllyyy slooowww again



*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times
as fast


*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times
as fast


As a test, can we copy data directly to the xfs mountpoint
(/mnt/brick1) and bypass gluster?


Any help you could give us would be appreciated.

Thanks

-- 


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:pha...@mit.edu 

Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-users



___ Gluster-users
  

Re: [Gluster-users] Slow write times to gluster disk

2017-05-16 Thread Joe Julian

On 05/10/17 14:18, Pat Haley wrote:


Hi Pranith,

Since we are mounting the partitions as the bricks, I tried the dd 
test writing to 
/.glusterfs/. The results 
without oflag=sync were 1.6 Gb/s (faster than gluster but not as fast 
as I was expecting given the 1.2 Gb/s to the no-gluster area w/ fewer 
disks).


Pat



Is that true for every disk? If you're choosing the same filename every 
time for your dd test, you're likely only doing that test against one 
disk. If that disk is slow, you would get the same results every time 
despite other disks performing normally.




On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:



On Wed, May 10, 2017 at 10:15 PM, Pat Haley > wrote:



Hi Pranith,

Not entirely sure (this isn't my area of expertise). I'll run
your answer by some other people who are more familiar with this.

I am also uncertain about how to interpret the results when we
also add the dd tests writing to the /home area (no gluster,
still on the same machine)

  * dd test without oflag=sync (rough average of multiple tests)
  o gluster w/ fuse mount : 570 Mb/s
  o gluster w/ nfs mount:  390 Mb/s
  o nfs (no gluster):  1.2 Gb/s
  * dd test with oflag=sync (rough average of multiple tests)
  o gluster w/ fuse mount:  5 Mb/s
  o gluster w/ nfs mount:  200 Mb/s
  o nfs (no gluster): 20 Mb/s

Given that the non-gluster area is a RAID-6 of 4 disks while each
brick of the gluster area is a RAID-6 of 32 disks, I would
naively expect the writes to the gluster area to be roughly 8x
faster than to the non-gluster.


I think a better test is to try and write to a file using nfs without 
any gluster to a location that is not inside the brick but someother 
location that is on same disk(s). If you are mounting the partition 
as the brick, then we can write to a file inside .glusterfs 
directory, something like 
/.glusterfs/.



I still think we have a speed issue, I can't tell if fuse vs nfs
is part of the problem.


I got interested in the post because I read that fuse speed is lesser 
than nfs speed which is counter-intuitive to my understanding. So 
wanted clarifications. Now that I got my clarifications where fuse 
outperformed nfs without sync, we can resume testing as described 
above and try to find what it is. Based on your email-id I am 
guessing you are from Boston and I am from Bangalore so if you are 
okay with doing this debugging for multiple days because of 
timezones, I will be happy to help. Please be a bit patient with me, 
I am under a release crunch but I am very curious with the problem 
you posted.


  Was there anything useful in the profiles?


Unfortunately profiles didn't help me much, I think we are collecting 
the profiles from an active volume, so it has a lot of information 
that is not pertaining to dd so it is difficult to find the 
contributions of dd. So I went through your post again and found 
something I didn't pay much attention to earlier i.e. oflag=sync, so 
did my own tests on my setup with FUSE so sent that reply.



Pat



On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:

Okay good. At least this validates my doubts. Handling O_SYNC in
gluster NFS and fuse is a bit different.
When application opens a file with O_SYNC on fuse mount then
each write syscall has to be written to disk as part of the
syscall where as in case of NFS, there is no concept of open.
NFS performs write though a handle saying it needs to be a
synchronous write, so write() syscall is performed first then it
performs fsync(). so an write on an fd with O_SYNC becomes
write+fsync. I am suspecting that when multiple threads do this
write+fsync() operation on the same file, multiple writes are
batched together to be written do disk so the throughput on the
disk is increasing is my guess.

Does it answer your doubts?

On Wed, May 10, 2017 at 9:35 PM, Pat Haley > wrote:


Without the oflag=sync and only a single test of each, the
FUSE is going faster than NFS:

FUSE:
mseas-data2(dri_nascar)% dd if=/dev/zero count=4096
bs=1048576 of=zeros.txt conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s


NFS
mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576
of=zeros.txt conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s



On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:

Could you let me know the speed without oflag=sync on both
the mounts? No need to collect profiles.

On Wed, May 10, 2017 at 9:17 PM, Pat Haley > wrote:


Here is what I see 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-16 Thread Pat Haley


Hi Pranith,

Sorry for the delay.  I never saw received your reply (but I did receive 
Ben Turner's follow-up to your reply).  So we tried to create a gluster 
volume under /home using different variations of


gluster volume create test-volume mseas-data2:/home/gbrick_test_1 
mseas-data2:/home/gbrick_test_2 transport tcp


However we keep getting errors of the form

Wrong brick type: transport, use :

Any thoughts on what we're doing wrong?

Also do you have a list of the test we should be running once we get 
this volume created?  Given the time-zone difference it might help if we 
can run a small battery of tests and post the results rather than 
test-post-new test-post... .


Thanks

Pat


On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 9:32 PM, Pat Haley > wrote:



Hi Pranith,

The /home partition is mounted as ext4
/home  ext4 defaults,usrquota,grpquota  1 2

The brick partitions are mounted ax xfs
/mnt/brick1  xfs defaults0 0
/mnt/brick2  xfs defaults0 0

Will this cause a problem with creating a volume under /home?


I don't think the bottleneck is disk. You can do the same tests you 
did on your new volume to confirm?



Pat



On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 8:57 PM, Pat Haley > wrote:


Hi Pranith,

Unfortunately, we don't have similar hardware for a small
scale test.  All we have is our production hardware.


You said something about /home partition which has lesser disks,
we can create plain distribute volume inside one of those
directories. After we are done, we can remove the setup. What do
you say?


Pat




On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 2:48 AM, Pat Haley > wrote:


Hi Pranith,

Since we are mounting the partitions as the bricks, I
tried the dd test writing to
/.glusterfs/.
The results without oflag=sync were 1.6 Gb/s (faster
than gluster but not as fast as I was expecting given
the 1.2 Gb/s to the no-gluster area w/ fewer disks).


Okay, then 1.6Gb/s is what we need to target for,
considering your volume is just distribute. Is there any way
you can do tests on similar hardware but at a small scale?
Just so we can run the workload to learn more about the
bottlenecks in the system? We can probably try to get the
speed to 1.2Gb/s on your /home partition you were telling me
yesterday. Let me know if that is something you are okay to do.


Pat



On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:



On Wed, May 10, 2017 at 10:15 PM, Pat Haley
> wrote:


Hi Pranith,

Not entirely sure (this isn't my area of
expertise). I'll run your answer by some other
people who are more familiar with this.

I am also uncertain about how to interpret the
results when we also add the dd tests writing to
the /home area (no gluster, still on the same machine)

  * dd test without oflag=sync (rough average of
multiple tests)
  o gluster w/ fuse mount : 570 Mb/s
  o gluster w/ nfs mount: 390 Mb/s
  o nfs (no gluster):  1.2 Gb/s
  * dd test with oflag=sync (rough average of
multiple tests)
  o gluster w/ fuse mount:  5 Mb/s
  o gluster w/ nfs mount: 200 Mb/s
  o nfs (no gluster): 20 Mb/s

Given that the non-gluster area is a RAID-6 of 4
disks while each brick of the gluster area is a
RAID-6 of 32 disks, I would naively expect the
writes to the gluster area to be roughly 8x faster
than to the non-gluster.


I think a better test is to try and write to a file
using nfs without any gluster to a location that is not
inside the brick but someother location that is on same
disk(s). If you are mounting the partition as the
brick, then we can write to a file inside .glusterfs
directory, something like
/.glusterfs/.


I still think we have a speed issue, I can't tell
if fuse vs nfs is part of the problem.


I got interested in the post because I read that fuse
speed is lesser than nfs speed which is
counter-intuitive to my understanding. So wanted

[Gluster-users] Mount sometimes stops responding during server's MD RAID check sync_action

2017-05-16 Thread Jan Wrona

Hi,

I have three servers in the linked list topology [1], GlusterFS 3.8.10, 
CentOS 7. Each server has two bricks, both on the same XFS filesystem. 
The XFS is constructed over the whole MD RAID device:
md5 : active raid5 sdj1[6] sdh1[8] sde1[2] sdg1[9] sdd1[1] sdi1[5] 
sdf1[3] sdc1[0]
  6836411904 blocks super 1.2 level 5, 512k chunk, algorithm 2 
[8/8] []

  bitmap: 2/8 pages [8KB], 65536KB chunk

Everything works fine until one of the RAID devices starts its regular 
check. During the check, the client's mount sometimes completely stops 
responding. I'm mounting using the Pacemaker's Filesystem OCF RA [2] 
with OCF_CHECK_LEVEL=20, which basically tries to write a small status 
file to the filesystem every 2 minutes to see if its OK. But even this 
small write operation sometimes times out (2 minutes) during the check. 
Pacemaker then remounts the Gluster and everything goes back to normal.


I understand that the RAID check is draining a lot of I/O performance, 
but the underlying XFS remains responsive (of course it is slower, but 
by far not as much as Gluster). The check intervals on the servers are 
not overlapping. I've even decreased the 
/proc/sys/dev/raid/speed_limit_max from the default 200 MB/s to the 50 
MB/s, but it helped only a little, the mount still tends to freeze for a 
few seconds during the check.


What are your suggestions to solve this issue?

Regards,
Jan Wrona

[1] 
https://joejulian.name/blog/how-to-expand-glusterfs-replicated-clusters-by-one-server/
[2] 
https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/Filesystem


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] how to restore snapshot LV's

2017-05-16 Thread WoongHee Han
Hi, all!

I erased the VG having snapshot LV related to gluster volumes
and then, I tried to restore volume;

1. vgcreate vg_cluster /dev/sdb
2. lvcreate --size=10G --type=thin-pool -n tp_cluster vg_cluster
3. lvcreate -V 5G --thinpool vg_cluster/tp_cluster -n test_vol vg_cluster
4. gluster v stop test_vol
5. getfattr -n trusted.glusterfs.volume-id /volume/test_vol ( in other node)
6. setfattr -n trusted.glusterfs.volume-id -v
 0sKtUJWIIpTeKWZx+S5PyXtQ== /volume/test_vol (already mounted)
7. gluster v start test_vol
8. restart glusterd
9. lvcreate -s vg_cluster/test_vol --setactivationskip=n
--name 6564c50651484d09a36b912962c573df_0
10. lvcreate -s vg_cluster/test_vol --setactivationskip=n
--name ee8c32a1941e4aba91feab21fbcb3c6c_0
11. lvcreate -s vg_cluster/test_vol --setactivationskip=n
--name bf93dc34233646128f0c5f84c3ac1f83_0
12. reboot

It works, but bricks for snapshot is not working.

--
~]# glsuter snpshot status
Brick Path:   192.225.3.35:
/var/run/gluster/snaps/bf93dc34233646128f0c5f84c3ac1f83/brick1
Volume Group  :   vg_cluster
Brick Running :   No
Brick PID :   N/A
Data Percentage   :   0.22
LV Size   :   5.00g


Brick Path:   192.225.3.36:
/var/run/gluster/snaps/bf93dc34233646128f0c5f84c3ac1f83/brick2
Volume Group  :   vg_cluster
Brick Running :   No
Brick PID :   N/A
Data Percentage   :   0.22
LV Size   :   5.00g


Brick Path:   192.225.3.37:
/var/run/gluster/snaps/bf93dc34233646128f0c5f84c3ac1f83/brick3
Volume Group  :   vg_cluster
Brick Running :   No
Brick PID :   N/A
Data Percentage   :   0.22
LV Size   :   5.00g


Brick Path:   192.225.3.38:
/var/run/gluster/snaps/bf93dc34233646128f0c5f84c3ac1f83/brick4
Volume Group  :   vg_cluster
Brick Running :   Tes
Brick PID :   N/A
Data Percentage   :   0.22
LV Size   :   5.00g

~]# gluster snapshot deactivate t3_GMT-2017.05.15-08.01.37
Deactivating snap will make its data inaccessible. Do you want to continue?
(y/n) y
snapshot deactivate: failed: Pre Validation failed on 192.225.3.36.
Snapshot t3_GMT-2017.05.15-08.01.37 is already deactivated.
Snapshot command failed

~]# gluster snapshot activate t3_GMT-2017.05.15-08.01.37
snapshot activate: failed: Snapshot t3_GMT-2017.05.15-08.01.37 is already
activated

--


how to  restore snapshot LV's ?

my nodes consist of four nodes and  distributed, replicated (2x2)


thank you.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Geo-replication 'faulty' status during initial sync, 'Transport endpoint is not connected'

2017-05-16 Thread Tom Fite
Hi all,

I've hit a strange problem with geo-replication.

On gluster 3.10.1, I have set up geo replication between my replicated /
distributed instance and a remote replicated / distributed instance. The
master and slave instances are connected via VPN. Initially the
geo-replication setup was working fine, I had a status of "Active" with
"Changelog crawl" previously after the initial setup, and I confirmed that
files were synced between the two gluster instances.

Something must have changed between then and now, because about a week
after the instance had been online it switched to a "Faulty" status.

[root@master-gfs1 ~]# gluster volume geo-replication gv0
r...@slave-gfs1.tomfite.com::gv0 status

MASTER NODE   MASTER VOLMASTER BRICKSLAVE
USERSLAVE SLAVE NODE
STATUS CRAWL STATUSLAST_SYNCED
--
master-gfs1.tomfite.comgv0   /data/brick1/gv0root
 slave-gfs1.tomfite.com::gv0N/A  Faulty
N/A N/A
master-gfs1.tomfite.comgv0   /data/brick2/gv0root
 slave-gfs1.tomfite.com::gv0N/A  Faulty
N/A N/A
master-gfs1.tomfite.comgv0   /data/brick3/gv0root
 slave-gfs1.tomfite.com::gv0N/A  Faulty
N/A N/A
master-gfs2.tomfite.comgv0   /data/brick1/gv0root
 slave-gfs1.tomfite.com::gv0slave-gfs1.tomfite.comPassiveN/A
  N/A
master-gfs2.tomfite.comgv0   /data/brick2/gv0root
 slave-gfs1.tomfite.com::gv0slave-gfs1.tomfite.comPassiveN/A
  N/A
master-gfs2.tomfite.comgv0   /data/brick3/gv0root
 slave-gfs1.tomfite.com::gv0slave-gfs1.tomfite.comPassiveN/A
  N/A

>From the logs (see below) seems like there is an issue trying to sync files
to the slave, as I get a "Transport is not connected" error when gsyncd
attempts to sync the first set of files.

Here's what I've tried so far:

1. ssh_port is currently configured on a non-standard port. I switched the
port to the standard 22 but observed no change in behavior.
2. I verified that SELinux is disabled on all boxes, and that there are no
firewalls running.
3. The remote_gsyncd setting was set to "/nonexistent/gsyncd' which looked
incorrect, changed it to a valid location for that executable
/usr/libexec/glusterfs/gsyncd
4. In an attempt to start the slave from scratch, I removed all files from
the slave and reset the geo-replication instance by deleting and recreating
the session.

Debug logs when trying to start geo-replication:

[2017-05-15 16:31:32.940068] I [gsyncd(conf):689:main_i] : Config Set:
session-owner = d37a7455-0b1b-402e-985b-cf1ace4e513e
[2017-05-15 16:31:33.293926] D [monitor(monitor):434:distribute] :
master bricks: [{'host': 'master-gfs1.tomfite.com', 'uuid':
'e0d9d624-5383-4c43-aca4-e946e7de296d', 'dir': '/data/brick1/gv0'},
{'host': 'master-gfs2.tomfite.com', 'uuid':
'bdbb7a18-3ecf-4733-a5df-447d8c712af5', 'dir': '/data/brick1/gv0'},
{'host': 'master-gfs1.tomfite.com', 'uuid':
'e0d9d624-5383-4c43-aca4-e946e7de296d', 'dir': '/data/brick2/gv0'},
{'host': 'master-gfs2.tomfite.com', 'uuid':
'bdbb7a18-3ecf-4733-a5df-447d8c712af5', 'dir': '/data/brick2/gv0'},
{'host': 'master-gfs1.tomfite.com', 'uuid':
'e0d9d624-5383-4c43-aca4-e946e7de296d', 'dir': '/data/brick3/gv0'},
{'host': 'master-gfs2.tomfite.com', 'uuid':
'bdbb7a18-3ecf-4733-a5df-447d8c712af5', 'dir': '/data/brick3/gv0'}]
[2017-05-15 16:31:33.294250] D [monitor(monitor):443:distribute] :
slave SSH gateway: slave-gfs1.tomfite.com
[2017-05-15 16:31:33.424451] D [monitor(monitor):464:distribute] :
slave bricks: [{'host': 'slave-gfs1.tomfite.com', 'uuid':
'c184bc78-cff0-4cef-8c6a-e637ab52b324', 'dir': '/data/brick1/gv0'},
{'host': 'slave-gfs2.tomfite.com', 'uuid':
'7290f265-0709-45fc-86ef-2ff5125d31e1', 'dir': '/data/brick1/gv0'},
{'host': 'slave-gfs1.tomfite.com', 'uuid':
'c184bc78-cff0-4cef-8c6a-e637ab52b324', 'dir': '/data/brick2/gv0'},
{'host': 'slave-gfs2.tomfite.com', 'uuid':
'7290f265-0709-45fc-86ef-2ff5125d31e1', 'dir': '/data/brick2/gv0'},
{'host': 'slave-gfs1.tomfite.com', 'uuid':
'c184bc78-cff0-4cef-8c6a-e637ab52b324', 'dir': '/data/brick3/gv0'},
{'host': 'slave-gfs2.tomfite.com', 'uuid':
'7290f265-0709-45fc-86ef-2ff5125d31e1', 'dir': '/data/brick3/gv0'}]
[2017-05-15 16:31:33.424927] D [monitor(monitor):119:is_hot] Volinfo:
brickpath: 'master-gfs1.tomfite.com:/data/brick1/gv0'
[2017-05-15 16:31:33.425452] D [monitor(monitor):119:is_hot] Volinfo:
brickpath: 'master-gfs1.tomfite.com:/data/brick2/gv0'
[2017-05-15 16:31:33.425790] D [monitor(monitor):119:is_hot] Volinfo:
brickpath: 'master-gfs1.tomfite.com:/data/brick3/gv0'
[2017-05-15 16:31:33.426130] D 

[Gluster-users] 3.9.1 in docker: problems when one of peers is unavailable.

2017-05-16 Thread RafaƂ Radecki
Hi All.

I have a 9 node dockerized glusterfs cluster and I am seeing a situation
that:
1) docker daemon on 8th node failes and as a result glusterd on this node
is leaving the cluster
2) as a result on 1st node I see message about 8th node being unavailable:

[2017-05-15 12:48:22.142865] I [MSGID: 106004]
[glusterd-handler.c:5808:__glusterd_peer_rpc_notify] 0-management: Peer
<10.10.10.8> (<5cb55b7a-1e04-4fb8-bd1d-55ee647719d2>), in state , has disconnected from glusterd.
[2017-05-15 12:48:22.167746] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x2035a)
[0x7f7d9d62535a]
-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x29f48)
[0x7f7d9d62ef48] -->/usr/lib64/glus
terfs/3.9.1/xlator/mgmt/glusterd.so(+0xd50aa) [0x7f7d9d6da0aa] )
0-management: Lock for vol csv not held
[2017-05-15 12:48:22.167767] W [MSGID: 106118]
[glusterd-handler.c:5833:__glusterd_peer_rpc_notify] 0-management: Lock not
released for csv


and the gluster share is unavailable and when I try to list it I get:

Transport endpoint is not connected
3) then on 5th node I see message similar to 2) about 1st node being
unavailable and 5th also disconnects from the cluster

[2017-05-15 12:52:54.321189] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x2035a)
[0x7f7fda22335a]
-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x29f48)
[0x7f7fda22cf48] -->/usr/lib64/glus
terfs/3.9.1/xlator/mgmt/glusterd.so(+0xd50aa) [0x7f7fda2d80aa] )
0-management: Lock for vol csv not held

[2017-05-15 12:52:54.321200] W [MSGID: 106118]
[glusterd-handler.c:5833:__glusterd_peer_rpc_notify] 0-management: Lock not
released for csv

[2017-05-15 12:53:04.659418] E [socket.c:2307:socket_connect_finish]
0-management: connection to 10.10.10.:24007 failed (Connection refused)


I am quite new to gluster but as far as I see this is somewhat a chain in
which failure of 1st node leads to disconnect of two other nodes. Any hints
how to solve this? Are there any settings for retries/timeouts/reconnects
in gluster which could help in my case?

Thanks for all help!

BR,
Rafal.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users