Re: [Gluster-users] Gluster Startup Issue

2016-06-24 Thread Danny Lee
So I've tried using a lot of your script, but I'm still unable to get past
the "Launching heal operation to perform full self heal on volume 
has been unsuccessful on bricks that are down. Please check if all brick
processes are running." error message.  Everything else seems like it's
working, but the "gluster volume heal appian full" is never working.

Is there any way to figure out what exactly happened that would cause this
error message?  The logs don't seem very useful in determining what exactly
happened.  It seems to just state that it can't seem to "Commit" with the
other bricks.

When I restart the volume though, it sometimes fixes it, but not sure I
want to run a script that constantly restarts the volume until "gluster
volume heal appian full" is working.

On Thu, Jun 23, 2016 at 2:21 AM, Heiko L.  wrote:

>
> hostname not needed
>
> # nodea=10.1.1.100;bricka=/mnt/sda6/brick4
> should be working
>
> but I prefer like to work with hostnames.
>
>
> regards heiko
>
> PS i forgot notes:
> - xfs,zfs (ext3 work, but partially bad performance (V3.4))
> - brickdir should not be topdir of fs
>   /dev/sda6 /mnt/brick4, brick=/mnt/brick4 ->  not recommended
>   /dev/sda6 /mnt/sda6,   brick=/mnt/sda6/brick4 better
>
> > Thank you for responding, Heiko.  In the process of seeing the
> differences
> > between our two scripts.  First thing I noticed was that the notes
> states "need
> > to be defined in the /etc/hosts". Would using the IP address directly be
> a
> > problem?
> >
> > On Tue, Jun 21, 2016 at 2:10 PM, Heiko L.  wrote:
> >
> >> Am Di, 21.06.2016, 19:22 schrieb Danny Lee:
> >> > Hello,
> >> >
> >> >
> >> > We are currently figuring out how to add GlusterFS to our system to
> make
> >> > our systems highly available using scripts.  We are using Gluster
> 3.7.11.
> >> >
> >> > Problem:
> >> > Trying to migrate to GlusterFS from a non-clustered system to a 3-node
> >> > glusterfs replicated cluster using scripts.  Tried various things to
> >> make this work, but it sometimes causes us to be in an
> >> > indesirable state where if you call "gluster volume heal 
> >> full", we would get the error message, "Launching heal
> >> > operation to perform full self heal on volume  has been
> >> unsuccessful on bricks that are down. Please check if
> >> > all brick processes are running."  All the brick processes are running
> >> based on running the command, "gluster volume status
> >> > volname"
> >> >
> >> > Things we have tried:
> >> > Order of preference
> >> > 1. Create Volume with 3 Filesystems with the same data
> >> > 2. Create Volume with 2 Empty filesysytems and one with the data
> >> > 3. Create Volume with only one filesystem with data and then using
> >> > "add-brick" command to add the other two empty filesystems
> >> > 4. Create Volume with one empty filesystem, mounting it, and then
> copying
> >> > the data over to that one.  And then finally, using "add-brick"
> command
> >> to add the other two empty filesystems
> >> - should be working
> >> - read each file on /mnt/gvol, to trigger replication [2]
> >>
> >> > 5. Create Volume
> >> > with 3 empty filesystems, mounting it, and then copying the data over
> >> - my favorite
> >>
> >> >
> >> > Other things to note:
> >> > A few minutes after the volume is created and started successfully,
> our
> >> > application server starts up against it, so reads and writes may
> happen
> >> pretty quickly after the volume has started.  But there
> >> > is only about 50MB of data.
> >> >
> >> > Steps to reproduce (all in a script):
> >> > # This is run by the primary node with the IP Adress, ,
> that
> >> > has data systemctl restart glusterd gluster peer probe 
> >> gluster peer probe  Wait for "gluster peer
> >> > status" to all be in "Peer in Cluster" state gluster volume create
> >>  replica 3 transport tcp ${BRICKS[0]} ${BRICKS[1]}
> >> > ${BRICKS[2]} force
> >> > gluster volume set  nfs.disable true gluster volume start
> >>  mkdir -p $MOUNT_POINT mount -t glusterfs
> >> > :/volname $MOUNT_POINT
> >> > find $MOUNT_POINT | xargs stat
> >>
> >> I have written a script for 2 nodes. [1]
> >> but should be at least 3 nodes.
> >>
> >>
> >> I hope it helps you
> >> regards Heiko
> >>
> >> >
> >> > Note that, when we added sleeps around the gluster commands, there
> was a
> >> > higher probability of success, but not 100%.
> >> >
> >> > # Once volume is started, all the the clients/servers will mount the
> >> > gluster filesystem by polling "mountpoint -q $MOUNT_POINT": mkdir -p
> >> $MOUNT_POINT mount -t glusterfs :/volname
> >> > $MOUNT_POINT
> >> >
> >> >
> >> > Logs:
> >> > *etc-glusterfs-glusterd.vol.log* in *server-ip-1*
> >> >
> >> >
> >> > [2016-06-21 14:10:38.285234] I [MSGID: 106533]
> >> > [glusterd-volume-ops.c:857:__glusterd_handle_cli_heal_volume]
> >> 0-management:
> >> > Received heal vol req for volume volname
> >> > [2016-06-21 14:10:38.296801] E [MSGID: 106153]
> >> > 

Re: [Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting

2016-06-24 Thread Vijay Bellur
On Fri, Jun 24, 2016 at 5:27 PM, Alastair Neil  wrote:
> Did I miss something in the release notes because surely this is an
> important compatibility issue.  As far as I can see Fedora have not provided
> 3.7.11 clients for F24 so I will probably have to faff about with installing
> the F23 rpms, as I'm not sure I want to be forced to upgrade my cluster
> servers driven by some workstation OS release schedule.
>

This is due to a bug that has to be addressed and is not an intended
compatibility break. Expediting 3.8.1 to address this seems to be the
way out here.

Regards,
Vijay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Small files performance

2016-06-24 Thread Luciano Giacchetta
About 40~60 MB/s with a 30% IOWait...

--
Saludos, LG

On Wed, Jun 22, 2016 at 10:04 AM, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> Il 21 giu 2016 19:02, "Luciano Giacchetta"  ha
> scritto:
> >
> > Hi,
> >
> > I have similar scenario, for a cars classified with millions of small
> files, mounted with gluster native client in a replica config.
> > The gluster server has 16gb RAM and 4 cores and mount the glusterfs with
> direct-io-mode=enable. Then i export to all servers ( windows included with
> CIFS )
> >
> > performance.cache-refresh-timeout: 60
> > performance.read-ahead: enable
> > performance.write-behind-window-size: 4MB
> > performance.io-thread-count: 64
> > performance.cache-size: 12GB
> > performance.quick-read: on
> > performance.flush-behind: on
> > performance.write-behind: on
> > nfs.disable: on
>
> Which performance are you getting?
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Small files performance

2016-06-24 Thread Luciano Giacchetta
This is my fstab

localhost:/root /mnt/root glusterfs defaults,*direct-io-mode=enable* 0 0

--
Saludos, LG

On Wed, Jun 22, 2016 at 9:49 AM, ML mail  wrote:

> Luciano, how do you enable direct-io-mode?
>
>
> On Wednesday, June 22, 2016 7:09 AM, Luciano Giacchetta <
> ldgiacche...@gmail.com> wrote:
>
>
> Hi,
>
> I have similar scenario, for a cars classified with millions of small
> files, mounted with gluster native client in a replica config.
> The gluster server has 16gb RAM and 4 cores and mount the glusterfs with
> direct-io-mode=enable. Then i export to all servers ( windows included with
> CIFS )
>
> performance.cache-refresh-timeout: 60
> performance.read-ahead: enable
> performance.write-behind-window-size: 4MB
> performance.io-thread-count: 64
> performance.cache-size: 12GB
> performance.quick-read: on
> performance.flush-behind: on
> performance.write-behind: on
> nfs.disable: on
>
>
> --
> Saludos, LG
>
> On Sat, May 28, 2016 at 6:46 AM, Gandalf Corvotempesta <
> gandalf.corvotempe...@gmail.com> wrote:
>
> if i remember properly, each stat() on a file needs to be sent to all host
> in replica to check if are in sync
> Is this true for both gluster native client and nfs ganesha?
> Which is the best for a shared hosting storage with many millions of small
> files? About 15.000.000 small files in 800gb ? Or even for Maildir hosting
> Ganesha can be configured for HA and loadbalancing so the biggest issue
> that was present in standard NFS now is gone
> Any advantage about native gluster over Ganesha? Removing the fuse
> requirement should also be a performance advantage for Ganesha over native
> client
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting

2016-06-24 Thread Vijay Bellur

On 06/24/2016 02:12 PM, Alastair Neil wrote:

I upgraded my fedora 23 system to f24 a couple of days ago, now I am
unable to mount my gluster cluster.

The update installed:

glusterfs-3.8.0-1.fc24.x86_64
glusterfs-libs-3.8.0-1.fc24.x86_64
glusterfs-fuse-3.8.0-1.fc24.x86_64
glusterfs-client-xlators-3.8.0-1.fc24.x86_64

the gluster is running 3.7.11

The volume is replica 3

I see these errors in the mount log:

[2016-06-24 17:55:34.016462] I [MSGID: 100030]
[glusterfsd.c:2408:main] 0-/usr/sbin/glusterfs: Started running
/usr/sbin/glusterfs version 3.8.0 (args: /usr/sbin/glusterfs
--volfile-server=gluster1 --volfile-id=homes /mnt/homes)
[2016-06-24 17:55:34.094345] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
thread with index 1
[2016-06-24 17:55:34.240135] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
thread with index 2
[2016-06-24 17:55:34.240130] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
thread with index 4
[2016-06-24 17:55:34.240130] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started
thread with index 3
[2016-06-24 17:55:34.241499] I [MSGID: 114020]
[client.c:2356:notify] 0-homes-client-2: parent translators are
ready, attempting connect on transport
[2016-06-24 17:55:34.249172] I [MSGID: 114020]
[client.c:2356:notify] 0-homes-client-5: parent translators are
ready, attempting connect on transport
[2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
0-homes-client-2: changing port to 49171 (from 0)
[2016-06-24 17:55:34.253347] I [MSGID: 114020]
[client.c:2356:notify] 0-homes-client-6: parent translators are
ready, attempting connect on transport
[2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
0-homes-client-5: changing port to 49154 (from 0)
[2016-06-24 17:55:34.255115] I [MSGID: 114057]
[client-handshake.c:1441:select_server_supported_programs]
0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2016-06-24 17:55:34.255861] W [MSGID: 114007]
[client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2:
failed to find key 'child_up' in the options
[2016-06-24 17:55:34.259097] I [MSGID: 114057]
[client-handshake.c:1441:select_server_supported_programs]
0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
Final graph:

+--+
  1: volume homes-client-2
  2: type protocol/client
  3: option clnt-lk-version 1
  4: option volfile-checksum 0
  5: option volfile-key homes
  6: option client-version 3.8.0
  7: option process-uuid
Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0
  8: option fops-version 1298437
  9: option ping-timeout 20
 10: option remote-host gluster-2
 11: option remote-subvolume /export/brick2/home
 12: option transport-type socket
 13: option event-threads 4
 14: option send-gids true
 15: end-volume
 16:
 17: volume homes-client-5
 18: type protocol/client
 19: option clnt-lk-version 1
 20: option volfile-checksum 0
 21: option volfile-key homes
 22: option client-version 3.8.0
 23: option process-uuid
Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0
 24: option fops-version 1298437
 25: option ping-timeout 20
 26: option remote-host gluster1.vsnet.gmu.edu

 27: option remote-subvolume /export/brick2/home
 28: option transport-type socket
 29: option event-threads 4
 30: option send-gids true
 31: end-volume
 32:
 33: volume homes-client-6
 34: type protocol/client
 35: option ping-timeout 20
 36: option remote-host gluster0
 37: option remote-subvolume /export/brick2/home
 38: option transport-type socket
 39: option event-threads 4
 40: option send-gids true
 41: end-volume
 42:
 43: volume homes-replicate-0
 44: type cluster/replicate
 45: option background-self-heal-count 20
 46: option metadata-self-heal on
 47: option data-self-heal off
 48: option entry-self-heal on
 49: option data-self-heal-window-size 8
 50: option data-self-heal-algorithm diff
 51: option eager-lock on
 52: option quorum-type auto
 53: option self-heal-readdir-size 64KB
 54: subvolumes homes-client-2 homes-client-5 homes-client-6
 55: end-volume
 56:
 57: volume homes-dht
 58: type cluster/distribute
 59: option min-free-disk 5%
 60: option rebalance-stats on
 

Re: [Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting

2016-06-24 Thread Manuel Padrón Martínez
Are you using nfs? by default 3.8 has nfs disabled. 
https://www.gluster.org/community/roadmap/3.8/ 

Manuel Padrón Martínez 


De: "Alastair Neil"  
Para: "gluster-users"  
Enviados: Viernes, 24 de Junio 2016 19:12:11 
Asunto: [Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke 
mounting 

I upgraded my fedora 23 system to f24 a couple of days ago, now I am unable to 
mount my gluster cluster. 

The update installed: 

glusterfs-3.8.0-1.fc24.x86_64 
glusterfs-libs-3.8.0-1.fc24.x86_64 
glusterfs-fuse-3.8.0-1.fc24.x86_64 
glusterfs-client-xlators-3.8.0-1.fc24.x86_64 

the gluster is running 3.7.11 

The volume is replica 3 

I see these errors in the mount log: 



[2016-06-24 17:55:34.016462] I [MSGID: 100030] [glusterfsd.c:2408:main] 
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.8.0 (args: 
/usr/sbin/glusterfs --volfile-server=gluster1 --volfile-id=homes /mnt/homes) 
[2016-06-24 17:55:34.094345] I [MSGID: 101190] 
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with 
index 1 
[2016-06-24 17:55:34.240135] I [MSGID: 101190] 
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with 
index 2 
[2016-06-24 17:55:34.240130] I [MSGID: 101190] 
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with 
index 4 
[2016-06-24 17:55:34.240130] I [MSGID: 101190] 
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with 
index 3 
[2016-06-24 17:55:34.241499] I [MSGID: 114020] [client.c:2356:notify] 
0-homes-client-2: parent translators are ready, attempting connect on transport 
[2016-06-24 17:55:34.249172] I [MSGID: 114020] [client.c:2356:notify] 
0-homes-client-5: parent translators are ready, attempting connect on transport 
[2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig] 
0-homes-client-2: changing port to 49171 (from 0) 
[2016-06-24 17:55:34.253347] I [MSGID: 114020] [client.c:2356:notify] 
0-homes-client-6: parent translators are ready, attempting connect on transport 
[2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig] 
0-homes-client-5: changing port to 49154 (from 0) 
[2016-06-24 17:55:34.255115] I [MSGID: 114057] 
[client-handshake.c:1441:select_server_supported_programs] 0-homes-client-2: 
Using Program GlusterFS 3.3, Num (1298437), Version (330) 
[2016-06-24 17:55:34.255861] W [MSGID: 114007] 
[client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2: failed to find 
key 'child_up' in the options 
[2016-06-24 17:55:34.259097] I [MSGID: 114057] 
[client-handshake.c:1441:select_server_supported_programs] 0-homes-client-5: 
Using Program GlusterFS 3.3, Num (1298437), Version (330) 
Final graph: 
+--+
 
1: volume homes-client-2 
2: type protocol/client 
3: option clnt-lk-version 1 
4: option volfile-checksum 0 
5: option volfile-key homes 
6: option client-version 3.8.0 
7: option process-uuid 
Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0 
8: option fops-version 1298437 
9: option ping-timeout 20 
10: option remote-host gluster-2 
11: option remote-subvolume /export/brick2/home 
12: option transport-type socket 
13: option event-threads 4 
14: option send-gids true 
15: end-volume 
16: 
17: volume homes-client-5 
18: type protocol/client 
19: option clnt-lk-version 1 
20: option volfile-checksum 0 
21: option volfile-key homes 
22: option client-version 3.8.0 
23: option process-uuid 
Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0 
24: option fops-version 1298437 
25: option ping-timeout 20 
26: option remote-host gluster1.vsnet.gmu.edu 
27: option remote-subvolume /export/brick2/home 
28: option transport-type socket 
29: option event-threads 4 
30: option send-gids true 
31: end-volume 
32: 
33: volume homes-client-6 
34: type protocol/client 
35: option ping-timeout 20 
36: option remote-host gluster0 
37: option remote-subvolume /export/brick2/home 
38: option transport-type socket 
39: option event-threads 4 
40: option send-gids true 
41: end-volume 
42: 
43: volume homes-replicate-0 
44: type cluster/replicate 
45: option background-self-heal-count 20 
46: option metadata-self-heal on 
47: option data-self-heal off 
48: option entry-self-heal on 
49: option data-self-heal-window-size 8 
50: option data-self-heal-algorithm diff 
51: option eager-lock on 
52: option quorum-type auto 
53: option self-heal-readdir-size 64KB 
54: subvolumes homes-client-2 homes-client-5 homes-client-6 
55: end-volume 
56: 
57: volume homes-dht 
58: type cluster/distribute 
59: option min-free-disk 5% 
60: option rebalance-stats on 
61: option readdir-optimize on 
62: subvolumes homes-replicate-0 
63: end-volume 
64: 
65: volume homes-read-ahead 
66: type performance/read-ahead 
67: subvolumes homes-dht 
68: end-volume 
69: 
70: volume homes-io-cache 
71: type performance/io-cache 
72: subvolumes homes-read-ahead 

Re: [Gluster-users] setfacl: Operation not supported

2016-06-24 Thread Evans, Kyle

Hi Jiffin,

Thanks for the help.  You understand correctly, I am talking about the client.  
The problem is intermittent, and those lines DO appear in the log when it works 
but DO NOT appear in the log when it is broken.  Also, here is another log I am 
getting that may be relevant:

[2016-06-13 17:39:33.128941] I [dict.c:473:dict_get] 
(-->/usr/lib64/glusterfs/3.7.5/xlator/system/posix-acl.so(posix_acl_setxattr_cbk+0x26)
 [0x7effdbdfb3a6] 
-->/usr/lib64/glusterfs/3.7.5/xlator/system/posix-acl.so(handling_other_acl_related_xattr+0x22)
 [0x7effdbdfb2a2] -->/lib64/libglusterfs.so.0(dict_get+0xac) [0x7effef3e80cc] ) 
0-dict: !this || key=system.posix_acl_access [Invalid argument]


Thanks,

Kyle

From: Jiffin Tony Thottan
Date: Friday, June 24, 2016 at 2:17 AM
To: Kyle Evans, "gluster-users@gluster.org"
Subject: Re: [Gluster-users] setfacl: Operation not supported



On 24/06/16 02:08, Evans, Kyle wrote:
I'm using gluster 3.7.5-19 on RHEL 7.2  Gluster periodically stops allowing 
ACLs.  I have it configured in fstab like this:

Server.example.com:/dir /mnt glusterfs defaults,_netdev,acl 0 0


Also, the bricks are XFS.

It usually works fine, but sometimes after a reboot, one of the nodes won't 
allow acl operations like setfacl and getfacl.  They give the error "Operation 
not supported".

Did u meant client reboot ?

Correct me if I am wrong,

You have mounted the glusterfs volume with acl enabled and configured in fstab

When you reboot client, acl operations are returning error as "Operation not 
supported".

Can please follow the steps if possible
after mounting can check the client log (in your example it should be 
/var/log/glusterfs/mnt.log)
and confirm whether following block is present in the vol graph
"volume posix-acl-autoload
type system/posix-acl
subvolumes dir
end-volume"

Clear the log file before reboot and just check whether same block is present 
after reboot

--
Jiffin

Sometimes it's not even after a reboot; it just stops supporting it.

If I unmount and remount, it starts working again.  Does anybody have any 
insight?

Thanks,

Kyle



___
Gluster-users mailing list
Gluster-users@gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Fedora upgrade to f24 installed 3.8.0 client and broke mounting

2016-06-24 Thread Alastair Neil
I upgraded my fedora 23 system to f24 a couple of days ago, now I am unable
to mount my gluster cluster.

The update installed:

glusterfs-3.8.0-1.fc24.x86_64
glusterfs-libs-3.8.0-1.fc24.x86_64
glusterfs-fuse-3.8.0-1.fc24.x86_64
glusterfs-client-xlators-3.8.0-1.fc24.x86_64

the gluster is running 3.7.11

The volume is replica 3

I see these errors in the mount log:

[2016-06-24 17:55:34.016462] I [MSGID: 100030] [glusterfsd.c:2408:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.8.0
(args: /usr/sbin/glusterfs --volfile-server=gluster1 --volfile-id=homes
/mnt/homes)
[2016-06-24 17:55:34.094345] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 1
[2016-06-24 17:55:34.240135] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 2
[2016-06-24 17:55:34.240130] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 4
[2016-06-24 17:55:34.240130] I [MSGID: 101190]
[event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 3
[2016-06-24 17:55:34.241499] I [MSGID: 114020] [client.c:2356:notify]
0-homes-client-2: parent translators are ready, attempting connect on
transport
[2016-06-24 17:55:34.249172] I [MSGID: 114020] [client.c:2356:notify]
0-homes-client-5: parent translators are ready, attempting connect on
transport
[2016-06-24 17:55:34.250186] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
0-homes-client-2: changing port to 49171 (from 0)
[2016-06-24 17:55:34.253347] I [MSGID: 114020] [client.c:2356:notify]
0-homes-client-6: parent translators are ready, attempting connect on
transport
[2016-06-24 17:55:34.254213] I [rpc-clnt.c:1855:rpc_clnt_reconfig]
0-homes-client-5: changing port to 49154 (from 0)
[2016-06-24 17:55:34.255115] I [MSGID: 114057]
[client-handshake.c:1441:select_server_supported_programs]
0-homes-client-2: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2016-06-24 17:55:34.255861] W [MSGID: 114007]
[client-handshake.c:1176:client_setvolume_cbk] 0-homes-client-2: failed to
find key 'child_up' in the options
[2016-06-24 17:55:34.259097] I [MSGID: 114057]
[client-handshake.c:1441:select_server_supported_programs]
0-homes-client-5: Using Program GlusterFS 3.3, Num (1298437), Version (330)
Final graph:
+--+
  1: volume homes-client-2
  2: type protocol/client
  3: option clnt-lk-version 1
  4: option volfile-checksum 0
  5: option volfile-key homes
  6: option client-version 3.8.0
  7: option process-uuid
Island-29185-2016/06/24-17:55:34:10054-homes-client-2-0-0
  8: option fops-version 1298437
  9: option ping-timeout 20
 10: option remote-host gluster-2
 11: option remote-subvolume /export/brick2/home
 12: option transport-type socket
 13: option event-threads 4
 14: option send-gids true
 15: end-volume
 16:
 17: volume homes-client-5
 18: type protocol/client
 19: option clnt-lk-version 1
 20: option volfile-checksum 0
 21: option volfile-key homes
 22: option client-version 3.8.0
 23: option process-uuid
Island-29185-2016/06/24-17:55:34:10054-homes-client-5-0-0
 24: option fops-version 1298437
 25: option ping-timeout 20
 26: option remote-host gluster1.vsnet.gmu.edu
 27: option remote-subvolume /export/brick2/home
 28: option transport-type socket
 29: option event-threads 4
 30: option send-gids true
 31: end-volume
 32:
 33: volume homes-client-6
 34: type protocol/client
 35: option ping-timeout 20
 36: option remote-host gluster0
 37: option remote-subvolume /export/brick2/home
 38: option transport-type socket
 39: option event-threads 4
 40: option send-gids true
 41: end-volume
 42:
 43: volume homes-replicate-0
 44: type cluster/replicate
 45: option background-self-heal-count 20
 46: option metadata-self-heal on
 47: option data-self-heal off
 48: option entry-self-heal on
 49: option data-self-heal-window-size 8
 50: option data-self-heal-algorithm diff
 51: option eager-lock on
 52: option quorum-type auto
 53: option self-heal-readdir-size 64KB
 54: subvolumes homes-client-2 homes-client-5 homes-client-6
 55: end-volume
 56:
 57: volume homes-dht
 58: type cluster/distribute
 59: option min-free-disk 5%
 60: option rebalance-stats on
 61: option readdir-optimize on
 62: subvolumes homes-replicate-0
 63: end-volume
 64:
 65: volume homes-read-ahead
 66: type performance/read-ahead
 67: subvolumes homes-dht
 68: end-volume
 69:
 70: volume homes-io-cache
 71: type performance/io-cache
 72: subvolumes homes-read-ahead
 73: end-volume
 74:
 75: volume homes-quick-read
 76: type performance/quick-read
 77: subvolumes homes-io-cache
 78: end-volume
 79:
 80: volume homes-open-behind
 81: type 

Re: [Gluster-users] [Gluster-devel] Fuse client hangs on doing multithreading IO tests

2016-06-24 Thread FNU Raghavendra Manjunath
Hi,

Any idea how big were the files that were being read?

Can you please attach the logs from all the gluster server and client
nodes? (the logs can be found in /var/log/glusterfs)

Also please provide the /var/log/messages from all the server and client
nodes.

Regards,
Raghavendra


On Fri, Jun 24, 2016 at 10:32 AM, 冷波  wrote:

> Hi,
>
>
> We found a problem when doing traffic tests. We created a replicated
> volume with two storage nodes (CentOS 6.5). There was one FUSE client
> (CentOS 6.7) which did multi-threading reads and writes. Most of IOs are
> reads for big files. All machines used 10Gbe NICs. And the typical read
> throught was 4-6Gbps (0.5-1.5GB/s).
>
>
> After the test ran several minutes, the test program hung. The throughput
> suddenly dropped to zero. Then there was no traffic any more. If we ran df,
> df would hang, too. But we could still read or write the volume from other
> clients.
>
>
> We tried several GlusterFS version from 3.7.5 to 3.8.0. Each version had
> this problem. We also tried to restore default GlusterFS options, but the
> problem still existed.
>
>
> The GlusterFS version was 3.7.11 for the following stacks.
>
>
> This was the stack of dd when hanging:
>
> [] wait_answer_interruptible+0x81/0xc0 [fuse]
>
> [] __fuse_request_send+0x1db/0x2b0 [fuse]
>
> [] fuse_request_send+0x12/0x20 [fuse]
>
> [] fuse_statfs+0xda/0x150 [fuse]
>
> [] statfs_by_dentry+0x74/0xa0
>
> [] vfs_statfs+0x1b/0xb0
>
> [] user_statfs+0x47/0xb0
>
> [] sys_statfs+0x2a/0x50
>
> [] system_call_fastpath+0x16/0x1b
>
> [] 0x
>
>
> This was the stack of gluster:
>
> [] futex_wait_queue_me+0xba/0xf0
>
> [] futex_wait+0x1c0/0x310
>
> [] do_futex+0x121/0xae0
>
> [] sys_futex+0x7b/0x170
>
> [] system_call_fastpath+0x16/0x1b
>
> [] 0x
>
>
> This was the stack of the test program:
>
> [] hrtimer_nanosleep+0xc4/0x180
>
> [] sys_nanosleep+0x6e/0x80
>
> [] system_call_fastpath+0x16/0x1b
>
> [] 0x
>
>
> Any clue?
>
> Thanks,
> Paul
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Fuse client hangs on doing multithreading IO tests

2016-06-24 Thread 冷波
Hi,


We found a problem when doing traffic tests. We created a replicated volume 
with two storage nodes (CentOS 6.5). There was one FUSE client (CentOS 6.7) 
which did multi-threading reads and writes. Most of IOs are reads for big 
files. All machines used 10Gbe NICs. And the typical read throught was 4-6Gbps 
(0.5-1.5GB/s).


After the test ran several minutes, the test program hung. The throughput 
suddenly dropped to zero. Then there was no traffic any more. If we ran df, df 
would hang, too. But we could still read or write the volume from other clients.


We tried several GlusterFS version from 3.7.5 to 3.8.0. Each version had this 
problem. We also tried to restore default GlusterFS options, but the problem 
still existed.


The GlusterFS version was 3.7.11 for the following stacks.


This was the stack of dd when hanging:
[a046d211] wait_answer_interruptible+0x81/0xc0 [fuse]
[a046d42b] __fuse_request_send+0x1db/0x2b0 [fuse]
[a046d512] fuse_request_send+0x12/0x20 [fuse]
[a0477d4a] fuse_statfs+0xda/0x150 [fuse]
[811c2b64] statfs_by_dentry+0x74/0xa0
[811c2c9b] vfs_statfs+0x1b/0xb0
[811c2e97] user_statfs+0x47/0xb0
[811c2f9a] sys_statfs+0x2a/0x50
[8100b072] system_call_fastpath+0x16/0x1b
[] 0x


This was the stack of gluster:
[810b226a] futex_wait_queue_me+0xba/0xf0
[810b33a0] futex_wait+0x1c0/0x310
[810b4c91] do_futex+0x121/0xae0
[810b56cb] sys_futex+0x7b/0x170
[8100b072] system_call_fastpath+0x16/0x1b
[] 0x


This was the stack of the test program:
[810a3f74] hrtimer_nanosleep+0xc4/0x180
[810a409e] sys_nanosleep+0x6e/0x80
[8100b072] system_call_fastpath+0x16/0x1b
[] 0x


Any clue?


Thanks,
Paul___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] .shard being healed?

2016-06-24 Thread Lindsay Mathieson
VNA:
getfattr -d -m . -e hex /tank/vmdata/datastore4/.shard
getfattr: Removing leading '/' from absolute path names
# file: tank/vmdata/datastore4/.shard
trusted.afr.datastore4-client-0=0x
trusted.afr.dirty=0x0031
trusted.gfid=0xbe318638e8a04c6d977d7a937aa84806
trusted.glusterfs.dht=0x0001


VNB:
getfattr -d -m . -e hex /tank/vmdata/datastore4/.shard
getfattr: Removing leading '/' from absolute path names
# file: tank/vmdata/datastore4/.shard
trusted.afr.datastore4-client-0=0x
trusted.afr.dirty=0x
trusted.gfid=0xbe318638e8a04c6d977d7a937aa84806
trusted.glusterfs.dht=0x0001


VNG
getfattr -d -m . -e hex /tank/vmdata/datastore4/.shard
getfattr: Removing leading '/' from absolute path names
# file: tank/vmdata/datastore4/.shard
trusted.afr.datastore4-client-0=0x
trusted.afr.dirty=0x0031
trusted.gfid=0xbe318638e8a04c6d977d7a937aa84806
trusted.glusterfs.dht=0x0001


Also a updated heal info. I'd restarted VM's so there is ongoing io
which always results in transient shard listings, but the .shard entry
was still there

gluster v heal datastore4 info
Brick vnb.proxmox.softlog:/tank/vmdata/datastore4
/.shard/6559b07f-51f3-487d-a710-6acee4ec452a.2
/.shard/cfdf3ba9-1ae7-492a-a0ad-d6c529e9fb30.2131
/.shard/6633b047-bb28-471e-890a-94dd0d3b8e85.1405
/.shard/cfdf3ba9-1ae7-492a-a0ad-d6c529e9fb30.784
/.shard/2bcfb707-74a4-4e33-895c-3721d137fe5a.47
/.shard/cfdf3ba9-1ae7-492a-a0ad-d6c529e9fb30.2060
/.shard/2bcfb707-74a4-4e33-895c-3721d137fe5a.63
/.shard/cfdf3ba9-1ae7-492a-a0ad-d6c529e9fb30.2059
/.shard/2bcfb707-74a4-4e33-895c-3721d137fe5a.48
/.shard/6633b047-bb28-471e-890a-94dd0d3b8e85.1096
/.shard/6cd24745-055a-49fb-8aab-b9ac0d6a0285.47
/.shard/2bcfb707-74a4-4e33-895c-3721d137fe5a.399
Status: Connected
Number of entries: 12

Brick vng.proxmox.softlog:/tank/vmdata/datastore4
/.shard/cfdf3ba9-1ae7-492a-a0ad-d6c529e9fb30.2247
/.shard/2bcfb707-74a4-4e33-895c-3721d137fe5a.49
/.shard/6cd24745-055a-49fb-8aab-b9ac0d6a0285.55
/.shard/cfdf3ba9-1ae7-492a-a0ad-d6c529e9fb30.2076
/.shard/007c8fcb-49ba-4e7e-b744-4e3768ac6bf6.569
/.shard/007c8fcb-49ba-4e7e-b744-4e3768ac6bf6.48
/.shard/6633b047-bb28-471e-890a-94dd0d3b8e85.1096
/.shard/007c8fcb-49ba-4e7e-b744-4e3768ac6bf6.568
/.shard/cfdf3ba9-1ae7-492a-a0ad-d6c529e9fb30.997
/.shard - Possibly undergoing heal

/.shard/b2996a69-f629-4425-9098-e62c25d9f033.47
/.shard/007c8fcb-49ba-4e7e-b744-4e3768ac6bf6.47
/.shard/2bcfb707-74a4-4e33-895c-3721d137fe5a.47
/.shard/007c8fcb-49ba-4e7e-b744-4e3768ac6bf6.1
/.shard/cfdf3ba9-1ae7-492a-a0ad-d6c529e9fb30.784
Status: Connected
Number of entries: 15

Brick vna.proxmox.softlog:/tank/vmdata/datastore4
/.shard/cfdf3ba9-1ae7-492a-a0ad-d6c529e9fb30.2133
/.shard/cfdf3ba9-1ae7-492a-a0ad-d6c529e9fb30.1681
/.shard/6633b047-bb28-471e-890a-94dd0d3b8e85.1444
/.shard/6633b047-bb28-471e-890a-94dd0d3b8e85.968
/.shard/2bcfb707-74a4-4e33-895c-3721d137fe5a.48
/.shard/6633b047-bb28-471e-890a-94dd0d3b8e85.1409
/.shard - Possibly undergoing heal

/.shard/007c8fcb-49ba-4e7e-b744-4e3768ac6bf6.261
/.shard/6cd24745-055a-49fb-8aab-b9ac0d6a0285.50
/.shard/007c8fcb-49ba-4e7e-b744-4e3768ac6bf6.2
Status: Connected
Number of entries: 10

thanks,


On 24 June 2016 at 18:43, Krutika Dhananjay  wrote:
> Could you share the output of
> getfattr -d -m . -e hex 
>
> from all of the bricks associated with datastore4?
>
> -Krutika
>
> On Fri, Jun 24, 2016 at 2:04 PM, Lindsay Mathieson
>  wrote:
>>
>> What does this mean?
>>
>> gluster v heal datastore4 info
>> Brick vnb.proxmox.softlog:/tank/vmdata/datastore4
>> Status: Connected
>> Number of entries: 0
>>
>> Brick vng.proxmox.softlog:/tank/vmdata/datastore4
>> /.shard - Possibly undergoing heal
>>
>> Status: Connected
>> Number of entries: 1
>>
>> Brick vna.proxmox.softlog:/tank/vmdata/datastore4
>> /.shard - Possibly undergoing heal
>>
>> Status: Connected
>> Number of entries: 1
>>
>> All activity on the cluster has been shut down, no I/O, but its been
>> sitting like this for a few minutes.
>>
>> Gluster 3.7.11
>>
>> --
>> Lindsay
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>



-- 
Lindsay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] .shard being healed?

2016-06-24 Thread Krutika Dhananjay
Could you share the output of
getfattr -d -m . -e hex 

from all of the bricks associated with datastore4?

-Krutika

On Fri, Jun 24, 2016 at 2:04 PM, Lindsay Mathieson <
lindsay.mathie...@gmail.com> wrote:

> What does this mean?
>
> gluster v heal datastore4 info
> Brick vnb.proxmox.softlog:/tank/vmdata/datastore4
> Status: Connected
> Number of entries: 0
>
> Brick vng.proxmox.softlog:/tank/vmdata/datastore4
> /.shard - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 1
>
> Brick vna.proxmox.softlog:/tank/vmdata/datastore4
> /.shard - Possibly undergoing heal
>
> Status: Connected
> Number of entries: 1
>
> All activity on the cluster has been shut down, no I/O, but its been
> sitting like this for a few minutes.
>
> Gluster 3.7.11
>
> --
> Lindsay
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] .shard being healed?

2016-06-24 Thread Lindsay Mathieson
What does this mean?

gluster v heal datastore4 info
Brick vnb.proxmox.softlog:/tank/vmdata/datastore4
Status: Connected
Number of entries: 0

Brick vng.proxmox.softlog:/tank/vmdata/datastore4
/.shard - Possibly undergoing heal

Status: Connected
Number of entries: 1

Brick vna.proxmox.softlog:/tank/vmdata/datastore4
/.shard - Possibly undergoing heal

Status: Connected
Number of entries: 1

All activity on the cluster has been shut down, no I/O, but its been
sitting like this for a few minutes.

Gluster 3.7.11

-- 
Lindsay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] setfacl: Operation not supported

2016-06-24 Thread Jiffin Tony Thottan



On 24/06/16 02:08, Evans, Kyle wrote:
I'm using gluster 3.7.5-19 on RHEL 7.2  Gluster periodically stops 
allowing ACLs.  I have it configured in fstab like this:


Server.example.com:/dir /mnt glusterfs defaults,_netdev,acl 0 0


Also, the bricks are XFS.

It usually works fine, but sometimes after a reboot, one of the nodes 
won't allow acl operations like setfacl and getfacl.  They give the 
error "Operation not supported".



Did u meant client reboot ?

Correct me if I am wrong,

You have mounted the glusterfs volume with acl enabled and configured in 
fstab


When you reboot client, acl operations are returning error as "Operation 
not supported".


Can please follow the steps if possible
after mounting can check the client log (in your example it should be 
/var/log/glusterfs/mnt.log)

and confirm whether following block is present in the vol graph
"volume posix-acl-autoload
type system/posix-acl
subvolumes dir
end-volume"

Clear the log file before reboot and just check whether same block is 
present after reboot


--
Jiffin


Sometimes it's not even after a reboot; it just stops supporting it.

If I unmount and remount, it starts working again.  Does anybody have 
any insight?


Thanks,

Kyle


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users