[Gluster-users] Monitorig gluster 3.6.1

2015-06-01 Thread Félix de Lelelis
Hi,

I have monitoring gluster with scripts that lunch scripts. All scripts are
redirected to a one script that check if is active any process glusterd and
if the repsonse its false, the script lunch the check.

All checks are:

   - gluster volume volname info
   - gluster volume heal volname info
   - gluster volume heal volname split-brain
   - gluster volume volname status detail
   - gluster volume volname statistics

Since I enable the monitoring in our pre-production gluster, the gluster is
down 2 times. We  suspect that the monitoring are overloading but should
not.

The question is, there any way to check those states otherwise?

Thanks
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Client load high (300) using fuse mount

2015-06-01 Thread Mitja Mihelič

Hi!

I am trying to set up a Wordpress cluster using GlusterFS used for 
storage. Web nodes will access the same Wordpress install on a volume 
mounted via FUSE from a 3 peer GlusterFS TSP.


I started with one web node and Wordpress on local storage. The load 
average was constantly about 5. iotop showed about 300kB/s disk reads or 
less. The load average was below 6.


When I mounted the GlusterFS volume to the web node the 1min load 
average went over 300. Each of the 3 peers is transmitting about 10MB/s 
to my web node regardless of the load.

TSP peers are on 10Gbit NICs and the web node is on a 1Gbit NIC.

I'm out of ideas here... Could it be the network?
What should I look at for optimizing the network stack on the client?

Options set on TSP:
Options Reconfigured:
performance.cache-size: 4GB
network.ping-timeout: 15
cluster.quorum-type: auto
network.remote-dio: on
cluster.eager-lock: on
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.cache-refresh-timeout: 4
performance.io-thread-count: 32
nfs.disable: on

Regards, Mitja

--
--
Mitja Mihelič
ARNES, Tehnološki park 18, p.p. 7, SI-1001 Ljubljana, Slovenia
tel: +386 1 479 8877, fax: +386 1 479 88 78

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster-users Digest, Vol 86, Issue 1 - Message 5: client load high using FUSE mount

2015-06-01 Thread Ben England


- Original Message -
 From: gluster-users-requ...@gluster.org
 To: gluster-users@gluster.org
 Sent: Monday, June 1, 2015 8:00:01 AM
 Subject: Gluster-users Digest, Vol 86, Issue 1
 
 Message: 5
 Date: Mon, 01 Jun 2015 13:11:13 +0200
 From: Mitja Miheli? mitja.mihe...@arnes.si
 To: gluster-users@gluster.org
 Subject: [Gluster-users] Client load high (300) using fuse mount
 Message-ID: 556c3dd1.1080...@arnes.si
 Content-Type: text/plain; charset=utf-8; format=flowed
 
 Hi!
 
 I am trying to set up a Wordpress cluster using GlusterFS used for
 storage. Web nodes will access the same Wordpress install on a volume
 mounted via FUSE from a 3 peer GlusterFS TSP.
 
 I started with one web node and Wordpress on local storage. The load
 average was constantly about 5. iotop showed about 300kB/s disk reads or
 less. The load average was below 6.
 
 When I mounted the GlusterFS volume to the web node the 1min load
 average went over 300. Each of the 3 peers is transmitting about 10MB/s
 to my web node regardless of the load.
 TSP peers are on 10Gbit NICs and the web node is on a 1Gbit NIC.

30 MB/s is about 1/3 line speed for a 1-Gbps NIC port.  Sounds like network 
latency and lack of client-side caching might be your bottleneck, might want to 
put a 10-Gbps NIC port on your client.  You did disable client-side caching 
(md-cache and io-cache translators) below, was that your intent?  Also, 
defaults for these translators are very conservative, if only 1 client you may 
want to increase time that data is cached (in the client) using FUSE mount 
options entry-timeout=30 and attribute-timeout=30.  Unlike non-distributed 
Linux filesystems, Gluster is very conservative about client side caching to 
avoid cache coherency issues.

 
 I'm out of ideas here... Could it be the network?
 What should I look at for optimizing the network stack on the client?
 
 Options set on TSP:
 Options Reconfigured:
 performance.cache-size: 4GB
 network.ping-timeout: 15
 cluster.quorum-type: auto
 network.remote-dio: on
 cluster.eager-lock: on
 performance.stat-prefetch: off
 performance.io-cache: off
 performance.read-ahead: off
 performance.quick-read: off
 performance.cache-refresh-timeout: 4
 performance.io-thread-count: 32
 nfs.disable: on
 

Too many tunings, what are these intended to do?  The gluster volume reset 
command allows you to undo this.  in Gluster 3.7, the gluster volume get 
your-volume all command lets you see what the defaults are.  

 Regards, Mitja
 
 --
 --
 Mitja Miheli?
 ARNES, Tehnolo?ki park 18, p.p. 7, SI-1001 Ljubljana, Slovenia
 tel: +386 1 479 8877, fax: +386 1 479 88 78
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Features - Object Count

2015-06-01 Thread Sachin Pandit


- Original Message -
 From: M S Vishwanath Bhat msvb...@gmail.com
 To: aasenov1989 aasenov1...@gmail.com
 Cc: Gluster-users@gluster.org List gluster-users@gluster.org
 Sent: Monday, June 1, 2015 3:02:08 PM
 Subject: Re: [Gluster-users] Features - Object Count
 
 
 
 On 29 May 2015 at 18:11, aasenov1989  aasenov1...@gmail.com  wrote:
 
 
 
 Hi,
 So is there a way to find how many files I have on each brick of the volume?
 I don't think gluster provides a way to exactly get the number of files in a
 brick or volume.
 
 Sorry if my solution is very obvious. But I generally use find to get the
 number of files in a particular brick.
 
 find /brick/path ! -path /brick/path/.glusterfs* | wc -l

Hi,

You can also do getfattr -d -m . -e hex brick_path
This command is to get the extended attributes of a directory.
When you issue this command after enabling quota then
you can see an extended attribute with name trusted.glusterfs.quota.size
That basically holds the size, file count and directory count.

The extended attribute consists of 48 hexadecimal numbers. First 16 will give
you the size, next 16 the file count and last 16 the directory count.

Hope this helps.

Thanks,
Sachin Pandit.


 
 
 Best Regards,
 Vishwanath
 
 
 
 
 
 Regards,
 Asen Asenov
 
 On Fri, May 29, 2015 at 3:33 PM, Atin Mukherjee  atin.mukherje...@gmail.com
  wrote:
 
 
 
 
 
 
 Sent from Samsung Galaxy S4
 On 29 May 2015 17:59, aasenov1989  aasenov1...@gmail.com  wrote:
  
  Hi,
  Thnaks for the help. I was able to retrieve number of objects for entire
  volume. But I didn't figure out how to set quota for particular brick. I
  have replicated volume with 2 bricks on 2 nodes:
  Bricks:
  Brick1: host1:/dataDir
  Brick2: host2:/dataDir
  Both bricks are up and files are replicated. But when I try to set quota on
  a particular brick:
 IIUC, You won't be able to set quota at brick level as multiple bricks
 comprise a volume which is exposed to the user. Quota team can correct me if
 I am wrong.
 
  
  gluster volume quota TestVolume limit-objects /dataDir/ 9223372036854775807
  quota command failed : Failed to get trusted.gfid attribute on path
  /dataDir/. Reason : No such file or directory
  please enter the path relative to the volume
  
  What should be the path to brick directories relative to the volume?
  
  Regards,
  Asen Asenov
  
  
  On Fri, May 29, 2015 at 12:35 PM, Sachin Pandit  span...@redhat.com 
  wrote:
  
  - Original Message -
   From: aasenov1989  aasenov1...@gmail.com 
   To: Humble Devassy Chirammal  humble.deva...@gmail.com 
   Cc:  Gluster-users@gluster.org List  gluster-users@gluster.org 
   Sent: Friday, May 29, 2015 12:22:43 AM
   Subject: Re: [Gluster-users] Features - Object Count
   
   Thanks Humble,
   But as far as I understand the object count is connected with the quotas
   set
   per folders. What I want is to get number of files I have in entire
   volume -
   even when volume is distributed across multiple computers. I think the
   purpose of this feature:
   http://gluster.readthedocs.org/en/latest/Feature%20Planning/GlusterFS%203.7/Object%20Count/
  
  Hi,
  
  You are absolutely correct. You can retrieve number of files in the entire
  volume if you have the limit-objects set on the root. If limit-objects
  is set on the directory present in a mount point then it will only show
  the number of files and directories of that particular directory.
  
  In your case, if you want to retrieve number of files and directories
  present in the entire volume then you might have to set the object limit
  on the root.
  
  
  Thanks,
  Sachin Pandit.
  
  
   is to provide such functionality. Am I right or there is no way to
   retrieve
   number of files for entire volume?
   
   Regards,
   Asen Asenov
   
   On Thu, May 28, 2015 at 8:09 PM, Humble Devassy Chirammal 
   humble.deva...@gmail.com  wrote:
   
   
   
   Hi Asen,
   
   https://gluster.readthedocs.org/en/latest/Features/quota-object-count/ ,
   hope
   this helps.
   
   --Humble
   
   
   On Thu, May 28, 2015 at 8:38 PM, aasenov1989  aasenov1...@gmail.com 
   wrote:
   
   
   
   Hi,
   I wanted to ask how to use this feature in gluster 3.7.0, as I was
   unable to
   find anything. How can I retrieve number of objects in volume and number
   of
   objects in particular brick?
   
   Thanks in advance.
   
   Regards,
   Asen Asenov
   
   ___
   Gluster-users mailing list
   Gluster-users@gluster.org
   http://www.gluster.org/mailman/listinfo/gluster-users
   
   
   
   ___
   Gluster-users mailing list
   Gluster-users@gluster.org
   http://www.gluster.org/mailman/listinfo/gluster-users
  
  
  
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-users
 
 
 
 
 
 ___
 Gluster-users mailing 

Re: [Gluster-users] Gluster 3.6.1

2015-06-01 Thread Atin Mukherjee


On 05/29/2015 01:29 PM, Félix de Lelelis wrote:
 Hi,
 
 I have a cluster with 3 nodes on pre-production. Yesterday, one node was
 down. The errror that I have seen is that:
 
 
 [2015-05-28 19:04:27.305560] E [glusterd-syncop.c:1578:gd_sync_task_begin]
 0-management: Unable to acquire lock for cfe-gv1
 The message I [MSGID: 106006]
 [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs
 has disconnected from glusterd. repeated 5 times between [2015-05-28
 19:04:09.346088] and [2015-05-28 19:04:24.349191]
 pending frames:
 frame : type(0) op(0)
 patchset: git://git.gluster.com/glusterfs.git
 signal received: 11
 time of crash:
 2015-05-28 19:04:27
 configuration details:
 argp 1
 backtrace 1
 dlfcn 1
 libpthread 1
 llistxattr 1
 setfsid 1
 spinlock 1
 epoll.h 1
 xattr.h 1
 st_atim.tv_nsec 1
 package-string: glusterfs 3.6.1
 /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7fd86e2f1232]
 /usr/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7fd86e30871d]
 /usr/lib64/libc.so.6(+0x35640)[0x7fd86d30c640]
 /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_remove_pending_entry+0x2c)[0x7fd85f52450c]
 /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(+0x5ae28)[0x7fd85f511e28]
 /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_op_sm+0x237)[0x7fd85f50f027]
 /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(__glusterd_brick_op_cbk+0x2fe)[0x7fd85f53be5e]
 /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_big_locked_cbk+0x4c)[0x7fd85f53d48c]
 /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7fd86e0c50b0]
 /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x171)[0x7fd86e0c5321]
 /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fd86e0c1273]
 /usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0x8530)[0x7fd85d17d530]
 /usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0xace4)[0x7fd85d17fce4]
 /usr/lib64/libglusterfs.so.0(+0x76322)[0x7fd86e346322]
 /usr/sbin/glusterd(main+0x502)[0x7fd86e79afb2]
 /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fd86d2f8af5]
 /usr/sbin/glusterd(+0x6351)[0x7fd86e79b351]
 -
 
 
 That is a problem with software? is a bug ?
The problem what I see here is concurrent volume status transactions
were run at a given point of time (From the cmd log history in BZ
1226254). 3.6.1 has some fixes missing to take care of these issues
identified on the same line. If you upgrade your cluster to 3.6.3
problem will go away. However 3.6.3 still misses one more fix
http://review.gluster.org/#/c/10023/ which will be released in 3.6.4.

I would request you to upgrade your cluster to 3.6.3 if not 3.7.
 
 Thanks.
 
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users
 

-- 
~Atin
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Features - Object Count

2015-06-01 Thread aasenov1989
Hi,
Thanks for the reply. I understand what you're saying, but then can you
give me an idea how to solve my problem. My setup is as follows:
I have 1 volume comprised of 2 nodes, each node having 2 bricks(total 4
bricks). The volume is replicated in such a way that each brick from node1
replicate files to corresponding brick in node2. When I upload, lets say 1
million files to node1, I want to find out when those files are replicated
to the second node. My idea was to set inode quota on a volume (to check
number of files in the volume) and on a brick(to check number of files in a
brick) and to verify that total number of files in bricks in each node is
equal to number of files in entire volume. This way I can be sure that the
files are replicated correctly.
But, as far as I understand, what I can do is to set quota on entire volume
to keep track of total number of files. Then to check each brick and count
number of files in that brick. But I have to perform this operation on both
nodes, as I can't check number of files in brick that is not in local node.
Also it's not really efficient every time to count millions of files
So is there a way to perform such a check?
Thanks in advance.

Regards,
Asen Asenov

On Mon, Jun 1, 2015 at 7:31 AM, Sachin Pandit span...@redhat.com wrote:



 - Original Message -
  From: aasenov1989 aasenov1...@gmail.com
  To: Atin Mukherjee atin.mukherje...@gmail.com
  Cc: Gluster-users@gluster.org List gluster-users@gluster.org,
 Sachin Pandit span...@redhat.com
  Sent: Friday, May 29, 2015 6:11:36 PM
  Subject: Re: [Gluster-users] Features - Object Count
 
  Hi,
  So is there a way to find how many files I have on each brick of the
 volume?
 

 Hi,

 Quota limit can only be set on the volume level and directories
 which is present in it. We don't have a gluster command which
 lists out the number of files present in a brick, as linux
 commands can take care of that very well.

 Thanks,
 Sachin Pandit.


  Regards,
  Asen Asenov
 
  On Fri, May 29, 2015 at 3:33 PM, Atin Mukherjee 
 atin.mukherje...@gmail.com
  wrote:
 
   Sent from Samsung Galaxy S4
   On 29 May 2015 17:59, aasenov1989 aasenov1...@gmail.com wrote:
   
Hi,
Thnaks for the help. I was able to retrieve number of objects for
 entire
   volume. But I didn't figure out how to set quota for particular brick.
 I
   have replicated volume with 2 bricks on 2 nodes:
Bricks:
Brick1: host1:/dataDir
Brick2: host2:/dataDir
Both bricks are up and files are replicated. But when I try to set
 quota
   on a particular brick:
   IIUC, You won't be able to set quota at brick level as multiple bricks
   comprise a volume which is exposed to the user. Quota team can correct
 me
   if I am wrong.
  
   
gluster volume quota TestVolume limit-objects /dataDir/
   9223372036854775807
quota command failed : Failed to get trusted.gfid attribute on path
   /dataDir/. Reason : No such file or directory
please enter the path relative to the volume
   
What should be the path to brick directories relative to the volume?
   
Regards,
Asen Asenov
   
   
On Fri, May 29, 2015 at 12:35 PM, Sachin Pandit span...@redhat.com
   wrote:
   
- Original Message -
 From: aasenov1989 aasenov1...@gmail.com
 To: Humble Devassy Chirammal humble.deva...@gmail.com
 Cc: Gluster-users@gluster.org List gluster-users@gluster.org
 Sent: Friday, May 29, 2015 12:22:43 AM
 Subject: Re: [Gluster-users] Features - Object Count

 Thanks Humble,
 But as far as I understand the object count is connected with the
   quotas set
 per folders. What I want is to get number of files I have in
 entire
   volume -
 even when volume is distributed across multiple computers. I
 think the
 purpose of this feature:

  
 http://gluster.readthedocs.org/en/latest/Feature%20Planning/GlusterFS%203.7/Object%20Count/
   
Hi,
   
You are absolutely correct. You can retrieve number of files in the
   entire
volume if you have the limit-objects set on the root. If
   limit-objects
is set on the directory present in a mount point then it will only
 show
the number of files and directories of that particular directory.
   
In your case, if you want to retrieve number of files and
 directories
present in the entire volume then you might have to set the object
 limit
on the root.
   
   
Thanks,
Sachin Pandit.
   
   
 is to provide such functionality. Am I right or there is no way to
   retrieve
 number of files for entire volume?

 Regards,
 Asen Asenov

 On Thu, May 28, 2015 at 8:09 PM, Humble Devassy Chirammal 
 humble.deva...@gmail.com  wrote:



 Hi Asen,


   https://gluster.readthedocs.org/en/latest/Features/quota-object-count/
 ,
   hope
 this helps.

 --Humble


 On Thu, May 28, 2015 at 8:38 PM, aasenov1989 
 aasenov1...@gmail.com
wrote:




Re: [Gluster-users] Features - Object Count

2015-06-01 Thread M S Vishwanath Bhat
On 29 May 2015 at 18:11, aasenov1989 aasenov1...@gmail.com wrote:

 Hi,
 So is there a way to find how many files I have on each brick of the
 volume?

I don't think gluster provides a way to  exactly get the number of files in
a brick or volume.

Sorry if my solution is very obvious. But I generally use find to get the
number of files in a particular brick.

find /brick/path ! -path /brick/path/.glusterfs* | wc -l


Best Regards,
Vishwanath


 Regards,
 Asen Asenov

 On Fri, May 29, 2015 at 3:33 PM, Atin Mukherjee 
 atin.mukherje...@gmail.com wrote:

 Sent from Samsung Galaxy S4
 On 29 May 2015 17:59, aasenov1989 aasenov1...@gmail.com wrote:
 
  Hi,
  Thnaks for the help. I was able to retrieve number of objects for
 entire volume. But I didn't figure out how to set quota for particular
 brick. I have replicated volume with 2 bricks on 2 nodes:
  Bricks:
  Brick1: host1:/dataDir
  Brick2: host2:/dataDir
  Both bricks are up and files are replicated. But when I try to set
 quota on a particular brick:
 IIUC, You won't be able to set quota at brick level as multiple bricks
 comprise a volume which is exposed to the user. Quota team can correct me
 if I am wrong.

 
  gluster volume quota TestVolume limit-objects /dataDir/
 9223372036854775807
  quota command failed : Failed to get trusted.gfid attribute on path
 /dataDir/. Reason : No such file or directory
  please enter the path relative to the volume
 
  What should be the path to brick directories relative to the volume?
 
  Regards,
  Asen Asenov
 
 
  On Fri, May 29, 2015 at 12:35 PM, Sachin Pandit span...@redhat.com
 wrote:
 
  - Original Message -
   From: aasenov1989 aasenov1...@gmail.com
   To: Humble Devassy Chirammal humble.deva...@gmail.com
   Cc: Gluster-users@gluster.org List gluster-users@gluster.org
   Sent: Friday, May 29, 2015 12:22:43 AM
   Subject: Re: [Gluster-users] Features - Object Count
  
   Thanks Humble,
   But as far as I understand the object count is connected with the
 quotas set
   per folders. What I want is to get number of files I have in entire
 volume -
   even when volume is distributed across multiple computers. I think
 the
   purpose of this feature:
  
 http://gluster.readthedocs.org/en/latest/Feature%20Planning/GlusterFS%203.7/Object%20Count/
 
  Hi,
 
  You are absolutely correct. You can retrieve number of files in the
 entire
  volume if you have the limit-objects set on the root. If
 limit-objects
  is set on the directory present in a mount point then it will only show
  the number of files and directories of that particular directory.
 
  In your case, if you want to retrieve number of files and directories
  present in the entire volume then you might have to set the object
 limit
  on the root.
 
 
  Thanks,
  Sachin Pandit.
 
 
   is to provide such functionality. Am I right or there is no way to
 retrieve
   number of files for entire volume?
  
   Regards,
   Asen Asenov
  
   On Thu, May 28, 2015 at 8:09 PM, Humble Devassy Chirammal 
   humble.deva...@gmail.com  wrote:
  
  
  
   Hi Asen,
  
  
 https://gluster.readthedocs.org/en/latest/Features/quota-object-count/ ,
 hope
   this helps.
  
   --Humble
  
  
   On Thu, May 28, 2015 at 8:38 PM, aasenov1989  aasenov1...@gmail.com
  wrote:
  
  
  
   Hi,
   I wanted to ask how to use this feature in gluster 3.7.0, as I was
 unable to
   find anything. How can I retrieve number of objects in volume and
 number of
   objects in particular brick?
  
   Thanks in advance.
  
   Regards,
   Asen Asenov
  
   ___
   Gluster-users mailing list
   Gluster-users@gluster.org
   http://www.gluster.org/mailman/listinfo/gluster-users
  
  
  
   ___
   Gluster-users mailing list
   Gluster-users@gluster.org
   http://www.gluster.org/mailman/listinfo/gluster-users
 
 
 
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-users



 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] 3.6.3 split brain on web browser cache dir w. replica 3 volume

2015-06-01 Thread Alastair Neil
I have a replica 3 volume I am using to serve my home directory.  I have
notices a couple of split-brains recently on files used by browsers(for the
most recent see below, I had an earlier one on
.config/google-chrome/Default/Session Storage/) .  When I was running
replica 2 I don't recall seeing more than two entries of the form:
trusted.afr.volname.client-?.  I did have two other servers that I have
removed from service recently but I am curious to know if there is some way
to map  what the server reports as trusted.afr.volname-client-? to a
hostname?

Thanks, Alastair


# gluster volume heal homes info
 Brick gluster-2:/export/brick2/home/
 /a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in split-brain
 Number of entries: 1
 Brick gluster1:/export/brick2/home/
 /a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in split-brain
 Number of entries: 1
 Brick gluster0:/export/brick2/home/
 /a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in split-brain
 Number of entries: 1
 # getfattr -d -m . -e hex
 /export/brick2/home/a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair
 getfattr: Removing leading '/' from absolute path names
 # file:
 export/brick2/home/a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair
 security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
 trusted.afr.dirty=0x
 trusted.afr.homes-client-0=0x
 trusted.afr.homes-client-1=0x
 trusted.afr.homes-client-2=0x
 trusted.afr.homes-client-3=0x0002
 trusted.afr.homes-client-4=0x
 trusted.gfid=0x3ae398227cea4f208d7652dbfb93e3e5
 trusted.glusterfs.dht=0x0001
 trusted.glusterfs.quota.dirty=0x3000

 trusted.glusterfs.quota.edf41dc8-2122-4aa3-bc20-29225564ca8c.contri=0x162d2200
 trusted.glusterfs.quota.size=0x162d2200
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Geo-Replication - Changelog socket is not present - Falling back to xsync

2015-06-01 Thread PEPONNET, Cyril N (Cyril)
Some news,

Looks like changelog is not working anymore. When I touch a file in master it 
doesnt propagate to slave…

.processing folder contain a thousand of changelog not processed.

I had to stop the geo-rep, reset changelog.changelog to the volume and restart 
the geo-rep. It’s now sending missing files using hybrid crawl.

So geo-repo is not working as expected.

Another thing, we use symlink to point to latest release build, and it seems 
that symlinks are not synced when they change from master to slave.

Any idea on how I can debug this ?

--
Cyril Peponnet

On May 29, 2015, at 3:01 AM, Kotresh Hiremath Ravishankar 
khire...@redhat.commailto:khire...@redhat.com wrote:

Yes, geo-rep internally uses fuse mount.
I will explore further and get back to you
if there is a way.

Thanks and Regards,
Kotresh H R

- Original Message -
From: Cyril N PEPONNET (Cyril) 
cyril.pepon...@alcatel-lucent.commailto:cyril.pepon...@alcatel-lucent.com
To: Kotresh Hiremath Ravishankar 
khire...@redhat.commailto:khire...@redhat.com
Cc: gluster-users 
gluster-users@gluster.orgmailto:gluster-users@gluster.org
Sent: Thursday, May 28, 2015 10:12:57 PM
Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not present 
- Falling back to xsync

One more thing:

nfs.volume-access read-only works only for nfs clients, glusterfs client have
still write access

features.read-only on need a vol restart and set RO for everyone but in this
case, geo-rep goes faulty.

[2015-05-28 09:42:27.917897] E [repce(/export/raid/usr_global):188:__call__]
RepceClient: call 8739:139858642609920:1432831347.73 (keep_alive) failed on
peer with OSError
[2015-05-28 09:42:27.918102] E
[syncdutils(/export/raid/usr_global):240:log_raise_exception] top: FAIL:
Traceback (most recent call last):
 File /usr/libexec/glusterfs/python/syncdaemon/syncdutils.py, line 266, in
 twrap
   tf(*aa)
 File /usr/libexec/glusterfs/python/syncdaemon/master.py, line 391, in
 keep_alive
   cls.slave.server.keep_alive(vi)
 File /usr/libexec/glusterfs/python/syncdaemon/repce.py, line 204, in
 __call__
   return self.ins(self.meth, *a)
 File /usr/libexec/glusterfs/python/syncdaemon/repce.py, line 189, in
 __call__
   raise res
OSError: [Errno 30] Read-

So there is no proper way to protect the salve against write.

--
Cyril Peponnet

On May 28, 2015, at 8:54 AM, Cyril Peponnet
cyril.pepon...@alcatel-lucent.commailto:cyril.pepon...@alcatel-lucent.commailto:cyril.pepon...@alcatel-lucent.com
wrote:

Hi Kotresh,

Inline.

Again, thank for you time.

--
Cyril Peponnet

On May 27, 2015, at 10:47 PM, Kotresh Hiremath Ravishankar
khire...@redhat.commailto:khire...@redhat.commailto:khire...@redhat.com 
wrote:

Hi Cyril,

Replies inline.

Thanks and Regards,
Kotresh H R

- Original Message -
From: Cyril N PEPONNET (Cyril)
cyril.pepon...@alcatel-lucent.commailto:cyril.pepon...@alcatel-lucent.commailto:cyril.pepon...@alcatel-lucent.com
To: Kotresh Hiremath Ravishankar
khire...@redhat.commailto:khire...@redhat.commailto:khire...@redhat.com
Cc: gluster-users
gluster-users@gluster.orgmailto:gluster-users@gluster.orgmailto:gluster-users@gluster.org
Sent: Wednesday, May 27, 2015 9:28:00 PM
Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not
present - Falling back to xsync

Hi and thanks again for those explanation.

Due to lot of missing files and not up to date (with gfid mismatch some
time), I reset the index (or I think I do) by:

deleting the geo-reop, reset geo-replication.indexing (set it to off does not
work for me), and recreate it again.

Resetting index does not initiate geo-replication from the version changelog
is
introduced. It works only for the versions prior to it.

NOTE 1: Recreation of geo-rep session will work only if slave doesn't contain
 file with mismatch gfids. If there are, slave should be cleaned up
 before recreating.

I started it again to transfert missing files Ill take of gfid missmatch
afterward. Our vol is almost 5TB and it took almost 2 month to crawl to the
slave I did’nt want to start over :/


NOTE 2: Another method exists now to initiate a full sync. It also expects
slave
   files should not be in gfid mismatch state (meaning, slave volume
   should not
   written by any other means other than geo-replication). The method is
   to
   reset stime on all the bricks of master.


   Following are the steps to trigger full sync!!!. Let me know if any
   comments/doubts.
   
   1. Stop geo-replication
   2. Remove stime extended attribute all the master brick root using
   following command.
  setfattr -x
  trusted.glusterfs.MASTER_VOL_UUID.SLAVE_VOL_UUID.stime
  brick-root
 NOTE: 1. If AFR is setup, do this for all replicated set

   2. Above mentioned stime key can be got as follows:
  Using 'gluster volume info mastervol', get all brick
  paths 

[Gluster-users] GlusterFS 3.7 - slow/poor performances

2015-06-01 Thread Geoffrey Letessier
Dear all,

I have a crash test cluster where i’ve tested the new version of GlusterFS 
(v3.7) before upgrading my HPC cluster in production. 
But… all my tests show me very very low performances.

For my benches, as you can read below, I do some actions (untar, du, find, tar, 
rm) with linux kernel sources, dropping cache, each on distributed, replicated, 
distributed-replicated, single (single brick) volumes and the native FS of one 
brick.

# time (echo 3  /proc/sys/vm/drop_caches; tar xJf ~/linux-4.1-rc5.tar.xz; 
sync; echo 3  /proc/sys/vm/drop_caches)
# time (echo 3  /proc/sys/vm/drop_caches; du -sh linux-4.1-rc5/; echo 3  
/proc/sys/vm/drop_caches)
# time (echo 3  /proc/sys/vm/drop_caches; find linux-4.1-rc5/|wc -l; echo 3  
/proc/sys/vm/drop_caches)
# time (echo 3  /proc/sys/vm/drop_caches; tar czf linux-4.1-rc5.tgz 
linux-4.1-rc5/; echo 3  /proc/sys/vm/drop_caches)
# time (echo 3  /proc/sys/vm/drop_caches; rm -rf linux-4.1-rc5.tgz 
linux-4.1-rc5/; echo 3  /proc/sys/vm/drop_caches)

And here are the process times:

---
| |  UNTAR  |   DU   |  FIND   |   TAR   |   RM   |
---
| single  |  ~3m45s |   ~43s |~47s |  ~3m10s | ~3m15s |
---
| replicated  |  ~5m10s |   ~59s |   ~1m6s |  ~1m19s | ~1m49s |
---
| distributed |  ~4m18s |   ~41s |~57s |  ~2m24s | ~1m38s |
---
| dist-repl   |  ~8m18s |  ~1m4s |  ~1m11s |  ~1m24s | ~2m40s |
---
| native FS   |~11s |~4s | ~2s |~56s |   ~10s |
---

I get the same results, whether with default configurations with custom 
configurations.

if I look at the side of the ifstat command, I can note my IO write processes 
never exceed 3MBs...

EXT4 native FS seems to be faster (roughly 15-20% but no more) than XFS one

My [test] storage cluster config is composed by 2 identical servers (biCPU 
Intel Xeon X5355, 8GB of RAM, 2x2TB HDD (no-RAID) and Gb ethernet)

My volume settings:
single: 1server 1 brick
replicated: 2 servers 1 brick each
distributed: 2 servers 2 bricks each
dist-repl: 2 bricks in the same server and replica 2

All seems to be OK in gluster status command line.

Do you have an idea why I obtain so bad results?
Thanks in advance.
Geoffrey
---
Geoffrey Letessier

Responsable informatique  ingénieur système
CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letess...@cnrs.fr

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 3.6.3 split brain on web browser cache dir w. replica 3 volume

2015-06-01 Thread Ravishankar N



On 06/01/2015 08:15 PM, Alastair Neil wrote:


I have a replica 3 volume I am using to serve my home directory.  I 
have notices a couple of split-brains recently on files used by 
browsers(for the most recent see below, I had an earlier one on 
.config/google-chrome/Default/Session Storage/) .  When I was 
running replica 2 I don't recall seeing more than two entries of the 
form: trusted.afr.volname.client-?.  I did have two other servers that 
I have removed from service recently but I am curious to know if there 
is some way to map  what the server reports as 
trusted.afr.volname-client-? to a hostname?





Your volfile 
(/var/lib/glusterd/vols/volname/trusted-volname.tcp-fuse.vol) should 
contain which brick (remote-subvolume + remote-host) a given 
trusted.afr* maps to.

Hope that helps,
Ravi



Thanks, Alastair


# gluster volume heal homes info
Brick gluster-2:/export/brick2/home/
/a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in
split-brain
Number of entries: 1
Brick gluster1:/export/brick2/home/
/a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in
split-brain
Number of entries: 1
Brick gluster0:/export/brick2/home/
/a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair - Is in
split-brain
Number of entries: 1
# getfattr -d -m . -e hex
/export/brick2/home/a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair
getfattr: Removing leading '/' from absolute path names
# file:
export/brick2/home/a/n/aneil2/.cache/mozilla/firefox/xecgwc8s.Alastair
security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.dirty=0x
trusted.afr.homes-client-0=0x
trusted.afr.homes-client-1=0x
trusted.afr.homes-client-2=0x
trusted.afr.homes-client-3=0x0002
trusted.afr.homes-client-4=0x
trusted.gfid=0x3ae398227cea4f208d7652dbfb93e3e5
trusted.glusterfs.dht=0x0001
trusted.glusterfs.quota.dirty=0x3000

trusted.glusterfs.quota.edf41dc8-2122-4aa3-bc20-29225564ca8c.contri=0x162d2200
trusted.glusterfs.quota.size=0x162d2200




___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Geo-Replication - Changelog socket is not present - Falling back to xsync

2015-06-01 Thread Kotresh Hiremath Ravishankar
Hi Cyril,

Could you please attach the geo-replication logs?

Thanks and Regards,
Kotresh H R

- Original Message -
 From: Cyril N PEPONNET (Cyril) cyril.pepon...@alcatel-lucent.com
 To: Kotresh Hiremath Ravishankar khire...@redhat.com
 Cc: gluster-users gluster-users@gluster.org
 Sent: Monday, June 1, 2015 10:34:42 PM
 Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not 
 present - Falling back to xsync
 
 Some news,
 
 Looks like changelog is not working anymore. When I touch a file in master it
 doesnt propagate to slave…
 
 .processing folder contain a thousand of changelog not processed.
 
 I had to stop the geo-rep, reset changelog.changelog to the volume and
 restart the geo-rep. It’s now sending missing files using hybrid crawl.
 
 So geo-repo is not working as expected.
 
 Another thing, we use symlink to point to latest release build, and it seems
 that symlinks are not synced when they change from master to slave.
 
 Any idea on how I can debug this ?
 
 --
 Cyril Peponnet
 
 On May 29, 2015, at 3:01 AM, Kotresh Hiremath Ravishankar
 khire...@redhat.commailto:khire...@redhat.com wrote:
 
 Yes, geo-rep internally uses fuse mount.
 I will explore further and get back to you
 if there is a way.
 
 Thanks and Regards,
 Kotresh H R
 
 - Original Message -
 From: Cyril N PEPONNET (Cyril)
 cyril.pepon...@alcatel-lucent.commailto:cyril.pepon...@alcatel-lucent.com
 To: Kotresh Hiremath Ravishankar
 khire...@redhat.commailto:khire...@redhat.com
 Cc: gluster-users
 gluster-users@gluster.orgmailto:gluster-users@gluster.org
 Sent: Thursday, May 28, 2015 10:12:57 PM
 Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not
 present - Falling back to xsync
 
 One more thing:
 
 nfs.volume-access read-only works only for nfs clients, glusterfs client have
 still write access
 
 features.read-only on need a vol restart and set RO for everyone but in this
 case, geo-rep goes faulty.
 
 [2015-05-28 09:42:27.917897] E [repce(/export/raid/usr_global):188:__call__]
 RepceClient: call 8739:139858642609920:1432831347.73 (keep_alive) failed on
 peer with OSError
 [2015-05-28 09:42:27.918102] E
 [syncdutils(/export/raid/usr_global):240:log_raise_exception] top: FAIL:
 Traceback (most recent call last):
  File /usr/libexec/glusterfs/python/syncdaemon/syncdutils.py, line 266, in
  twrap
tf(*aa)
  File /usr/libexec/glusterfs/python/syncdaemon/master.py, line 391, in
  keep_alive
cls.slave.server.keep_alive(vi)
  File /usr/libexec/glusterfs/python/syncdaemon/repce.py, line 204, in
  __call__
return self.ins(self.meth, *a)
  File /usr/libexec/glusterfs/python/syncdaemon/repce.py, line 189, in
  __call__
raise res
 OSError: [Errno 30] Read-
 
 So there is no proper way to protect the salve against write.
 
 --
 Cyril Peponnet
 
 On May 28, 2015, at 8:54 AM, Cyril Peponnet
 cyril.pepon...@alcatel-lucent.commailto:cyril.pepon...@alcatel-lucent.commailto:cyril.pepon...@alcatel-lucent.com
 wrote:
 
 Hi Kotresh,
 
 Inline.
 
 Again, thank for you time.
 
 --
 Cyril Peponnet
 
 On May 27, 2015, at 10:47 PM, Kotresh Hiremath Ravishankar
 khire...@redhat.commailto:khire...@redhat.commailto:khire...@redhat.com
 wrote:
 
 Hi Cyril,
 
 Replies inline.
 
 Thanks and Regards,
 Kotresh H R
 
 - Original Message -
 From: Cyril N PEPONNET (Cyril)
 cyril.pepon...@alcatel-lucent.commailto:cyril.pepon...@alcatel-lucent.commailto:cyril.pepon...@alcatel-lucent.com
 To: Kotresh Hiremath Ravishankar
 khire...@redhat.commailto:khire...@redhat.commailto:khire...@redhat.com
 Cc: gluster-users
 gluster-users@gluster.orgmailto:gluster-users@gluster.orgmailto:gluster-users@gluster.org
 Sent: Wednesday, May 27, 2015 9:28:00 PM
 Subject: Re: [Gluster-users] Geo-Replication - Changelog socket is not
 present - Falling back to xsync
 
 Hi and thanks again for those explanation.
 
 Due to lot of missing files and not up to date (with gfid mismatch some
 time), I reset the index (or I think I do) by:
 
 deleting the geo-reop, reset geo-replication.indexing (set it to off does not
 work for me), and recreate it again.
 
 Resetting index does not initiate geo-replication from the version changelog
 is
 introduced. It works only for the versions prior to it.
 
 NOTE 1: Recreation of geo-rep session will work only if slave doesn't contain
  file with mismatch gfids. If there are, slave should be cleaned up
  before recreating.
 
 I started it again to transfert missing files Ill take of gfid missmatch
 afterward. Our vol is almost 5TB and it took almost 2 month to crawl to the
 slave I did’nt want to start over :/
 
 
 NOTE 2: Another method exists now to initiate a full sync. It also expects
 slave
files should not be in gfid mismatch state (meaning, slave volume
should not
written by any other means other than geo-replication). The method is
to
reset stime on all the bricks of master.
 
 
Following are the steps to trigger full 

Re: [Gluster-users] split brain on / just after installation

2015-06-01 Thread Ravishankar N



On 06/02/2015 09:10 AM, Carl L Hoffman wrote:

Hello - I was wondering if someone could please help me.

I've just setup Gluster 3.6 on two Ubuntu 14.04 hosts.  Gluster is setup to 
replicate two volumes (prod-volume, dev-volume) between the two hosts.  
Replication is working fine.  The glustershd.log shows:


Are you sure you are running gluster 3.6?  The 
'afr_sh_print_split_brain_log' message appears only in gluster 3.5 or lower.





[2015-06-02 03:28:04.495162] E 
[afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-prod-volume-replicate-0: 
Unable to self-heal contents of 'gfid:----0001' 
(possible split-brain). Please delete the file from all but the preferred subvolume.- 
Pending matrix:  [ [ 0 2 ] [ 2 0 ] ]

and the prod-volume logs shows:

[2015-06-02 02:54:28.286268] E 
[afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 
0-prod-volume-replicate-0: Unable to self-heal contents of '/' (possible 
split-brain). Please delete the file from all but the preferred subvolume.- 
Pending matrix:  [ [ 0 2 ] [ 2 0 ] ]
[2015-06-02 02:54:28.287476] E 
[afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 
0-prod-volume-replicate-0: background  meta-data self-heal failed on /

I've checked against 
https://github.com/gluster/glusterfs/blob/6c578c03f0d44913d264494de5df004544c96271/doc/features/heal-info-and-split-brain-resolution.md
 but I can't see any scenario that covers mine.  The output of bluster volume 
heal prod-volume info is:



Is the metadata same on both bricks on the root?  (Compare  `ls -ld 
/export/prodvol/brick`  and `getfattr -d -m . -e hex 
/export/prodvol/brick` on both servers to see if anything is mismatching).

-Ravi



Gathering Heal info on volume prod-volume has been successful

Brick server1:/export/prodvol/brick
Number of entries: 1
/

Brick server2
Number of entries: 1
/


and doesn't show anything in split-brain.

But the output of gluster volume heal prod-volume info split brain shows:

Gathering Heal info on volume prod-volume has been successful

Brick server1:/export/prodvol/brick
Number of entries: 6
atpath on brick
---
2015-06-02 03:28:04 /
2015-06-02 03:18:04 /
2015-06-02 03:08:04 /
2015-06-02 02:58:04 /
2015-06-02 02:48:04 /
2015-06-02 02:48:04 /

Brick server2:/export/prodvol/brick
Number of entries: 5
atpath on brick
---
2015-06-02 03:28:00 /
2015-06-02 03:18:00 /
2015-06-02 03:08:00 /
2015-06-02 02:58:00 /
2015-06-02 02:48:04 /


And the number continues to grow.  The count on server2 is always one behind 
server1.

Could someone please help?

Cheers,


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster peer rejected and failed to start

2015-06-01 Thread Atin Mukherjee


On 06/02/2015 10:00 AM, vyyy杨雨阳 wrote:
 Hi
 
 We have a gluster (Version 3.6.3) cluster with 6 nodes, I tried to add 4 more 
 nodes, but ‘Peer Rejected’, then I tried to resolve it by dump 
 /var/lib/glusterd and probe again, not success, this is a question, But 
 strange thing is:
 
 A node already in cluster also shown “Peer Reject”
 
 I tried to restart glusterd, It failed
 
 I found that /var/lib/glusterd/peers is empty, I copied the files from other 
 nodes, still can’t start glusterd
It seems like you are trying to peer probe nodes which are either either
part of some other clusters (uncleaned nodes). Could you check whether
the nodes which you are adding have empty /var/lib/glusterd? If not
clean them and retry.

~Atin
 
 
 
 etc-glusterfs-glusterd.vol.log shown that cluster member as “unknown peer ”
 
 [2015-06-02 01:52:14.650635] C 
 [glusterd-handler.c:2369:__glusterd_handle_friend_update] 0-: Received friend 
 update request from unknown peer 04f22ee8-8e00-4c32-a924-b40a0e413aa6
 [2015-06-02 01:52:14.650786] C 
 [glusterd-handler.c:2369:__glusterd_handle_friend_update] 0-: Received friend 
 update request from unknown peer 674a78b5-0590-48d4-8752-d4608832ed1d
 [2015-06-02 01:52:14.657881] C 
 [glusterd-handler.c:2369:__glusterd_handle_friend_update] 0-: Received friend 
 update request from unknown peer 83e1a9db-3134-45e4-acd2-387b12b5b207
 [2015-06-02 01:52:17.747865] W 
 [glusterd-handler.c:697:__glusterd_handle_cluster_lock] 0-management: 
 04f22ee8-8e00-4c32-a924-b40a0e413aa6 doesn't belong to the cluster. Ignoring 
 request.
 [2015-06-02 01:52:17.747908] E [rpcsvc.c:544:rpcsvc_check_and_reply_error] 
 0-rpcsvc: rpc actor failed to complete successfully
 [2015-06-02 01:52:40.338885] W 
 [glusterd-handler.c:697:__glusterd_handle_cluster_lock] 0-management: 
 674a78b5-0590-48d4-8752-d4608832ed1d doesn't belong to the cluster. Ignoring 
 request.
 [2015-06-02 01:52:40.338929] E [rpcsvc.c:544:rpcsvc_check_and_reply_error] 
 0-rpcsvc: rpc actor failed to complete successfully
 [2015-06-02 01:52:41.310451] W 
 [glusterd-handler.c:697:__glusterd_handle_cluster_lock] 0-management: 
 674a78b5-0590-48d4-8752-d4608832ed1d doesn't belong to the cluster. Ignoring 
 request.
 [2015-06-02 01:52:41.310486] E [rpcsvc.c:544:rpcsvc_check_and_reply_error] 
 0-rpcsvc: rpc actor failed to complete successfully
 
 
 
 Debug info is as following,
 
 
 
 /usr/sbin/glusterd
 [root@SH02SVR5951 peers]# /usr/sbin/glusterd --debug
 [2015-06-02 04:09:24.626690] I [MSGID: 100030] [glusterfsd.c:2018:main] 
 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.6.3 (args: 
 /usr/sbin/glusterd --debug)
 [2015-06-02 04:09:24.626739] D [logging.c:1763:__gf_log_inject_timer_event] 
 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
 [2015-06-02 04:09:24.627052] D [MSGID: 0] [glusterfsd.c:613:get_volfp] 
 0-glusterfsd: loading volume file /etc/glusterfs/glusterd.vol
 [2015-06-02 04:09:24.629683] I [glusterd.c:1214:init] 0-management: Maximum 
 allowed open file descriptors set to 65536
 [2015-06-02 04:09:24.629706] I [glusterd.c:1259:init] 0-management: Using 
 /var/lib/glusterd as working directory
 [2015-06-02 04:09:24.629764] D [glusterd.c:391:glusterd_rpcsvc_options_build] 
 0-: listen-backlog value: 128
 [2015-06-02 04:09:24.629895] D [rpcsvc.c:2198:rpcsvc_init] 0-rpc-service: RPC 
 service inited.
 [2015-06-02 04:09:24.629904] D [rpcsvc.c:1801:rpcsvc_program_register] 
 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port:  0
 [2015-06-02 04:09:24.629930] D [rpc-transport.c:262:rpc_transport_load] 
 0-rpc-transport: attempt to load file 
 /usr/lib64/glusterfs/3.6.3/rpc-transport/socket.so
 [2015-06-02 04:09:24.631989] D [socket.c:3807:socket_init] 
 0-socket.management: SSL support on the I/O path is NOT enabled
 [2015-06-02 04:09:24.632005] D [socket.c:3810:socket_init] 
 0-socket.management: SSL support for glusterd is NOT enabled
 [2015-06-02 04:09:24.632013] D [socket.c:3827:socket_init] 
 0-socket.management: using system polling thread
 [2015-06-02 04:09:24.632024] D [name.c:550:server_fill_address_family] 
 0-socket.management: option address-family not specified, defaulting to inet
 [2015-06-02 04:09:24.632072] D [rpc-transport.c:262:rpc_transport_load] 
 0-rpc-transport: attempt to load file 
 /usr/lib64/glusterfs/3.6.3/rpc-transport/rdma.so
 [2015-06-02 04:09:24.632102] E [rpc-transport.c:266:rpc_transport_load] 
 0-rpc-transport: /usr/lib64/glusterfs/3.6.3/rpc-transport/rdma.so: cannot 
 open shared object file: No such file or directory
 [2015-06-02 04:09:24.632112] W [rpc-transport.c:270:rpc_transport_load] 
 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid 
 or not found on this machine
 [2015-06-02 04:09:24.632122] W [rpcsvc.c:1524:rpcsvc_transport_create] 
 0-rpc-service: cannot create listener, initing the transport failed
 [2015-06-02 04:09:24.632132] D [rpcsvc.c:1801:rpcsvc_program_register] 
 0-rpc-service: New 

Re: [Gluster-users] Gluster 3.7.0 released

2015-06-01 Thread Atin Mukherjee


On 06/01/2015 09:01 PM, Ted Miller wrote:
 On 5/27/2015 1:17 PM, Atin Mukherjee wrote:
 On 05/27/2015 07:33 PM, Ted Miller wrote:
 responses below
 Ted Miller

 On 5/26/2015 12:01 AM, Atin Mukherjee wrote:
 On 05/26/2015 03:12 AM, Ted Miller wrote:
 From: Niels de Vos nde...@redhat.com
 Sent: Monday, May 25, 2015 4:44 PM

 On Mon, May 25, 2015 at 06:49:26PM +, Ted Miller wrote:
 From: Humble Devassy Chirammal humble.deva...@gmail.com
 Sent: Monday, May 18, 2015 9:37 AM
 Hi All,

 GlusterFS 3.7.0 RPMs for RHEL, CentOS, Fedora and packages for
 Debian are available at
 download.gluster.orghttp://download.gluster.org [1].

 [1] http://download.gluster.org/pub/gluster/glusterfs/3.7/3.7.0/

 --Humble


 On Thu, May 14, 2015 at 2:49 PM, Vijay Bellur
 vbel...@redhat.commailto:vbel...@redhat.com wrote:

 Hi All,

 I am happy to announce that Gluster 3.7.0 is now generally
 available. 3.7.0 contains several

 [snip]

 Cheers,
 Vijay

 [snip]
 [snip]

 I have no idea about the problem below, it sounds like something the
 GlusterD developers could help with.

 Niels

 Command 'gluster volume status' on the C5 machine makes everything
 look fine:

 Status of volume: ISO2
 Gluster process   Port
 Online  Pid
 --


 Brick 10.x.x.2:/bricks/01/iso249162
 Y   4679
 Brick 10.x.x.4:/bricks/01/iso249183
 Y   6447
 Brick 10.x.x.9:/bricks/01/iso249169
 Y   1985

 But the same command on either of the C6 machines shows the C5
 machine
 (10.x.x.2) missing in action (though it does recognize that there are
 NFS and heal daemons there):

 Status of volume: ISO2
 Gluster process TCP Port  RDMA Port
 Online  Pid
 --


 Brick 10.41.65.4:/bricks/01/iso249183 0
 Y   6447
 Brick 10.41.65.9:/bricks/01/iso249169 0
 Y   1985
 NFS Server on localhost 2049  0
 Y   2279
 Self-heal Daemon on localhost   N/A   N/A
 Y   2754
 NFS Server on 10.41.65.22049  0
 Y   4757
 Self-heal Daemon on 10.41.65.2  N/A   N/A
 Y   4764
 NFS Server on 10.41.65.42049  0
 Y   6543
 Self-heal Daemon on 10.41.65.4  N/A   N/A
 Y   6551

 So, is this just an oversight (I hope), or has support for C5 been
 dropped?
 If support for C5 is gone, how do I downgrade my Centos6 machines
 back
 to 3.6.x? (I know how to change the repo, but the actual sequence of
 yum commands and gluster commands is unknown to me).
 Could you attach the glusterd log file of 10.x.x.2 machine
 attached as etc-glusterfs-glusterd.vol.log.newer.2, starting from last
 machine reboot
and the node from where you triggered volume status.
 attached as etc-glusterfs-glusterd.vol.log.newer4 starting same time as
 .2 log
 Could you also share gluster volume info output of all the nodes?
 I have several volumes, so I chose the one that shows up first on the
 listings:

 *from 10.41.65.2:*

 [root@office2 /var/log/glusterfs]$ gluster volume info

 Volume Name: ISO2
 Type: Replicate
 Volume ID: 090da4b3-c666-41fe-8283-2c029228b3f7
 Status: Started
 Number of Bricks: 1 x 3 = 3
 Transport-type: tcp
 Bricks:
 Brick1: 10.41.65.2:/bricks/01/iso2
 Brick2: 10.41.65.4:/bricks/01/iso2
 Brick3: 10.41.65.9:/bricks/01/iso2

 [root@office2 /var/log/glusterfs]$ gluster volume status ISO2
 Status of volume: ISO2
 Gluster process PortOnline  Pid
 --


 Brick 10.41.65.2:/bricks/01/iso2 49162   Y   4463
 Brick 10.41.65.4:/bricks/01/iso2 49183   Y   6447
 Brick 10.41.65.9:/bricks/01/iso2 49169   Y   1985
 NFS Server on localhost 2049Y   4536
 Self-heal Daemon on localhost N/A Y   4543
 NFS Server on 10.41.65.9 2049Y   2279
 Self-heal Daemon on 10.41.65.9 N/A Y   2754
 NFS Server on 10.41.65.4 2049Y   6543
 Self-heal Daemon on 10.41.65.4 N/A Y   6551

 Task Status of Volume ISO2
 --


 There are no active volume tasks

 [root@office2 ~]$ gluster peer status
 Number of Peers: 2

 Hostname: 10.41.65.9
 Uuid: cf2ae9c7-833e-4a73-a996-e72158011c69
 State: Peer in Cluster (Connected)

 Hostname: 10.41.65.4
 Uuid: bd3ca8b7-f2da-44ce-8739-c0db5e40158c
 State: Peer in Cluster (Connected)


 *from 10.41.65.4:*

 [root@office4b ~]# gluster volume info ISO2

 Volume Name: ISO2
 Type: Replicate
 Volume ID: 090da4b3-c666-41fe-8283-2c029228b3f7
 Status: Started
 Number of Bricks: 1 x 3 = 3
 Transport-type: tcp
 Bricks:
 Brick1: 10.41.65.2:/bricks/01/iso2
 Brick2: 10.41.65.4:/bricks/01/iso2
 Brick3: 10.41.65.9:/bricks/01/iso2

 [root@office4b ~]# 

[Gluster-users] split brain on / just after installation

2015-06-01 Thread Carl L Hoffman
Hello - I was wondering if someone could please help me.

I've just setup Gluster 3.6 on two Ubuntu 14.04 hosts.  Gluster is setup to 
replicate two volumes (prod-volume, dev-volume) between the two hosts.  
Replication is working fine.  The glustershd.log shows:

[2015-06-02 03:28:04.495162] E 
[afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 
0-prod-volume-replicate-0: Unable to self-heal contents of 
'gfid:----0001' (possible split-brain). Please 
delete the file from all but the preferred subvolume.- Pending matrix:  [ [ 0 2 
] [ 2 0 ] ]

and the prod-volume logs shows:

[2015-06-02 02:54:28.286268] E 
[afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 
0-prod-volume-replicate-0: Unable to self-heal contents of '/' (possible 
split-brain). Please delete the file from all but the preferred subvolume.- 
Pending matrix:  [ [ 0 2 ] [ 2 0 ] ]
[2015-06-02 02:54:28.287476] E 
[afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 
0-prod-volume-replicate-0: background  meta-data self-heal failed on /

I've checked against 
https://github.com/gluster/glusterfs/blob/6c578c03f0d44913d264494de5df004544c96271/doc/features/heal-info-and-split-brain-resolution.md
 but I can't see any scenario that covers mine.  The output of bluster volume 
heal prod-volume info is:

Gathering Heal info on volume prod-volume has been successful

Brick server1:/export/prodvol/brick
Number of entries: 1
/

Brick server2
Number of entries: 1
/


and doesn't show anything in split-brain.

But the output of gluster volume heal prod-volume info split brain shows:

Gathering Heal info on volume prod-volume has been successful

Brick server1:/export/prodvol/brick
Number of entries: 6
atpath on brick
---
2015-06-02 03:28:04 /
2015-06-02 03:18:04 /
2015-06-02 03:08:04 /
2015-06-02 02:58:04 /
2015-06-02 02:48:04 /
2015-06-02 02:48:04 /

Brick server2:/export/prodvol/brick
Number of entries: 5
atpath on brick
---
2015-06-02 03:28:00 /
2015-06-02 03:18:00 /
2015-06-02 03:08:00 /
2015-06-02 02:58:00 /
2015-06-02 02:48:04 /


And the number continues to grow.  The count on server2 is always one behind 
server1.

Could someone please help?

Cheers,


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Client load high (300) using fuse mount

2015-06-01 Thread Pranith Kumar Karampuri

hi Mitja,
 Could you please give output of the following commands:
1) gluster volume info
2) gluster volume profile volname start
3) Wait while the CPU is high for 5-10 minutes
4) gluster volume profile volname info  
output-you-need-to-attach-to-this-mail.txt


4th command tells what are the operations that are issued a lot.

Pranith

On 06/01/2015 04:41 PM, Mitja Mihelič wrote:

Hi!

I am trying to set up a Wordpress cluster using GlusterFS used for 
storage. Web nodes will access the same Wordpress install on a volume 
mounted via FUSE from a 3 peer GlusterFS TSP.


I started with one web node and Wordpress on local storage. The load 
average was constantly about 5. iotop showed about 300kB/s disk reads 
or less. The load average was below 6.


When I mounted the GlusterFS volume to the web node the 1min load 
average went over 300. Each of the 3 peers is transmitting about 
10MB/s to my web node regardless of the load.

TSP peers are on 10Gbit NICs and the web node is on a 1Gbit NIC.

I'm out of ideas here... Could it be the network?
What should I look at for optimizing the network stack on the client?

Options set on TSP:
Options Reconfigured:
performance.cache-size: 4GB
network.ping-timeout: 15
cluster.quorum-type: auto
network.remote-dio: on
cluster.eager-lock: on
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.cache-refresh-timeout: 4
performance.io-thread-count: 32
nfs.disable: on

Regards, Mitja



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users