Re: [Gluster-users] Please advise for our file server cluster

2015-06-08 Thread Ben Turner
- Original Message -
 From: Gao g...@pztop.com
 To: gluster-users@gluster.org
 Sent: Monday, June 8, 2015 12:58:56 PM
 Subject: Re: [Gluster-users] Please advise for our file server cluster
 
 On 15-06-05 04:30 PM, Gao wrote:
  Hi,
 
  We are a small business and now we are planning to build a new file
  server system. I did some research and I decide to use GlusterFS as
  the cluster system to build a 2-node system. Our goals are trying to
  minimize the downtime and to avoid single point of failure. Meanwhile,
  I need keep an eye on the budget.
 
  In our office we have 20+ computers running Ubuntu. Few(6) machines
  use Windows 8. We use a SAMBA server to take care file sharing.

What file sizes / access patterns are you planning on using?  Smallfile and 
stat / metadata operations on Windows / Samba will be much slower than using 
glusterfs or NFS mounts.  Be sure to clearly identify your performance 
requirements before you go to size your HW.

 
  I did some research and here are some main components I selected for
  the system:
  M/B: Asus P9D-E/4L (It has 6 SATA ports so I can use softRAID5 for
  data storage. 4 NIC ports so I can do link aggregation)
  CPU: XEON E3-1220v3 3.1GHz (is this over kill? the MB also support i3
  though.)
  Memory: 4x8GB ECC DDR3
  SSD: 120 GB for OS
  Hard Drive: 4 (or 5) 3TB 7200RPM drive to form soft RAID5
  10GBe card: Intel X540-T1

Seems reasonable.  I would expect 40-60 MB / sec writes and 80-100 MB / sec 
reads over gigabit with sequential workloads.  Over 10G I would expect ~200-400 
MB / sec for sequential reads and writes.  Glusterfs and NFS mounts will 
perform better but it sounds like you need samba for your windows hosts.

 
  About the hardware I am not confident. One thing is the 10GBe card. Is
  it sufficient? I chose this because it's less expensive. But I don't
  want it drag the system down once I build them. Also, if I only need 2
  nodes, can I just use CAT6 cable to link them together? or I have to
  use a 10GBe switch?

It all depends on your performance requirements.  You will need a 10G switch if 
you want the clients to access the servers over 10G.  If you don't need more 
than 120 MB / sec you can use gigabit, but if you need more then you will have 
to goto the 10G NICs.  


 
  Could someone give me some advice?
 
  Thanks.
 
  Gao
 
 
 
 
 Any help? Please.
 
 --
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users
 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Questions on ganesha HA and shared storage size

2015-06-08 Thread Alessandro De Salvo
OK, I found at least one of the bugs.
The /usr/libexec/ganesha/ganesha.sh has the following lines:

if [ -e /etc/os-release ]; then
RHEL6_PCS_CNAME_OPTION=
fi

This is OK for RHEL  7, but does not work for = 7. I have changed it to the 
following, to make it working:

if [ -e /etc/os-release ]; then
eval $(grep -F REDHAT_SUPPORT_PRODUCT= /etc/os-release)
[ $REDHAT_SUPPORT_PRODUCT == Fedora ]  RHEL6_PCS_CNAME_OPTION=
fi

Apart from that, the VIP_node I was using were wrong, and I should have 
converted all the “-“ to underscores, maybe this could be mentioned in the 
documentation when you will have it ready.
Now, the cluster starts, but the VIPs apparently not:

Online: [ atlas-node1 atlas-node2 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
 Started: [ atlas-node1 atlas-node2 ]
 Clone Set: nfs-grace-clone [nfs-grace]
 Started: [ atlas-node1 atlas-node2 ]
 atlas-node1-cluster_ip-1  (ocf::heartbeat:IPaddr):Stopped 
 atlas-node1-trigger_ip-1  (ocf::heartbeat:Dummy): Started atlas-node1 
 atlas-node2-cluster_ip-1  (ocf::heartbeat:IPaddr):Stopped 
 atlas-node2-trigger_ip-1  (ocf::heartbeat:Dummy): Started atlas-node2 
 atlas-node1-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node1 
 atlas-node2-dead_ip-1 (ocf::heartbeat:Dummy): Started atlas-node2 

PCSD Status:
  atlas-node1: Online
  atlas-node2: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled


But the issue that is puzzling me more is the following:

# showmount -e localhost
rpc mount export: RPC: Timed out

And when I try to enable the ganesha exports on a volume I get this error:

# gluster volume set atlas-home-01 ganesha.enable on
volume set: failed: Failed to create NFS-Ganesha export config file.

But I see the file created in /etc/ganesha/exports/*.conf
Still, showmount hangs and times out.
Any help?
Thanks,

Alessandro

 Il giorno 08/giu/2015, alle ore 20:00, Alessandro De Salvo 
 alessandro.desa...@roma1.infn.it ha scritto:
 
 Hi,
 indeed, it does not work :-)
 OK, this is what I did, with 2 machines, running CentOS 7.1, Glusterfs 3.7.1 
 and nfs-ganesha 2.2.0:
 
 1) ensured that the machines are able to resolve their IPs (but this was 
 already true since they were in the DNS);
 2) disabled NetworkManager and enabled network on both machines;
 3) created a gluster shared volume 'gluster_shared_storage' and mounted it on 
 '/run/gluster/shared_storage' on all the cluster nodes using glusterfs native 
 mount (on CentOS 7.1 there is a link by default /var/run - ../run)
 4) created an empty /etc/ganesha/ganesha.conf;
 5) installed pacemaker pcs resource-agents corosync on all cluster machines;
 6) set the ‘hacluster’ user the same password on all machines;
 7) pcs cluster auth hostname -u hacluster -p pass on all the nodes (on 
 both nodes I issued the commands for both nodes)
 8) IPv6 is configured by default on all nodes, although the infrastructure is 
 not ready for IPv6
 9) enabled pcsd and started it on all nodes
 10) populated /etc/ganesha/ganesha-ha.conf with the following contents, one 
 per machine:
 
 
 === atlas-node1
 # Name of the HA cluster created.
 HA_NAME=ATLAS_GANESHA_01
 # The server from which you intend to mount
 # the shared volume.
 HA_VOL_SERVER=“atlas-node1
 # The subset of nodes of the Gluster Trusted Pool
 # that forms the ganesha HA cluster. IP/Hostname
 # is specified.
 HA_CLUSTER_NODES=“atlas-node1,atlas-node2
 # Virtual IPs of each of the nodes specified above.
 VIP_atlas-node1=“x.x.x.1
 VIP_atlas-node2=“x.x.x.2
 
 === atlas-node2
 # Name of the HA cluster created.
 HA_NAME=ATLAS_GANESHA_01
 # The server from which you intend to mount
 # the shared volume.
 HA_VOL_SERVER=“atlas-node2
 # The subset of nodes of the Gluster Trusted Pool
 # that forms the ganesha HA cluster. IP/Hostname
 # is specified.
 HA_CLUSTER_NODES=“atlas-node1,atlas-node2
 # Virtual IPs of each of the nodes specified above.
 VIP_atlas-node1=“x.x.x.1
 VIP_atlas-node2=“x.x.x.2”
 
 11) issued gluster nfs-ganesha enable, but it fails with a cryptic message:
 
 # gluster nfs-ganesha enable
 Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the trusted 
 pool. Do you still want to continue? (y/n) y
 nfs-ganesha: failed: Failed to set up HA config for NFS-Ganesha. Please check 
 the log file for details
 
 Looking at the logs I found nothing really special but this:
 
 == /var/log/glusterfs/etc-glusterfs-glusterd.vol.log ==
 [2015-06-08 17:57:15.672844] I [MSGID: 106132] 
 [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
 [2015-06-08 17:57:15.675395] I [glusterd-ganesha.c:386:check_host_list] 
 0-management: ganesha host found Hostname is atlas-node2
 [2015-06-08 17:57:15.720692] I [glusterd-ganesha.c:386:check_host_list] 
 0-management: ganesha host found Hostname is atlas-node2
 [2015-06-08 17:57:15.721161] I [glusterd-ganesha.c:335:is_ganesha_host] 
 

Re: [Gluster-users] Errors in quota-crawl.log

2015-06-08 Thread Ryan Clough
I have submitted a BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1229422

___
¯\_(ツ)_/¯
Ryan Clough
Information Systems
Decision Sciences International Corporation
http://www.decisionsciencescorp.com/
http://www.decisionsciencescorp.com/

On Wed, Apr 8, 2015 at 1:49 AM, Sachin Pandit span...@redhat.com wrote:


 Please find the comments inline.

 - Original Message -
  From: Ryan Clough ryan.clo...@dsic.com
  To: gluster-users gluster-users@gluster.org
  Sent: Wednesday, April 8, 2015 9:59:55 AM
  Subject: Re: [Gluster-users] Errors in quota-crawl.log
 
  No takers? Seems like quota is working but when I see permission denied
  warnings it makes me wonder if the quota calculations are going to be
  accurate. Any help would be much appreciated.
 
  Ryan Clough
  Information Systems
  Decision Sciences International Corporation
 
  On Thu, Apr 2, 2015 at 12:43 PM, Ryan Clough  ryan.clo...@dsic.com 
 wrote:
 
 
 
  We are running the following operating system:
  Scientific Linux release 6.6 (Carbon)
 
  With the following kernel:
  2.6.32-504.3.3.el6.x86_64
 
  We are using the following version of Glusterfs:
  glusterfs-libs-3.6.2-1.el6.x86_64
  glusterfs-3.6.2-1.el6.x86_64
  glusterfs-cli-3.6.2-1.el6.x86_64
  glusterfs-api-3.6.2-1.el6.x86_64
  glusterfs-fuse-3.6.2-1.el6.x86_64
  glusterfs-server-3.6.2-1.el6.x86_64
 
  Here is the current configuration of our 2 node distribute only cluster:
  Volume Name: export_volume
  Type: Distribute
  Volume ID: c74cc970-31e2-4924-a244-4c70d958dadb
  Status: Started
  Number of Bricks: 2
  Transport-type: tcp
  Bricks:
  Brick1: hgluster01:/gluster_data
  Brick2: hgluster02:/gluster_data
  Options Reconfigured:
  performance.cache-size: 1GB
  diagnostics.brick-log-level: ERROR
  performance.stat-prefetch: on
  performance.write-behind: on
  performance.flush-behind: on
  features.quota-deem-statfs: on
  performance.quick-read: off
  performance.client-io-threads: on
  performance.read-ahead: on
  performance.io-thread-count: 24
  features.quota: on
  cluster.eager-lock: on
  nfs.disable: on
  auth.allow: 192.168.10.*,10.0.10.*,10.8.0.*,10.2.0.*,10.0.60.*
  server.allow-insecure: on
  performance.write-behind-window-size: 1MB
  network.ping-timeout: 60
  features.quota-timeout: 0
  performance.io-cache: off
  server.root-squash: on
  performance.readdir-ahead: on
 
  Here is the status of the nodes:
  Status of volume: export_volume
  Gluster process Port Online Pid
 
 --
  Brick hgluster01:/gluster_data 49152 Y 7370
  Brick hgluster02:/gluster_data 49152 Y 17868
  Quota Daemon on localhost N/A Y 2051
  Quota Daemon on hgluster02.red.dsic.com N/A Y 6691
 
  Task Status of Volume export_volume
 
 --
  There are no active volume tasks
 
  I have just turned quota on and was watching the quota-crawl.log and see
 a
  bunch of these type of messages:
 
  [2015-04-02 19:23:01.540692] W [fuse-bridge.c:483:fuse_entry_cbk]
  0-glusterfs-fuse: 2338683: LOOKUP() /\ = -1 (Permission denied)
 
  [2015-04-02 19:23:01.543565] W
 [client-rpc-fops.c:2766:client3_3_lookup_cbk]
  0-export_volume-client-1: remote operation failed: Permission denied.
 Path:
  /\ (----)
 
  [2015-04-02 17:58:14.090556] W
 [client-rpc-fops.c:2766:client3_3_lookup_cbk]
  0-export_volume-client-0: remote operation failed: Permission denied.
 Path:
  /\ (----)
 
  Should I be worried about this and how do I go about fixing the
 permissions?
  Is this a bug and should it be reported?

 Hi Ryan,

 Apologies for the late reply. Looking at the description of the problem
 I don't think there will be any problem. I think its better if we track
 this problem using a bug. If you have already raised a bug then please
 do provide us a bug-id, or else we will raise a new bug.

 I have one question: Looking at the path /\ , do you have a directory
 with similar path, as we can see accessing that has failed?

 Thanks,
 Sachin.



 
  Thanks, in advance, for your time to help me.
  Ryan Clough
  Information Systems
  Decision Sciences International Corporation
 
 
  This email and its contents are confidential. If you are not the intended
  recipient, please do not disclose or use the information within this
 email
  or its attachments. If you have received this email in error, please
 report
  the error to the sender by return email and delete this communication
 from
  your records.
 
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://www.gluster.org/mailman/listinfo/gluster-users


-- 
This email and its contents are confidential. If you are not the intended 
recipient, please do not disclose or use the information within this email 
or its attachments. If you have received this email in error, 

Re: [Gluster-users] Questions on ganesha HA and shared storage size

2015-06-08 Thread Soumya Koduri




On 06/08/2015 08:20 PM, Alessandro De Salvo wrote:

Sorry, just another question:

- in my installation of gluster 3.7.1 the command gluster features.ganesha 
enable does not work:

# gluster features.ganesha enable
unrecognized word: features.ganesha (position 0)

Which version has full support for it?


Sorry. This option has recently been changed. It is now

$ gluster nfs-ganesha enable




- in the documentation the ccs and cman packages are required, but they seems 
not to be available anymore on CentOS 7 and similar, I guess they are not 
really required anymore, as pcs should do the full job

Thanks,

Alessandro


Looks like so from http://clusterlabs.org/quickstart-redhat.html. Let us 
know if it doesn't work.


Thanks,
Soumya




Il giorno 08/giu/2015, alle ore 15:09, Alessandro De Salvo 
alessandro.desa...@roma1.infn.it ha scritto:

Great, many thanks Soumya!
Cheers,

Alessandro


Il giorno 08/giu/2015, alle ore 13:53, Soumya Koduri skod...@redhat.com ha 
scritto:

Hi,

Please find the slides of the demo video at [1]

We recommend to have a distributed replica volume as a shared volume for better 
data-availability.

Size of the volume depends on the workload you may have. Since it is used to 
maintain states of NLM/NFSv4 clients, you may calculate the size of the volume 
to be minimum of aggregate of
(typical_size_of'/var/lib/nfs'_directory + 
~4k*no_of_clients_connected_to_each_of_the_nfs_servers_at_any_point)

We shall document about this feature sooner in the gluster docs as well.

Thanks,
Soumya

[1] - http://www.slideshare.net/SoumyaKoduri/high-49117846

On 06/08/2015 04:34 PM, Alessandro De Salvo wrote:

Hi,
I have seen the demo video on ganesha HA, 
https://www.youtube.com/watch?v=Z4mvTQC-efM
However there is no advice on the appropriate size of the shared volume. How is 
it really used, and what should be a reasonable size for it?
Also, are the slides from the video available somewhere, as well as a 
documentation on all this? I did not manage to find them.
Thanks,

Alessandro



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users






___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Questions on ganesha HA and shared storage size

2015-06-08 Thread Alessandro De Salvo
Hi,
indeed, it does not work :-)
OK, this is what I did, with 2 machines, running CentOS 7.1, Glusterfs 3.7.1 
and nfs-ganesha 2.2.0:

1) ensured that the machines are able to resolve their IPs (but this was 
already true since they were in the DNS);
2) disabled NetworkManager and enabled network on both machines;
3) created a gluster shared volume 'gluster_shared_storage' and mounted it on 
'/run/gluster/shared_storage' on all the cluster nodes using glusterfs native 
mount (on CentOS 7.1 there is a link by default /var/run - ../run)
4) created an empty /etc/ganesha/ganesha.conf;
5) installed pacemaker pcs resource-agents corosync on all cluster machines;
6) set the ‘hacluster’ user the same password on all machines;
7) pcs cluster auth hostname -u hacluster -p pass on all the nodes (on both 
nodes I issued the commands for both nodes)
8) IPv6 is configured by default on all nodes, although the infrastructure is 
not ready for IPv6
9) enabled pcsd and started it on all nodes
10) populated /etc/ganesha/ganesha-ha.conf with the following contents, one per 
machine:


=== atlas-node1
# Name of the HA cluster created.
HA_NAME=ATLAS_GANESHA_01
# The server from which you intend to mount
# the shared volume.
HA_VOL_SERVER=“atlas-node1
# The subset of nodes of the Gluster Trusted Pool
# that forms the ganesha HA cluster. IP/Hostname
# is specified.
HA_CLUSTER_NODES=“atlas-node1,atlas-node2
# Virtual IPs of each of the nodes specified above.
VIP_atlas-node1=“x.x.x.1
VIP_atlas-node2=“x.x.x.2

=== atlas-node2
# Name of the HA cluster created.
HA_NAME=ATLAS_GANESHA_01
# The server from which you intend to mount
# the shared volume.
HA_VOL_SERVER=“atlas-node2
# The subset of nodes of the Gluster Trusted Pool
# that forms the ganesha HA cluster. IP/Hostname
# is specified.
HA_CLUSTER_NODES=“atlas-node1,atlas-node2
# Virtual IPs of each of the nodes specified above.
VIP_atlas-node1=“x.x.x.1
VIP_atlas-node2=“x.x.x.2”

11) issued gluster nfs-ganesha enable, but it fails with a cryptic message:

# gluster nfs-ganesha enable
Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the trusted 
pool. Do you still want to continue? (y/n) y
nfs-ganesha: failed: Failed to set up HA config for NFS-Ganesha. Please check 
the log file for details

Looking at the logs I found nothing really special but this:

== /var/log/glusterfs/etc-glusterfs-glusterd.vol.log ==
[2015-06-08 17:57:15.672844] I [MSGID: 106132] 
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2015-06-08 17:57:15.675395] I [glusterd-ganesha.c:386:check_host_list] 
0-management: ganesha host found Hostname is atlas-node2
[2015-06-08 17:57:15.720692] I [glusterd-ganesha.c:386:check_host_list] 
0-management: ganesha host found Hostname is atlas-node2
[2015-06-08 17:57:15.721161] I [glusterd-ganesha.c:335:is_ganesha_host] 
0-management: ganesha host found Hostname is atlas-node2
[2015-06-08 17:57:16.633048] E [glusterd-ganesha.c:254:glusterd_op_set_ganesha] 
0-management: Initial NFS-Ganesha set up failed
[2015-06-08 17:57:16.641563] E [glusterd-syncop.c:1396:gd_commit_op_phase] 
0-management: Commit of operation 'Volume (null)' failed on localhost : Failed 
to set up HA config for NFS-Ganesha. Please check the log file for details

== /var/log/glusterfs/cmd_history.log ==
[2015-06-08 17:57:16.643615]  : nfs-ganesha enable : FAILED : Failed to set up 
HA config for NFS-Ganesha. Please check the log file for details

== /var/log/glusterfs/cli.log ==
[2015-06-08 17:57:16.643839] I [input.c:36:cli_batch] 0-: Exiting with: -1


Also, pcs seems to be fine for the auth part, although it obviously tells me 
the cluster is not running.

I, [2015-06-08T19:57:16.305323 #7223]  INFO -- : Running: 
/usr/sbin/corosync-cmapctl totem.cluster_name
I, [2015-06-08T19:57:16.345457 #7223]  INFO -- : Running: /usr/sbin/pcs cluster 
token-nodes
:::141.108.38.46 - - [08/Jun/2015 19:57:16] GET /remote/check_auth 
HTTP/1.1 200 68 0.1919
:::141.108.38.46 - - [08/Jun/2015 19:57:16] GET /remote/check_auth 
HTTP/1.1 200 68 0.1920
atlas-node1.mydomain - - [08/Jun/2015:19:57:16 CEST] GET /remote/check_auth 
HTTP/1.1 200 68
- - /remote/check_auth


What am I doing wrong?
Thanks,

Alessandro

 Il giorno 08/giu/2015, alle ore 19:30, Soumya Koduri skod...@redhat.com ha 
 scritto:
 
 
 
 
 On 06/08/2015 08:20 PM, Alessandro De Salvo wrote:
 Sorry, just another question:
 
 - in my installation of gluster 3.7.1 the command gluster features.ganesha 
 enable does not work:
 
 # gluster features.ganesha enable
 unrecognized word: features.ganesha (position 0)
 
 Which version has full support for it?
 
 Sorry. This option has recently been changed. It is now
 
 $ gluster nfs-ganesha enable
 
 
 
 - in the documentation the ccs and cman packages are required, but they 
 seems not to be available anymore on CentOS 7 and similar, I guess they are 
 not really required anymore, as pcs should do the full job
 
 Thanks,
 
  Alessandro
 
 Looks like so from 

Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances

2015-06-08 Thread Geoffrey Letessier
Hi Ben

Here the expected output:
[root@node048 ~]# iperf3 -c 10.0.4.1
Connecting to host 10.0.4.1, port 5201
[  4] local 10.0.5.48 port 44151 connected to 10.0.4.1 port 5201
[ ID] Interval   Transfer Bandwidth   Retr  Cwnd
[  4]   0.00-1.00   sec  1.86 GBytes  15.9 Gbits/sec0   8.24 MBytes   
[  4]   1.00-2.00   sec  1.94 GBytes  16.7 Gbits/sec0   8.24 MBytes   
[  4]   2.00-3.00   sec  1.95 GBytes  16.8 Gbits/sec0   8.24 MBytes   
[  4]   3.00-4.00   sec  1.86 GBytes  16.0 Gbits/sec0   8.24 MBytes   
[  4]   4.00-5.00   sec  1.85 GBytes  15.8 Gbits/sec0   8.24 MBytes   
[  4]   5.00-6.00   sec  1.89 GBytes  16.2 Gbits/sec0   8.24 MBytes   
[  4]   6.00-7.00   sec  1.90 GBytes  16.3 Gbits/sec0   8.24 MBytes   
[  4]   7.00-8.00   sec  1.88 GBytes  16.1 Gbits/sec0   8.24 MBytes   
[  4]   8.00-9.00   sec  1.88 GBytes  16.2 Gbits/sec0   8.24 MBytes   
[  4]   9.00-10.00  sec  1.87 GBytes  16.1 Gbits/sec0   8.24 MBytes   
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval   Transfer Bandwidth   Retr
[  4]   0.00-10.00  sec  18.9 GBytes  16.2 Gbits/sec0 sender
[  4]   0.00-10.00  sec  18.9 GBytes  16.2 Gbits/sec  receiver

iperf Done.

Here are all shell commands i used for volume creation with RDMA transport-type:
gluster volume create vol_home replica 2 transport rdma,tcp 
ib-storage1:/export/brick_home/brick1/ ib-storage2:/export/brick_home/brick1/ 
ib-storage3:/export/brick_home/brick1/ ib-storage4:/export/brick_home/brick1/ 
ib-storage1:/export/brick_home/brick2/ ib-storage2:/export/brick_home/brick2/ 
ib-storage3:/export/brick_home/brick2/ ib-storage4:/export/brick_home/brick2/ 
force

and below the current volume information:
[root@lucifer ~]# gluster volume info vol_home
 
Volume Name: vol_home
Type: Distributed-Replicate
Volume ID: f6ebcfc1-b735-4a0e-b1d7-47ed2d2e7af6
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp,rdma
Bricks:
Brick1: ib-storage1:/export/brick_home/brick1
Brick2: ib-storage2:/export/brick_home/brick1
Brick3: ib-storage3:/export/brick_home/brick1
Brick4: ib-storage4:/export/brick_home/brick1
Brick5: ib-storage1:/export/brick_home/brick2
Brick6: ib-storage2:/export/brick_home/brick2
Brick7: ib-storage3:/export/brick_home/brick2
Brick8: ib-storage4:/export/brick_home/brick2
Options Reconfigured:
performance.stat-prefetch: on
performance.flush-behind: on
features.default-soft-limit: 90%
features.quota: on
diagnostics.brick-log-level: CRITICAL
auth.allow: localhost,127.0.0.1,10.*
nfs.disable: on
performance.cache-size: 64MB
performance.write-behind-window-size: 1MB
performance.quick-read: on
performance.io-cache: on
performance.io-thread-count: 64
nfs.enable-ino32: on

and below my mount command:
mount -t glusterfs -o transport=rdma,direct-io-mode=disable,enable-ino32 
ib-storage1:vol_home /home

I dont obtain any error with RDMA option but transport type silently fall back 
to TCP.

Did i make any mistake in my settings?

Can you tell me more about block size and other tunings i should do on my rdma 
volumes?

Thanks in advance,
Geoffrey
--
Geoffrey Letessier
Responsable informatique  ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letess...@ibpc.fr

Le 8 juin 2015 à 18:22, Ben Turner btur...@redhat.com a écrit :

 - Original Message -
 From: Geoffrey Letessier geoffrey.letess...@cnrs.fr
 To: Ben Turner btur...@redhat.com
 Cc: Pranith Kumar Karampuri pkara...@redhat.com, 
 gluster-users@gluster.org
 Sent: Monday, June 8, 2015 8:37:08 AM
 Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances
 
 Hello,
 
 Do you know more about?
 
 In addition, do you know how to « activate » RDMA for my volume with
 Intel/QLogic QDR? Currently, i mount my volumes with RDMA transport-type
 option (both in server and client side) but I notice all streams are using
 TCP stack -and my bandwith never exceed 2.0-2.5Gbs (250-300MB/s).
 
 That is a little slow for the HW you described.  Can you check what you get 
 with iperf just between the clients and servers? https://iperf.fr/  With 
 replica 2 and 10G NW you should see ~400 MB / sec sequential writes and ~600 
 MB / sec reads.  Can you send me the output from gluster v info?  You specify 
 RDMA volumes at create time by running gluster v create blah transport rdma, 
 did you specify RDMA when you created the volume?  What block size are you 
 using in your tests?  1024 KB writes perform best with glusterfs, and the 
 block size gets smaller perf will drop a little bit.  I wouldn't write in 
 anything under 4k blocks, the sweet spot is between 64k and 1024k.
 
 -b
 
 
 Thanks in advance,
 Geoffrey
 --
 Geoffrey Letessier
 

[Gluster-users] reading from local replica?

2015-06-08 Thread Brian Ericson

Am I misunderstanding cluster.read-subvolume/cluster.read-subvolume-index?

I have two regions, A and B with servers a and b in, 
respectfully, each region.  I have clients in both regions. Intra-region 
communication is fast, but the pipe between the regions is terrible.  
I'd like to minimize inter-region communication to as close to glusterfs 
write operations only and have reads go to the server in the region the 
client is running in.


I have created a replica volume as:
gluster volume create gv0 replica 2 a:/data/brick1/gv0 
b:/data/brick1/gv0 force


As a baseline, if I use scp to copy from the brick directly, I get -- 
for a 100M file -- times of about 6s if the client scps from the server 
in the same region and anywhere from 3 to 5 minutes if I the client scps 
the server in the other region.


I was under the impression (from something I read but can't now find) 
that glusterfs automatically picks the fastest replica, but that has not 
been my experience; glusterfs seems to generally prefer the server in 
the other region over the local one, with times usually in excess of 4 
minutes.


I've also tried having clients mount the volume using the xlator 
options cluster.read-subvolume and cluster.read-subvolume-index, but 
neither seem to have any impact.  Here are sample mount commands to show 
what I'm attempting:


mount -t glusterfs -o xlator-option=cluster.read-subvolume=gv0-client-0 
or 1 a:/gv0 /mnt/glusterfs
mount -t glusterfs -o xlator-option=cluster.read-subvolume-index=0 or 
1 a:/gv0 /mnt/glusterfs


Am I misunderstanding how glusterfs works, particularly when trying to 
read locally?  Is it possible to configure glusterfs to use a local 
replica (or the fastest replica) for reads?

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] nfs-ganesha/samba vfs and replica redundancy

2015-06-08 Thread Ted Miller

On 6/3/2015 3:15 AM, Benjamin Kingston wrote:
Can someone give me a hint on the best way to maintain data availability to 
a share on a third system using nfs-ganesha and samba?


I currently have a round-robbin dns entry that nfs ganesha/samba uses, 
however even with a short ttl, there's brief downtime when a replica node 
fails. I can't see in the samba VFS or ganesha fsal syntax where a 
secondary address can be provided.


I've tried comma seperated, space seperated, with/without quotes for 
multiple IP's and only seen issues.
Any reason you aren't using a floating IP address?  This isn't the newest 
talk, but the concepts have not changed: 
http://events.linuxfoundation.org/sites/events/files/lcjpcojp13_nakai.pdf


Ted Miller
Elkhart, IN, USA

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] The strange behavior whose common denominator is gluster

2015-06-08 Thread Ted Miller
Are you sure you have mounted the gluster volume, and are writing to the 
gluster volume, and NOT to the brick?  What you describe can happen when you 
write to the brick instead of the gluster volume. You can see here: 
http://www.gluster.org/community/documentation/index.php/QuickStart in steps 
6 and 7.  If you do not understand the difference, include the output of the 
'mount' command from one of your servers.

Ted Miller
Elkhart, IN, USA

On 6/5/2015 8:46 AM, Pablo Silva wrote:

Dear Colleagues:

We are using gluster in 3.3.1-15.el6.x86_64 and GlusterFS-3.6.2-1.el5 
versions, we have two types of service:


1) Apache httpd-2.2.3-91.el5.centos + GlusterFS-3.6.2-1.el5 (two bricks)

2) AS2 Mendelson B45 + gluster 3.3.1-15.el6.x86_64 (two bricks)

It is different services, a common problem, which I will explain

Service N1 (Apache httpd-2.2.3-91.el5.centos + GlusterFS-3.6.2-1.el5 (two 
bricks))

---
We have a high-availability architecture, in which there are two Apache 
servers see a directory that is hosted on a gluster long ago we had a 
problem where an Apache server could list the files and submit them for 
download, while the other Apache server that is watching the same directory 
with the same files gluster indicated that there were no files for download.


Feeding gluster files to that directory, MULE performed asynchronously.In 
summary, an Apache server could access files and another did not give aware 
of their existence, as the directory and the same files.


Service N2 (As2 Mendelson B45 + gluster 3.3.1-15.el6.x86_64 (two bricks) )
--
We have only one Mendelson AS2 Server B45 running with gluster (two bricks),
The operations of mendelson is quite simple, is to observe the presence of 
files in a directory every 5 seconds and sent to the partner, the directory 
is hosted in gluster, the issue that every certain amount of time not 
Mendelson AS2 takes cognizance the existence of files in the directory, 
even if you enter the directory notes of its existence


In both cases, different services being the only common denominator is 
gluster, someone else is experiencing this problem?


Have we not set the service gluster well and we are repeating the same 
mistake ?, or is it a bug?


Thanks in advance
Pablo


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


--
*Ted Miller*, Design Engineer
*SonSet Solutions*
(formerly HCJB Global Technology Center)
my desk +1 574.970.4272
receptionist +1 574.972.4252
http://sonsetsolutions.org

/Technology for abundant life!/
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS 3.6.1 breaks VM images on cluster node restart

2015-06-08 Thread Joe Julian
Unfortunately, when I restart every node in the cluster 
sequentially...qemu image of the HA VM gets corrupted...

Even client nodes?

Make sure that your client can connect to all of the servers.

Make sure, after you restart a server, that the self-heal finishes 
before you restart the next one. What I suspect is happening is that you 
restart server A, writes happen on server B. You restart server B before 
the heal has happened to copy the changes from server A to server B, 
thus causing the client to write changes to server B. When server A 
comes back, both server A and server B think they have changes for the 
other. This is a classic split-brain state.


On 06/04/2015 07:08 AM, Roger Lehmann wrote:

Hello, I'm having a serious problem with my GlusterFS cluster.
I'm using Proxmox 3.4 for high available VM management which works 
with GlusterFS as storage.
Unfortunately, when I restart every node in the cluster sequentially 
one by one (with online migration of the running HA VM first of 
course) the qemu image of the HA VM gets corrupted and the VM itself 
has problems accessing it.


May 15 10:35:09 blog kernel: [339003.942602] end_request: I/O error, 
dev vda, sector 2048
May 15 10:35:09 blog kernel: [339003.942829] Buffer I/O error on 
device vda1, logical block 0
May 15 10:35:09 blog kernel: [339003.942929] lost page write due to 
I/O error on vda1
May 15 10:35:09 blog kernel: [339003.942952] end_request: I/O error, 
dev vda, sector 2072
May 15 10:35:09 blog kernel: [339003.943049] Buffer I/O error on 
device vda1, logical block 3
May 15 10:35:09 blog kernel: [339003.943146] lost page write due to 
I/O error on vda1
May 15 10:35:09 blog kernel: [339003.943153] end_request: I/O error, 
dev vda, sector 4196712
May 15 10:35:09 blog kernel: [339003.943251] Buffer I/O error on 
device vda1, logical block 524333
May 15 10:35:09 blog kernel: [339003.943350] lost page write due to 
I/O error on vda1
May 15 10:35:09 blog kernel: [339003.943363] end_request: I/O error, 
dev vda, sector 4197184



After the image is broken, it's impossible to migrate the VM or start 
it when it's down.


root@pve2 ~ # gluster volume heal pve-vol info
Gathering list of entries to be healed on volume pve-vol has been 
successful


Brick pve1:/var/lib/glusterd/brick
Number of entries: 1
/images//200/vm-200-disk-1.qcow2

Brick pve2:/var/lib/glusterd/brick
Number of entries: 1
/images/200/vm-200-disk-1.qcow2

Brick pve3:/var/lib/glusterd/brick
Number of entries: 1
/images//200/vm-200-disk-1.qcow2



I couldn't really reproduce this in my test environment with GlusterFS 
3.6.2 but I had other problems while testing (may also be because of a 
virtualized test environment), so I don't want to upgrade to 3.6.2 
until I definitely know the problems I encountered are fixed in 3.6.2.
Anybody else experienced this problem? I'm not sure if issue 1161885 
(Possible file corruption on dispersed volumes) is the issue I'm 
experiencing. I have a 3 node replicate cluster.

Thanks for your help!

Regards,
Roger Lehmann
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Please advise for our file server cluster

2015-06-08 Thread Gao

On 15-06-05 04:30 PM, Gao wrote:

Hi,

We are a small business and now we are planning to build a new file 
server system. I did some research and I decide to use GlusterFS as 
the cluster system to build a 2-node system. Our goals are trying to 
minimize the downtime and to avoid single point of failure. Meanwhile, 
I need keep an eye on the budget.


In our office we have 20+ computers running Ubuntu. Few(6) machines 
use Windows 8. We use a SAMBA server to take care file sharing.


I did some research and here are some main components I selected for 
the system:
M/B: Asus P9D-E/4L (It has 6 SATA ports so I can use softRAID5 for 
data storage. 4 NIC ports so I can do link aggregation)
CPU: XEON E3-1220v3 3.1GHz (is this over kill? the MB also support i3 
though.)

Memory: 4x8GB ECC DDR3
SSD: 120 GB for OS
Hard Drive: 4 (or 5) 3TB 7200RPM drive to form soft RAID5
10GBe card: Intel X540-T1

About the hardware I am not confident. One thing is the 10GBe card. Is 
it sufficient? I chose this because it's less expensive. But I don't 
want it drag the system down once I build them. Also, if I only need 2 
nodes, can I just use CAT6 cable to link them together? or I have to 
use a 10GBe switch?


Could someone give me some advice?

Thanks.

Gao





Any help? Please.

--

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] hadoop gluster

2015-06-08 Thread 中川智之
Hi EveryOne.

I want to use Hadoop 2.x with GlusterFS. To testing, Prepare these
softwares.

Cnfiguration
- CentOS 7.1(on VMWare)
- Gluster FS 3.7
- Hadoop 2.x
- Glusterfs-Hadoop plugin 2.3.13

Test Process is this.

Test Process
1. Install CentOS 7.1. (3 VM Machine)
  - 1 Machine is GlusterFS Client, and hadoop Namenode
  - 2 Machine is GlusterFS BrickNode, and hadoop Datanode
2. Install GlusterFS 3.7, and cnfigure brick.
3. Mount Gluster volume from client Machine.
4. Install Hadoop 2.x, and HDFS configure.
  * Hadoop installed by root.
5. Starting and testing HDFS and Mapreduce.
6. Configure Hadoop 2.0 with GlusterFS

Finish and Succcess Test 5, and I started Test 6 with this Note.

Note
http://www.gluster.org/community/documentation/index.php/Hadoop
https://forge.gluster.org/hadoop/pages/Configuration

But I'm not entirely the way. 
- How to use glusterfs-hadoop-2.3.13.jar ?
- Only put glusterfs-hadoop-2.3.13.jar and edit core-site.xml, Hadoop can
use on GlusterFS?
- If Finished Configuration, Hadoop can access on GlusterFS Volume by hadoop
command ?
 ex) 
 $ bin/hadoop dfs -cat GlusterVolume/data
 $ bin/hadoop jar
share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar grep
GlusteVolume/data GlusteVolume/dataout '[a-z.]+'

Please tll me the way of Configuring.
I'm a hadoop Beginner, and I'm not so good at English.


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] root squash and git clone

2015-06-08 Thread Prasun Gera
Anyone noticed this ? This is easily reproducible for me. It's a bit
strange though since a) git clone isn't/shouldn't be doing anything as root
and b) if it were, it would have failed similarly on a regular(not
glusterfs) nfs mount with root squash for me, which it doesn't.

On Wed, Jun 3, 2015 at 7:22 AM, Prasun Gera prasun.g...@gmail.com wrote:

 Version: RHS 3.0

 I noticed that if server.root-squash is set on, clients get permissions
 errors on git commands like git clone. Is this a known issue ? I confirmed
 that the write permissions to the destination directories were correct, and
 normal writes were working fine. git clones would fail though with:

 error:unable to write sha1 filename 
 fatal :cannot store pack file
 fatal :index-pack failed

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] root squash and git clone

2015-06-08 Thread Ryan Clough
We had seen this at some point back in the Gluster 3.5.x days but have not
seen it since 3.6.x. If you are truly using fully licensed Red Hat Storage
then I would leverage Red Hat support directly.


___
¯\_(ツ)_/¯
Ryan Clough
Information Systems
Decision Sciences International Corporation
http://www.decisionsciencescorp.com/
http://www.decisionsciencescorp.com/

On Mon, Jun 8, 2015 at 6:34 PM, Prasun Gera prasun.g...@gmail.com wrote:

 Anyone noticed this ? This is easily reproducible for me. It's a bit
 strange though since a) git clone isn't/shouldn't be doing anything as root
 and b) if it were, it would have failed similarly on a regular(not
 glusterfs) nfs mount with root squash for me, which it doesn't.

 On Wed, Jun 3, 2015 at 7:22 AM, Prasun Gera prasun.g...@gmail.com wrote:

 Version: RHS 3.0

 I noticed that if server.root-squash is set on, clients get permissions
 errors on git commands like git clone. Is this a known issue ? I confirmed
 that the write permissions to the destination directories were correct, and
 normal writes were working fine. git clones would fail though with:

 error:unable to write sha1 filename 
 fatal :cannot store pack file
 fatal :index-pack failed



 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users


-- 
This email and its contents are confidential. If you are not the intended 
recipient, please do not disclose or use the information within this email 
or its attachments. If you have received this email in error, please report 
the error to the sender by return email and delete this communication from 
your records.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] hadoop gluster

2015-06-08 Thread Shubhendu Tripathi

You should submit jobs as a user other than yarn which is also a member of the 
hadoop group. We usually add a mapred user.

Also check your Hadoop Home/etc/hadoop/container-executor.cfg:

yarn.nodemanager.linux-container-executor.group=hadoop
banned.users=yarn
min.user.id=1000
allowed.system.users=mapred

You'll want the mapred UID  1000 or else adjust the setting in the file.

Regards,
Shubhendu

On 06/08/2015 07:44 AM, 中川智之 wrote:

Hi EveryOne.

I want to use Hadoop 2.x with GlusterFS. To testing, Prepare these
softwares.

Cnfiguration
- CentOS 7.1(on VMWare)
- Gluster FS 3.7
- Hadoop 2.x
- Glusterfs-Hadoop plugin 2.3.13

Test Process is this.

Test Process
1. Install CentOS 7.1. (3 VM Machine)
   - 1 Machine is GlusterFS Client, and hadoop Namenode
   - 2 Machine is GlusterFS BrickNode, and hadoop Datanode
2. Install GlusterFS 3.7, and cnfigure brick.
3. Mount Gluster volume from client Machine.
4. Install Hadoop 2.x, and HDFS configure.
   * Hadoop installed by root.
5. Starting and testing HDFS and Mapreduce.
6. Configure Hadoop 2.0 with GlusterFS

Finish and Succcess Test 5, and I started Test 6 with this Note.

Note
http://www.gluster.org/community/documentation/index.php/Hadoop
https://forge.gluster.org/hadoop/pages/Configuration

But I'm not entirely the way.
- How to use glusterfs-hadoop-2.3.13.jar ?
- Only put glusterfs-hadoop-2.3.13.jar and edit core-site.xml, Hadoop can
use on GlusterFS?
- If Finished Configuration, Hadoop can access on GlusterFS Volume by hadoop
command ?
  ex)
  $ bin/hadoop dfs -cat GlusterVolume/data
  $ bin/hadoop jar
share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar grep
GlusteVolume/data GlusteVolume/dataout '[a-z.]+'

Please tll me the way of Configuring.
I'm a hadoop Beginner, and I'm not so good at English.





___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Errors in quota-crawl.log

2015-06-08 Thread Sachin Pandit
Hi Ryan,

Thank you for reporting this failure.
We will make sure to fix this as soon as possible.

Thanks,
Sachin Pandit.

- Original Message -
 From: Ryan Clough ryan.clo...@dsic.com
 To: Sachin Pandit span...@redhat.com
 Cc: gluster-users gluster-users@gluster.org, Vijaikumar M 
 vmall...@redhat.com
 Sent: Monday, June 8, 2015 11:21:25 PM
 Subject: Re: [Gluster-users] Errors in quota-crawl.log
 
 I have submitted a BZ:
 https://bugzilla.redhat.com/show_bug.cgi?id=1229422
 
 ___
 ¯\_(ツ)_/¯
 Ryan Clough
 Information Systems
 Decision Sciences International Corporation
 http://www.decisionsciencescorp.com/
 http://www.decisionsciencescorp.com/
 
 On Wed, Apr 8, 2015 at 1:49 AM, Sachin Pandit span...@redhat.com wrote:
 
 
  Please find the comments inline.
 
  - Original Message -
   From: Ryan Clough ryan.clo...@dsic.com
   To: gluster-users gluster-users@gluster.org
   Sent: Wednesday, April 8, 2015 9:59:55 AM
   Subject: Re: [Gluster-users] Errors in quota-crawl.log
  
   No takers? Seems like quota is working but when I see permission denied
   warnings it makes me wonder if the quota calculations are going to be
   accurate. Any help would be much appreciated.
  
   Ryan Clough
   Information Systems
   Decision Sciences International Corporation
  
   On Thu, Apr 2, 2015 at 12:43 PM, Ryan Clough  ryan.clo...@dsic.com 
  wrote:
  
  
  
   We are running the following operating system:
   Scientific Linux release 6.6 (Carbon)
  
   With the following kernel:
   2.6.32-504.3.3.el6.x86_64
  
   We are using the following version of Glusterfs:
   glusterfs-libs-3.6.2-1.el6.x86_64
   glusterfs-3.6.2-1.el6.x86_64
   glusterfs-cli-3.6.2-1.el6.x86_64
   glusterfs-api-3.6.2-1.el6.x86_64
   glusterfs-fuse-3.6.2-1.el6.x86_64
   glusterfs-server-3.6.2-1.el6.x86_64
  
   Here is the current configuration of our 2 node distribute only cluster:
   Volume Name: export_volume
   Type: Distribute
   Volume ID: c74cc970-31e2-4924-a244-4c70d958dadb
   Status: Started
   Number of Bricks: 2
   Transport-type: tcp
   Bricks:
   Brick1: hgluster01:/gluster_data
   Brick2: hgluster02:/gluster_data
   Options Reconfigured:
   performance.cache-size: 1GB
   diagnostics.brick-log-level: ERROR
   performance.stat-prefetch: on
   performance.write-behind: on
   performance.flush-behind: on
   features.quota-deem-statfs: on
   performance.quick-read: off
   performance.client-io-threads: on
   performance.read-ahead: on
   performance.io-thread-count: 24
   features.quota: on
   cluster.eager-lock: on
   nfs.disable: on
   auth.allow: 192.168.10.*,10.0.10.*,10.8.0.*,10.2.0.*,10.0.60.*
   server.allow-insecure: on
   performance.write-behind-window-size: 1MB
   network.ping-timeout: 60
   features.quota-timeout: 0
   performance.io-cache: off
   server.root-squash: on
   performance.readdir-ahead: on
  
   Here is the status of the nodes:
   Status of volume: export_volume
   Gluster process Port Online Pid
  
  --
   Brick hgluster01:/gluster_data 49152 Y 7370
   Brick hgluster02:/gluster_data 49152 Y 17868
   Quota Daemon on localhost N/A Y 2051
   Quota Daemon on hgluster02.red.dsic.com N/A Y 6691
  
   Task Status of Volume export_volume
  
  --
   There are no active volume tasks
  
   I have just turned quota on and was watching the quota-crawl.log and see
  a
   bunch of these type of messages:
  
   [2015-04-02 19:23:01.540692] W [fuse-bridge.c:483:fuse_entry_cbk]
   0-glusterfs-fuse: 2338683: LOOKUP() /\ = -1 (Permission denied)
  
   [2015-04-02 19:23:01.543565] W
  [client-rpc-fops.c:2766:client3_3_lookup_cbk]
   0-export_volume-client-1: remote operation failed: Permission denied.
  Path:
   /\ (----)
  
   [2015-04-02 17:58:14.090556] W
  [client-rpc-fops.c:2766:client3_3_lookup_cbk]
   0-export_volume-client-0: remote operation failed: Permission denied.
  Path:
   /\ (----)
  
   Should I be worried about this and how do I go about fixing the
  permissions?
   Is this a bug and should it be reported?
 
  Hi Ryan,
 
  Apologies for the late reply. Looking at the description of the problem
  I don't think there will be any problem. I think its better if we track
  this problem using a bug. If you have already raised a bug then please
  do provide us a bug-id, or else we will raise a new bug.
 
  I have one question: Looking at the path /\ , do you have a directory
  with similar path, as we can see accessing that has failed?
 
  Thanks,
  Sachin.
 
 
 
  
   Thanks, in advance, for your time to help me.
   Ryan Clough
   Information Systems
   Decision Sciences International Corporation
  
  
   This email and its contents are confidential. If you are not the intended
   recipient, please do not disclose or use the information within this
  email
   

Re: [Gluster-users] root squash and git clone

2015-06-08 Thread Prasun Gera
I am using it through my shcool's Satellite subscription. So I don't have a
direct interface with RHN support. While it might be theoretically possible
to escalate it to rhn, it would be much easier if i can figure this out on
my own.

On Mon, Jun 8, 2015 at 7:31 PM, Ryan Clough ryan.clo...@dsic.com wrote:

 We had seen this at some point back in the Gluster 3.5.x days but have not
 seen it since 3.6.x. If you are truly using fully licensed Red Hat Storage
 then I would leverage Red Hat support directly.


 ___
 ¯\_(ツ)_/¯
 Ryan Clough
 Information Systems
 Decision Sciences International Corporation
 http://www.decisionsciencescorp.com/
 http://www.decisionsciencescorp.com/

 On Mon, Jun 8, 2015 at 6:34 PM, Prasun Gera prasun.g...@gmail.com wrote:

 Anyone noticed this ? This is easily reproducible for me. It's a bit
 strange though since a) git clone isn't/shouldn't be doing anything as root
 and b) if it were, it would have failed similarly on a regular(not
 glusterfs) nfs mount with root squash for me, which it doesn't.

 On Wed, Jun 3, 2015 at 7:22 AM, Prasun Gera prasun.g...@gmail.com
 wrote:

 Version: RHS 3.0

 I noticed that if server.root-squash is set on, clients get permissions
 errors on git commands like git clone. Is this a known issue ? I confirmed
 that the write permissions to the destination directories were correct, and
 normal writes were working fine. git clones would fail though with:

 error:unable to write sha1 filename 
 fatal :cannot store pack file
 fatal :index-pack failed



 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users



 This email and its contents are confidential. If you are not the intended
 recipient, please do not disclose or use the information within this email
 or its attachments. If you have received this email in error, please report
 the error to the sender by return email and delete this communication from
 your records.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] One host won't rebalance

2015-06-08 Thread Nithya Balachandran
The rebalance failures appear to be because the connection to subvolume 
bigdata2-client-8 was lost. Rebalance will stop if any dht subvolume goes down. 
From the logs:

[2015-06-04 23:24:36.714719] I [client.c:2215:client_rpc_notify] 
0-bigdata2-client-8: disconnected from bigdata2-client-8. Client process will 
keep trying to connect to glusterd until brick's port is available
[2015-06-04 23:24:36.714734] W [dht-common.c:5953:dht_notify] 0-bigdata2-dht: 
Received CHILD_DOWN. Exiting
[2015-06-04 23:24:36.714745] I [MSGID: 109029] 
[dht-rebalance.c:2136:gf_defrag_stop] 0-: Received stop command on rebalance


Did anything happen to the brick process for 0-bigdata2-client-8 that would 
cause this? The brick logs might help here.


I need to look into why the rebalance never proceeded on gluster-6. The logs 
show the following :

[2015-06-03 15:18:17.905569] W [client-handshake.c:1109:client_setvolume_cbk] 
0-bigdata2-client-1: failed to set the volume (Permission denied)
[2015-06-03 15:18:17.905583] W [client-handshake.c:1135:client_setvolume_cbk] 
0-bigdata2-client-1: failed to get 'process-uuid' from reply dict
[2015-06-03 15:18:17.905592] E [client-handshake.c:1141:client_setvolume_cbk] 
0-bigdata2-client-1: SETVOLUME on remote-host failed: Authentication


for all subvols on gluster-6. Can you send us the brick logs for those as well?

Thanks,
Nithya




- Original Message -
 From: Branden Timm bt...@wisc.edu
 To: Nithya Balachandran nbala...@redhat.com
 Cc: gluster-users@gluster.org
 Sent: Saturday, 6 June, 2015 12:20:53 AM
 Subject: Re: [Gluster-users] One host won't rebalance
 
 Update on this.  After two out of three servers entered failed state during
 rebalance, and the third hadn't done anything yet, I cancelled the
 rebalance.  I then stopped/started the volume, and ran rebalance fix-layout.
 As of this point, it is running on all three servers successfully.
 
 Once fix-layout is done I will attempt another data rebalance and update this
 list with the results.
 
 
 
 
 From: gluster-users-boun...@gluster.org gluster-users-boun...@gluster.org
 on behalf of Branden Timm bt...@wisc.edu
 Sent: Friday, June 5, 2015 10:38 AM
 To: Nithya Balachandran
 Cc: gluster-users@gluster.org
 Subject: Re: [Gluster-users] One host won't rebalance
 
 Sure, here is gluster volume info:
 
 Volume Name: bigdata2
 Type: Distribute
 Volume ID: 2cd214fa-6fa4-49d0-93f6-de2c510d4dd4
 Status: Started
 Number of Bricks: 15
 Transport-type: tcp
 Bricks:
 Brick1: gluster-6.redacted:/gluster/brick1/data
 Brick2: gluster-6.redacted:/gluster/brick2/data
 Brick3: gluster-6.redacted:/gluster/brick3/data
 Brick4: gluster-6.redacted:/gluster/brick4/data
 Brick5: gluster-7.redacted:/gluster/brick1/data
 Brick6: gluster-7.redacted:/gluster/brick2/data
 Brick7: gluster-7.redacted:/gluster/brick3/data
 Brick8: gluster-7.redacted:/gluster/brick4/data
 Brick9: gluster-8.redacted:/gluster/brick1/data
 Brick10: gluster-8.redacted:/gluster/brick2/data
 Brick11: gluster-8.redacted:/gluster/brick3/data
 Brick12: gluster-8.redacted:/gluster/brick4/data
 Brick13: gluster-7.redacted:/gluster-sata/brick1/data
 Brick14: gluster-8.redacted:/gluster-sata/brick1/data
 Brick15: gluster-6.redacted:/gluster-sata/brick1/data
 Options Reconfigured:
 cluster.readdir-optimize: on
 performance.enable-least-priority: off
 
 Attached is a tarball containing logs for gluster-6, 7 and 8. I should also
 note that as of this morning, the two hosts that were successfully running
 the rebalance show as failed, while the affected host still is sitting at 0
 secs progress:
 
 Node Rebalanced-files  size   scanned  failures   skipped
 status   run time in secs
 -  ---   ---   ---   ---
 ---  --
 localhost00Bytes 0 0
 0  in progress   0.00
 gluster-7.glbrc.org 302019.4TB 12730
 4 0   failed  105165.00
 gluster-8.glbrc.org00Bytes 0
 0 0   failed   0.00
 volume rebalance: bigdata2: success:
 
 Thanks!
 
 
 From: Nithya Balachandran nbala...@redhat.com
 Sent: Friday, June 5, 2015 4:46 AM
 To: Branden Timm
 Cc: Atin Mukherjee; gluster-users@gluster.org
 Subject: Re: [Gluster-users] One host won't rebalance
 
 Hi,
 
 Can you send us the gluster volume info for the volume and the rebalance log
 for the nodes? What is the pid of the process which does not proceed?
 
 Thanks,
 Nithya
 
 - Original Message -
  From: Atin Mukherjee amukh...@redhat.com
  To: Branden Timm bt...@wisc.edu, Atin Mukherjee
  atin.mukherje...@gmail.com
  Cc: gluster-users@gluster.org
  Sent: Friday, June 5, 2015 9:26:44 AM
  Subject: Re: [Gluster-users] One host won't rebalance
 
 
 
  On 06/05/2015 12:05 

Re: [Gluster-users] Double counting of quota

2015-06-08 Thread Vijaikumar M

Hi Alessandro,

Please provide the test-case, so that we can try to re-create this 
problem in-house?


Thanks,
Vijay

On Saturday 06 June 2015 05:59 AM, Alessandro De Salvo wrote:

Hi,
just to answer to myself, it really seems the temp files from rsync are the 
culprit, it seems that their size are summed up to the real contents of the 
directories I’m synchronizing, or in other terms their size is not removed from 
the used size after they are removed. I suppose this is someway connected to 
the error on removexattr I’m seeing. The temporary solution I’ve found is to 
use rsync with the option to write the temp files to /tmp, but it would be very 
interesting to understand why this is happening.
Cheers,

Alessandro


Il giorno 06/giu/2015, alle ore 01:19, Alessandro De Salvo 
alessandro.desa...@roma1.infn.it ha scritto:

Hi,
I currently have two brick with replica 2 on the same machine, pointing to 
different disks of a connected SAN.
The volume itself is fine:

# gluster volume info atlas-home-01

Volume Name: atlas-home-01
Type: Replicate
Volume ID: 660db960-31b8-4341-b917-e8b43070148b
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: host1:/bricks/atlas/home02/data
Brick2: host2:/bricks/atlas/home01/data
Options Reconfigured:
performance.write-behind-window-size: 4MB
performance.io-thread-count: 32
performance.readdir-ahead: on
server.allow-insecure: on
nfs.disable: true
features.quota: on
features.inode-quota: on


However, when I set a quota on a dir of the volume the size show is twice the 
physical size of the actual dir:

# gluster volume quota atlas-home-01 list /user1
  Path   Hard-limit Soft-limit   Used  
Available  Soft-limit exceeded? Hard-limit exceeded?
---
/user14.0GB   80%   3.2GB 853.4MB   
   No   No

# du -sh /storage/atlas/home/user1
1.6G/storage/atlas/home/user1

If I remove one of the bricks the quota shows the correct value.
Is there any double counting in case the bricks are on the same machine?
Also, I see a lot of errors in the logs like the following:

[2015-06-05 21:59:27.450407] E [posix-handle.c:157:posix_make_ancestryfromgfid] 
0-atlas-home-01-posix: could not read the link from the gfid handle 
/bricks/atlas/home01/data/.glusterfs/be/e5/bee5e2b8-c639-4539-a483-96c19cd889eb 
(No such file or directory)

and also

[2015-06-05 22:52:01.112070] E [marker-quota.c:2363:mq_mark_dirty] 
0-atlas-home-01-marker: failed to get inode ctx for /user1/file1

When running rsync I also see the following errors:

[2015-06-05 23:06:22.203968] E [marker-quota.c:2601:mq_remove_contri] 
0-atlas-home-01-marker: removexattr 
trusted.glusterfs.quota.fddf31ba-7f1d-4ba8-a5ad-2ebd6e4030f3.contri failed for 
/user1/..bashrc.O4kekp: No data available

Those files are the temp files of rsync, I’m not sure why the throw errors in 
glusterfs.
Any help?
Thanks,

Alessandro


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] using a preferred node ?

2015-06-08 Thread Mathieu Chateau
From this slide (maybe outdated) it says that reads are also balanced (in
replication scenario slide 22):
http://www.gluster.org/community/documentation/images/8/80/GlusterFS_Architecture_%26_Roadmap-Vijay_Bellur-LinuxCon_EU_2013.pdf

Except for write, having an option to do only failover for reads  lookup
would be possible I guess ?


Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-06-08 8:11 GMT+02:00 Ravishankar N ravishan...@redhat.com:



 On 06/08/2015 11:34 AM, Mathieu Chateau wrote:

 Hello Ravi,

  thanks for clearing things up.

  Anything on the roadmap that would help my case?



 I don't think it would be possible for clients to do I/O only on its local
 brick and yet expect the bricks' contents to be in sync in real-time..



   Cordialement,
 Mathieu CHATEAU
 http://www.lotp.fr

 2015-06-08 6:37 GMT+02:00 Ravishankar N ravishan...@redhat.com:



 On 06/06/2015 12:49 AM, Mathieu Chateau wrote:

 Hello,

  sorry to bother again but I am still facing this issue.

  client still looks on the other side and not using the node declared
 in fstab:
 prd-sta-sto01:/gluster-preprod /mnt/gluster-preprod glusterfs
 defaults,_netdev,backupvolfile-server=prd-sta-sto02 0 0

  I expect client to use sto01 and not sto02 as it's available.


  Hi Mathieu,
 When you do lookups (`ls` etc), they are sent to both bricks of the
 replica. If you write to a file, the write is also sent to both bricks.
 This is how it works. Only reads are served from the local brick.
 -Ravi



  If I add a static route to break connectivity to sto02 and do a df, I
 have around 30s before it works.
 Then it works ok.

  Questions:

- How to force node to stick as possible with one specific (local)
node ?
- How to know where a client is currently connected?

 Thanks for your help :)


  Cordialement,
 Mathieu CHATEAU
 http://www.lotp.fr

 2015-05-11 7:26 GMT+02:00 Mathieu Chateau mathieu.chat...@lotp.fr:

 Hello,

  thanks for helping :)

  If gluster server is rebooted, any way to make client failback on node
 after reboot ?

  How to know which node is using a client ? I see TCP connection to
 both node

  Regards,

  Cordialement,
 Mathieu CHATEAU
 http://www.lotp.fr

 2015-05-11 7:13 GMT+02:00 Ravishankar N ravishan...@redhat.com:



 On 05/10/2015 08:29 PM, Mathieu Chateau wrote:

 Hello,

  Short way: Is there any way to define a preferred Gluster server ?

  Long way:
 I have the following setup (version 3.6.3) :

  Gluster A  == VPN == Gluster B

  Volume is replicated between A and B.

  They are in same datacenter, using a 1Gb/s connection, low latency
 (0.5ms)

  I have gluster clients in lan A  B.

  When doing a ls on big folder (~60k files), both gluster node are
 used, and so it need 9mn instead on 1mn if only the local gluster is
 reachable.


  Lookups (and writes of course) from clients are sent to both  bricks
 because AFR uses the result of the lookup to select which brick to read
 from if there is a pending heal etc.
 If the file is clean on both A and B, then reads are always served from
 the local brick. i.e. reads on clients mounted on A will be served from the
 brick in A (and likewise for B).

 Hope that helps,
 Ravi


   It's HA setup, application is present on both side. I would like a
 master/master setup, but using only local node as possible.


  Regards,
 Mathieu CHATEAU
 http://www.lotp.fr


  ___
 Gluster-users mailing 
 listGluster-users@gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users








___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Monitorig gluster 3.6.1

2015-06-08 Thread M S Vishwanath Bhat
On 1 June 2015 at 12:28, Félix de Lelelis felix.deleli...@gmail.com wrote:

 Hi,

 I have monitoring gluster with scripts that lunch scripts. All scripts are
 redirected to a one script that check if is active any process glusterd and
 if the repsonse its false, the script lunch the check.

 All checks are:

- gluster volume volname info
- gluster volume heal volname info
- gluster volume heal volname split-brain
- gluster volume volname status detail
- gluster volume volname statistics

 Since I enable the monitoring in our pre-production gluster, the gluster
 is down 2 times. We  suspect that the monitoring are overloading but should
 not.

 The question is, there any way to check those states otherwise?


You can make use of https://github.com/keithseahus/fluent-plugin-glusterfs
as well.

http://docs.fluentd.org/articles/collect-glusterfs-logs

HTH

Best Regards,
Vishwanath



 Thanks

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Cannot start Gluster -- resolve brick failed in restore

2015-06-08 Thread shacky
Hi.
I have a GlusterFS cluster running on a Debian Wheezy with GlusterFS
3.6.2, with one volume on all three bricks (web1, web2, web3).
All was working good until I changed the IP addresses of bricks,
because after then only the GlusterFS daemon on web1 is starting well,
and the deamons on web2 and web3 are exiting with these errors:

[2015-06-08 07:59:15.929330] I [MSGID: 100030]
[glusterfsd.c:2018:main] 0-/usr/sbin/glusterd: Started running
/usr/sbin/glusterd version 3.6.2 (args: /usr/sbin/glusterd -p
/var/run/glusterd.pid)
[2015-06-08 07:59:15.932417] I [glusterd.c:1214:init] 0-management:
Maximum allowed open file descriptors set to 65536
[2015-06-08 07:59:15.932482] I [glusterd.c:1259:init] 0-management:
Using /var/lib/glusterd as working directory
[2015-06-08 07:59:15.933772] W [rdma.c:4221:__gf_rdma_ctx_create]
0-rpc-transport/rdma: rdma_cm event channel creation failed (No such
device)
[2015-06-08 07:59:15.933815] E [rdma.c:4519:init] 0-rdma.management:
Failed to initialize IB Device
[2015-06-08 07:59:15.933838] E
[rpc-transport.c:333:rpc_transport_load] 0-rpc-transport: 'rdma'
initialization failed
[2015-06-08 07:59:15.933887] W [rpcsvc.c:1524:rpcsvc_transport_create]
0-rpc-service: cannot create listener, initing the transport failed
[2015-06-08 07:59:17.354500] I
[glusterd-store.c:2043:glusterd_restore_op_version] 0-glusterd:
retrieved op-version: 30600
[2015-06-08 07:59:17.527377] I
[glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo]
0-management: connect returned 0
[2015-06-08 07:59:17.527446] I
[glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo]
0-management: connect returned 0
[2015-06-08 07:59:17.527499] I
[rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting
frame-timeout to 600
[2015-06-08 07:59:17.528139] I
[rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting
frame-timeout to 600
[2015-06-08 07:59:17.528861] E
[glusterd-store.c:4244:glusterd_resolve_all_bricks] 0-glusterd:
resolve brick failed in restore
[2015-06-08 07:59:17.528891] E [xlator.c:425:xlator_init]
0-management: Initialization of volume 'management' failed, review
your volfile again
[2015-06-08 07:59:17.528906] E [graph.c:322:glusterfs_graph_init]
0-management: initializing translator failed
[2015-06-08 07:59:17.528917] E [graph.c:525:glusterfs_graph_activate]
0-graph: init failed
[2015-06-08 07:59:17.529257] W [glusterfsd.c:1194:cleanup_and_exit]
(-- 0-: received signum (0), shutting down

Please note that bricks name are setted in /etc/hosts and all of them
are resolving well with the new IP addresses, so I cannot find out
where the problem is.

Could you help me please?

Thank you very much!
Bye
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] using a preferred node ?

2015-06-08 Thread Mathieu Chateau
Hello Ravi,

thanks for clearing things up.

Anything on the roadmap that would help my case?

Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-06-08 6:37 GMT+02:00 Ravishankar N ravishan...@redhat.com:



 On 06/06/2015 12:49 AM, Mathieu Chateau wrote:

 Hello,

  sorry to bother again but I am still facing this issue.

  client still looks on the other side and not using the node declared
 in fstab:
 prd-sta-sto01:/gluster-preprod /mnt/gluster-preprod glusterfs
 defaults,_netdev,backupvolfile-server=prd-sta-sto02 0 0

  I expect client to use sto01 and not sto02 as it's available.


 Hi Mathieu,
 When you do lookups (`ls` etc), they are sent to both bricks of the
 replica. If you write to a file, the write is also sent to both bricks.
 This is how it works. Only reads are served from the local brick.
 -Ravi



  If I add a static route to break connectivity to sto02 and do a df, I
 have around 30s before it works.
 Then it works ok.

  Questions:

- How to force node to stick as possible with one specific (local)
node ?
- How to know where a client is currently connected?

 Thanks for your help :)


  Cordialement,
 Mathieu CHATEAU
 http://www.lotp.fr

 2015-05-11 7:26 GMT+02:00 Mathieu Chateau mathieu.chat...@lotp.fr:

 Hello,

  thanks for helping :)

  If gluster server is rebooted, any way to make client failback on node
 after reboot ?

  How to know which node is using a client ? I see TCP connection to both
 node

  Regards,

  Cordialement,
 Mathieu CHATEAU
 http://www.lotp.fr

 2015-05-11 7:13 GMT+02:00 Ravishankar N ravishan...@redhat.com:



 On 05/10/2015 08:29 PM, Mathieu Chateau wrote:

 Hello,

  Short way: Is there any way to define a preferred Gluster server ?

  Long way:
 I have the following setup (version 3.6.3) :

  Gluster A  == VPN == Gluster B

  Volume is replicated between A and B.

  They are in same datacenter, using a 1Gb/s connection, low latency
 (0.5ms)

  I have gluster clients in lan A  B.

  When doing a ls on big folder (~60k files), both gluster node are
 used, and so it need 9mn instead on 1mn if only the local gluster is
 reachable.


  Lookups (and writes of course) from clients are sent to both  bricks
 because AFR uses the result of the lookup to select which brick to read
 from if there is a pending heal etc.
 If the file is clean on both A and B, then reads are always served from
 the local brick. i.e. reads on clients mounted on A will be served from the
 brick in A (and likewise for B).

 Hope that helps,
 Ravi


   It's HA setup, application is present on both side. I would like a
 master/master setup, but using only local node as possible.


  Regards,
 Mathieu CHATEAU
 http://www.lotp.fr


  ___
 Gluster-users mailing 
 listGluster-users@gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users






___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] HA storage based on two nodes with one point of failure

2015-06-08 Thread Юрий Полторацкий
2015-06-08 8:32 GMT+03:00 Ravishankar N ravishan...@redhat.com:



 On 06/08/2015 02:38 AM, Юрий Полторацкий wrote:

 Hi,

 I have made a lab with a config listed below and have got unexpected
 result. Someone, tell me, please, where did I go wrong?

 I am testing oVirt. Data Center has two clusters: the first as a computing
 with three nodes (node1, node2, node3); the second as a storage (node5,
 node6) based on glusterfs (replica 2).

 I want the storage to be HA. I have read here
 https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/sect-Managing_Split-brain.html
 next:
 For a replicated volume with two nodes and one brick on each machine, if
 the server-side quorum is enabled and one of the nodes goes offline, the
 other node will also be taken offline because of the quorum configuration.
 As a result, the high availability provided by the replication is
 ineffective. To prevent this situation, a dummy node can be added to the
 trusted storage pool which does not contain any bricks. This ensures that
 even if one of the nodes which contains data goes offline, the other node
 will remain online. Note that if the dummy node and one of the data nodes
 goes offline, the brick on other node will be also be taken offline, and
 will result in data unavailability.

 So, I have added my Engine (not self-hosted) as a dummy node without a
 brick and have configured quorum as listed below:
 cluster.quorum-type: fixed
 cluster.quorum-count: 1
 cluster.server-quorum-type: server
 cluster.server-quorum-ratio: 51%


 Then, I've run a VM and have dropped the network link from node6, after
 one a hour have switched back the link and after a while have got a
 split-brain. But why? No one could write to the brick on node6: the VM was
 running on node3 and node1 was SPM.



 It could have happened that after node6 came up, the client(s) saw a
 temporary disconnect of node 5 and a write happened at that time. When the
 node 5 is connected again, we have AFR xattrs on both nodes blaming each
 other, causing split-brain. For a replica 2 setup. it is best to set the
 client-quorum to auto instead of fixed. What this means is that the first
 node of the replica must always be up for writes to be permitted. If the
 first node goes down, the volume becomes read-only.

Yes, at first I have tested with client-quorum auto, but my VMs has been
paused when the first node goes down and this is not unacceptable

Ok, I understood: there is now way to have fault tolerance storage with
only two servers using GlusterFS. I have to get another one.

Thanks.


 For better availability , it would be better to use a replica 3 volume
 with (again with client-quorum set to auto). If you are using glusterfs
 3.7, you can also consider using the arbiter configuration [1] for replica
 3.

 [1]
 https://github.com/gluster/glusterfs/blob/master/doc/features/afr-arbiter-volumes.md

 Thanks,
 Ravi


  Gluster's log from node6:
 Июн 07 15:35:06 node6.virt.local etc-glusterfs-glusterd.vol[28491]:
 [2015-06-07 12:35:06.106270] C [MSGID: 106002]
 [glusterd-server-quorum.c:356:glusterd_do_volume_quorum_action]
 0-management: Server quorum lost for volume vol3. Stopping local bricks.
 Июн 07 16:30:06 node6.virt.local etc-glusterfs-glusterd.vol[28491]:
 [2015-06-07 13:30:06.261505] C [MSGID: 106003]
 [glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action]
 0-management: Server quorum regained for volume vol3. Starting local bricks.


 gluster volume heal vol3 info
 Brick node5.virt.local:/storage/brick12/
 /5d0bb2f3-f903-4349-b6a5-25b549affe5f/dom_md/ids - Is in split-brain

 Number of entries: 1

 Brick node6.virt.local:/storage/brick13/
 /5d0bb2f3-f903-4349-b6a5-25b549affe5f/dom_md/ids - Is in split-brain

 Number of entries: 1


 gluster volume info vol3

 Volume Name: vol3
 Type: Replicate
 Volume ID: 69ba8c68-6593-41ca-b1d9-40b3be50ac80
 Status: Started
 Number of Bricks: 1 x 2 = 2
 Transport-type: tcp
 Bricks:
 Brick1: node5.virt.local:/storage/brick12
 Brick2: node6.virt.local:/storage/brick13
 Options Reconfigured:
 storage.owner-gid: 36
 storage.owner-uid: 36
 cluster.server-quorum-type: server
 cluster.quorum-type: fixed
 network.remote-dio: enable
 cluster.eager-lock: enable
 performance.stat-prefetch: off
 performance.io-cache: off
 performance.read-ahead: off
 performance.quick-read: off
 auth.allow: *
 user.cifs: disable
 nfs.disable: on
 performance.readdir-ahead: on
 cluster.quorum-count: 1
 cluster.server-quorum-ratio: 51%



 06.06.2015 12:09, Юрий Полторацкий пишет:

 Hi,

  I want to build a HA storage based on two servers. I want that if one
 goes down, my storage will be available in RW mode.

  If I will use replica 2, then split-brain can occur. To avoid this I
 would use a quorum. As I understand correctly, I can use quorum on a client
 side, on a server side, or on both. I want to add a dummy node without a
 brick and make such config:

 cluster.quorum-type: fixed
 

Re: [Gluster-users] using a preferred node ?

2015-06-08 Thread Ravishankar N



On 06/08/2015 11:34 AM, Mathieu Chateau wrote:

Hello Ravi,

thanks for clearing things up.

Anything on the roadmap that would help my case?




I don't think it would be possible for clients to do I/O only on its 
local brick and yet expect the bricks' contents to be in sync in real-time..




Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-06-08 6:37 GMT+02:00 Ravishankar N ravishan...@redhat.com 
mailto:ravishan...@redhat.com:




On 06/06/2015 12:49 AM, Mathieu Chateau wrote:

Hello,

sorry to bother again but I am still facing this issue.

client still looks on the other side and not using the node
declared in fstab:
prd-sta-sto01:/gluster-preprod /mnt/gluster-preprod glusterfs
defaults,_netdev,backupvolfile-server=prd-sta-sto02 0 0

I expect client to use sto01 and not sto02 as it's available.


Hi Mathieu,
When you do lookups (`ls` etc), they are sent to both bricks of
the replica. If you write to a file, the write is also sent to
both bricks. This is how it works. Only reads are served from the
local brick.
-Ravi




If I add a static route to break connectivity to sto02 and do a
df, I have around 30s before it works.
Then it works ok.

Questions:

  * How to force node to stick as possible with one specific
(local) node ?
  * How to know where a client is currently connected?

Thanks for your help :)


Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-05-11 7:26 GMT+02:00 Mathieu Chateau
mathieu.chat...@lotp.fr mailto:mathieu.chat...@lotp.fr:

Hello,

thanks for helping :)

If gluster server is rebooted, any way to make client
failback on node after reboot ?

How to know which node is using a client ? I see TCP
connection to both node

Regards,

Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-05-11 7:13 GMT+02:00 Ravishankar N
ravishan...@redhat.com mailto:ravishan...@redhat.com:



On 05/10/2015 08:29 PM, Mathieu Chateau wrote:

Hello,

Short way: Is there any way to define a preferred
Gluster server ?

Long way:
I have the following setup (version 3.6.3) :

Gluster A  == VPN == Gluster B

Volume is replicated between A and B.

They are in same datacenter, using a 1Gb/s connection,
low latency (0.5ms)

I have gluster clients in lan A  B.

When doing a ls on big folder (~60k files), both
gluster node are used, and so it need 9mn instead on 1mn
if only the local gluster is reachable.



Lookups (and writes of course) from clients are sent to
both  bricks because AFR uses the result of the lookup to
select which brick to read from if there is a pending
heal etc.
If the file is clean on both A and B, then reads are
always served from the local brick. i.e. reads on clients
mounted on A will be served from the brick in A (and
likewise for B).

Hope that helps,
Ravi



It's HA setup, application is present on both side. I
would like a master/master setup, but using only local
node as possible.


Regards,
Mathieu CHATEAU
http://www.lotp.fr


___
Gluster-users mailing list
Gluster-users@gluster.org  mailto:Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users









___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Monitorig gluster 3.6.1

2015-06-08 Thread Humble Devassy Chirammal
You may use gluster nagios plugin for monitoring purpose.

You can get more details from here :
http://www.gluster.org/pipermail/gluster-users/2014-June/017819.html

--Humble


On Mon, Jun 8, 2015 at 12:42 PM, M S Vishwanath Bhat msvb...@gmail.com
wrote:



 On 1 June 2015 at 12:28, Félix de Lelelis felix.deleli...@gmail.com
 wrote:

 Hi,

 I have monitoring gluster with scripts that lunch scripts. All scripts
 are redirected to a one script that check if is active any process glusterd
 and if the repsonse its false, the script lunch the check.

 All checks are:

- gluster volume volname info
- gluster volume heal volname info
- gluster volume heal volname split-brain
- gluster volume volname status detail
- gluster volume volname statistics

 Since I enable the monitoring in our pre-production gluster, the gluster
 is down 2 times. We  suspect that the monitoring are overloading but should
 not.

 The question is, there any way to check those states otherwise?


 You can make use of https://github.com/keithseahus/fluent-plugin-glusterfs
 as well.

 http://docs.fluentd.org/articles/collect-glusterfs-logs

 HTH

 Best Regards,
 Vishwanath



 Thanks

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users



 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Double counting of quota

2015-06-08 Thread Alessandro De Salvo
Hi Vijiay,
the use case is very simple.
I'm using gluster 3.7.1 with a replicated volume (replica 2), enabling quota. 
The 2 bricks are on the same machine but using two disks of the same size, but 
unfortunately one of them is slower, but I think it is irrelevant. Although I 
do not think it is important for this use case, but I want to note that the 
volumes are xfs formatted with default options and are thin logical volumes.
The gluster server is running on CentIS 7.1.
The problem occurs when using rsync to copy from an external source into the 
gluster volume: if I copy without specifying the temp dir rsync uses the 
current dir for temporaries and this is taken into account, at least in one of 
the two bricks. I can confirm it's only happening in one of the bricks by 
looking at the xattrs of the same dir on the two bricks, as the quota values 
are different.
At the moment I have recreated the bricks and started the copy over and it 
seems much better now that I'm explicitly asking rsync to use /tmp for 
temporaries.
Anyways, I'm still seeing errors in the logs that I will report later.
Many thanks for the help,

   Alessandro




 Il giorno 08/giu/2015, alle ore 08:38, Vijaikumar M vmall...@redhat.com ha 
 scritto:
 
 Hi Alessandro,
 
 Please provide the test-case, so that we can try to re-create this problem 
 in-house?
 
 Thanks,
 Vijay
 
 On Saturday 06 June 2015 05:59 AM, Alessandro De Salvo wrote:
 Hi,
 just to answer to myself, it really seems the temp files from rsync are the 
 culprit, it seems that their size are summed up to the real contents of the 
 directories I’m synchronizing, or in other terms their size is not removed 
 from the used size after they are removed. I suppose this is someway 
 connected to the error on removexattr I’m seeing. The temporary solution 
 I’ve found is to use rsync with the option to write the temp files to /tmp, 
 but it would be very interesting to understand why this is happening.
 Cheers,
 
  Alessandro
 
 Il giorno 06/giu/2015, alle ore 01:19, Alessandro De Salvo 
 alessandro.desa...@roma1.infn.it ha scritto:
 
 Hi,
 I currently have two brick with replica 2 on the same machine, pointing to 
 different disks of a connected SAN.
 The volume itself is fine:
 
 # gluster volume info atlas-home-01
 
 Volume Name: atlas-home-01
 Type: Replicate
 Volume ID: 660db960-31b8-4341-b917-e8b43070148b
 Status: Started
 Number of Bricks: 1 x 2 = 2
 Transport-type: tcp
 Bricks:
 Brick1: host1:/bricks/atlas/home02/data
 Brick2: host2:/bricks/atlas/home01/data
 Options Reconfigured:
 performance.write-behind-window-size: 4MB
 performance.io-thread-count: 32
 performance.readdir-ahead: on
 server.allow-insecure: on
 nfs.disable: true
 features.quota: on
 features.inode-quota: on
 
 
 However, when I set a quota on a dir of the volume the size show is twice 
 the physical size of the actual dir:
 
 # gluster volume quota atlas-home-01 list /user1
  Path   Hard-limit Soft-limit   Used  
 Available  Soft-limit exceeded? Hard-limit exceeded?
 ---
 /user14.0GB   80%   3.2GB 
 853.4MB  No   No
 
 # du -sh /storage/atlas/home/user1
 1.6G/storage/atlas/home/user1
 
 If I remove one of the bricks the quota shows the correct value.
 Is there any double counting in case the bricks are on the same machine?
 Also, I see a lot of errors in the logs like the following:
 
 [2015-06-05 21:59:27.450407] E 
 [posix-handle.c:157:posix_make_ancestryfromgfid] 0-atlas-home-01-posix: 
 could not read the link from the gfid handle 
 /bricks/atlas/home01/data/.glusterfs/be/e5/bee5e2b8-c639-4539-a483-96c19cd889eb
  (No such file or directory)
 
 and also
 
 [2015-06-05 22:52:01.112070] E [marker-quota.c:2363:mq_mark_dirty] 
 0-atlas-home-01-marker: failed to get inode ctx for /user1/file1
 
 When running rsync I also see the following errors:
 
 [2015-06-05 23:06:22.203968] E [marker-quota.c:2601:mq_remove_contri] 
 0-atlas-home-01-marker: removexattr 
 trusted.glusterfs.quota.fddf31ba-7f1d-4ba8-a5ad-2ebd6e4030f3.contri failed 
 for /user1/..bashrc.O4kekp: No data available
 
 Those files are the temp files of rsync, I’m not sure why the throw errors 
 in glusterfs.
 Any help?
 Thanks,
 
 Alessandro
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users
 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] using a preferred node ?

2015-06-08 Thread Ravishankar N



On 06/08/2015 11:51 AM, Mathieu Chateau wrote:
From this slide (maybe outdated) it says that reads are also balanced 
(in replication scenario slide 22):

http://www.gluster.org/community/documentation/images/8/80/GlusterFS_Architecture_%26_Roadmap-Vijay_Bellur-LinuxCon_EU_2013.pdf

Except for write, having an option to do only failover for reads  
lookup would be possible I guess ?




Lookups have to be sent to both bricks because AFR uses the response to 
determine if there is a stale copy etc (and then serve from the good 
copy).  For reads, if a client is also mounted on the same machine as 
the brick, reads will be served from that brick automatically. You can 
also use the cluster.read-subvolume option to explicitly force the 
client to read from a brick:


/`gluster volume set help//`
//
snip//
//
Option: cluster.read-subvolume//
//Default Value: (null)//
//Description: inode-read fops happen only on one of the bricks in 
replicate. Afr will prefer the one specified using this option if it is 
not stale. Option value must be one of the xlator names of the children. 
Ex: volname-client-0 till volname-client-number-of-bricks - 1//

//
///snip//
//
/


Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-06-08 8:11 GMT+02:00 Ravishankar N ravishan...@redhat.com 
mailto:ravishan...@redhat.com:




On 06/08/2015 11:34 AM, Mathieu Chateau wrote:

Hello Ravi,

thanks for clearing things up.

Anything on the roadmap that would help my case?




I don't think it would be possible for clients to do I/O only on
its local brick and yet expect the bricks' contents to be in sync
in real-time..




Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-06-08 6:37 GMT+02:00 Ravishankar N ravishan...@redhat.com
mailto:ravishan...@redhat.com:



On 06/06/2015 12:49 AM, Mathieu Chateau wrote:

Hello,

sorry to bother again but I am still facing this issue.

client still looks on the other side and not using the
node declared in fstab:
prd-sta-sto01:/gluster-preprod
/mnt/gluster-preprod glusterfs
defaults,_netdev,backupvolfile-server=prd-sta-sto02 0 0

I expect client to use sto01 and not sto02 as it's available.


Hi Mathieu,
When you do lookups (`ls` etc), they are sent to both bricks
of the replica. If you write to a file, the write is also
sent to both bricks. This is how it works. Only reads are
served from the local brick.
-Ravi




If I add a static route to break connectivity to sto02 and
do a df, I have around 30s before it works.
Then it works ok.

Questions:

  * How to force node to stick as possible with one specific
(local) node ?
  * How to know where a client is currently connected?

Thanks for your help :)


Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-05-11 7:26 GMT+02:00 Mathieu Chateau
mathieu.chat...@lotp.fr mailto:mathieu.chat...@lotp.fr:

Hello,

thanks for helping :)

If gluster server is rebooted, any way to make client
failback on node after reboot ?

How to know which node is using a client ? I see TCP
connection to both node

Regards,

Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-05-11 7:13 GMT+02:00 Ravishankar N
ravishan...@redhat.com mailto:ravishan...@redhat.com:



On 05/10/2015 08:29 PM, Mathieu Chateau wrote:

Hello,

Short way: Is there any way to define a preferred
Gluster server ?

Long way:
I have the following setup (version 3.6.3) :

Gluster A  == VPN == Gluster B

Volume is replicated between A and B.

They are in same datacenter, using a 1Gb/s
connection, low latency (0.5ms)

I have gluster clients in lan A  B.

When doing a ls on big folder (~60k files), both
gluster node are used, and so it need 9mn instead
on 1mn if only the local gluster is reachable.



Lookups (and writes of course) from clients are sent
to both  bricks because AFR uses the result of the
lookup to select which brick to read from if there
is a pending heal etc.
If the file is clean on both A and B, then reads are
always served from the local brick. i.e. reads on
clients mounted on A will be served from the brick
in A (and likewise for B).

Hope that helps,
Ravi



It's HA setup, application is present on both side.
I would like a master/master setup, but 

Re: [Gluster-users] slave is rebalancing, master is not?

2015-06-08 Thread M S Vishwanath Bhat
On 5 June 2015 at 20:46, Dr. Michael J. Chudobiak m...@avtechpulse.com
wrote:

 I seem to have an issue with my replicated setup.

 The master says no rebalancing is happening, but the slave says there is
 (sort of). The master notes the issue:

 [2015-06-05 15:11:26.735361] E
 [glusterd-utils.c:9993:glusterd_volume_status_aggregate_tasks_status]
 0-management: Local tasks count (0) and remote tasks count (1) do not
 match. Not aggregating tasks status.

 The slave shows some odd messages like this:
 [2015-06-05 14:44:56.525402] E [glusterfsd-mgmt.c:1494:mgmt_getspec_cbk]
 0-glusterfs: failed to get the 'volume file' from server

 I want the supposed rebalancing to stop, so I can add bricks.

 Any idea what is going on, and how to fix it?

 Both servers were recently upgraded from Fedora 21 to 22.

 Status output is below.

 - Mike



 Master: [root@karsh ~]# /usr/sbin/gluster volume status
 Status of volume: volume1
 Gluster process PortOnline  Pid

 --
 Brick karsh:/gluster/brick1/data49152   Y
  4023
 Brick xena:/gluster/brick2/data 49152   Y
  1719
 Brick karsh:/gluster/brick3/data49153   Y
  4015
 Brick xena:/gluster/brick4/data 49153   Y
  1725
 NFS Server on localhost 2049Y
  4022
 Self-heal Daemon on localhost   N/A Y
  4034
 NFS Server on xena  2049Y
  24550
 Self-heal Daemon on xenaN/A Y
  24557

 Task Status of Volume volume1

 --
 There are no active volume tasks


 [root@xena glusterfs]# /usr/sbin/gluster volume status
 Status of volume: volume1
 Gluster process PortOnline  Pid

 --
 Brick karsh:/gluster/brick1/data49152   Y
  4023
 Brick xena:/gluster/brick2/data 49152   Y
  1719
 Brick karsh:/gluster/brick3/data49153   Y
  4015
 Brick xena:/gluster/brick4/data 49153   Y
  1725
 NFS Server on localhost 2049Y
  24550
 Self-heal Daemon on localhost   N/A Y
  24557
 NFS Server on 192.168.0.240 2049Y
  4022
 Self-heal Daemon on 192.168.0.240   N/A Y
  4034

 Task Status of Volume volume1

 --
 Task : Rebalance
 ID   : f550b485-26c4-49f8-b7dc-055c678afce8
 Status   : in progress

 [root@xena glusterfs]# gluster volume rebalance volume1 status
 volume rebalance: volume1: success:


This is weird. Did you start rebalance yourself? What does gluster volume
rebalance volume1 status say? Also check if both the nodes are properly
connected using gluster peer status.

If it says completed/stopped, you can go ahead and add the bricks. Also can
you check if rebalance process is running in your second server (xena?)

BTW, there is *no* master and slave in a single gluster volume :)

Best Regards,
Vishwanath




 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 2 node replica 2 cluster - volume on one node stopped responding

2015-06-08 Thread Tiemen Ruiten
Some extra points:

- 10.100.3.41 is one of the oVirt hosts.

- I only needed to restart glusterfsd  glusterd in one of the gluster
nodes (also the one where I pulled the logs from) to get everything in
working order.

- it's a separate gluster volume, not managed from oVirt engine.

On 8 June 2015 at 11:35, Tiemen Ruiten t.rui...@rdmedia.com wrote:

 Hello,

 We are running an oVirt cluster on top of a 2 node replica 2 Gluster
 volume. Yesterday we suddenly noticed VMs were not responding and quickly
 found out the Gluster volume had issues. These errors were filling up the
 etc-glusterfs-glusterd.log file:

 [2015-06-07 08:36:26.498012] W [rpcsvc.c:270:rpcsvc_program_actor]
 0-rpc-service: RPC program not available (req 1298437 330) for
 10.100.3.41:1022
 [2015-06-07 08:36:26.498073] E
 [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to
 complete successfully


 A restart of glusterfsd and glusterd resolved the issue, but triggered a
 lot of self-heals.

 We are running glusterfs 3.7.0 on ZFS.

 I have attached etc-glusterfs-glusterd.log, the brick log file and the
 glustershd.log. I would be grateful if anyone could shed any light on what
 happened here and if there's anything we can do to prevent it.

 --
 Tiemen Ruiten
 Systems Engineer
 RD Media




-- 
Tiemen Ruiten
Systems Engineer
RD Media
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster does not seem to detect a split-brain situation

2015-06-08 Thread Sjors Gielen
Ah, that's really weird. I'm pretty sure that nothing ever made write
changes to /export on either machine, so I wonder how the hard links ended
up being split. I'll indeed clean up the .glusterfs directory and keep
close tabs on Gluster's repair.

Glustershd.log and the client mount logs (data.log and gluster.log at
least) on the client are empty and nothing appears when I read the
mismatching studies.dat file.

Thanks for your help!
Sjors

Op zo 7 jun. 2015 om 22:10 schreef Joe Julian j...@julianfamily.org:

  (oops... I hate when I reply off-list)

 That warning should, imho, be an error. That's saying that the handle,
 which should be a hardlink to the file, doesn't have a matching inode. It
 should if it's a hardlink.

 If it were me, I would:

 find /export/sdb1/data/.glusterfs -type f -links 1 -print0 | xargs
 /bin/rm

 This would clean up any handles that are not hardlinked where they should
 be and will allow gluster to repair them.

 Btw, the self-heal errors would be in glustershd.log and/or the client
 mount log(s), not (usually) the brick logs.


 On 06/07/2015 12:21 PM, Sjors Gielen wrote:

 Oops! Accidentally ran the command as non-root on Curacao, that's why
 there was no output. The actual output is:

  curacao# getfattr -m . -d -e hex
 /export/sdb1/data/Case/21000355/studies.dat
 getfattr: Removing leading '/' from absolute path names
 # file: export/sdb1/data/Case/21000355/studies.dat
 trusted.afr.data-client-0=0x
 trusted.afr.data-client-1=0x
 trusted.afr.dirty=0x
 trusted.gfid=0xfb34574974cf4804b8b80789738c0f81

  For reference, the output on bonaire:

  bonaire# getfattr -m . -d -e hex
 /export/sdb1/data/Case/21000355/studies.dat
 getfattr: Removing leading '/' from absolute path names
 # file: export/sdb1/data/Case/21000355/studies.dat
 trusted.gfid=0xfb34574974cf4804b8b80789738c0f81

  Op zo 7 jun. 2015 om 21:13 schreef Sjors Gielen sj...@sjorsgielen.nl:

  I'm reading about quorums, I haven't set up anything like that yet.

  (In reply to Joe Julian, who responded off-list)

  The output of getfattr on bonaire:

  bonaire# getfattr -m . -d -e hex
 /export/sdb1/data/Case/21000355/studies.dat
 getfattr: Removing leading '/' from absolute path names
 # file: export/sdb1/data/Case/21000355/studies.dat
 trusted.gfid=0xfb34574974cf4804b8b80789738c0f81

  On curacao, the command gives no output.

  From `gluster volume status`, it seems that while the brick
 curacao:/export/sdb1/data is online, it has no associated port number.
 Curacao can connect to the port number provided by Bonaire just fine. There
 are no firewalls on/between the two machines, they are on the same subnet
 connected by Ethernet cables and two switches.

  By the way, warning messages just started appearing to
 /var/log/glusterfs/bricks/export-sdb1-data.log on Bonaire saying
 mismatching ino/dev between file X and handle Y, though, maybe only just
 now even though I started the full self-heal hours ago.

  [2015-06-07 19:10:39.624393] W [posix-handle.c:727:posix_handle_hard]
 0-data-posix: mismatching ino/dev between file
 /export/sdb1/data/Archive/S21/21008971/studies.dat (9127104621/2065) and
 handle
 /export/sdb1/data/.glusterfs/97/c2/97c2a65d-36e0-4566-a5c1-5925f97af1fd
 (9190215976/2065)

  Thanks again!
 Sjors

  Op zo 7 jun. 2015 om 19:13 schreef Sjors Gielen sj...@sjorsgielen.nl:

 Hi all,

  I work at a small, 8-person company that uses Gluster for its primary
 data storage. We have a volume called data that is replicated over two
 servers (details below). This worked perfectly for over a year, but lately
 we've been noticing some mismatches between the two bricks, so it seems
 there has been some split-brain situation that is not being detected or
 resolved. I have two questions about this:

  1) I expected Gluster to (eventually) detect a situation like this;
 why doesn't it?
 2) How do I fix this situation? I've tried an explicit 'heal', but that
 didn't seem to change anything.

  Thanks a lot for your help!
 Sjors

  --8--

  Volume  peer info: http://pastebin.com/PN7tRXdU
 curacao# md5sum /export/sdb1/data/Case/21000355/studies.dat
 7bc2daec6be953ffae920d81fe6fa25c
 /export/sdb1/data/Case/21000355/studies.dat
  bonaire# md5sum /export/sdb1/data/Case/21000355/studies.dat
 28c950a1e2a5f33c53a725bf8cd72681
 /export/sdb1/data/Case/21000355/studies.dat

  # mallorca is one of the clients
 mallorca# md5sum /data/Case/21000355/studies.dat
 7bc2daec6be953ffae920d81fe6fa25c  /data/Case/21000355/studies.dat

  I expected an input/output error after reading this file, because of
 the split-brain situation, but got none. There are no entries in the
 GlusterFS logs of either bonaire or curacao.

  bonaire# gluster volume heal data full
 Launching heal operation to perform full self heal on volume data has
 been successful
 Use heal info commands to check status
 bonaire# gluster volume heal data info
 Brick 

Re: [Gluster-users] Cannot start Gluster -- resolve brick failed in restore

2015-06-08 Thread Atin Mukherjee


On 06/08/2015 01:38 PM, shacky wrote:
 Hi.
 I have a GlusterFS cluster running on a Debian Wheezy with GlusterFS
 3.6.2, with one volume on all three bricks (web1, web2, web3).
 All was working good until I changed the IP addresses of bricks,
 because after then only the GlusterFS daemon on web1 is starting well,
 and the deamons on web2 and web3 are exiting with these errors:
 
 [2015-06-08 07:59:15.929330] I [MSGID: 100030]
 [glusterfsd.c:2018:main] 0-/usr/sbin/glusterd: Started running
 /usr/sbin/glusterd version 3.6.2 (args: /usr/sbin/glusterd -p
 /var/run/glusterd.pid)
 [2015-06-08 07:59:15.932417] I [glusterd.c:1214:init] 0-management:
 Maximum allowed open file descriptors set to 65536
 [2015-06-08 07:59:15.932482] I [glusterd.c:1259:init] 0-management:
 Using /var/lib/glusterd as working directory
 [2015-06-08 07:59:15.933772] W [rdma.c:4221:__gf_rdma_ctx_create]
 0-rpc-transport/rdma: rdma_cm event channel creation failed (No such
 device)
 [2015-06-08 07:59:15.933815] E [rdma.c:4519:init] 0-rdma.management:
 Failed to initialize IB Device
 [2015-06-08 07:59:15.933838] E
 [rpc-transport.c:333:rpc_transport_load] 0-rpc-transport: 'rdma'
 initialization failed
 [2015-06-08 07:59:15.933887] W [rpcsvc.c:1524:rpcsvc_transport_create]
 0-rpc-service: cannot create listener, initing the transport failed
 [2015-06-08 07:59:17.354500] I
 [glusterd-store.c:2043:glusterd_restore_op_version] 0-glusterd:
 retrieved op-version: 30600
 [2015-06-08 07:59:17.527377] I
 [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo]
 0-management: connect returned 0
 [2015-06-08 07:59:17.527446] I
 [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo]
 0-management: connect returned 0
 [2015-06-08 07:59:17.527499] I
 [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting
 frame-timeout to 600
 [2015-06-08 07:59:17.528139] I
 [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting
 frame-timeout to 600
 [2015-06-08 07:59:17.528861] E
 [glusterd-store.c:4244:glusterd_resolve_all_bricks] 0-glusterd:
 resolve brick failed in restore
 [2015-06-08 07:59:17.528891] E [xlator.c:425:xlator_init]
 0-management: Initialization of volume 'management' failed, review
 your volfile again
 [2015-06-08 07:59:17.528906] E [graph.c:322:glusterfs_graph_init]
 0-management: initializing translator failed
 [2015-06-08 07:59:17.528917] E [graph.c:525:glusterfs_graph_activate]
 0-graph: init failed
 [2015-06-08 07:59:17.529257] W [glusterfsd.c:1194:cleanup_and_exit]
 (-- 0-: received signum (0), shutting down
 
 Please note that bricks name are setted in /etc/hosts and all of them
 are resolving well with the new IP addresses, so I cannot find out
 where the problem is.
 
 Could you help me please?

Here is what you can do on the nodes where glusterD fails to start:

1. cd /var/lib/glusterd
2. grep -irns old ip
The output will be similar like this :

vols/test-vol/info:20:brick-0=172.17.0.2:-tmp-b1
vols/test-vol/info:21:brick-1=172.17.0.2:-tmp-b2
vols/test-vol/test-vol.tcp-fuse.vol:6:option remote-host 172.17.0.2
vols/test-vol/test-vol.tcp-fuse.vol:15:option remote-host 172.17.0.2
vols/test-vol/trusted-test-vol.tcp-fuse.vol:8:option remote-host
172.17.0.2
vols/test-vol/trusted-test-vol.tcp-fuse.vol:19:option remote-host
172.17.0.2
vols/test-vol/test-vol-rebalance.vol:6:option remote-host 172.17.0.2
vols/test-vol/test-vol-rebalance.vol:15:option remote-host 172.17.0.2
vols/test-vol/bricks/172.17.0.1:-tmp-b1:1:hostname=172.17.0.2
vols/test-vol/bricks/172.17.0.1:-tmp-b2:1:hostname=172.17.0.2
nfs/nfs-server.vol:8:option remote-host 172.17.0.2
nfs/nfs-server.vol:19:option remote-host 172.17.0.2

3. find . * -exec sed -i s/old ip/new ip/g {} \;

4. You would need to manually rename few files (for eg : mv
vols/test-vol/bricks/172.17.0.1:-tmp-b1
vols/test-vol/bricks/172.17.0.2:-tmp-b1)

Do this exercise on all the failed nodes and recheck and let me know if
it works.

~Atin
 
 Thank you very much!
 Bye
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users
 
 

-- 
~Atin
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Double counting of quota

2015-06-08 Thread Rajesh kumar Reddy Mekala

We have open bug 1227724 for the similar problem

Thanks,
Rajesh

On 06/08/2015 12:08 PM, Vijaikumar M wrote:

Hi Alessandro,

Please provide the test-case, so that we can try to re-create this 
problem in-house?


Thanks,
Vijay

On Saturday 06 June 2015 05:59 AM, Alessandro De Salvo wrote:

Hi,
just to answer to myself, it really seems the temp files from rsync are the 
culprit, it seems that their size are summed up to the real contents of the 
directories I’m synchronizing, or in other terms their size is not removed from 
the used size after they are removed. I suppose this is someway connected to 
the error on removexattr I’m seeing. The temporary solution I’ve found is to 
use rsync with the option to write the temp files to /tmp, but it would be very 
interesting to understand why this is happening.
Cheers,

Alessandro


Il giorno 06/giu/2015, alle ore 01:19, Alessandro De 
Salvoalessandro.desa...@roma1.infn.it  ha scritto:

Hi,
I currently have two brick with replica 2 on the same machine, pointing to 
different disks of a connected SAN.
The volume itself is fine:

# gluster volume info atlas-home-01

Volume Name: atlas-home-01
Type: Replicate
Volume ID: 660db960-31b8-4341-b917-e8b43070148b
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: host1:/bricks/atlas/home02/data
Brick2: host2:/bricks/atlas/home01/data
Options Reconfigured:
performance.write-behind-window-size: 4MB
performance.io-thread-count: 32
performance.readdir-ahead: on
server.allow-insecure: on
nfs.disable: true
features.quota: on
features.inode-quota: on


However, when I set a quota on a dir of the volume the size show is twice the 
physical size of the actual dir:

# gluster volume quota atlas-home-01 list /user1
  Path   Hard-limit Soft-limit   Used  
Available  Soft-limit exceeded? Hard-limit exceeded?
---
/user14.0GB   80%   3.2GB 853.4MB   
   No   No

# du -sh /storage/atlas/home/user1
1.6G/storage/atlas/home/user1

If I remove one of the bricks the quota shows the correct value.
Is there any double counting in case the bricks are on the same machine?
Also, I see a lot of errors in the logs like the following:

[2015-06-05 21:59:27.450407] E [posix-handle.c:157:posix_make_ancestryfromgfid] 
0-atlas-home-01-posix: could not read the link from the gfid handle 
/bricks/atlas/home01/data/.glusterfs/be/e5/bee5e2b8-c639-4539-a483-96c19cd889eb 
(No such file or directory)

and also

[2015-06-05 22:52:01.112070] E [marker-quota.c:2363:mq_mark_dirty] 
0-atlas-home-01-marker: failed to get inode ctx for /user1/file1

When running rsync I also see the following errors:

[2015-06-05 23:06:22.203968] E [marker-quota.c:2601:mq_remove_contri] 
0-atlas-home-01-marker: removexattr 
trusted.glusterfs.quota.fddf31ba-7f1d-4ba8-a5ad-2ebd6e4030f3.contri failed for 
/user1/..bashrc.O4kekp: No data available

Those files are the temp files of rsync, I’m not sure why the throw errors in 
glusterfs.
Any help?
Thanks,

Alessandro


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users




___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Monitorig gluster 3.6.1

2015-06-08 Thread Michael Schwartzkopff
Am Montag, 8. Juni 2015, 12:42:17 schrieb M S Vishwanath Bhat:
 On 1 June 2015 at 12:28, Félix de Lelelis felix.deleli...@gmail.com wrote:
  Hi,
  
  I have monitoring gluster with scripts that lunch scripts. All scripts are
  redirected to a one script that check if is active any process glusterd
  and
  if the repsonse its false, the script lunch the check.
  
  All checks are:
 - gluster volume volname info
 - gluster volume heal volname info
 - gluster volume heal volname split-brain
 - gluster volume volname status detail
 - gluster volume volname statistics
  
  Since I enable the monitoring in our pre-production gluster, the gluster
  is down 2 times. We  suspect that the monitoring are overloading but
  should
  not.
  
  The question is, there any way to check those states otherwise?
 
 You can make use of https://github.com/keithseahus/fluent-plugin-glusterfs
 as well.
 
 http://docs.fluentd.org/articles/collect-glusterfs-logs
 
 HTH
 
 Best Regards,
 Vishwanath

gluster lacks the implementation of a SNMP agent, so ALL monitoring systems 
can monitor it.

Home-grown scripts or implementations for one specific monitoring system cannot 
be the solution.


Mit freundlichen Grüßen,

Michael Schwartzkopff

-- 
[*] sys4 AG

http://sys4.de, +49 (89) 30 90 46 64, +49 (162) 165 0044
Franziskanerstraße 15, 81669 München

Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
Vorstand: Patrick Ben Koetter, Marc Schiffbauer
Aufsichtsratsvorsitzender: Florian Kirstein

signature.asc
Description: This is a digitally signed message part.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Questions on ganesha HA and shared storage size

2015-06-08 Thread Alessandro De Salvo
Hi,
I have seen the demo video on ganesha HA, 
https://www.youtube.com/watch?v=Z4mvTQC-efM
However there is no advice on the appropriate size of the shared volume. How is 
it really used, and what should be a reasonable size for it?
Also, are the slides from the video available somewhere, as well as a 
documentation on all this? I did not manage to find them.
Thanks,

Alessandro

smime.p7s
Description: S/MIME cryptographic signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances

2015-06-08 Thread Geoffrey Letessier
Hello,

Do you know more about?

In addition, do you know how to « activate » RDMA for my volume with 
Intel/QLogic QDR? Currently, i mount my volumes with RDMA transport-type option 
(both in server and client side) but I notice all streams are using TCP stack 
-and my bandwith never exceed 2.0-2.5Gbs (250-300MB/s).

Thanks in advance,
Geoffrey
--
Geoffrey Letessier
Responsable informatique  ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letess...@ibpc.fr

 Le 2 juin 2015 à 23:45, Geoffrey Letessier geoffrey.letess...@cnrs.fr a 
 écrit :
 
 Hi Ben,
 
 I just check my messages log files, both on client and server, and I dont 
 find any hung task you notice on yours.. 
 
 As you can read below, i dont note the performance issue in a simple DD but I 
 think my issue is concerning a set of small files (tens of thousands nay 
 more)…
 
 [root@nisus test]# ddt -t 10g /mnt/test/
 Writing to /mnt/test/ddt.8362 ... syncing ... done.
 sleeping 10 seconds ... done.
 Reading from /mnt/test/ddt.8362 ... done.
 10240MiBKiB/s  CPU%
 Write  114770 4
 Read40675 4
 
 for info: /mnt/test concerns the single v2 GlFS volume
 
 [root@nisus test]# ddt -t 10g /mnt/fhgfs/
 Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done.
 sleeping 10 seconds ... done.
 Reading from /mnt/fhgfs/ddt.8380 ... done.
 10240MiBKiB/s  CPU%
 Write  102591 1
 Read98079 2
 
 Do you have a idea how to tune/optimize performance settings? and/or TCP 
 settings (MTU, etc.)?
 
 ---
 | |  UNTAR  |   DU   |  FIND   |   TAR   |   RM   |
 ---
 | single  |  ~3m45s |   ~43s |~47s |  ~3m10s | ~3m15s |
 ---
 | replicated  |  ~5m10s |   ~59s |   ~1m6s |  ~1m19s | ~1m49s |
 ---
 | distributed |  ~4m18s |   ~41s |~57s |  ~2m24s | ~1m38s |
 ---
 | dist-repl   |  ~8m18s |  ~1m4s |  ~1m11s |  ~1m24s | ~2m40s |
 ---
 | native FS   |~11s |~4s | ~2s |~56s |   ~10s |
 ---
 | BeeGFS  |  ~3m43s |   ~15s | ~3s |  ~1m33s |   ~46s |
 ---
 | single (v2) |   ~3m6s |   ~14s |~32s |   ~1m2s |   ~44s |
 ---
 for info: 
   -BeeGFS is a distributed FS (4 bricks, 2 bricks per server and 2 
 servers)
   - single (v2): simple gluster volume with default settings
 
 I also note I obtain the same tar/untar performance issue with FhGFS/BeeGFS 
 but the rest (DU, FIND, RM) looks like to be OK.
 
 Thank you very much for your reply and help.
 Geoffrey
 ---
 Geoffrey Letessier
 
 Responsable informatique  ingénieur système
 CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
 Institut de Biologie Physico-Chimique
 13, rue Pierre et Marie Curie - 75005 Paris
 Tel: 01 58 41 50 93 - eMail: geoffrey.letess...@cnrs.fr 
 mailto:geoffrey.letess...@cnrs.fr
 Le 2 juin 2015 à 21:53, Ben Turner btur...@redhat.com 
 mailto:btur...@redhat.com a écrit :
 
 I am seeing problems on 3.7 as well.  Can you check /var/log/messages on 
 both the clients and servers for hung tasks like:
 
 Jun  2 15:23:14 gqac006 kernel: echo 0  
 /proc/sys/kernel/hung_task_timeout_secs disables this message.
 Jun  2 15:23:14 gqac006 kernel: iozoneD 0001 0 21999 
  1 0x0080
 Jun  2 15:23:14 gqac006 kernel: 880611321cc8 0082 
 880611321c18 a027236e
 Jun  2 15:23:14 gqac006 kernel: 880611321c48 a0272c10 
 88052bd1e040 880611321c78
 Jun  2 15:23:14 gqac006 kernel: 88052bd1e0f0 88062080c7a0 
 880625addaf8 880611321fd8
 Jun  2 15:23:14 gqac006 kernel: Call Trace:
 Jun  2 15:23:14 gqac006 kernel: [a027236e] ? 
 rpc_make_runnable+0x7e/0x80 [sunrpc]
 Jun  2 15:23:14 gqac006 kernel: [a0272c10] ? rpc_execute+0x50/0xa0 
 [sunrpc]
 Jun  2 15:23:14 gqac006 kernel: [810aaa21] ? ktime_get_ts+0xb1/0xf0
 Jun  2 15:23:14 gqac006 kernel: [811242d0] ? sync_page+0x0/0x50
 Jun  2 15:23:14 gqac006 kernel: [8152a1b3] io_schedule+0x73/0xc0
 Jun  2 15:23:14 gqac006 kernel: [8112430d] sync_page+0x3d/0x50
 Jun  2 15:23:14 gqac006 kernel: [8152ac7f] __wait_on_bit+0x5f/0x90
 Jun  2 15:23:14 gqac006 kernel: [81124543] 
 wait_on_page_bit+0x73/0x80
 Jun  2 15:23:14 gqac006 kernel: [8109eb80] ? 
 wake_bit_function+0x0/0x50
 Jun  2 

Re: [Gluster-users] Questions on ganesha HA and shared storage size

2015-06-08 Thread Alessandro De Salvo
Great, many thanks Soumya!
Cheers,

Alessandro

 Il giorno 08/giu/2015, alle ore 13:53, Soumya Koduri skod...@redhat.com ha 
 scritto:
 
 Hi,
 
 Please find the slides of the demo video at [1]
 
 We recommend to have a distributed replica volume as a shared volume for 
 better data-availability.
 
 Size of the volume depends on the workload you may have. Since it is used to 
 maintain states of NLM/NFSv4 clients, you may calculate the size of the 
 volume to be minimum of aggregate of
 (typical_size_of'/var/lib/nfs'_directory + 
 ~4k*no_of_clients_connected_to_each_of_the_nfs_servers_at_any_point)
 
 We shall document about this feature sooner in the gluster docs as well.
 
 Thanks,
 Soumya
 
 [1] - http://www.slideshare.net/SoumyaKoduri/high-49117846
 
 On 06/08/2015 04:34 PM, Alessandro De Salvo wrote:
 Hi,
 I have seen the demo video on ganesha HA, 
 https://www.youtube.com/watch?v=Z4mvTQC-efM
 However there is no advice on the appropriate size of the shared volume. How 
 is it really used, and what should be a reasonable size for it?
 Also, are the slides from the video available somewhere, as well as a 
 documentation on all this? I did not manage to find them.
 Thanks,
 
  Alessandro
 
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users
 



smime.p7s
Description: S/MIME cryptographic signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Quota issue

2015-06-08 Thread Geoffrey Letessier
In addition, i notice a very big difference between the sum of DU on each brick 
and « quota list » display, as you can read below:
[root@lucifer ~]# pdsh -w cl-storage[1,3] du -sh 
/export/brick_home/brick*/amyloid_team
cl-storage1: 1,6T   /export/brick_home/brick1/amyloid_team
cl-storage3: 1,6T   /export/brick_home/brick1/amyloid_team
cl-storage1: 1,6T   /export/brick_home/brick2/amyloid_team
cl-storage3: 1,6T   /export/brick_home/brick2/amyloid_team
[root@lucifer ~]# gluster volume quota vol_home list /amyloid_team
  Path   Hard-limit Soft-limit   Used  Available

/amyloid_team  9.0TB   90%   7.8TB   1.2TB

As you can notice, the sum of all bricks gives me roughly 6.4TB and « quota 
list » around 7.8TB; so there is a difference of 1.4TB i’m not able to explain… 
Do you have any idea?

Thanks,
Geoffrey
--
Geoffrey Letessier
Responsable informatique  ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letess...@ibpc.fr

 Le 8 juin 2015 à 14:30, Geoffrey Letessier geoffrey.letess...@cnrs.fr a 
 écrit :
 
 Hello,
 
 Concerning the 3.5.3 version of GlusterFS, I met this morning a strange issue 
 writing file when quota is exceeded. 
 
 One person of my lab, whose her quota is exceeded (but she didn’t know about) 
 try to modify a file but, because of exceeded quota, she was unable to and 
 decided to exit VI. Now, her file is empty/blank as you can read below:
 pdsh@lucifer: cl-storage3: ssh exited with exit code 2
 cl-storage1: -T 2 tarus amyloid_team 0 19 févr. 12:34 
 /export/brick_home/brick1/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh
 cl-storage1: -rwxrw-r-- 2 tarus amyloid_team 0  8 juin  12:38 
 /export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh
 
 In addition, i dont understand why, my volume being a distributed volume 
 inside replica (cl-storage[1,3] is replicated only on cl-storage[2,4]), i 
 have 2 « same » files (complete path) in 2 different bricks (as you can read 
 above).
 
 Thanks by advance for your help and clarification.
 Geoffrey
 --
 Geoffrey Letessier
 Responsable informatique  ingénieur système
 UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
 Institut de Biologie Physico-Chimique
 13, rue Pierre et Marie Curie - 75005 Paris
 Tel: 01 58 41 50 93 - eMail: geoffrey.letess...@ibpc.fr 
 mailto:geoffrey.letess...@ibpc.fr
 Le 2 juin 2015 à 23:45, Geoffrey Letessier geoffrey.letess...@cnrs.fr 
 mailto:geoffrey.letess...@cnrs.fr a écrit :
 
 Hi Ben,
 
 I just check my messages log files, both on client and server, and I dont 
 find any hung task you notice on yours.. 
 
 As you can read below, i dont note the performance issue in a simple DD but 
 I think my issue is concerning a set of small files (tens of thousands nay 
 more)…
 
 [root@nisus test]# ddt -t 10g /mnt/test/
 Writing to /mnt/test/ddt.8362 ... syncing ... done.
 sleeping 10 seconds ... done.
 Reading from /mnt/test/ddt.8362 ... done.
 10240MiBKiB/s  CPU%
 Write  114770 4
 Read40675 4
 
 for info: /mnt/test concerns the single v2 GlFS volume
 
 [root@nisus test]# ddt -t 10g /mnt/fhgfs/
 Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done.
 sleeping 10 seconds ... done.
 Reading from /mnt/fhgfs/ddt.8380 ... done.
 10240MiBKiB/s  CPU%
 Write  102591 1
 Read98079 2
 
 Do you have a idea how to tune/optimize performance settings? and/or TCP 
 settings (MTU, etc.)?
 
 ---
 | |  UNTAR  |   DU   |  FIND   |   TAR   |   RM   |
 ---
 | single  |  ~3m45s |   ~43s |~47s |  ~3m10s | ~3m15s |
 ---
 | replicated  |  ~5m10s |   ~59s |   ~1m6s |  ~1m19s | ~1m49s |
 ---
 | distributed |  ~4m18s |   ~41s |~57s |  ~2m24s | ~1m38s |
 ---
 | dist-repl   |  ~8m18s |  ~1m4s |  ~1m11s |  ~1m24s | ~2m40s |
 ---
 | native FS   |~11s |~4s | ~2s |~56s |   ~10s |
 ---
 | BeeGFS  |  ~3m43s |   ~15s | ~3s |  ~1m33s |   ~46s |
 ---
 | single (v2) |   ~3m6s |   ~14s |~32s |   ~1m2s |   ~44s |
 ---
 for 

[Gluster-users] Quota issue

2015-06-08 Thread Geoffrey Letessier
Hello,

Concerning the 3.5.3 version of GlusterFS, I met this morning a strange issue 
writing file when quota is exceeded. 

One person of my lab, whose her quota is exceeded (but she didn’t know about) 
try to modify a file but, because of exceeded quota, she was unable to and 
decided to exit VI. Now, her file is empty/blank as you can read below:
pdsh@lucifer: cl-storage3: ssh exited with exit code 2
cl-storage1: -T 2 tarus amyloid_team 0 19 févr. 12:34 
/export/brick_home/brick1/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh
cl-storage1: -rwxrw-r-- 2 tarus amyloid_team 0  8 juin  12:38 
/export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh

In addition, i dont understand why, my volume being a distributed volume inside 
replica (cl-storage[1,3] is replicated only on cl-storage[2,4]), i have 2 « 
same » files (complete path) in 2 different bricks (as you can read above).

Thanks by advance for your help and clarification.
Geoffrey
--
Geoffrey Letessier
Responsable informatique  ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letess...@ibpc.fr

 Le 2 juin 2015 à 23:45, Geoffrey Letessier geoffrey.letess...@cnrs.fr a 
 écrit :
 
 Hi Ben,
 
 I just check my messages log files, both on client and server, and I dont 
 find any hung task you notice on yours.. 
 
 As you can read below, i dont note the performance issue in a simple DD but I 
 think my issue is concerning a set of small files (tens of thousands nay 
 more)…
 
 [root@nisus test]# ddt -t 10g /mnt/test/
 Writing to /mnt/test/ddt.8362 ... syncing ... done.
 sleeping 10 seconds ... done.
 Reading from /mnt/test/ddt.8362 ... done.
 10240MiBKiB/s  CPU%
 Write  114770 4
 Read40675 4
 
 for info: /mnt/test concerns the single v2 GlFS volume
 
 [root@nisus test]# ddt -t 10g /mnt/fhgfs/
 Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done.
 sleeping 10 seconds ... done.
 Reading from /mnt/fhgfs/ddt.8380 ... done.
 10240MiBKiB/s  CPU%
 Write  102591 1
 Read98079 2
 
 Do you have a idea how to tune/optimize performance settings? and/or TCP 
 settings (MTU, etc.)?
 
 ---
 | |  UNTAR  |   DU   |  FIND   |   TAR   |   RM   |
 ---
 | single  |  ~3m45s |   ~43s |~47s |  ~3m10s | ~3m15s |
 ---
 | replicated  |  ~5m10s |   ~59s |   ~1m6s |  ~1m19s | ~1m49s |
 ---
 | distributed |  ~4m18s |   ~41s |~57s |  ~2m24s | ~1m38s |
 ---
 | dist-repl   |  ~8m18s |  ~1m4s |  ~1m11s |  ~1m24s | ~2m40s |
 ---
 | native FS   |~11s |~4s | ~2s |~56s |   ~10s |
 ---
 | BeeGFS  |  ~3m43s |   ~15s | ~3s |  ~1m33s |   ~46s |
 ---
 | single (v2) |   ~3m6s |   ~14s |~32s |   ~1m2s |   ~44s |
 ---
 for info: 
   -BeeGFS is a distributed FS (4 bricks, 2 bricks per server and 2 
 servers)
   - single (v2): simple gluster volume with default settings
 
 I also note I obtain the same tar/untar performance issue with FhGFS/BeeGFS 
 but the rest (DU, FIND, RM) looks like to be OK.
 
 Thank you very much for your reply and help.
 Geoffrey
 ---
 Geoffrey Letessier
 
 Responsable informatique  ingénieur système
 CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
 Institut de Biologie Physico-Chimique
 13, rue Pierre et Marie Curie - 75005 Paris
 Tel: 01 58 41 50 93 - eMail: geoffrey.letess...@cnrs.fr 
 mailto:geoffrey.letess...@cnrs.fr
 Le 2 juin 2015 à 21:53, Ben Turner btur...@redhat.com 
 mailto:btur...@redhat.com a écrit :
 
 I am seeing problems on 3.7 as well.  Can you check /var/log/messages on 
 both the clients and servers for hung tasks like:
 
 Jun  2 15:23:14 gqac006 kernel: echo 0  
 /proc/sys/kernel/hung_task_timeout_secs disables this message.
 Jun  2 15:23:14 gqac006 kernel: iozoneD 0001 0 21999 
  1 0x0080
 Jun  2 15:23:14 gqac006 kernel: 880611321cc8 0082 
 880611321c18 a027236e
 Jun  2 15:23:14 gqac006 kernel: 880611321c48 a0272c10 
 88052bd1e040 880611321c78
 Jun  2 15:23:14 gqac006 kernel: 88052bd1e0f0 88062080c7a0 
 880625addaf8 880611321fd8
 Jun  2 15:23:14 gqac006 kernel: Call Trace:
 

Re: [Gluster-users] Double counting of quota

2015-06-08 Thread Alessandro De Salvo
OK, many thanks Rajesh.
I just wanted to add that I see a lot of warnings in the logs like the 
following:

[2015-06-08 13:13:10.365633] W [marker-quota.c:3162:mq_initiate_quota_task] 
0-atlas-data-01-marker: inode ctx get failed, aborting quota txn

I’m not sure if this is a bug (related or not to the one you mention) or if it 
is normal and harmless.
Thanks,

Alessandro


 Il giorno 08/giu/2015, alle ore 10:39, Rajesh kumar Reddy Mekala 
 rmek...@redhat.com ha scritto:
 
 We have open bug 1227724 for the similar problem 
 
 Thanks,
 Rajesh
 
 On 06/08/2015 12:08 PM, Vijaikumar M wrote:
 Hi Alessandro,
 
 Please provide the test-case, so that we can try to re-create this problem 
 in-house?
 
 Thanks,
 Vijay
 
 On Saturday 06 June 2015 05:59 AM, Alessandro De Salvo wrote:
 Hi,
 just to answer to myself, it really seems the temp files from rsync are the 
 culprit, it seems that their size are summed up to the real contents of the 
 directories I’m synchronizing, or in other terms their size is not removed 
 from the used size after they are removed. I suppose this is someway 
 connected to the error on removexattr I’m seeing. The temporary solution 
 I’ve found is to use rsync with the option to write the temp files to /tmp, 
 but it would be very interesting to understand why this is happening.
 Cheers,
 
 Alessandro
 
 Il giorno 06/giu/2015, alle ore 01:19, Alessandro De Salvo 
 alessandro.desa...@roma1.infn.it 
 mailto:alessandro.desa...@roma1.infn.it ha scritto:
 
 Hi,
 I currently have two brick with replica 2 on the same machine, pointing to 
 different disks of a connected SAN.
 The volume itself is fine:
 
 # gluster volume info atlas-home-01
 
 Volume Name: atlas-home-01
 Type: Replicate
 Volume ID: 660db960-31b8-4341-b917-e8b43070148b
 Status: Started
 Number of Bricks: 1 x 2 = 2
 Transport-type: tcp
 Bricks:
 Brick1: host1:/bricks/atlas/home02/data
 Brick2: host2:/bricks/atlas/home01/data
 Options Reconfigured:
 performance.write-behind-window-size: 4MB
 performance.io-thread-count: 32
 performance.readdir-ahead: on
 server.allow-insecure: on
 nfs.disable: true
 features.quota: on
 features.inode-quota: on
 
 
 However, when I set a quota on a dir of the volume the size show is twice 
 the physical size of the actual dir:
 
 # gluster volume quota atlas-home-01 list /user1
  Path   Hard-limit Soft-limit   Used  
 Available  Soft-limit exceeded? Hard-limit exceeded?
 ---
 /user14.0GB   80%   3.2GB 
 853.4MB  No   No
 
 # du -sh /storage/atlas/home/user1
 1.6G/storage/atlas/home/user1
 
 If I remove one of the bricks the quota shows the correct value.
 Is there any double counting in case the bricks are on the same machine?
 Also, I see a lot of errors in the logs like the following:
 
 [2015-06-05 21:59:27.450407] E 
 [posix-handle.c:157:posix_make_ancestryfromgfid] 0-atlas-home-01-posix: 
 could not read the link from the gfid handle 
 /bricks/atlas/home01/data/.glusterfs/be/e5/bee5e2b8-c639-4539-a483-96c19cd889eb
  (No such file or directory)
 
 and also
 
 [2015-06-05 22:52:01.112070] E [marker-quota.c:2363:mq_mark_dirty] 
 0-atlas-home-01-marker: failed to get inode ctx for /user1/file1
 
 When running rsync I also see the following errors:
 
 [2015-06-05 23:06:22.203968] E [marker-quota.c:2601:mq_remove_contri] 
 0-atlas-home-01-marker: removexattr 
 trusted.glusterfs.quota.fddf31ba-7f1d-4ba8-a5ad-2ebd6e4030f3.contri failed 
 for /user1/..bashrc.O4kekp: No data available
 
 Those files are the temp files of rsync, I’m not sure why the throw errors 
 in glusterfs.
 Any help?
 Thanks,
 
Alessandro
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org mailto:Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users 
 http://www.gluster.org/mailman/listinfo/gluster-users
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org mailto:Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users 
 http://www.gluster.org/mailman/listinfo/gluster-users
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org mailto:Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users 
 http://www.gluster.org/mailman/listinfo/gluster-users



smime.p7s
Description: S/MIME cryptographic signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Questions on ganesha HA and shared storage size

2015-06-08 Thread Soumya Koduri

Hi,

Please find the slides of the demo video at [1]

We recommend to have a distributed replica volume as a shared volume for 
better data-availability.


Size of the volume depends on the workload you may have. Since it is 
used to maintain states of NLM/NFSv4 clients, you may calculate the size 
of the volume to be minimum of aggregate of
 (typical_size_of'/var/lib/nfs'_directory + 
~4k*no_of_clients_connected_to_each_of_the_nfs_servers_at_any_point)


We shall document about this feature sooner in the gluster docs as well.

Thanks,
Soumya

[1] - http://www.slideshare.net/SoumyaKoduri/high-49117846

On 06/08/2015 04:34 PM, Alessandro De Salvo wrote:

Hi,
I have seen the demo video on ganesha HA, 
https://www.youtube.com/watch?v=Z4mvTQC-efM
However there is no advice on the appropriate size of the shared volume. How is 
it really used, and what should be a reasonable size for it?
Also, are the slides from the video available somewhere, as well as a 
documentation on all this? I did not manage to find them.
Thanks,

Alessandro



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Questions on ganesha HA and shared storage size

2015-06-08 Thread Alessandro De Salvo
Sorry, just another question:

- in my installation of gluster 3.7.1 the command gluster features.ganesha 
enable does not work:

# gluster features.ganesha enable
unrecognized word: features.ganesha (position 0)

Which version has full support for it?

- in the documentation the ccs and cman packages are required, but they seems 
not to be available anymore on CentOS 7 and similar, I guess they are not 
really required anymore, as pcs should do the full job

Thanks,

Alessandro

 Il giorno 08/giu/2015, alle ore 15:09, Alessandro De Salvo 
 alessandro.desa...@roma1.infn.it ha scritto:
 
 Great, many thanks Soumya!
 Cheers,
 
   Alessandro
 
 Il giorno 08/giu/2015, alle ore 13:53, Soumya Koduri skod...@redhat.com ha 
 scritto:
 
 Hi,
 
 Please find the slides of the demo video at [1]
 
 We recommend to have a distributed replica volume as a shared volume for 
 better data-availability.
 
 Size of the volume depends on the workload you may have. Since it is used to 
 maintain states of NLM/NFSv4 clients, you may calculate the size of the 
 volume to be minimum of aggregate of
 (typical_size_of'/var/lib/nfs'_directory + 
 ~4k*no_of_clients_connected_to_each_of_the_nfs_servers_at_any_point)
 
 We shall document about this feature sooner in the gluster docs as well.
 
 Thanks,
 Soumya
 
 [1] - http://www.slideshare.net/SoumyaKoduri/high-49117846
 
 On 06/08/2015 04:34 PM, Alessandro De Salvo wrote:
 Hi,
 I have seen the demo video on ganesha HA, 
 https://www.youtube.com/watch?v=Z4mvTQC-efM
 However there is no advice on the appropriate size of the shared volume. 
 How is it really used, and what should be a reasonable size for it?
 Also, are the slides from the video available somewhere, as well as a 
 documentation on all this? I did not manage to find them.
 Thanks,
 
 Alessandro
 
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users
 
 



smime.p7s
Description: S/MIME cryptographic signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS 3.6.1 breaks VM images on cluster node restart

2015-06-08 Thread André Bauer
I saw similar behaviour when file permissions of vm image was set to
root:root instead of hypervisor user.

chown -R libvirt-qemu:kvm /var/lib/libvirt/images before starting vm
did the trick for me...


Am 04.06.2015 um 16:08 schrieb Roger Lehmann:
 Hello, I'm having a serious problem with my GlusterFS cluster.
 I'm using Proxmox 3.4 for high available VM management which works with
 GlusterFS as storage.
 Unfortunately, when I restart every node in the cluster sequentially one
 by one (with online migration of the running HA VM first of course) the
 qemu image of the HA VM gets corrupted and the VM itself has problems
 accessing it.
 
 May 15 10:35:09 blog kernel: [339003.942602] end_request: I/O error, dev
 vda, sector 2048
 May 15 10:35:09 blog kernel: [339003.942829] Buffer I/O error on device
 vda1, logical block 0
 May 15 10:35:09 blog kernel: [339003.942929] lost page write due to I/O
 error on vda1
 May 15 10:35:09 blog kernel: [339003.942952] end_request: I/O error, dev
 vda, sector 2072
 May 15 10:35:09 blog kernel: [339003.943049] Buffer I/O error on device
 vda1, logical block 3
 May 15 10:35:09 blog kernel: [339003.943146] lost page write due to I/O
 error on vda1
 May 15 10:35:09 blog kernel: [339003.943153] end_request: I/O error, dev
 vda, sector 4196712
 May 15 10:35:09 blog kernel: [339003.943251] Buffer I/O error on device
 vda1, logical block 524333
 May 15 10:35:09 blog kernel: [339003.943350] lost page write due to I/O
 error on vda1
 May 15 10:35:09 blog kernel: [339003.943363] end_request: I/O error, dev
 vda, sector 4197184
 
 
 After the image is broken, it's impossible to migrate the VM or start it
 when it's down.
 
 root@pve2 ~ # gluster volume heal pve-vol info
 Gathering list of entries to be healed on volume pve-vol has been
 successful
 
 Brick pve1:/var/lib/glusterd/brick
 Number of entries: 1
 /images//200/vm-200-disk-1.qcow2
 
 Brick pve2:/var/lib/glusterd/brick
 Number of entries: 1
 /images/200/vm-200-disk-1.qcow2
 
 Brick pve3:/var/lib/glusterd/brick
 Number of entries: 1
 /images//200/vm-200-disk-1.qcow2
 
 
 
 I couldn't really reproduce this in my test environment with GlusterFS
 3.6.2 but I had other problems while testing (may also be because of a
 virtualized test environment), so I don't want to upgrade to 3.6.2 until
 I definitely know the problems I encountered are fixed in 3.6.2.
 Anybody else experienced this problem? I'm not sure if issue 1161885
 (Possible file corruption on dispersed volumes) is the issue I'm
 experiencing. I have a 3 node replicate cluster.
 Thanks for your help!
 
 Regards,
 Roger Lehmann
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users
 


-- 
Mit freundlichen Grüßen
André Bauer

MAGIX Software GmbH
André Bauer
Administrator
August-Bebel-Straße 48
01219 Dresden
GERMANY

tel.: 0351 41884875
e-mail: aba...@magix.net
aba...@magix.net mailto:Email
www.magix.com http://www.magix.com/


Geschäftsführer | Managing Directors: Dr. Arnd Schröder, Michael Keith
Amtsgericht | Commercial Register: Berlin Charlottenburg, HRB 127205

Find us on:

http://www.facebook.com/MAGIX http://www.twitter.com/magix_de
http://www.youtube.com/wwwmagixcom http://www.magixmagazin.de
--
The information in this email is intended only for the addressee named
above. Access to this email by anyone else is unauthorized. If you are
not the intended recipient of this message any disclosure, copying,
distribution or any action taken in reliance on it is prohibited and
may be unlawful. MAGIX does not warrant that any attachments are free
from viruses or other defects and accepts no liability for any losses
resulting from infected email transmissions. Please note that any
views expressed in this email may be those of the originator and do
not necessarily represent the agenda of the company.
--
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances

2015-06-08 Thread Ben Turner
- Original Message -
 From: Geoffrey Letessier geoffrey.letess...@cnrs.fr
 To: Ben Turner btur...@redhat.com
 Cc: Pranith Kumar Karampuri pkara...@redhat.com, gluster-users@gluster.org
 Sent: Monday, June 8, 2015 8:37:08 AM
 Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances
 
 Hello,
 
 Do you know more about?
 
 In addition, do you know how to « activate » RDMA for my volume with
 Intel/QLogic QDR? Currently, i mount my volumes with RDMA transport-type
 option (both in server and client side) but I notice all streams are using
 TCP stack -and my bandwith never exceed 2.0-2.5Gbs (250-300MB/s).

That is a little slow for the HW you described.  Can you check what you get 
with iperf just between the clients and servers?  https://iperf.fr/  With 
replica 2 and 10G NW you should see ~400 MB / sec sequential writes and ~600 MB 
/ sec reads.  Can you send me the output from gluster v info?  You specify RDMA 
volumes at create time by running gluster v create blah transport rdma, did you 
specify RDMA when you created the volume?  What block size are you using in 
your tests?  1024 KB writes perform best with glusterfs, and the block size 
gets smaller perf will drop a little bit.  I wouldn't write in anything under 
4k blocks, the sweet spot is between 64k and 1024k.

-b

 
 Thanks in advance,
 Geoffrey
 --
 Geoffrey Letessier
 Responsable informatique  ingénieur système
 UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
 Institut de Biologie Physico-Chimique
 13, rue Pierre et Marie Curie - 75005 Paris
 Tel: 01 58 41 50 93 - eMail: geoffrey.letess...@ibpc.fr
 
  Le 2 juin 2015 à 23:45, Geoffrey Letessier geoffrey.letess...@cnrs.fr a
  écrit :
  
  Hi Ben,
  
  I just check my messages log files, both on client and server, and I dont
  find any hung task you notice on yours..
  
  As you can read below, i dont note the performance issue in a simple DD but
  I think my issue is concerning a set of small files (tens of thousands nay
  more)…
  
  [root@nisus test]# ddt -t 10g /mnt/test/
  Writing to /mnt/test/ddt.8362 ... syncing ... done.
  sleeping 10 seconds ... done.
  Reading from /mnt/test/ddt.8362 ... done.
  10240MiBKiB/s  CPU%
  Write  114770 4
  Read40675 4
  
  for info: /mnt/test concerns the single v2 GlFS volume
  
  [root@nisus test]# ddt -t 10g /mnt/fhgfs/
  Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done.
  sleeping 10 seconds ... done.
  Reading from /mnt/fhgfs/ddt.8380 ... done.
  10240MiBKiB/s  CPU%
  Write  102591 1
  Read98079 2
  
  Do you have a idea how to tune/optimize performance settings? and/or TCP
  settings (MTU, etc.)?
  
  ---
  | |  UNTAR  |   DU   |  FIND   |   TAR   |   RM   |
  ---
  | single  |  ~3m45s |   ~43s |~47s |  ~3m10s | ~3m15s |
  ---
  | replicated  |  ~5m10s |   ~59s |   ~1m6s |  ~1m19s | ~1m49s |
  ---
  | distributed |  ~4m18s |   ~41s |~57s |  ~2m24s | ~1m38s |
  ---
  | dist-repl   |  ~8m18s |  ~1m4s |  ~1m11s |  ~1m24s | ~2m40s |
  ---
  | native FS   |~11s |~4s | ~2s |~56s |   ~10s |
  ---
  | BeeGFS  |  ~3m43s |   ~15s | ~3s |  ~1m33s |   ~46s |
  ---
  | single (v2) |   ~3m6s |   ~14s |~32s |   ~1m2s |   ~44s |
  ---
  for info:
  -BeeGFS is a distributed FS (4 bricks, 2 bricks per server and 2 
  servers)
  - single (v2): simple gluster volume with default settings
  
  I also note I obtain the same tar/untar performance issue with FhGFS/BeeGFS
  but the rest (DU, FIND, RM) looks like to be OK.
  
  Thank you very much for your reply and help.
  Geoffrey
  ---
  Geoffrey Letessier
  
  Responsable informatique  ingénieur système
  CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
  Institut de Biologie Physico-Chimique
  13, rue Pierre et Marie Curie - 75005 Paris
  Tel: 01 58 41 50 93 - eMail: geoffrey.letess...@cnrs.fr
  mailto:geoffrey.letess...@cnrs.fr
  Le 2 juin 2015 à 21:53, Ben Turner btur...@redhat.com
  mailto:btur...@redhat.com a écrit :
  
  I am seeing problems on 3.7 as well.  Can you check /var/log/messages on
  both the clients and servers for hung tasks like:
  
  Jun  2 15:23:14 gqac006 kernel: echo 0 
  /proc/sys/kernel/hung_task_timeout_secs disables this message.
  Jun  2 15:23:14 gqac006 kernel: iozoneD 0001 0
  21999  1 0x0080
  Jun  2