Re: [Gluster-devel] missing files

2015-02-05 Thread Pranith Kumar Karampuri
This used to happen because of a dht issue, +Raghavendra to check if he 
knows something about this.


Pranith
On 02/06/2015 10:06 AM, David F. Robinson wrote:
Not repeatable.  Once it shows up, it stays there.  I sent some other 
strange behavior I am seeing to Pranith earlier this evening.  
Attached below...


David

Another issue I am having that might be related is that I cannot 
delete some directories. It complains that the directories are not 
empty. But when I list them out, there is nothing there.
However, if I know of the name of the directory, I can cd into it and 
see the files.


[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# pwd
/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor 



[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -al
total 0
drwxrws--x 7 root root 449 Feb 4 18:12 .
drwxrwx--- 3 root root 200 Feb 4 18:19 ..
drwxrws--- 3 root root 41 Feb 4 18:12 References
drwxrws--x 4 root root 54 Feb 4 18:12 Testing
drwxrws--- 4 root root 51 Feb 4 18:12 Velodyne
drwxrws--x 4 root root 38 Feb 4 18:12 progress_reports

[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# rm -rf *
rm: cannot remove `References': Directory not empty
rm: cannot remove `Testing': Directory not empty
rm: cannot remove `Velodyne': Directory not empty
rm: cannot remove `progress_reports/pr2': Directory not empty
rm: cannot remove `progress_reports/pr3': Directory not empty

[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -alR
total 0
drwxrws--x 6 root root 449 Feb 4 18:12 .
drwxrwx--- 3 root root 200 Feb 4 18:19 ..
drwxrws--- 3 root root 41 Feb 4 18:12 References *** Note that there 
is nothing in this References directory.

drwxrws--x 4 root root 54 Feb 4 18:12 Testing
drwxrws--- 4 root root 51 Feb 4 18:12 Velodyne
drwxrws--x 4 root root 38 Feb 4 18:12 progress_reports


However, from the bricks (see listings below), there are other 
directories that are not shown. For example, the References directory 
contains the USSOCOM_OPAQUE_ARMOR directory on the brick, but it 
doesn't show up on the volume.


[root@gfs01a USSOCOM_OPAQUE_ARMOR]# pwd
/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor 



[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# cd References/
[root@gfs01a References]# ls -al *** There is nothing shown in the 
References directory

total 0
drwxrws--- 3 root root 133 Feb 4 18:12 .
drwxrws--x 7 root root 449 Feb 4 18:12 ..

[root@gfs01a References]# cd USSOCOM_OPAQUE_ARMOR *** From the brick 
listing, I knew the directory name. Even though it isn't shown, I can 
cd to it and see the files.

[root@gfs01a USSOCOM_OPAQUE_ARMOR]# ls -al
total 6787
drwxrws--- 2 streadway sbir 244 Feb 5 21:28 .
drwxrws--- 3 root root 164 Feb 5 21:28 ..
-rwxrw 1 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw 1 streadway sbir 17248 Jun 19 2014 COMPARISON OF SOLUTIONS.one
-rwxrw 1 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one

-rwxrw 1 sgilbert sbir 2974120 Jan 22 09:15 FEASABILITY STUDY.docx
-rwxrw 1 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one
-rwxrw 1 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one



The recursive file listed (ls -alR) from each of the bricks shows that 
there are files/directories that do not show up on the /homegfs volume.


[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: 


total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR: 


total 6648
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 sgilbert sbir 2974120 Jan 22 09:15 FEASABILITY STUDY.docx
-rwxrw 2 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: 


total 0
drwxrws--- 2 root root 10 Feb 4 18:12 .
drwxrws--x 6 root root 95 Feb 4 18:12 ..

[root@gfs01b ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References
/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References: 


total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR: 


total 6648
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 sgilbert sbir 2974120 Jan 22 

Re: [Gluster-devel] missing files

2015-02-05 Thread David F. Robinson
Not repeatable.  Once it shows up, it stays there.  I sent some other 
strange behavior I am seeing to Pranith earlier this evening.  Attached 
below...


David

Another issue I am having that might be related is that I cannot delete 
some directories. It complains that the directories are not empty. But 
when I list them out, there is nothing there.
However, if I know of the name of the directory, I can cd into it and 
see the files.


[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# pwd
/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor

[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -al
total 0
drwxrws--x 7 root root 449 Feb 4 18:12 .
drwxrwx--- 3 root root 200 Feb 4 18:19 ..
drwxrws--- 3 root root 41 Feb 4 18:12 References
drwxrws--x 4 root root 54 Feb 4 18:12 Testing
drwxrws--- 4 root root 51 Feb 4 18:12 Velodyne
drwxrws--x 4 root root 38 Feb 4 18:12 progress_reports

[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# rm -rf *
rm: cannot remove `References': Directory not empty
rm: cannot remove `Testing': Directory not empty
rm: cannot remove `Velodyne': Directory not empty
rm: cannot remove `progress_reports/pr2': Directory not empty
rm: cannot remove `progress_reports/pr3': Directory not empty

[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -alR
total 0
drwxrws--x 6 root root 449 Feb 4 18:12 .
drwxrwx--- 3 root root 200 Feb 4 18:19 ..
drwxrws--- 3 root root 41 Feb 4 18:12 References *** Note that there is 
nothing in this References directory.

drwxrws--x 4 root root 54 Feb 4 18:12 Testing
drwxrws--- 4 root root 51 Feb 4 18:12 Velodyne
drwxrws--x 4 root root 38 Feb 4 18:12 progress_reports


However, from the bricks (see listings below), there are other 
directories that are not shown. For example, the References directory 
contains the USSOCOM_OPAQUE_ARMOR directory on the brick, but it doesn't 
show up on the volume.


[root@gfs01a USSOCOM_OPAQUE_ARMOR]# pwd
/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor

[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# cd References/
[root@gfs01a References]# ls -al *** There is nothing shown in the 
References directory

total 0
drwxrws--- 3 root root 133 Feb 4 18:12 .
drwxrws--x 7 root root 449 Feb 4 18:12 ..

[root@gfs01a References]# cd USSOCOM_OPAQUE_ARMOR *** From the brick 
listing, I knew the directory name. Even though it isn't shown, I can cd 
to it and see the files.

[root@gfs01a USSOCOM_OPAQUE_ARMOR]# ls -al
total 6787
drwxrws--- 2 streadway sbir 244 Feb 5 21:28 .
drwxrws--- 3 root root 164 Feb 5 21:28 ..
-rwxrw 1 streadway sbir 42440 Jun 19 2014 ARMOR PACKAGES.one
-rwxrw 1 streadway sbir 17248 Jun 19 2014 COMPARISON OF 
SOLUTIONS.one
-rwxrw 1 streadway sbir 38184 Jun 19 2014 CURRENT STANDARD 
ARMORING.one

-rwxrw 1 sgilbert sbir 2974120 Jan 22 09:15 FEASABILITY STUDY.docx
-rwxrw 1 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one
-rwxrw 1 streadway sbir 49736 Jan 21 13:18 GIVEN TRADE SPACE.one



The recursive file listed (ls -alR) from each of the bricks shows that 
there are files/directories that do not show up on the /homegfs volume.


[root@gfs01a Phase_1_SOCOM14-003_adv_armor]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References

/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 6648
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 sgilbert sbir 2974120 Jan 22 09:15 FEASABILITY STUDY.docx
-rwxrw 2 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one

/data/brick02a/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 2 root root 10 Feb 4 18:12 .
drwxrws--x 6 root root 95 Feb 4 18:12 ..

[root@gfs01b ~]# ls -alR 
/data/brick0*/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References

/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References:
total 0
drwxrws--- 3 root root 41 Feb 4 18:12 .
drwxrws--x 7 root root 118 Feb 4 18:12 ..
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 USSOCOM_OPAQUE_ARMOR

/data/brick01b/homegfs/documentation/programs/OLD_PROGRAMS/SBIR_TOM/Phase_1_SOCOM14-003_adv_armor/References/USSOCOM_OPAQUE_ARMOR:
total 6648
drwxrws--- 2 streadway sbir 75 Jan 23 14:46 .
drwxrws--- 3 root root 41 Feb 4 18:12 ..
-rwxrw 2 sgilbert sbir 2974120 Jan 22 09:15 FEASABILITY STUDY.docx
-rwxrw 2 streadway sbir 3826704 Jan 21 14:57 FEASABILITY STUDY.one


Re: [Gluster-devel] [Gluster-users] Appending time to snap name in USS

2015-02-05 Thread Mohammed Rafi K C
We decided to append time-stamp with snapname when creating a snapshot
by default. Users can override this with a flag no-timestamp, then
snapshot will be created without appending time-stamp. So the snapshot
create syntax would be like  snapshot create snapname volname(s)
[no-timestamp]  [description description] [force] .

Patch for the same can be found here http://review.gluster.org/#/c/9597/1.

Regards
Rafi KC

On 01/09/2015 12:18 PM, Poornima Gurusiddaiah wrote:
 Yes, the creation time of the snap is appended to the snapname
 dynamically,
 i.e. snapview-server takes the snaplist from glusterd, and while
 populating the dentry for the .snaps it appends the time.

 Thanks,
 Poornima

 

 *From: *Anand Avati av...@gluster.org
 *To: *Poornima Gurusiddaiah pguru...@redhat.com, Gluster
 Devel gluster-devel@gluster.org, gluster-users
 gluster-us...@gluster.org
 *Sent: *Friday, January 9, 2015 1:49:02 AM
 *Subject: *Re: [Gluster-devel] Appending time to snap name in USS

 It would be convenient if the time is appended to the snap name on
 the fly (when receiving list of snap names from glusterd?) so that
 the timezone application can be dynamic (which is what users would
 expect).

 Thanks

 On Thu Jan 08 2015 at 3:21:15 AM Poornima Gurusiddaiah
 pguru...@redhat.com mailto:pguru...@redhat.com wrote:

 Hi,

 Windows has a feature called shadow copy. This is widely used
 by all
 windows users to view the previous versions of a file.
 For shadow copy to work with glusterfs backend, the problem
 was that
 the clients expect snapshots to contain some format
 of time in their name.

 After evaluating the possible ways(asking the user to create
 snapshot with some format of time in it and have rename snapshot
 for existing snapshots) the following method seemed simpler.

 If the USS is enabled, then the creation time of the snapshot is
 appended to the snapname and is listed in the .snaps directory.
 The actual name of the snapshot is left unmodified. i.e. the 
 snapshot
 list/info/restore etc. commands work with the original snapname.
 The patch for the same can be found
 @http://review.gluster.org/#/c/9371/

 The impact is that, the users would see the snapnames to be
 different in the .snaps folder
 than what they have created. Also the current patch does not
 take care of the scenario where
 the snapname already has time in its name.

 Eg:
 Without this patch:
 drwxr-xr-x 4 root root 110 Dec 26 04:14 snap1
 drwxr-xr-x 4 root root 110 Dec 26 04:14 snap2

 With this patch
 drwxr-xr-x 4 root root 110 Dec 26 04:14
 snap1@GMT-2014.12.30-05.07.50
 drwxr-xr-x 4 root root 110 Dec 26 04:14
 snap2@GMT-2014.12.30-23.49.02

 Please let me know if you have any suggestions or concerns on
 the same.

 Thanks,
 Poornima
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
 http://www.gluster.org/mailman/listinfo/gluster-devel




 ___
 Gluster-users mailing list
 gluster-us...@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Input/Output Error on Gluster NFS

2015-02-05 Thread Soumya Koduri



On 02/05/2015 11:32 PM, Peter Auyeung wrote:

Hi Soumya

root@glusterprod001:~# gluster volume info | grep nfs.acl
02/05/15 10:00:05 [ /root ]

Seems like we do not have ACL enabled.

nfs client is a RHEL4 standard NFS client


Oh by default ACLs are enabled. It seem to be shown in 'gluster volume 
info' only if we explicitly modify its value to ON/OFF.


Can you please verify if the filesystem where your Gluster bricks have 
been created has been mounted with ACLs enabled.


Thanks,
Soumya



Thanks
-Peter

From: Soumya Koduri [skod...@redhat.com]
Sent: Wednesday, February 04, 2015 11:28 PM
To: Peter Auyeung; gluster-us...@gluster.org; gluster-devel@gluster.org
Subject: Re: [Gluster-devel] [Gluster-users] Input/Output Error on Gluster NFS

Hi Peter,

Have you disabled Gluster-NFS ACLs .

Please check the option value -
#gluster v info | grep nfs.acl
nfs.acl: ON

Also please provide the details of the nfs-client you are using.
Typically, nfs-clients seem to issue getxattr before doing
setxattr/removexattr operations and return 'ENOTSUPP' incase of ACLs
disabled. But from the strace, looks like the client has issued
'removexattr' of 'system.posix_acl_default' which returned EIO.

Anyways, 'removexattr' should also have returned EOPNOTSUPP instead of EIO.

Thanks,
Soumya

On 02/05/2015 02:31 AM, Peter Auyeung wrote:

I was trying to copy a directory of files to Gluster via NFS and getting
permission denied with Input/Output error

--- r...@bizratedbstandby.bo2.shopzilla.sea (0.00)# cp -pr db /mnt/
cp: setting permissions for 
`/mnt/db/full/pr_bizrate_standby_SMLS.F02-01-22-35.d': Input/output error
cp: setting permissions for 
`/mnt/db/full/pr_bizrate_standby_logging.F02-02-18-10.b': Input/output error
cp: setting permissions for `/mnt/db/full/pr_bizrate_SMLS.F02-01-22-35.d': 
Input/output error
cp: setting permissions for 
`/mnt/db/full/pr_bizrate_standby_master.F02-02-22-00': Input/output error
cp: setting permissions for `/mnt/db/full': Input/output error
cp: setting permissions for `/mnt/db': Input/output error

Checked gluster nfs.log and etc log and bricks looks clean.
The files ends up able to copy over with right permission.

Stack trace the copy and seems like it failed on removexattr

removexattr(/mnt/db, system.posix_acl_default...) = -1 EIO (Input/output 
error)

http://pastie.org/9884810

Any Clue?

Thanks
Peter







___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] failed heal

2015-02-05 Thread Niels de Vos
On Thu, Feb 05, 2015 at 11:21:58AM +0530, Pranith Kumar Karampuri wrote:
 
 On 02/04/2015 11:52 PM, David F. Robinson wrote:
 I don't recall if that was before or after my upgrade.
 I'll forward you an email thread for the current heal issues which are
 after the 3.6.2 upgrade...
 This is executed after the upgrade on just one machine. 3.6.2 entry locks
 are not compatible with versions = 3.5.3 and 3.6.1 that is the reason. From
 3.5.4 and releases =3.6.2 it should work fine.

Oh, I was not aware of this requirement. Does it mean we should not mix
deployments with these versions (what about 3.4?) any longer? 3.5.4 has
not been released yet, so anyone with a mixed 3.5/3.6.2 environment will
hit these issues? Is this only for the self-heal daemon, or are the
triggered/stat self-heal procedures affected too?

It should be noted *very* clearly in the release notes, and I think an
announcement (email+blog) as a warning/reminder would be good. Could you
get some details and advice written down, please?

Thanks,
Niels


 
 Pranith
 David
 -- Original Message --
 From: Pranith Kumar Karampuri pkara...@redhat.com
 mailto:pkara...@redhat.com
 To: David F. Robinson david.robin...@corvidtec.com
 mailto:david.robin...@corvidtec.com; gluster-us...@gluster.org
 gluster-us...@gluster.org mailto:gluster-us...@gluster.org; Gluster
 Devel gluster-devel@gluster.org mailto:gluster-devel@gluster.org
 Sent: 2/4/2015 2:33:20 AM
 Subject: Re: [Gluster-devel] failed heal
 
 On 02/02/2015 03:34 AM, David F. Robinson wrote:
 I have several files that gluster says it cannot heal. I deleted the
 files from all of the bricks
 (/data/brick0*/hpc_shared/motorsports/gmics/Raven/p3/*) and ran a full
 heal using 'gluster volume heal homegfs full'.  Even after the full
 heal, the entries below still show up.
 How do I clear these?
 3.6.1 Had an issue where files undergoing I/O will also be shown in the
 output of 'gluster volume heal volname info', we addressed that in
 3.6.2. Is this output from 3.6.1 by any chance?
 
 Pranith
 [root@gfs01a ~]# gluster volume heal homegfs info
 Gathering list of entries to be healed on volume homegfs has been
 successful
 Brick gfsib01a.corvidtec.com:/data/brick01a/homegfs
 Number of entries: 10
 /hpc_shared/motorsports/gmics/Raven/p3/70_rke/Movies
 gfid:a6fc9011-74ad-4128-a232-4ccd41215ac8
 gfid:bc17fa79-c1fd-483d-82b1-2c0d3564ddc5
 gfid:ec804b5c-8bfc-4e7b-91e3-aded7952e609
 gfid:ba62e340-4fad-477c-b450-704133577cbb
 gfid:4843aa40-8361-4a97-88d5-d37fc28e04c0
 gfid:c90a8f1c-c49e-4476-8a50-2bfb0a89323c
 gfid:090042df-855a-4f5d-8929-c58feec10e33
 /hpc_shared/motorsports/gmics/Raven/p3/70_rke/.Convrg.swp
 /hpc_shared/motorsports/gmics/Raven/p3/70_rke
 Brick gfsib01b.corvidtec.com:/data/brick01b/homegfs
 Number of entries: 2
 gfid:f96b4ddf-8a75-4abb-a640-15dbe41fdafa
 /hpc_shared/motorsports/gmics/Raven/p3/70_rke
 Brick gfsib01a.corvidtec.com:/data/brick02a/homegfs
 Number of entries: 7
 gfid:5d08fe1d-17b3-4a76-ab43-c708e346162f
 /hpc_shared/motorsports/gmics/Raven/p3/70_rke/PICTURES/.tmpcheck
 /hpc_shared/motorsports/gmics/Raven/p3/70_rke/PICTURES
 /hpc_shared/motorsports/gmics/Raven/p3/70_rke/Movies
 gfid:427d3738-3a41-4e51-ba2b-f0ba7254d013
 gfid:8ad88a4d-8d5e-408f-a1de-36116cf6d5c1
 gfid:0e034160-cd50-4108-956d-e45858f27feb
 Brick gfsib01b.corvidtec.com:/data/brick02b/homegfs
 Number of entries: 0
 Brick gfsib02a.corvidtec.com:/data/brick01a/homegfs
 Number of entries: 0
 Brick gfsib02b.corvidtec.com:/data/brick01b/homegfs
 Number of entries: 0
 Brick gfsib02a.corvidtec.com:/data/brick02a/homegfs
 Number of entries: 0
 Brick gfsib02b.corvidtec.com:/data/brick02b/homegfs
 Number of entries: 0
 ===
 David F. Robinson, Ph.D.
 President - Corvid Technologies
 704.799.6944 x101 [office]
 704.252.1310 [cell]
 704.799.7974 [fax]
 david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com
 http://www.corvidtechnologies.com http://www.corvidtechnologies.com/
 
 
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel
 
 

 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel



pgpssRXZEETwm.pgp
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] missing files

2015-02-05 Thread David F. Robinson
It was a mix of files from very small to very large. And many terabytes of 
data. Approx 20tb

David  (Sent from mobile)

===
David F. Robinson, Ph.D. 
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com

 On Feb 5, 2015, at 4:55 PM, Ben Turner btur...@redhat.com wrote:
 
 - Original Message -
 From: Pranith Kumar Karampuri pkara...@redhat.com
 To: Xavier Hernandez xhernan...@datalab.es, David F. Robinson 
 david.robin...@corvidtec.com, Benjamin Turner
 bennytu...@gmail.com
 Cc: gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org
 Sent: Thursday, February 5, 2015 5:30:04 AM
 Subject: Re: [Gluster-users] [Gluster-devel] missing files
 
 
 On 02/05/2015 03:48 PM, Pranith Kumar Karampuri wrote:
 I believe David already fixed this. I hope this is the same issue he
 told about permissions issue.
 Oops, it is not. I will take a look.
 
 Yes David exactly like these:
 
 data-brick02a-homegfs.log:[2015-02-03 19:09:34.568842] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from 
 gfs02a.corvidtec.com-18563-2015/02/03-19:07:58:519134-homegfs-client-2-0-0
 data-brick02a-homegfs.log:[2015-02-03 19:09:41.286551] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from 
 gfs01a.corvidtec.com-12804-2015/02/03-19:09:38:497808-homegfs-client-2-0-0
 data-brick02a-homegfs.log:[2015-02-03 19:16:35.906412] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from 
 gfs02b.corvidtec.com-27190-2015/02/03-19:15:53:458467-homegfs-client-2-0-0
 data-brick02a-homegfs.log:[2015-02-03 19:51:22.761293] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from gfs01a.corvidtec.com-25926-2015/02/03-19:51:02:89070-homegfs-client-2-0-0
 data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1
 
 You can 100% verify my theory if you can correlate the time on the 
 disconnects to the time that the missing files were healed.  Can you have a 
 look at /var/log/glusterfs/glustershd.log?  That has all of the healed files 
 + timestamps, if we can see a disconnect during the rsync and a self heal of 
 the missing file I think we can safely assume that the disconnects may have 
 caused this.  I'll try this on my test systems, how much data did you rsync?  
 What size ish of files / an idea of the dir layout?  
 
 @Pranith - Could bricks flapping up and down during the rsync cause the files 
 to be missing on the first ls(written to 1 subvol but not the other cause it 
 was down), the ls triggered SH, and thats why the files were there for the 
 second ls be a possible cause here?
 
 -b
 
 
 Pranith
 
 Pranith
 On 02/05/2015 03:44 PM, Xavier Hernandez wrote:
 Is the failure repeatable ? with the same directories ?
 
 It's very weird that the directories appear on the volume when you do
 an 'ls' on the bricks. Could it be that you only made a single 'ls'
 on fuse mount which not showed the directory ? Is it possible that
 this 'ls' triggered a self-heal that repaired the problem, whatever
 it was, and when you did another 'ls' on the fuse mount after the
 'ls' on the bricks, the directories were there ?
 
 The first 'ls' could have healed the files, causing that the
 following 'ls' on the bricks showed the files as if nothing were
 damaged. If that's the case, it's possible that there were some
 disconnections during the copy.
 
 Added Pranith because he knows better replication and self-heal details.
 
 Xavi
 
 On 02/04/2015 07:23 PM, David F. Robinson wrote:
 Distributed/replicated
 
 Volume Name: homegfs
 Type: Distributed-Replicate
 Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
 Status: Started
 Number of Bricks: 4 x 2 = 8
 Transport-type: tcp
 Bricks:
 Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
 Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
 Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
 Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
 Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
 Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
 Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
 Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
 Options Reconfigured:
 performance.io-thread-count: 32
 performance.cache-size: 128MB
 performance.write-behind-window-size: 128MB
 server.allow-insecure: on
 network.ping-timeout: 10
 storage.owner-gid: 100
 geo-replication.indexing: off
 geo-replication.ignore-pid-check: on
 changelog.changelog: on
 changelog.fsync-interval: 3
 changelog.rollover-time: 15
 server.manage-gids: on
 
 
 -- Original Message --
 From: Xavier Hernandez xhernan...@datalab.es
 To: David F. Robinson david.robin...@corvidtec.com; 

Re: [Gluster-devel] [Gluster-users] missing files

2015-02-05 Thread David F. Robinson
Should I run my rsync with --block-size = something other than the default? Is 
there an optimal value? I think 128k is the max from my quick search. Didn't 
dig into it throughly though. 

David  (Sent from mobile)

===
David F. Robinson, Ph.D. 
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com

 On Feb 5, 2015, at 5:41 PM, Ben Turner btur...@redhat.com wrote:
 
 - Original Message -
 From: Ben Turner btur...@redhat.com
 To: David F. Robinson david.robin...@corvidtec.com
 Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez 
 xhernan...@datalab.es, Benjamin Turner
 bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Thursday, February 5, 2015 5:22:26 PM
 Subject: Re: [Gluster-users] [Gluster-devel] missing files
 
 - Original Message -
 From: David F. Robinson david.robin...@corvidtec.com
 To: Ben Turner btur...@redhat.com
 Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez
 xhernan...@datalab.es, Benjamin Turner
 bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel
 gluster-devel@gluster.org
 Sent: Thursday, February 5, 2015 5:01:13 PM
 Subject: Re: [Gluster-users] [Gluster-devel] missing files
 
 I'll send you the emails I sent Pranith with the logs. What causes these
 disconnects?
 
 Thanks David!  Disconnects happen when there are interruption in
 communication between peers, normally there is ping timeout that happens.
 It could be anything from a flaky NW to the system was to busy to respond
 to the pings.  My initial take is more towards the ladder as rsync is
 absolutely the worst use case for gluster - IIRC it writes in 4kb blocks.  I
 try to keep my writes at least 64KB as in my testing that is the smallest
 block size I can write with before perf starts to really drop off.  I'll try
 something similar in the lab.
 
 Ok I do think that the file being self healed is RCA for what you were 
 seeing.  Lets look at one of the disconnects:
 
 data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1
 
 And in the glustershd.log from the gfs01b_glustershd.log file:
 
 [2015-02-03 20:55:48.001797] I 
 [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: 
 performing entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448
 [2015-02-03 20:55:49.341996] I [afr-self-heal-common.c:476:afr_log_selfheal] 
 0-homegfs-replicate-0: Completed entry selfheal on 
 6c79a368-edaa-432b-bef9-ec690ab42448. source=1 sinks=0 
 [2015-02-03 20:55:49.343093] I 
 [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: 
 performing entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69
 [2015-02-03 20:55:50.463652] I [afr-self-heal-common.c:476:afr_log_selfheal] 
 0-homegfs-replicate-0: Completed entry selfheal on 
 792cb0d6-9290-4447-8cd7-2b2d7a116a69. source=1 sinks=0 
 [2015-02-03 20:55:51.465289] I 
 [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] 
 0-homegfs-replicate-0: performing metadata selfheal on 
 403e661a-1c27-4e79-9867-c0572aba2b3c
 [2015-02-03 20:55:51.466515] I [afr-self-heal-common.c:476:afr_log_selfheal] 
 0-homegfs-replicate-0: Completed metadata selfheal on 
 403e661a-1c27-4e79-9867-c0572aba2b3c. source=1 sinks=0 
 [2015-02-03 20:55:51.467098] I 
 [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: 
 performing entry selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c
 [2015-02-03 20:55:55.257808] I [afr-self-heal-common.c:476:afr_log_selfheal] 
 0-homegfs-replicate-0: Completed entry selfheal on 
 403e661a-1c27-4e79-9867-c0572aba2b3c. source=1 sinks=0 
 [2015-02-03 20:55:55.258548] I 
 [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] 
 0-homegfs-replicate-0: performing metadata selfheal on 
 c612ee2f-2fb4-4157-a9ab-5a2d5603c541
 [2015-02-03 20:55:55.259367] I [afr-self-heal-common.c:476:afr_log_selfheal] 
 0-homegfs-replicate-0: Completed metadata selfheal on 
 c612ee2f-2fb4-4157-a9ab-5a2d5603c541. source=1 sinks=0 
 [2015-02-03 20:55:55.259980] I 
 [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: 
 performing entry selfheal on c612ee2f-2fb4-4157-a9ab-5a2d5603c541
 
 As you can see the self heal logs are just spammed with files being healed, 
 and I looked at a couple of disconnects and I see self heals getting run 
 shortly after on the bricks that were down.  Now we need to find the cause of 
 the disconnects, I am thinking once the disconnects are resolved the files 
 should be properly copied over without SH having to fix things.  Like I said 
 I'll give this a go on my lab systems and see if I can repro the disconnects, 
 I'll have time to run through it tomorrow.  If in the mean time anyone else 
 has a 

Re: [Gluster-devel] [Gluster-users] missing files

2015-02-05 Thread David F. Robinson
I'll send you the emails I sent Pranith with the logs. What causes these 
disconnects?

David  (Sent from mobile)

===
David F. Robinson, Ph.D. 
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com

 On Feb 5, 2015, at 4:55 PM, Ben Turner btur...@redhat.com wrote:
 
 - Original Message -
 From: Pranith Kumar Karampuri pkara...@redhat.com
 To: Xavier Hernandez xhernan...@datalab.es, David F. Robinson 
 david.robin...@corvidtec.com, Benjamin Turner
 bennytu...@gmail.com
 Cc: gluster-us...@gluster.org, Gluster Devel gluster-devel@gluster.org
 Sent: Thursday, February 5, 2015 5:30:04 AM
 Subject: Re: [Gluster-users] [Gluster-devel] missing files
 
 
 On 02/05/2015 03:48 PM, Pranith Kumar Karampuri wrote:
 I believe David already fixed this. I hope this is the same issue he
 told about permissions issue.
 Oops, it is not. I will take a look.
 
 Yes David exactly like these:
 
 data-brick02a-homegfs.log:[2015-02-03 19:09:34.568842] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from 
 gfs02a.corvidtec.com-18563-2015/02/03-19:07:58:519134-homegfs-client-2-0-0
 data-brick02a-homegfs.log:[2015-02-03 19:09:41.286551] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from 
 gfs01a.corvidtec.com-12804-2015/02/03-19:09:38:497808-homegfs-client-2-0-0
 data-brick02a-homegfs.log:[2015-02-03 19:16:35.906412] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from 
 gfs02b.corvidtec.com-27190-2015/02/03-19:15:53:458467-homegfs-client-2-0-0
 data-brick02a-homegfs.log:[2015-02-03 19:51:22.761293] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from gfs01a.corvidtec.com-25926-2015/02/03-19:51:02:89070-homegfs-client-2-0-0
 data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1
 
 You can 100% verify my theory if you can correlate the time on the 
 disconnects to the time that the missing files were healed.  Can you have a 
 look at /var/log/glusterfs/glustershd.log?  That has all of the healed files 
 + timestamps, if we can see a disconnect during the rsync and a self heal of 
 the missing file I think we can safely assume that the disconnects may have 
 caused this.  I'll try this on my test systems, how much data did you rsync?  
 What size ish of files / an idea of the dir layout?  
 
 @Pranith - Could bricks flapping up and down during the rsync cause the files 
 to be missing on the first ls(written to 1 subvol but not the other cause it 
 was down), the ls triggered SH, and thats why the files were there for the 
 second ls be a possible cause here?
 
 -b
 
 
 Pranith
 
 Pranith
 On 02/05/2015 03:44 PM, Xavier Hernandez wrote:
 Is the failure repeatable ? with the same directories ?
 
 It's very weird that the directories appear on the volume when you do
 an 'ls' on the bricks. Could it be that you only made a single 'ls'
 on fuse mount which not showed the directory ? Is it possible that
 this 'ls' triggered a self-heal that repaired the problem, whatever
 it was, and when you did another 'ls' on the fuse mount after the
 'ls' on the bricks, the directories were there ?
 
 The first 'ls' could have healed the files, causing that the
 following 'ls' on the bricks showed the files as if nothing were
 damaged. If that's the case, it's possible that there were some
 disconnections during the copy.
 
 Added Pranith because he knows better replication and self-heal details.
 
 Xavi
 
 On 02/04/2015 07:23 PM, David F. Robinson wrote:
 Distributed/replicated
 
 Volume Name: homegfs
 Type: Distributed-Replicate
 Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
 Status: Started
 Number of Bricks: 4 x 2 = 8
 Transport-type: tcp
 Bricks:
 Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
 Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
 Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
 Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
 Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
 Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
 Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
 Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
 Options Reconfigured:
 performance.io-thread-count: 32
 performance.cache-size: 128MB
 performance.write-behind-window-size: 128MB
 server.allow-insecure: on
 network.ping-timeout: 10
 storage.owner-gid: 100
 geo-replication.indexing: off
 geo-replication.ignore-pid-check: on
 changelog.changelog: on
 changelog.fsync-interval: 3
 changelog.rollover-time: 15
 server.manage-gids: on
 
 
 -- Original Message --
 From: Xavier Hernandez xhernan...@datalab.es
 To: David F. Robinson david.robin...@corvidtec.com; Benjamin
 

Re: [Gluster-devel] [Gluster-users] missing files

2015-02-05 Thread David F. Robinson
Isn't rsync what geo-rep uses?

David  (Sent from mobile)

===
David F. Robinson, Ph.D. 
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com

 On Feb 5, 2015, at 5:41 PM, Ben Turner btur...@redhat.com wrote:
 
 - Original Message -
 From: Ben Turner btur...@redhat.com
 To: David F. Robinson david.robin...@corvidtec.com
 Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez 
 xhernan...@datalab.es, Benjamin Turner
 bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Thursday, February 5, 2015 5:22:26 PM
 Subject: Re: [Gluster-users] [Gluster-devel] missing files
 
 - Original Message -
 From: David F. Robinson david.robin...@corvidtec.com
 To: Ben Turner btur...@redhat.com
 Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier Hernandez
 xhernan...@datalab.es, Benjamin Turner
 bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel
 gluster-devel@gluster.org
 Sent: Thursday, February 5, 2015 5:01:13 PM
 Subject: Re: [Gluster-users] [Gluster-devel] missing files
 
 I'll send you the emails I sent Pranith with the logs. What causes these
 disconnects?
 
 Thanks David!  Disconnects happen when there are interruption in
 communication between peers, normally there is ping timeout that happens.
 It could be anything from a flaky NW to the system was to busy to respond
 to the pings.  My initial take is more towards the ladder as rsync is
 absolutely the worst use case for gluster - IIRC it writes in 4kb blocks.  I
 try to keep my writes at least 64KB as in my testing that is the smallest
 block size I can write with before perf starts to really drop off.  I'll try
 something similar in the lab.
 
 Ok I do think that the file being self healed is RCA for what you were 
 seeing.  Lets look at one of the disconnects:
 
 data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I 
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection 
 from gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1
 
 And in the glustershd.log from the gfs01b_glustershd.log file:
 
 [2015-02-03 20:55:48.001797] I 
 [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: 
 performing entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448
 [2015-02-03 20:55:49.341996] I [afr-self-heal-common.c:476:afr_log_selfheal] 
 0-homegfs-replicate-0: Completed entry selfheal on 
 6c79a368-edaa-432b-bef9-ec690ab42448. source=1 sinks=0 
 [2015-02-03 20:55:49.343093] I 
 [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: 
 performing entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69
 [2015-02-03 20:55:50.463652] I [afr-self-heal-common.c:476:afr_log_selfheal] 
 0-homegfs-replicate-0: Completed entry selfheal on 
 792cb0d6-9290-4447-8cd7-2b2d7a116a69. source=1 sinks=0 
 [2015-02-03 20:55:51.465289] I 
 [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] 
 0-homegfs-replicate-0: performing metadata selfheal on 
 403e661a-1c27-4e79-9867-c0572aba2b3c
 [2015-02-03 20:55:51.466515] I [afr-self-heal-common.c:476:afr_log_selfheal] 
 0-homegfs-replicate-0: Completed metadata selfheal on 
 403e661a-1c27-4e79-9867-c0572aba2b3c. source=1 sinks=0 
 [2015-02-03 20:55:51.467098] I 
 [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: 
 performing entry selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c
 [2015-02-03 20:55:55.257808] I [afr-self-heal-common.c:476:afr_log_selfheal] 
 0-homegfs-replicate-0: Completed entry selfheal on 
 403e661a-1c27-4e79-9867-c0572aba2b3c. source=1 sinks=0 
 [2015-02-03 20:55:55.258548] I 
 [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] 
 0-homegfs-replicate-0: performing metadata selfheal on 
 c612ee2f-2fb4-4157-a9ab-5a2d5603c541
 [2015-02-03 20:55:55.259367] I [afr-self-heal-common.c:476:afr_log_selfheal] 
 0-homegfs-replicate-0: Completed metadata selfheal on 
 c612ee2f-2fb4-4157-a9ab-5a2d5603c541. source=1 sinks=0 
 [2015-02-03 20:55:55.259980] I 
 [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0: 
 performing entry selfheal on c612ee2f-2fb4-4157-a9ab-5a2d5603c541
 
 As you can see the self heal logs are just spammed with files being healed, 
 and I looked at a couple of disconnects and I see self heals getting run 
 shortly after on the bricks that were down.  Now we need to find the cause of 
 the disconnects, I am thinking once the disconnects are resolved the files 
 should be properly copied over without SH having to fix things.  Like I said 
 I'll give this a go on my lab systems and see if I can repro the disconnects, 
 I'll have time to run through it tomorrow.  If in the mean time anyone else 
 has a theory / anything to add here it would be appreciated.
 
 -b
 
 -b
 
 David  (Sent from mobile)
 
 ===
 David F. Robinson, Ph.D.
 

Re: [Gluster-devel] Gluster 3.6.2 On Xeon Phi

2015-02-05 Thread Rudra Siva
Rafi,

Sorry it took me some time - I had to merge these with some of my
changes - the scif0 (iWARP) does not support SRQ (max_srq : 0) so have
changed some of the code to use QP instead - can provide those if
there is interest after this is stable.

Here's the good -

The performance with the patches is better than without (esp.
http://review.gluster.org/#/c/9327/).

The bad - glusterfsd crashes for large files so it's difficult to get
some decent benchmark numbers - small ones look good - trying to
understand the patch at this time. Looks like this code comes from
9327 as well.

Can you please review the reset of mr_count?

Info from gdb is as follows - if you need more or something jumps out
please feel free to let me know.

(gdb) p *post
$16 = {next = 0x7fffe003b280, prev = 0x7fffe0037cc0, mr =
0x7fffe0037fb0, buf = 0x7fffe0096000 \005\004, buf_size = 4096, aux
= 0 '\000',
  reused = 1, device = 0x7fffe00019c0, type = GF_RDMA_RECV_POST, ctx =
{mr = {0x7fffe0003020, 0x7fffc8005f20, 0x7fffc8000aa0, 0x7fffc80030c0,
  0x7fffc8002d70, 0x7fffc8008bb0, 0x7fffc8008bf0, 0x7fffc8002cd0},
mr_count = -939493456, vector = {{iov_base = 0x77fd6000,
iov_len = 112}, {iov_base = 0x7fffbf14, iov_len = 131072},
{iov_base = 0x0, iov_len = 0} repeats 14 times}, count = 2,
iobref = 0x7fffc8001670, hdr_iobuf = 0x61d710, is_request = 0
'\000', gf_rdma_reads = 1, reply_info = 0x0}, refcount = 1, lock = {
__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
__kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' repeats 39 times, __align = 0}}

(gdb) bt
#0  0x7fffe7142681 in __gf_rdma_register_local_mr_for_rdma
(peer=0x7fffe0001800, vector=0x7fffe003b108, count=1,
ctx=0x7fffe003b0b0)
at rdma.c:2255
#1  0x7fffe7145acd in gf_rdma_do_reads (peer=0x7fffe0001800,
post=0x7fffe003b070, readch=0x7fffe0096010) at rdma.c:3609
#2  0x7fffe714656e in gf_rdma_recv_request (peer=0x7fffe0001800,
post=0x7fffe003b070, readch=0x7fffe0096010) at rdma.c:3859
#3  0x7fffe714691d in gf_rdma_process_recv (peer=0x7fffe0001800,
wc=0x7fffceffcd20) at rdma.c:3967
#4  0x7fffe7146e7d in gf_rdma_recv_completion_proc
(data=0x7fffe0002b30) at rdma.c:4114
#5  0x772cfdf3 in start_thread () from /lib64/libpthread.so.0
#6  0x76c403dd in clone () from /lib64/libc.so.6

On Fri, Jan 30, 2015 at 7:11 AM, Mohammed Rafi K C rkavu...@redhat.com wrote:

 On 01/29/2015 06:13 PM, Rudra Siva wrote:
 Hi,

 Have been able to get Gluster running on Intel's MIC platform. The
 only code change to Gluster source was an unresolved yylex (I am not
 really sure why that was coming up - may be someone more familiar with
 it's use in Gluster can answer).

 At the step for compiling the binaries (glusterd, glusterfsd,
 glusterfs, glfsheal)  build breaks with an unresolved yylex error.

 For now have a routine yylex that simply calls graphyylex - I don't
 know if this is even correct however mount functions.

 GCC - 4.7 (it's an oddity, latest GCC is missing the Phi patches)

 flex --version
 flex 2.5.39

 bison --version
 bison (GNU Bison) 3.0

 I'm still working on testing the RDMA and Infiniband support and can
 make notes, numbers available when that is complete.
 There are couple of rdma performance related patches under review. If
 you could make use of those patches, I hope that will give a performance
 enhancement.

 [1] : http://review.gluster.org/#/c/9329/
 [2] : http://review.gluster.org/#/c/9321/
 [3] : http://review.gluster.org/#/c/9327/
 [4] : http://review.gluster.org/#/c/9506/

 Let me know if you need any clarification.

 Regards!
 Rafi KC





-- 
-Siva
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] missing files

2015-02-05 Thread Benjamin Turner
Correct!  I have seen(back in the day, its been 3ish years since I have
seen it) having say 50+ volumes each with a geo rep session take system
load levels to the point where pings couldn't be serviced within the ping
timeout.  So it is known to happen but there has been alot of work in the
geo rep space to help here, some of which is discussed:

https://medium.com/@msvbhat/distributed-geo-replication-in-glusterfs-ec95f4393c50

(think tar + ssh and other fixes)Your symptoms remind me of that case of
50+ geo repd volumes, thats why I mentioned it from the start.  My current
shoot from the hip theory is when rsyncing all that data the servers got
too busy to service the pings and it lead to disconnects.  This is common
across all of the clustering / distributed software I have worked on, if
the system gets too busy to service heartbeat within the timeout things go
crazy(think fork bomb on a single host).  Now this could be a case of me
putting symptoms from an old issue into what you are describing, but thats
where my head is at.  If I'm correct I should be able to repro using a
similar workload.  I think that the multi threaded epoll changes that
_just_ landed in master will help resolve this, but they are so new I
haven't been able to test this.  I'll know more when I get a chance to test
tomorrow.

-b

On Thu, Feb 5, 2015 at 6:04 PM, David F. Robinson 
david.robin...@corvidtec.com wrote:

 Isn't rsync what geo-rep uses?

 David  (Sent from mobile)

 ===
 David F. Robinson, Ph.D.
 President - Corvid Technologies
 704.799.6944 x101 [office]
 704.252.1310  [cell]
 704.799.7974  [fax]
 david.robin...@corvidtec.com
 http://www.corvidtechnologies.com

  On Feb 5, 2015, at 5:41 PM, Ben Turner btur...@redhat.com wrote:
 
  - Original Message -
  From: Ben Turner btur...@redhat.com
  To: David F. Robinson david.robin...@corvidtec.com
  Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier
 Hernandez xhernan...@datalab.es, Benjamin Turner
  bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel 
 gluster-devel@gluster.org
  Sent: Thursday, February 5, 2015 5:22:26 PM
  Subject: Re: [Gluster-users] [Gluster-devel] missing files
 
  - Original Message -
  From: David F. Robinson david.robin...@corvidtec.com
  To: Ben Turner btur...@redhat.com
  Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier
 Hernandez
  xhernan...@datalab.es, Benjamin Turner
  bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel
  gluster-devel@gluster.org
  Sent: Thursday, February 5, 2015 5:01:13 PM
  Subject: Re: [Gluster-users] [Gluster-devel] missing files
 
  I'll send you the emails I sent Pranith with the logs. What causes
 these
  disconnects?
 
  Thanks David!  Disconnects happen when there are interruption in
  communication between peers, normally there is ping timeout that
 happens.
  It could be anything from a flaky NW to the system was to busy to
 respond
  to the pings.  My initial take is more towards the ladder as rsync is
  absolutely the worst use case for gluster - IIRC it writes in 4kb
 blocks.  I
  try to keep my writes at least 64KB as in my testing that is the
 smallest
  block size I can write with before perf starts to really drop off.
 I'll try
  something similar in the lab.
 
  Ok I do think that the file being self healed is RCA for what you were
 seeing.  Lets look at one of the disconnects:
 
  data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I
 [server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting connection
 from
 gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1
 
  And in the glustershd.log from the gfs01b_glustershd.log file:
 
  [2015-02-03 20:55:48.001797] I
 [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0:
 performing entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448
  [2015-02-03 20:55:49.341996] I
 [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0:
 Completed entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448. source=1
 sinks=0
  [2015-02-03 20:55:49.343093] I
 [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0:
 performing entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69
  [2015-02-03 20:55:50.463652] I
 [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0:
 Completed entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69. source=1
 sinks=0
  [2015-02-03 20:55:51.465289] I
 [afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do]
 0-homegfs-replicate-0: performing metadata selfheal on
 403e661a-1c27-4e79-9867-c0572aba2b3c
  [2015-02-03 20:55:51.466515] I
 [afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0:
 Completed metadata selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c.
 source=1 sinks=0
  [2015-02-03 20:55:51.467098] I
 [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-homegfs-replicate-0:
 performing entry selfheal on 403e661a-1c27-4e79-9867-c0572aba2b3c
  [2015-02-03 

Re: [Gluster-devel] missing files

2015-02-05 Thread Xavier Hernandez

Is the failure repeatable ? with the same directories ?

It's very weird that the directories appear on the volume when you do an 
'ls' on the bricks. Could it be that you only made a single 'ls' on fuse 
mount which not showed the directory ? Is it possible that this 'ls' 
triggered a self-heal that repaired the problem, whatever it was, and 
when you did another 'ls' on the fuse mount after the 'ls' on the 
bricks, the directories were there ?


The first 'ls' could have healed the files, causing that the following 
'ls' on the bricks showed the files as if nothing were damaged. If 
that's the case, it's possible that there were some disconnections 
during the copy.


Added Pranith because he knows better replication and self-heal details.

Xavi

On 02/04/2015 07:23 PM, David F. Robinson wrote:

Distributed/replicated

Volume Name: homegfs
Type: Distributed-Replicate
Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
Options Reconfigured:
performance.io-thread-count: 32
performance.cache-size: 128MB
performance.write-behind-window-size: 128MB
server.allow-insecure: on
network.ping-timeout: 10
storage.owner-gid: 100
geo-replication.indexing: off
geo-replication.ignore-pid-check: on
changelog.changelog: on
changelog.fsync-interval: 3
changelog.rollover-time: 15
server.manage-gids: on


-- Original Message --
From: Xavier Hernandez xhernan...@datalab.es
To: David F. Robinson david.robin...@corvidtec.com; Benjamin
Turner bennytu...@gmail.com
Cc: gluster-us...@gluster.org gluster-us...@gluster.org; Gluster
Devel gluster-devel@gluster.org
Sent: 2/4/2015 6:03:45 AM
Subject: Re: [Gluster-devel] missing files


On 02/04/2015 01:30 AM, David F. Robinson wrote:

Sorry. Thought about this a little more. I should have been clearer.
The files were on both bricks of the replica, not just one side. So,
both bricks had to have been up... The files/directories just don't show
up on the mount.
I was reading and saw a related bug
(https://bugzilla.redhat.com/show_bug.cgi?id=1159484). I saw it
suggested to run:
 find mount -d -exec getfattr -h -n trusted.ec.heal {} \;


This command is specific for a dispersed volume. It won't do anything
(aside from the error you are seeing) on a replicated volume.

I think you are using a replicated volume, right ?

In this case I'm not sure what can be happening. Is your volume a pure
replicated one or a distributed-replicated ? on a pure replicated it
doesn't make sense that some entries do not show in an 'ls' when the
file is in both replicas (at least without any error message in the
logs). On a distributed-replicated it could be caused by some problem
while combining contents of each replica set.

What's the configuration of your volume ?

Xavi



I get a bunch of errors for operation not supported:
[root@gfs02a homegfs]# find wks_backup -d -exec getfattr -h -n
trusted.ec.heal {} \;
find: warning: the -d option is deprecated; please use -depth instead,
because the latter is a POSIX-compliant feature.
wks_backup/homer_backup/backup: trusted.ec.heal: Operation not supported
wks_backup/homer_backup/logs/2014_05_20.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_21.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_18.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_19.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs/2014_05_22.log: trusted.ec.heal: Operation
not supported
wks_backup/homer_backup/logs: trusted.ec.heal: Operation not supported
wks_backup/homer_backup: trusted.ec.heal: Operation not supported
-- Original Message --
From: Benjamin Turner bennytu...@gmail.com
mailto:bennytu...@gmail.com
To: David F. Robinson david.robin...@corvidtec.com
mailto:david.robin...@corvidtec.com
Cc: Gluster Devel gluster-devel@gluster.org
mailto:gluster-devel@gluster.org; gluster-us...@gluster.org
gluster-us...@gluster.org mailto:gluster-us...@gluster.org
Sent: 2/3/2015 7:12:34 PM
Subject: Re: [Gluster-devel] missing files

It sounds to me like the files were only copied to one replica, werent
there for the initial for the initial ls which triggered a self heal,
and were there for the last ls because they were healed. Is there any
chance that one of the replicas was down during the rsync? It could
be that you lost a brick during copy or something like that. To
confirm I would look for disconnects in the brick logs as well as
checking glusterfshd.log to 

Re: [Gluster-devel] [Gluster-users] missing files

2015-02-05 Thread David F. Robinson

copy that.  Thanks for looking into the issue.

David


-- Original Message --
From: Benjamin Turner bennytu...@gmail.com
To: David F. Robinson david.robin...@corvidtec.com
Cc: Ben Turner btur...@redhat.com; Pranith Kumar Karampuri 
pkara...@redhat.com; Xavier Hernandez xhernan...@datalab.es; 
gluster-us...@gluster.org gluster-us...@gluster.org; Gluster Devel 
gluster-devel@gluster.org

Sent: 2/5/2015 9:05:43 PM
Subject: Re: [Gluster-users] [Gluster-devel] missing files

Correct!  I have seen(back in the day, its been 3ish years since I have 
seen it) having say 50+ volumes each with a geo rep session take system 
load levels to the point where pings couldn't be serviced within the 
ping timeout.  So it is known to happen but there has been alot of work 
in the geo rep space to help here, some of which is discussed:


https://medium.com/@msvbhat/distributed-geo-replication-in-glusterfs-ec95f4393c50

(think tar + ssh and other fixes)Your symptoms remind me of that case 
of 50+ geo repd volumes, thats why I mentioned it from the start.  My 
current shoot from the hip theory is when rsyncing all that data the 
servers got too busy to service the pings and it lead to disconnects.  
This is common across all of the clustering / distributed software I 
have worked on, if the system gets too busy to service heartbeat within 
the timeout things go crazy(think fork bomb on a single host).  Now 
this could be a case of me putting symptoms from an old issue into what 
you are describing, but thats where my head is at.  If I'm correct I 
should be able to repro using a similar workload.  I think that the 
multi threaded epoll changes that _just_ landed in master will help 
resolve this, but they are so new I haven't been able to test this.  
I'll know more when I get a chance to test tomorrow.


-b

On Thu, Feb 5, 2015 at 6:04 PM, David F. Robinson 
david.robin...@corvidtec.com wrote:

Isn't rsync what geo-rep uses?

David  (Sent from mobile)

===
David F. Robinson, Ph.D.
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310  [cell]
704.799.7974  [fax]
david.robin...@corvidtec.com
http://www.corvidtechnologies.com

 On Feb 5, 2015, at 5:41 PM, Ben Turner btur...@redhat.com wrote:

 - Original Message -
 From: Ben Turner btur...@redhat.com
 To: David F. Robinson david.robin...@corvidtec.com
 Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier 
Hernandez xhernan...@datalab.es, Benjamin Turner
 bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel 
gluster-devel@gluster.org

 Sent: Thursday, February 5, 2015 5:22:26 PM
 Subject: Re: [Gluster-users] [Gluster-devel] missing files

 - Original Message -
 From: David F. Robinson david.robin...@corvidtec.com
 To: Ben Turner btur...@redhat.com
 Cc: Pranith Kumar Karampuri pkara...@redhat.com, Xavier 
Hernandez

 xhernan...@datalab.es, Benjamin Turner
 bennytu...@gmail.com, gluster-us...@gluster.org, Gluster Devel
 gluster-devel@gluster.org
 Sent: Thursday, February 5, 2015 5:01:13 PM
 Subject: Re: [Gluster-users] [Gluster-devel] missing files

 I'll send you the emails I sent Pranith with the logs. What causes 
these

 disconnects?

 Thanks David!  Disconnects happen when there are interruption in
 communication between peers, normally there is ping timeout that 
happens.
 It could be anything from a flaky NW to the system was to busy to 
respond
 to the pings.  My initial take is more towards the ladder as rsync 
is
 absolutely the worst use case for gluster - IIRC it writes in 4kb 
blocks.  I
 try to keep my writes at least 64KB as in my testing that is the 
smallest
 block size I can write with before perf starts to really drop off.  
I'll try

 something similar in the lab.

 Ok I do think that the file being self healed is RCA for what you 
were seeing.  Lets look at one of the disconnects:


 data-brick02a-homegfs.log:[2015-02-03 20:54:02.772180] I 
[server.c:518:server_rpc_notify] 0-homegfs-server: disconnecting 
connection from 
gfs01b.corvidtec.com-4175-2015/02/02-16:44:31:179119-homegfs-client-2-0-1


 And in the glustershd.log from the gfs01b_glustershd.log file:

 [2015-02-03 20:55:48.001797] I 
[afr-self-heal-entry.c:554:afr_selfheal_entry_do] 
0-homegfs-replicate-0: performing entry selfheal on 
6c79a368-edaa-432b-bef9-ec690ab42448
 [2015-02-03 20:55:49.341996] I 
[afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: 
Completed entry selfheal on 6c79a368-edaa-432b-bef9-ec690ab42448. 
source=1 sinks=0
 [2015-02-03 20:55:49.343093] I 
[afr-self-heal-entry.c:554:afr_selfheal_entry_do] 
0-homegfs-replicate-0: performing entry selfheal on 
792cb0d6-9290-4447-8cd7-2b2d7a116a69
 [2015-02-03 20:55:50.463652] I 
[afr-self-heal-common.c:476:afr_log_selfheal] 0-homegfs-replicate-0: 
Completed entry selfheal on 792cb0d6-9290-4447-8cd7-2b2d7a116a69. 
source=1 sinks=0
 [2015-02-03 20:55:51.465289] I 
[afr-self-heal-metadata.c:54:__afr_selfheal_metadata_do] 

Re: [Gluster-devel] [Gluster-users] Input/Output Error on Gluster NFS

2015-02-05 Thread Peter Auyeung
Hi Soumya

root@glusterprod001:~# gluster volume info | grep nfs.acl
02/05/15 10:00:05 [ /root ]

Seems like we do not have ACL enabled.

nfs client is a RHEL4 standard NFS client

Thanks
-Peter

From: Soumya Koduri [skod...@redhat.com]
Sent: Wednesday, February 04, 2015 11:28 PM
To: Peter Auyeung; gluster-us...@gluster.org; gluster-devel@gluster.org
Subject: Re: [Gluster-devel] [Gluster-users] Input/Output Error on Gluster NFS

Hi Peter,

Have you disabled Gluster-NFS ACLs .

Please check the option value -
#gluster v info | grep nfs.acl
nfs.acl: ON

Also please provide the details of the nfs-client you are using.
Typically, nfs-clients seem to issue getxattr before doing
setxattr/removexattr operations and return 'ENOTSUPP' incase of ACLs
disabled. But from the strace, looks like the client has issued
'removexattr' of 'system.posix_acl_default' which returned EIO.

Anyways, 'removexattr' should also have returned EOPNOTSUPP instead of EIO.

Thanks,
Soumya

On 02/05/2015 02:31 AM, Peter Auyeung wrote:
 I was trying to copy a directory of files to Gluster via NFS and getting
 permission denied with Input/Output error

 --- r...@bizratedbstandby.bo2.shopzilla.sea (0.00)# cp -pr db /mnt/
 cp: setting permissions for 
 `/mnt/db/full/pr_bizrate_standby_SMLS.F02-01-22-35.d': Input/output error
 cp: setting permissions for 
 `/mnt/db/full/pr_bizrate_standby_logging.F02-02-18-10.b': Input/output error
 cp: setting permissions for `/mnt/db/full/pr_bizrate_SMLS.F02-01-22-35.d': 
 Input/output error
 cp: setting permissions for 
 `/mnt/db/full/pr_bizrate_standby_master.F02-02-22-00': Input/output error
 cp: setting permissions for `/mnt/db/full': Input/output error
 cp: setting permissions for `/mnt/db': Input/output error

 Checked gluster nfs.log and etc log and bricks looks clean.
 The files ends up able to copy over with right permission.

 Stack trace the copy and seems like it failed on removexattr

 removexattr(/mnt/db, system.posix_acl_default...) = -1 EIO (Input/output 
 error)

 http://pastie.org/9884810

 Any Clue?

 Thanks
 Peter







 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] REMINDER: GlusterFS.next (a.k.a. 4.0) status/planning meeting

2015-02-05 Thread Jeff Darcy
This is *tomorrow* at 12:00 UTC (approximately 15.5 hours from now) in
#gluster-meeting on Freenode.  See you all there!

- Original Message -
 Perhaps it's not obvious to the broader community, but a bunch of people
 have put a bunch of work into various projects under the 4.0 banner.
 Some of the results can be seen in the various feature pages here:
 
 http://www.gluster.org/community/documentation/index.php/Planning40
 
 Now that the various subproject feature pages have been updated, it's
 time to get people together and decide what 4.0 is *really* going to be.
 To that end, I'd like to schedule an IRC meeting for February 6 at 12:00
 UTC - that's this Friday, same time as the triage/community meetings but
 on Friday instead of Tuesday/Wednesday.  An initial agenda includes:
 
 * Introduction and expectation-setting
 
 * Project-by-project status and planning
 
 * Discussion of future meeting formats and times
 
 * Discussion of collaboration tools (e.g. gluster.org wiki or
   Freedcamp) going forward.
 
 Anyone with an interest in the future of GlusterFS is welcome to attend.
 This is *not* a Red Hat only effort, tied to Red Hat product needs and
 schedules and strategies.  This is a chance for the community to come
 together and define what the next generation of distributed file
 systems for the real world will look like.  I hope to see everyone
 there.
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel