Re: [Gluster-devel] Fw: Re: Corvid gluster testing
Just to clarify a little, there are two cases where I was evaluating performance. 1) The first case that Pranith was working involved 20-nodes with 4-processors on each node for a total of 80-processors. Each processor does its own independent i/o. These files are roughly 100-200MB each and there are several hundred of them. When I mounted the gluster system using fuse, it took 1.5-hours to do the i/o. When I mounted the same system using NFS, it took 30-minutes. Note, that in order to get the gluster mounted file-system down to 1.5-hours, I had to get rid of the replicated volume (this was done during troubleshooting with Pranith to rule out other possible issues). The timing was significantly worse (3+ hours) when I was using a replicated pair. 2) The second case was the output of a larger single file (roughly 2.5TB). For this case, it takes the gluster mounted filesystem 60-seconds (although I got that down to 52-seconds with some gluster parameter tuning). The NFS mount takes 38-seconds. I sent the results of this to the developer list first as this case is much easier to test (50-seconds versus what could be 3+ hours). I am head out of town for a few days and will not be able to do additional testing until Monday. For the second case, I will turn off cluster.eager-lock and send the results to the email list. If there is any other testing that you would like to see for the first case, let me know and I will be happy to perform the tests and send in the results... Sorry for the confusion... David -- Original Message -- From: Pranith Kumar Karampuri pkara...@redhat.com To: Anand Avati av...@gluster.org Cc: David F. Robinson david.robin...@corvidtec.com; Gluster Devel gluster-devel@gluster.org Sent: 8/6/2014 9:51:11 PM Subject: Re: [Gluster-devel] Fw: Re: Corvid gluster testing On 08/07/2014 07:18 AM, Anand Avati wrote: It would be worth checking the perf numbers without -o acl (in case it was enabled, as seen in the other gid thread). Client side -o acl mount option can have a negative impact on performance because of the increased number of up-calls from FUSE for access(). Actually it is all write intensive. here are the numbers they gave me from earlier runs: %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop - --- --- --- 0.00 0.00 us 0.00 us 0.00 us 99 FORGET 0.00 0.00 us 0.00 us 0.00 us 1093 RELEASE 0.00 0.00 us 0.00 us 0.00 us468 RELEASEDIR 0.00 60.00 us 26.00 us 107.00 us 4 SETATTR 0.00 91.56 us 42.00 us 157.00 us 27 UNLINK 0.00 20.75 us 12.00 us 55.00 us132 GETXATTR 0.00 19.03 us 9.00 us 95.00 us152 READLINK 0.00 43.19 us 12.00 us 106.00 us 83 OPEN 0.00 18.37 us 8.00 us 92.00 us257 STATFS 0.00 32.42 us 11.00 us 118.00 us322 OPENDIR 0.00 36.09 us 5.00 us 109.00 us359 FSTAT 0.00 51.14 us 37.00 us 183.00 us663 RENAME 0.00 33.32 us 6.00 us 123.00 us 1451 STAT 0.00 821.79 us 21.00 us 22678.00 us 84 READ 0.00 34.88 us 3.00 us 139.00 us 2326 FLUSH 0.01 789.33 us 72.00 us 64054.00 us347 CREATE 0.011144.63 us 43.00 us 280735.00 us337 FTRUNCATE 0.01 47.82 us 16.00 us 19817.00 us 16513 LOOKUP 0.02 604.85 us 11.00 us1233.00 us 1423 READDIRP 99.95 17.51 us 6.00 us 212701.00 us 300715967 WRITE Duration: 5390 seconds Data Read: 1495257497 bytes Data Written: 166546887668 bytes Pranith Thanks On Wed, Aug 6, 2014 at 6:26 PM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 08/07/2014 06:48 AM, Anand Avati wrote: On Wed, Aug 6, 2014 at 6:05 PM, Pranith Kumar Karampuri pkara...@redhat.com wrote: We checked this performance with plain distribute as well and on nfs it gave 25 minutes where as on nfs it gave around 90 minutes after disabling throttling in both situations. This sentence is very confusing. Can you please state it more clearly? sorry :-D. We checked this performance on plain distribute volume by disabling throttling. On nfs the run took 25 minutes. On fuse the run took 90 minutes. Pranith Thanks I was wondering if any of you guys know what could contribute to this difference. Pranith On 08/07/2014 01:33 AM, Anand Avati wrote: Seems like heavy FINODELK contention. As a diagnostic step
Re: [Gluster-devel] Fw: Re: Corvid gluster testing
David, Is it possible to profile the app to understand the block sizes used for performing write() (using strace, source code inspection etc)? The block sizes reported by gluster volume profile is measured on the server side and is subject to some aggregation by the client side write-behind xlator. Typically the biggest hurdle for small block writes is FUSE context switches which happens even before reaching the client side write-behind xlator. You could also enable the io-stats xlator on the client side just below FUSE (before reaching write-behind), and extract data using setfattr. On Wed, Aug 6, 2014 at 10:00 AM, David F. Robinson david.robin...@corvidtec.com wrote: My apologies. I did some additional testing and realized that my timing wasn't right. I believe that after I do the write, NFS caches the data and until I close and flush the file, the timing isn't correct. I believe the appropriate timing is now 38-seconds for NFS and 60-seconds for gluster. I played around with some of the parameters and got it down to 52-seconds with gluster by setting: performance.write-behind-window-size: 128MB performance.cache-size: 128MB I couldn't get it closer to the NFS timing on the writes, although the read speads were slightly better than NFS. I am not sure if this is reasonable, or if I should be able to get write speeds that are more comparable to the NFS mount... Sorry for the confusion I might have caused with my first email... It isn't 25x slower. It is roughly 30% slower for the writes... David -- Original Message -- From: Vijay Bellur vbel...@redhat.com To: David F. Robinson david.robin...@corvidtec.com; gluster-devel@gluster.org Sent: 8/6/2014 12:48:09 PM Subject: Re: [Gluster-devel] Fw: Re: Corvid gluster testing On 08/06/2014 12:11 AM, David F. Robinson wrote: I have been testing some of the fixes that Pranith incorporated into the 3.5.2-beta to see how they performed for moderate levels of i/o. All of the stability issues that I had seen in previous versions seem to have been fixed in 3.5.2; however, there still seem to be some significant performance issues. Pranith suggested that I send this to the gluster-devel email list, so here goes: I am running an MPI job that saves a restart file to the gluster file system. When I use the following in my fstab to mount the gluster volume, the i/o time for the 2.5GB file is roughly 45-seconds. / gfsib01a.corvidtec.com:/homegfs /homegfs glusterfs transport=tcp,_netdev 0 0 / When I switch this to use the NFS protocol (see below), the i/o time is 2.5-seconds. / gfsib01a.corvidtec.com:/homegfs /homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768 0 0/ The read-times for gluster are 10-20% faster than NFS, but the write times are almost 20x slower. What is the block size of the writes that are being performed? You can expect better throughput and lower latency with block sizes that are close to or greater than 128KB. -Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Fw: Re: Corvid gluster testing
Forgot to attach profile info in previous email. Attached... David -- Original Message -- From: David F. Robinson david.robin...@corvidtec.com To: gluster-devel@gluster.org Sent: 8/5/2014 2:41:34 PM Subject: Fw: Re: Corvid gluster testing I have been testing some of the fixes that Pranith incorporated into the 3.5.2-beta to see how they performed for moderate levels of i/o. All of the stability issues that I had seen in previous versions seem to have been fixed in 3.5.2; however, there still seem to be some significant performance issues. Pranith suggested that I send this to the gluster-devel email list, so here goes: I am running an MPI job that saves a restart file to the gluster file system. When I use the following in my fstab to mount the gluster volume, the i/o time for the 2.5GB file is roughly 45-seconds. gfsib01a.corvidtec.com:/homegfs /homegfs glusterfs transport=tcp,_netdev 0 0 When I switch this to use the NFS protocol (see below), the i/o time is 2.5-seconds. gfsib01a.corvidtec.com:/homegfs /homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768 0 0 The read-times for gluster are 10-20% faster than NFS, but the write times are almost 20x slower. I am running SL 6.4 and glusterfs-3.5.2-0.1.beta1.el6.x86_64... [root@gfs01a glusterfs]# gluster volume info homegfs Volume Name: homegfs Type: Distributed-Replicate Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs David -- Forwarded Message -- From: Pranith Kumar Karampuri pkara...@redhat.com To: David Robinson david.robin...@corvidtec.com Cc: Young Thomas tom.yo...@corvidtec.com Sent: 8/5/2014 2:25:38 AM Subject: Re: Corvid gluster testing gluster-devel@gluster.org is the email-id for the mailing list. We should probably start with the initial run numbers and the comparison for glusterfs mount and nfs mounts. May be something like glusterfs mount: 90 minutes nfs mount: 25 minutes And profile outputs, volume config, number of mounts, hardware configuration should be a good start. Pranith On 08/05/2014 09:28 AM, David Robinson wrote: Thanks pranith === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 [office] 704.252.1310 [cell] 704.799.7974 [fax] david.robin...@corvidtec.com http://www.corvidtechnologies.com On Aug 4, 2014, at 11:22 PM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 08/05/2014 08:33 AM, Pranith Kumar Karampuri wrote: On 08/05/2014 08:29 AM, David F. Robinson wrote: On 08/05/2014 12:51 AM, David F. Robinson wrote: No. I don't want to use nfs. It eliminates most of the benefits of why I want to use gluster. Failover redundancy of the pair, load balancing, etc. What is the meaning of 'Failover redundancy of the pair, load balancing ' Could you elaborate more? smb/nfs/glusterfs are just access protocols that gluster supports functionality is almost same Here is my understanding. Please correct me where I am wrong. With gluster, if I am doing a write and one of the replicated pairs goes down, there is no interruption to the I/o. The failover is handled by gluster and the fuse client. This isn't done if I use an nfs mount unless the component of the pair that goes down isn't the one I used for the mount. With nfs, I will have to mount one of the bricks. So, if I have gfs01a, gfs01b, gfs02a, gfs02b, gfs03a, gfs03b, etc and my fstab mounts gfs01a, it is my understanding that all of my I/o will go through gfs01a which then gets distributed to all of the other bricks. Gfs01a throughput becomes a bottleneck. Where if I do a gluster mount using fuse, the load balancing is handled at the client side , not the server side. If I have 1000-nodes accessing 20-gluster bricks, I need the load balancing aspect. I cannot have all traffic going through the network interface on a single brick. If I am wrong with the above assumptions, I guess my question is why would one ever use the gluster mount instead of nfs and/or samba? Tom: feel free to chime in if I have missed anything. I see your point now. Yes the gluster server where you did the mount is kind of a bottle neck. Now that we established the problem is in the clients/protocols, you should send out a detailed mail on gluster-devel and see if anyone can help with you on performance xlators that can improve it a bit more. My area of expertise is more on replication. I am sub-maintainer for replication,locks components. I also know connection management/io-threads related issues which lead to hangs as I worked on them before. Performance xlators are black box to me. Performance xlators are enabled only on fuse gluster stack. On nfs server mounts we
Re: [Gluster-devel] Fw: Re: Corvid gluster testing
On 08/06/2014 12:11 AM, David F. Robinson wrote: I have been testing some of the fixes that Pranith incorporated into the 3.5.2-beta to see how they performed for moderate levels of i/o. All of the stability issues that I had seen in previous versions seem to have been fixed in 3.5.2; however, there still seem to be some significant performance issues. Pranith suggested that I send this to the gluster-devel email list, so here goes: I am running an MPI job that saves a restart file to the gluster file system. When I use the following in my fstab to mount the gluster volume, the i/o time for the 2.5GB file is roughly 45-seconds. /gfsib01a.corvidtec.com:/homegfs /homegfs glusterfs transport=tcp,_netdev 0 0 / When I switch this to use the NFS protocol (see below), the i/o time is 2.5-seconds. / gfsib01a.corvidtec.com:/homegfs /homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768 0 0/ The read-times for gluster are 10-20% faster than NFS, but the write times are almost 20x slower. What is the block size of the writes that are being performed? You can expect better throughput and lower latency with block sizes that are close to or greater than 128KB. -Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Fw: Re: Corvid gluster testing
hi Avati, We checked this performance with plain distribute as well and on nfs it gave 25 minutes where as on nfs it gave around 90 minutes after disabling throttling in both situations. I was wondering if any of you guys know what could contribute to this difference. Pranith On 08/07/2014 01:33 AM, Anand Avati wrote: Seems like heavy FINODELK contention. As a diagnostic step, can you try disabling eager-locking and check the write performance again (gluster volume set $name cluster.eager-lock off)? On Tue, Aug 5, 2014 at 11:44 AM, David F. Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com wrote: Forgot to attach profile info in previous email. Attached... David -- Original Message -- From: David F. Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com To: gluster-devel@gluster.org mailto:gluster-devel@gluster.org Sent: 8/5/2014 2:41:34 PM Subject: Fw: Re: Corvid gluster testing I have been testing some of the fixes that Pranith incorporated into the 3.5.2-beta to see how they performed for moderate levels of i/o. All of the stability issues that I had seen in previous versions seem to have been fixed in 3.5.2; however, there still seem to be some significant performance issues. Pranith suggested that I send this to the gluster-devel email list, so here goes: I am running an MPI job that saves a restart file to the gluster file system. When I use the following in my fstab to mount the gluster volume, the i/o time for the 2.5GB file is roughly 45-seconds. /gfsib01a.corvidtec.com:/homegfs /homegfs glusterfs transport=tcp,_netdev 0 0 / When I switch this to use the NFS protocol (see below), the i/o time is 2.5-seconds. / gfsib01a.corvidtec.com:/homegfs /homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768 0 0/ The read-times for gluster are 10-20% faster than NFS, but the write times are almost 20x slower. I am running SL 6.4 and glusterfs-3.5.2-0.1.beta1.el6.x86_64... /[root@gfs01a glusterfs]# gluster volume info homegfs Volume Name: homegfs Type: Distributed-Replicate Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs/ David -- Forwarded Message -- From: Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com To: David Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com Cc: Young Thomas tom.yo...@corvidtec.com mailto:tom.yo...@corvidtec.com Sent: 8/5/2014 2:25:38 AM Subject: Re: Corvid gluster testing gluster-devel@gluster.org mailto:gluster-devel@gluster.org is the email-id for the mailing list. We should probably start with the initial run numbers and the comparison for glusterfs mount and nfs mounts. May be something like glusterfs mount: 90 minutes nfs mount: 25 minutes And profile outputs, volume config, number of mounts, hardware configuration should be a good start. Pranith On 08/05/2014 09:28 AM, David Robinson wrote: Thanks pranith === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 tel:704.799.6944%20x101 [office] 704.252.1310 tel:704.252.1310 [cell] 704.799.7974 tel:704.799.7974 [fax] david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com http://www.corvidtechnologies.com http://www.corvidtechnologies.com/ On Aug 4, 2014, at 11:22 PM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 08/05/2014 08:33 AM, Pranith Kumar Karampuri wrote: On 08/05/2014 08:29 AM, David F. Robinson wrote: On 08/05/2014 12:51 AM, David F. Robinson wrote: No. I don't want to use nfs. It eliminates most of the benefits of why I want to use gluster. Failover redundancy of the pair, load balancing, etc. What is the meaning of 'Failover redundancy of the pair, load balancing ' Could you elaborate more? smb/nfs/glusterfs are just access protocols that gluster supports functionality is almost same Here is my understanding. Please correct me where I am wrong. With gluster, if I am doing a write and one of the replicated pairs goes down, there is no interruption to the I/o. The failover is handled by gluster and the fuse client. This isn't done if I use an nfs mount unless the component of the pair that goes down isn't the one I used for the mount. With nfs, I will have to mount one of the bricks. So, if I have gfs01a, gfs01b, gfs02a, gfs02b, gfs03a, gfs03b, etc and my
Re: [Gluster-devel] Fw: Re: Corvid gluster testing
On 08/07/2014 06:48 AM, Anand Avati wrote: On Wed, Aug 6, 2014 at 6:05 PM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: We checked this performance with plain distribute as well and on nfs it gave 25 minutes where as on nfs it gave around 90 minutes after disabling throttling in both situations. This sentence is very confusing. Can you please state it more clearly? sorry :-D. We checked this performance on plain distribute volume by disabling throttling. On nfs the run took 25 minutes. On fuse the run took 90 minutes. Pranith Thanks I was wondering if any of you guys know what could contribute to this difference. Pranith On 08/07/2014 01:33 AM, Anand Avati wrote: Seems like heavy FINODELK contention. As a diagnostic step, can you try disabling eager-locking and check the write performance again (gluster volume set $name cluster.eager-lock off)? On Tue, Aug 5, 2014 at 11:44 AM, David F. Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com wrote: Forgot to attach profile info in previous email. Attached... David -- Original Message -- From: David F. Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com To: gluster-devel@gluster.org mailto:gluster-devel@gluster.org Sent: 8/5/2014 2:41:34 PM Subject: Fw: Re: Corvid gluster testing I have been testing some of the fixes that Pranith incorporated into the 3.5.2-beta to see how they performed for moderate levels of i/o. All of the stability issues that I had seen in previous versions seem to have been fixed in 3.5.2; however, there still seem to be some significant performance issues. Pranith suggested that I send this to the gluster-devel email list, so here goes: I am running an MPI job that saves a restart file to the gluster file system. When I use the following in my fstab to mount the gluster volume, the i/o time for the 2.5GB file is roughly 45-seconds. /gfsib01a.corvidtec.com:/homegfs /homegfs glusterfs transport=tcp,_netdev 0 0 / When I switch this to use the NFS protocol (see below), the i/o time is 2.5-seconds. /gfsib01a.corvidtec.com:/homegfs /homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768 0 0/ The read-times for gluster are 10-20% faster than NFS, but the write times are almost 20x slower. I am running SL 6.4 and glusterfs-3.5.2-0.1.beta1.el6.x86_64... /[root@gfs01a glusterfs]# gluster volume info homegfs Volume Name: homegfs Type: Distributed-Replicate Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs/ David -- Forwarded Message -- From: Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com To: David Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com Cc: Young Thomas tom.yo...@corvidtec.com mailto:tom.yo...@corvidtec.com Sent: 8/5/2014 2:25:38 AM Subject: Re: Corvid gluster testing gluster-devel@gluster.org mailto:gluster-devel@gluster.org is the email-id for the mailing list. We should probably start with the initial run numbers and the comparison for glusterfs mount and nfs mounts. May be something like glusterfs mount: 90 minutes nfs mount: 25 minutes And profile outputs, volume config, number of mounts, hardware configuration should be a good start. Pranith On 08/05/2014 09:28 AM, David Robinson wrote: Thanks pranith === David F. Robinson, Ph.D. President - Corvid Technologies 704.799.6944 x101 tel:704.799.6944%20x101 [office] 704.252.1310 tel:704.252.1310 [cell] 704.799.7974 tel:704.799.7974 [fax] david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com http://www.corvidtechnologies.com http://www.corvidtechnologies.com/ On Aug 4, 2014, at 11:22 PM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 08/05/2014 08:33 AM, Pranith Kumar Karampuri wrote: On 08/05/2014 08:29 AM, David F. Robinson wrote: On 08/05/2014 12:51 AM, David F. Robinson wrote: No. I don't want to use nfs. It eliminates most of the benefits of why I want to use gluster. Failover
Re: [Gluster-devel] Fw: Re: Corvid gluster testing
On 08/07/2014 07:18 AM, Anand Avati wrote: It would be worth checking the perf numbers without -o acl (in case it was enabled, as seen in the other gid thread). Client side -o acl mount option can have a negative impact on performance because of the increased number of up-calls from FUSE for access(). Actually it is all write intensive. here are the numbers they gave me from earlier runs: %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop - --- --- --- 0.00 0.00 us 0.00 us 0.00 us 99 FORGET 0.00 0.00 us 0.00 us 0.00 us 1093 RELEASE 0.00 0.00 us 0.00 us 0.00 us468 RELEASEDIR 0.00 60.00 us 26.00 us 107.00 us 4 SETATTR 0.00 91.56 us 42.00 us 157.00 us 27 UNLINK 0.00 20.75 us 12.00 us 55.00 us 132GETXATTR 0.00 19.03 us 9.00 us 95.00 us 152READLINK 0.00 43.19 us 12.00 us 106.00 us 83OPEN 0.00 18.37 us 8.00 us 92.00 us 257 STATFS 0.00 32.42 us 11.00 us 118.00 us 322 OPENDIR 0.00 36.09 us 5.00 us 109.00 us 359 FSTAT 0.00 51.14 us 37.00 us 183.00 us 663 RENAME 0.00 33.32 us 6.00 us 123.00 us 1451STAT 0.00 821.79 us 21.00 us 22678.00 us 84READ 0.00 34.88 us 3.00 us 139.00 us 2326 FLUSH 0.01 789.33 us 72.00 us 64054.00 us 347 CREATE 0.011144.63 us 43.00 us 280735.00 us 337 FTRUNCATE 0.01 47.82 us 16.00 us 19817.00 us 16513 LOOKUP 0.02 604.85 us 11.00 us1233.00 us 1423READDIRP 99.95 17.51 us 6.00 us 212701.00 us 300715967 WRITE Duration: 5390 seconds Data Read: 1495257497 bytes Data Written: 166546887668 bytes Pranith Thanks On Wed, Aug 6, 2014 at 6:26 PM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 08/07/2014 06:48 AM, Anand Avati wrote: On Wed, Aug 6, 2014 at 6:05 PM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: We checked this performance with plain distribute as well and on nfs it gave 25 minutes where as on nfs it gave around 90 minutes after disabling throttling in both situations. This sentence is very confusing. Can you please state it more clearly? sorry :-D. We checked this performance on plain distribute volume by disabling throttling. On nfs the run took 25 minutes. On fuse the run took 90 minutes. Pranith Thanks I was wondering if any of you guys know what could contribute to this difference. Pranith On 08/07/2014 01:33 AM, Anand Avati wrote: Seems like heavy FINODELK contention. As a diagnostic step, can you try disabling eager-locking and check the write performance again (gluster volume set $name cluster.eager-lock off)? On Tue, Aug 5, 2014 at 11:44 AM, David F. Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com wrote: Forgot to attach profile info in previous email. Attached... David -- Original Message -- From: David F. Robinson david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com To: gluster-devel@gluster.org mailto:gluster-devel@gluster.org Sent: 8/5/2014 2:41:34 PM Subject: Fw: Re: Corvid gluster testing I have been testing some of the fixes that Pranith incorporated into the 3.5.2-beta to see how they performed for moderate levels of i/o. All of the stability issues that I had seen in previous versions seem to have been fixed in 3.5.2; however, there still seem to be some significant performance issues. Pranith suggested that I send this to the gluster-devel email list, so here goes: I am running an MPI job that saves a restart file to the gluster file system. When I use the following in my fstab to mount the gluster volume, the i/o time for the 2.5GB file is roughly 45-seconds. /gfsib01a.corvidtec.com:/homegfs /homegfs glusterfs transport=tcp,_netdev 0 0 / When I switch this to use the NFS protocol (see below), the i/o time is 2.5-seconds. /gfsib01a.corvidtec.com:/homegfs /homegfs nfs vers=3,intr,bg,rsize=32768,wsize=32768 0 0/ The read-times for gluster are 10-20% faster than NFS, but the write times are