On Thu, May 11, 2017 at 2:48 AM, Pat Haley <[email protected]> wrote: > > Hi Pranith, > > Since we are mounting the partitions as the bricks, I tried the dd test > writing to <brick-path>/.glusterfs/<file-to-be-removed-after-test>. The > results without oflag=sync were 1.6 Gb/s (faster than gluster but not as > fast as I was expecting given the 1.2 Gb/s to the no-gluster area w/ fewer > disks). >
Okay, then 1.6Gb/s is what we need to target for, considering your volume is just distribute. Is there any way you can do tests on similar hardware but at a small scale? Just so we can run the workload to learn more about the bottlenecks in the system? We can probably try to get the speed to 1.2Gb/s on your /home partition you were telling me yesterday. Let me know if that is something you are okay to do. > > Pat > > > > On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote: > > > > On Wed, May 10, 2017 at 10:15 PM, Pat Haley <[email protected]> wrote: > >> >> Hi Pranith, >> >> Not entirely sure (this isn't my area of expertise). I'll run your >> answer by some other people who are more familiar with this. >> >> I am also uncertain about how to interpret the results when we also add >> the dd tests writing to the /home area (no gluster, still on the same >> machine) >> >> - dd test without oflag=sync (rough average of multiple tests) >> - gluster w/ fuse mount : 570 Mb/s >> - gluster w/ nfs mount: 390 Mb/s >> - nfs (no gluster): 1.2 Gb/s >> - dd test with oflag=sync (rough average of multiple tests) >> - gluster w/ fuse mount: 5 Mb/s >> - gluster w/ nfs mount: 200 Mb/s >> - nfs (no gluster): 20 Mb/s >> >> Given that the non-gluster area is a RAID-6 of 4 disks while each brick >> of the gluster area is a RAID-6 of 32 disks, I would naively expect the >> writes to the gluster area to be roughly 8x faster than to the non-gluster. >> > > I think a better test is to try and write to a file using nfs without any > gluster to a location that is not inside the brick but someother location > that is on same disk(s). If you are mounting the partition as the brick, > then we can write to a file inside .glusterfs directory, something like > <brick-path>/.glusterfs/<file-to-be-removed-after-test>. > > >> I still think we have a speed issue, I can't tell if fuse vs nfs is part >> of the problem. >> > > I got interested in the post because I read that fuse speed is lesser than > nfs speed which is counter-intuitive to my understanding. So wanted > clarifications. Now that I got my clarifications where fuse outperformed > nfs without sync, we can resume testing as described above and try to find > what it is. Based on your email-id I am guessing you are from Boston and I > am from Bangalore so if you are okay with doing this debugging for multiple > days because of timezones, I will be happy to help. Please be a bit patient > with me, I am under a release crunch but I am very curious with the problem > you posted. > > Was there anything useful in the profiles? >> > > Unfortunately profiles didn't help me much, I think we are collecting the > profiles from an active volume, so it has a lot of information that is not > pertaining to dd so it is difficult to find the contributions of dd. So I > went through your post again and found something I didn't pay much > attention to earlier i.e. oflag=sync, so did my own tests on my setup with > FUSE so sent that reply. > > >> >> Pat >> >> >> >> On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote: >> >> Okay good. At least this validates my doubts. Handling O_SYNC in gluster >> NFS and fuse is a bit different. >> When application opens a file with O_SYNC on fuse mount then each write >> syscall has to be written to disk as part of the syscall where as in case >> of NFS, there is no concept of open. NFS performs write though a handle >> saying it needs to be a synchronous write, so write() syscall is performed >> first then it performs fsync(). so an write on an fd with O_SYNC becomes >> write+fsync. I am suspecting that when multiple threads do this >> write+fsync() operation on the same file, multiple writes are batched >> together to be written do disk so the throughput on the disk is increasing >> is my guess. >> >> Does it answer your doubts? >> >> On Wed, May 10, 2017 at 9:35 PM, Pat Haley <[email protected]> wrote: >> >>> >>> Without the oflag=sync and only a single test of each, the FUSE is going >>> faster than NFS: >>> >>> FUSE: >>> mseas-data2(dri_nascar)% dd if=/dev/zero count=4096 bs=1048576 >>> of=zeros.txt conv=sync >>> 4096+0 records in >>> 4096+0 records out >>> 4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s >>> >>> >>> NFS >>> mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt >>> conv=sync >>> 4096+0 records in >>> 4096+0 records out >>> 4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s >>> >>> >>> >>> On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote: >>> >>> Could you let me know the speed without oflag=sync on both the mounts? >>> No need to collect profiles. >>> >>> On Wed, May 10, 2017 at 9:17 PM, Pat Haley <[email protected]> wrote: >>> >>>> >>>> Here is what I see now: >>>> >>>> [root@mseas-data2 ~]# gluster volume info >>>> >>>> Volume Name: data-volume >>>> Type: Distribute >>>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18 >>>> Status: Started >>>> Number of Bricks: 2 >>>> Transport-type: tcp >>>> Bricks: >>>> Brick1: mseas-data2:/mnt/brick1 >>>> Brick2: mseas-data2:/mnt/brick2 >>>> Options Reconfigured: >>>> diagnostics.count-fop-hits: on >>>> diagnostics.latency-measurement: on >>>> nfs.exports-auth-enable: on >>>> diagnostics.brick-sys-log-level: WARNING >>>> performance.readdir-ahead: on >>>> nfs.disable: on >>>> nfs.export-volumes: off >>>> >>>> >>>> >>>> On 05/10/2017 11:44 AM, Pranith Kumar Karampuri wrote: >>>> >>>> Is this the volume info you have? >>>> >>>> >* [root at mseas-data2 >>>> ><http://www.gluster.org/mailman/listinfo/gluster-users> ~]# gluster >>>> >volume info >>>> *>>* Volume Name: data-volume >>>> *>* Type: Distribute >>>> *>* Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18 >>>> *>* Status: Started >>>> *>* Number of Bricks: 2 >>>> *>* Transport-type: tcp >>>> *>* Bricks: >>>> *>* Brick1: mseas-data2:/mnt/brick1 >>>> *>* Brick2: mseas-data2:/mnt/brick2 >>>> *>* Options Reconfigured: >>>> *>* performance.readdir-ahead: on >>>> *>* nfs.disable: on >>>> *>* nfs.export-volumes: off >>>> >>>> * >>>> >>>> I copied this from old thread from 2016. This is distribute volume. >>>> Did you change any of the options in between? >>>> >>>> -- >>>> >>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>> Pat Haley Email: [email protected] >>>> Center for Ocean Engineering Phone: (617) 253-6824 >>>> Dept. of Mechanical Engineering Fax: (617) 253-8125 >>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/ >>>> 77 Massachusetts Avenue >>>> Cambridge, MA 02139-4301 >>>> >>>> -- >>> Pranith >>> >>> -- >>> >>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>> Pat Haley Email: [email protected] >>> Center for Ocean Engineering Phone: (617) 253-6824 >>> Dept. of Mechanical Engineering Fax: (617) 253-8125 >>> MIT, Room 5-213 http://web.mit.edu/phaley/www/ >>> 77 Massachusetts Avenue >>> Cambridge, MA 02139-4301 >>> >>> -- >> Pranith >> >> -- >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >> Pat Haley Email: [email protected] >> Center for Ocean Engineering Phone: (617) 253-6824 >> Dept. of Mechanical Engineering Fax: (617) 253-8125 >> MIT, Room 5-213 http://web.mit.edu/phaley/www/ >> 77 Massachusetts Avenue >> Cambridge, MA 02139-4301 >> >> -- > Pranith > > -- > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > Pat Haley Email: [email protected] > Center for Ocean Engineering Phone: (617) 253-6824 > Dept. of Mechanical Engineering Fax: (617) 253-8125 > MIT, Room 5-213 http://web.mit.edu/phaley/www/ > 77 Massachusetts Avenue > Cambridge, MA 02139-4301 > > -- Pranith
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
