On 05/10/17 14:18, Pat Haley wrote:
Hi Pranith,
Since we are mounting the partitions as the bricks, I tried the dd
test writing to
<brick-path>/.glusterfs/<file-to-be-removed-after-test>. The results
without oflag=sync were 1.6 Gb/s (faster than gluster but not as fast
as I was expecting given the 1.2 Gb/s to the no-gluster area w/ fewer
disks).
Pat
Is that true for every disk? If you're choosing the same filename every
time for your dd test, you're likely only doing that test against one
disk. If that disk is slow, you would get the same results every time
despite other disks performing normally.
On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:
On Wed, May 10, 2017 at 10:15 PM, Pat Haley <[email protected]
<mailto:[email protected]>> wrote:
Hi Pranith,
Not entirely sure (this isn't my area of expertise). I'll run
your answer by some other people who are more familiar with this.
I am also uncertain about how to interpret the results when we
also add the dd tests writing to the /home area (no gluster,
still on the same machine)
* dd test without oflag=sync (rough average of multiple tests)
o gluster w/ fuse mount : 570 Mb/s
o gluster w/ nfs mount: 390 Mb/s
o nfs (no gluster): 1.2 Gb/s
* dd test with oflag=sync (rough average of multiple tests)
o gluster w/ fuse mount: 5 Mb/s
o gluster w/ nfs mount: 200 Mb/s
o nfs (no gluster): 20 Mb/s
Given that the non-gluster area is a RAID-6 of 4 disks while each
brick of the gluster area is a RAID-6 of 32 disks, I would
naively expect the writes to the gluster area to be roughly 8x
faster than to the non-gluster.
I think a better test is to try and write to a file using nfs without
any gluster to a location that is not inside the brick but someother
location that is on same disk(s). If you are mounting the partition
as the brick, then we can write to a file inside .glusterfs
directory, something like
<brick-path>/.glusterfs/<file-to-be-removed-after-test>.
I still think we have a speed issue, I can't tell if fuse vs nfs
is part of the problem.
I got interested in the post because I read that fuse speed is lesser
than nfs speed which is counter-intuitive to my understanding. So
wanted clarifications. Now that I got my clarifications where fuse
outperformed nfs without sync, we can resume testing as described
above and try to find what it is. Based on your email-id I am
guessing you are from Boston and I am from Bangalore so if you are
okay with doing this debugging for multiple days because of
timezones, I will be happy to help. Please be a bit patient with me,
I am under a release crunch but I am very curious with the problem
you posted.
Was there anything useful in the profiles?
Unfortunately profiles didn't help me much, I think we are collecting
the profiles from an active volume, so it has a lot of information
that is not pertaining to dd so it is difficult to find the
contributions of dd. So I went through your post again and found
something I didn't pay much attention to earlier i.e. oflag=sync, so
did my own tests on my setup with FUSE so sent that reply.
Pat
On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:
Okay good. At least this validates my doubts. Handling O_SYNC in
gluster NFS and fuse is a bit different.
When application opens a file with O_SYNC on fuse mount then
each write syscall has to be written to disk as part of the
syscall where as in case of NFS, there is no concept of open.
NFS performs write though a handle saying it needs to be a
synchronous write, so write() syscall is performed first then it
performs fsync(). so an write on an fd with O_SYNC becomes
write+fsync. I am suspecting that when multiple threads do this
write+fsync() operation on the same file, multiple writes are
batched together to be written do disk so the throughput on the
disk is increasing is my guess.
Does it answer your doubts?
On Wed, May 10, 2017 at 9:35 PM, Pat Haley <[email protected]
<mailto:[email protected]>> wrote:
Without the oflag=sync and only a single test of each, the
FUSE is going faster than NFS:
FUSE:
mseas-data2(dri_nascar)% dd if=/dev/zero count=4096
bs=1048576 of=zeros.txt conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s
NFS
mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576
of=zeros.txt conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s
On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:
Could you let me know the speed without oflag=sync on both
the mounts? No need to collect profiles.
On Wed, May 10, 2017 at 9:17 PM, Pat Haley <[email protected]
<mailto:[email protected]>> wrote:
Here is what I see now:
[root@mseas-data2 ~]# gluster volume info
Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off
On 05/10/2017 11:44 AM, Pranith Kumar Karampuri wrote:
Is this the volume info you have?
>/[root at mseas-data2
<http://www.gluster.org/mailman/listinfo/gluster-users>
~]# gluster volume info />//>/Volume Name: data-volume />/Type: Distribute />/Volume ID:
c162161e-2a2d-4dac-b015-f31fd89ceb18 />/Status: Started />/Number of Bricks: 2 />/Transport-type: tcp
/>/Bricks: />/Brick1: mseas-data2:/mnt/brick1 />/Brick2: mseas-data2:/mnt/brick2 />/Options Reconfigured:
/>/performance.readdir-ahead: on />/nfs.disable: on />/nfs.export-volumes: off /
I copied this from old thread from 2016. This is
distribute volume. Did you change any of the options
in between?
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley Email:[email protected]
<mailto:[email protected]>
Center for Ocean Engineering Phone: (617) 253-6824
Dept. of Mechanical Engineering Fax: (617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA 02139-4301
--
Pranith
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley Email:[email protected]
<mailto:[email protected]>
Center for Ocean Engineering Phone: (617) 253-6824
Dept. of Mechanical Engineering Fax: (617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA 02139-4301
--
Pranith
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley Email:[email protected]
<mailto:[email protected]>
Center for Ocean Engineering Phone: (617) 253-6824
Dept. of Mechanical Engineering Fax: (617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA 02139-4301
--
Pranith
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley Email:[email protected]
Center for Ocean Engineering Phone: (617) 253-6824
Dept. of Mechanical Engineering Fax: (617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA 02139-4301
_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users