Re: [Gluster-users] gluster local vs local = gluster x4 slower

Steven Truelove Tue, 30 Mar 2010 09:33:36 -0700

What you are likely seeing is the OS saving dirty pages in the diskcache before writing them. If you were untarring a file that wassignificantly larger than available memory on the server, the serverwould be forced to write to disk and you would likely see performancefall more into line with the results you get when you call sync.

Gluster is probably flushing data to disk more aggressively than the OSwould on its own. This may be intended for reducing the loss of data inserver failure scenarios. Someone on the Gluster team can probablycomment on any settings that may exist for controlling Gluster's dataflushing behaviour.


Steven Truelove


On 29/03/2010 5:09 PM, Jeremy Enos wrote:

I've already determined that && sync brings the values at least tothe same order (gluster is about 75% of direct disk there). I couldaccept that for the benefit of having a parallel fileystem.What I'm actually trying to achieve now is exactly what leaving outthe && sync yields in perceived performance, which translates to realperformance if the user can continue on to another task instead ofblocking because Gluster isn't utilizing cache. How, with Gluster,can I achieve the same cache benefit that direct disk gets? Will auser ever be able to untar a moderately sized (below physical memory)file on to a Gluster filesystem as fast as to a single disk? (as Idid in my initial comparison) Is there something fundamentallypreventing that in Gluster's design, or am I misconfiguring it?

thx-

    Jeremy

On 3/29/2010 2:00 PM, Bryan Whitehead wrote:

heh, don't forget the&&  sync

:)

On Mon, Mar 29, 2010 at 11:21 AM, Jeremy Enos<[email protected]>wrote:

Got a chance to run your suggested test:

##############GLUSTER SINGLE DISK##############

[r...@ac33 gjenos]# dd bs=4096 count=32768 if=/dev/zeroof=./filename.test

32768+0 records in
32768+0 records out
134217728 bytes (134 MB) copied, 8.60486 s, 15.6 MB/s
[r...@ac33 gjenos]#
[r...@ac33 gjenos]# cd /export/jenos/

##############DIRECT SINGLE DISK##############

[r...@ac33 jenos]# dd bs=4096 count=32768 if=/dev/zeroof=./filename.test

32768+0 records in
32768+0 records out
134217728 bytes (134 MB) copied, 0.21915 s, 612 MB/s
[r...@ac33 jenos]#

If doing anything that can see a cache benefit, the performance ofGluster

can't compare.  Is it even using cache?

This is the client vol file I used for that test:

[r...@ac33 jenos]# cat /etc/glusterfs/ghome.vol
#-----------IB remotes------------------
volume ghome
  type protocol/client
  option transport-type tcp/client
  option remote-host ac33
  option remote-subvolume ibstripe
end-volume

#------------Performance Options-------------------

volume readahead
  type performance/read-ahead
  option page-count 4           # 2 is default option
  option force-atime-update off # default is off
  subvolumes ghome
end-volume

volume writebehind
  type performance/write-behind
  option cache-size 1MB
  subvolumes readahead
end-volume

volume cache
  type performance/io-cache
  option cache-size 2GB
  subvolumes writebehind
end-volume


Any suggestions appreciated.  thx-

    Jeremy

On 3/26/2010 6:09 PM, Bryan Whitehead wrote:

One more thought, looks like (from your emails) you are always running
the gluster test first. Maybe the tar file is being read from disk
when you do the gluster test, then being read from cache when you run
for the disk.

What if you just pull a chunk of 0's off /dev/zero?

dd bs=4096 count=32768 if=/dev/zero of=./filename.test

or stick the tar in a ramdisk?

(or run the benchmark 10 times for each, drop the best and the worse,
and average the remaining 8)

Would also be curious if you add another node if the time would be
halved, then add another 2... then it would be halved again? I guess
that depends on if striping or just replicating is being used.
(unfortunately I don't have access to more than 1 test box right now).

On Wed, Mar 24, 2010 at 11:06 PM, JeremyEnos<[email protected]> wrote:

For completeness:

##############GLUSTER SINGLE DISK NO PERFORMANCEOPTIONS##############

[r...@ac33 gjenos]# time (tar xzf
/scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&&    sync )

real    0m41.052s
user    0m7.705s
sys     0m3.122s
##############DIRECT SINGLE DISK##############
[r...@ac33 gjenos]# cd /export/jenos
[r...@ac33 jenos]# time (tar xzf
/scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&&    sync )

real    0m22.093s
user    0m6.932s
sys     0m2.459s
[r...@ac33 jenos]#

The performance options don't appear to be the problem. So thequestionstands- how do I get the disk cache advantage through the Glustermounted

filesystem?  It seems to be key in the large performance difference.

    Jeremy

On 3/24/2010 4:47 PM, Jeremy Enos wrote:

Good suggestion- I hadn't tried that yet. It brings them muchcloser.


##############GLUSTER SINGLE DISK##############
[r...@ac33 gjenos]# time (tar xzf
/scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&&    sync )

real    0m32.089s
user    0m6.516s
sys     0m3.177s
##############DIRECT SINGLE DISK##############
[r...@ac33 gjenos]# cd /export/jenos/
[r...@ac33 jenos]# time (tar xzf
/scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&&    sync )

real    0m25.089s
user    0m6.850s
sys     0m2.058s
##############DIRECT SINGLE DISK CACHED##############
[r...@ac33 jenos]# time (tar xzf
/scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz )

real    0m8.955s
user    0m6.785s
sys     0m1.848s

Oddly, I'm seeing better performance on the gluster system thanprevious

tests too (used to be ~39 s).  The direct disk time is obviously
benefiting
from cache.  There is still a difference, but it appears most of the

difference disappears w/ the cache advantage removed. That said-therelative performance issue then still exists with Gluster. Whatcan be

done
to make it benefit from cache the same way direct disk does?
thx-

    Jeremy

P.S.

I'll be posting results w/ performance options completely removedfrom

gluster as soon as I get a chance.

    Jeremy

On 3/24/2010 4:23 PM, Bryan Whitehead wrote:

I'd like to see results with this:

time ( tar xzf/scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&&

  sync )

I've found local filesystems seem to use cache very heavily. The
untarred file could mostly be sitting in ram with local fs vs going
though fuse (which might do many more sync'ed flushes to disk?).

On Wed, Mar 24, 2010 at 2:25 AM, Jeremy Enos<[email protected]>
  wrote:

I also neglected to mention that the underlying filesystem isext3.


On 3/24/2010 3:44 AM, Jeremy Enos wrote:

I haven't tried all performance options disabled yet- I cantry that

tomorrow when the resource frees up.  I was actually asking first
before
blindly trying different configuration matrices in case there's a
clear
direction I should take with it.  I'll let you know.

    Jeremy

On 3/24/2010 2:54 AM, Stephan von Krawczynski wrote:

Hi Jeremy,

have you tried to reproduce with all performance optionsdisabled?

They
are
possibly no good idea on a local system.
What local fs do you use?


--
Regards,
Stephan


On Tue, 23 Mar 2010 19:11:28 -0500
Jeremy Enos<[email protected]>        wrote:

Stephan is correct- I primarily did this test to show a
demonstrable
overhead example that I'm trying to eliminate.  It's pronounced
enough

that it can be seen on a single disk / single nodeconfiguration,

which
is good in a way (so anyone can easily repro).

My distributed/clustered solution would be ideal if it werefast

enough
for small block i/o as well as large block- I was hoping that
single
node systems would achieve that, hence the single node test.
  Because

the single node test performed poorly, I eventually reduceddown tosingle disk to see if it could still be seen, and it clearlycan

be.

Perhaps it's something in my configuration? I've pasted myconfig

files
below.
thx-

      Jeremy

######################glusterfsd.vol######################
volume posix
    type storage/posix
    option directory /export
end-volume

volume locks
    type features/locks
    subvolumes posix
end-volume

volume disk
    type performance/io-threads
    option thread-count 4
    subvolumes locks
end-volume

volume server-ib
    type protocol/server
    option transport-type ib-verbs/server
    option auth.addr.disk.allow *
    subvolumes disk
end-volume

volume server-tcp
    type protocol/server
    option transport-type tcp/server
    option auth.addr.disk.allow *
    subvolumes disk
end-volume

######################ghome.vol######################

#-----------IB remotes------------------
volume ghome
    type protocol/client
    option transport-type ib-verbs/client
#  option transport-type tcp/client
    option remote-host acfs
    option remote-subvolume raid
end-volume

#------------Performance Options-------------------

volume readahead
    type performance/read-ahead
    option page-count 4           # 2 is default option
    option force-atime-update off # default is off
    subvolumes ghome
end-volume

volume writebehind
    type performance/write-behind
    option cache-size 1MB
    subvolumes readahead
end-volume

volume cache
    type performance/io-cache
    option cache-size 1GB
    subvolumes writebehind
end-volume

######################END######################



On 3/23/2010 6:02 AM, Stephan von Krawczynski wrote:

On Tue, 23 Mar 2010 02:59:35 -0600 (CST)
"Tejas N. Bhise"<[email protected]>         wrote:

Out of curiosity, if you want to do stuff only on onemachine,
why do you want to use a distributed, multi node, clustered,
file system ?

Because what he does is a very good way to show the overhead
produced
only by
glusterfs and nothing else (i.e. no network involved).
A pretty relevant test scenario I would say.

--
Regards,
Stephan

Am I missing something here ?

Regards,
Tejas.

----- Original Message -----
From: "Jeremy Enos"<[email protected]>
To: [email protected]
Sent: Tuesday, March 23, 2010 2:07:06 PM GMT +05:30 Chennai,
Kolkata,
Mumbai, New Delhi
Subject: [Gluster-users] gluster local vs local = gluster x4
slower

This test is pretty easy to replicate anywhere- only takes 1
disk,
one

machine, one tarball. Untarring to local disk directly vsthru

gluster

is about 4.5x faster. At first I thought this may be dueto a

slow
host

(Opteron 2.4ghz). But it's not- same configuration, on amuch

faster
machine (dual 3.33ghz Xeon) yields the performance below.

####THIS TEST WAS TO A LOCAL DISK THRU GLUSTER####
[r...@ac33 jenos]# time tar xzf
/scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz

real    0m41.290s
user    0m14.246s
sys     0m2.957s

####THIS TEST WAS TO A LOCAL DISK (BYPASS GLUSTER)####
[r...@ac33 jenos]# cd /export/jenos/
[r...@ac33 jenos]# time tar xzf
/scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz

real    0m8.983s
user    0m6.857s
sys     0m1.844s

####THESE ARE TEST FILE DETAILS####
[r...@ac33 jenos]# tar tzvf
/scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz  |wc -l
109
[r...@ac33 jenos]# ls -l
/scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
-rw-r--r-- 1 jenos ac 804385203 2010-02-07 06:32
/scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
[r...@ac33 jenos]#

These are the relevant performance options I'm using in my.vol

file:

#------------Performance Options-------------------

volume readahead
     type performance/read-ahead
     option page-count 4           # 2 is default option
     option force-atime-update off # default is off
     subvolumes ghome
end-volume

volume writebehind
     type performance/write-behind
     option cache-size 1MB
     subvolumes readahead
end-volume

volume cache
     type performance/io-cache
     option cache-size 1GB
     subvolumes writebehind
end-volume

What can I do to improve gluster's performance?

       Jeremy

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


--
Steven Truelove
Array Systems Computing, Inc.
1120 Finch Avenue West, 7th Floor
Toronto, Ontario
M3J 3H7
CANADA
http://www.array.ca
[email protected]
Phone: (416) 736-0900 x307
Fax: (416) 736-4715

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster local vs local = gluster x4 slower

Reply via email to