Right, so what I did is: - on one node (gluster 3.7.3), run 'gluster volume shared profile start' - on the client mount, run the test - on the node, run 'gluster volume shared profile info' (and copied the output) - finally, ran 'gluster volume profile shared stop'
I repeated this for two different tests (simple rm followed by svn checkout, and a more complete build test), on an NFS mount and on a Fuse mount. To my surprise the svn checkout is actually a lot faster (3x) on the Fuse mount than NFS. However the build test is a lot slower on the Fuse mount (+50%, which is a lot considering the compilation is CPU intensive, not just I/Os!). Ben I will send you the profile outputs separately now... On 29 Sep 2015 9:40 pm, "Ben Turner" <[email protected]> wrote: > ----- Original Message ----- > > From: "Thibault Godouet" <[email protected]> > > To: "Ben Turner" <[email protected]> > > Cc: [email protected], [email protected] > > Sent: Tuesday, September 29, 2015 1:36:20 PM > > Subject: Re: [Gluster-users] Tuning for small files > > > > Ben, > > > > I suspect meta-data / 'ls -l' performance is very important for my svn > > use-case. > > > > Having said that, what do you mean by small file performance? I thought > > what people meant by this was really the overhead of meta-data, with a > 'ls > > -l' being a sort of extreme case (pure meta-data). > > Obviously if you also have to read and write actual data (albeit not much > > at all per file), then the effect of meta-data overhead would get diluted > > to a degree, bit potentially still very present. > > Where you run into problems with smallfiles on gluster is latency of > sending data over the wire. For every smallfile create there are a bunch > of different file opetations we have to do on every file. For example we > will have to do at least 1 lookup per brick to make sure that the file > doesn't exist anywhere before we create it. We actually got it down to 1 > per brick with lookup optimize on, its 2 IIRC(maybe more?) with it > disabled. So the time we spend waiting for those lookups to complete adds > to latency which lowers the number of files that can be created in a given > period of time. Lookup optimize was implemented in 3.7 and like I said its > now at the optimal 1 lookup per brick on creates. > > The other problem with small files that we had in 3.6 is that we were > using a single threaded event listener(epoll is what we call it). This > single thread would spike a CPU to 100%(called a hot thread) and glusterfs > would become CPU bound. The solution here was to make the event listener > multi threaded so that we could spread the epoll load across CPUs there by > eliminating the CPU bottleneck and allowing us to process more events in a > given time. FYI epoll is defaulted to 2 threads in 3.7, but I have seen > cases where I still bottlenecked on CPU without 4 threads in my envs, so I > usually do 4. This was implemented in upstream 3.7 but was backported to > RHGS 3.0.4 if you have a RH based version. > > Fixing these two issues lead to the performance gains I was talking about > with smallfile creates. You are probably thinking from a distributed FS + > metadata server perspective(MDS) where the bottleneck is the MDS for > smallfiles. Since gluster doesn't have an MDS that load is transferred to > the clients / servers and this lead to a CPU bottleneck when epoll was > single threaded. I think this is the piece you may have been missing. > > > > > Would there be an easy way to tell how much time is spent on meta-data > vs. > > Data in a profile output? > > Yep! Can you gather some profiling info and send it to me? > > > > > One thing I wonder: do your comments apply to both native Fuse and NFS > > mounts? > > > > Finally, all this brings me back to my initial question really: are there > > any tuning recommendation of configuration tuning for my requirement > (small > > file read/writes on a pair of nodes with replication) beyond the thread > > counts and lookup optimize? > > Or are those by far the most important in this scenario? > > For creating a bunch of small files those are the only two that I know of > that will have a large impact, maybe some others from the list can give > some input on anything else we can do here. > > -b > > > > > Thx, > > Thibault. > > ----- Original Message ----- > > > From: [email protected] > > > To: [email protected] > > > Cc: [email protected] > > > Sent: Monday, September 28, 2015 7:40:52 AM > > > Subject: Re: [Gluster-users] Tuning for small files > > > > > > I'm also quite interested by small files performances optimization, but > > > I'm a bit confused about the best option between 3.6/3.7. > > > > > > Ben Turner was saying that 3.6 might give the best performances: > > > > http://www.gluster.org/pipermail/gluster-users/2015-September/023733.html > > > > > > What kind of gain is expected (with consistent-metadata) if this > > > regression is solved? > > > > Just to be clear, the issue I am talking about is metadata only(think ls > -l > > or file browsing). It doesn't affect small file perf(well not that much, > > I'm sure a little, but I have never quantified it), with server and > client > > event threads set to 4 + lookup optimize I see between a 200-300% gain on > > my systems on 3.7 vs 3.6 builds. If I needed fast metadata I would go > with > > 3.6, if I need fast smallfile I would go with 3.7. If I needed both I > > would pick the less of the two evils and go with that one and upgrade > when > > the fix is released. > > > > -b > > > > > > > > > > I tried 3.6.5 (last version for debian jessie), and it's a bit better > > > than 3.7.4 but not by much (10-15%). > > > > > > I was also wondering if there is recommendations for the underlying > file > > > system of the bricks (xfs, ext4, tuning...). > > > > > > > > > Regards > > > > > > Thomas HAMEL > > > > > > On 2015-09-28 12:04, André Bauer wrote: > > > > If you're not already on Glusterfs 3.7.x i would recommend an update > > > > first. > > > > > > > > Am 25.09.2015 um 17:49 schrieb Thibault Godouet: > > > >> Hi, > > > >> > > > >> There are quite a few tuning parameters for Gluster (as seen in > > > >> Gluster > > > >> volume XYZ get all), but I didn't find much documentation on those. > > > >> Some people do seem to set at least some of them, so the knowledge > > > >> must > > > >> be somewhere... > > > >> > > > >> Is there a good source of information to understand what they mean, > > > >> and > > > >> recommendation on how to set them to get a good small file > > > >> performance? > > > >> > > > >> Basically what I'm trying to optimize is for svn operations (e.g. > svn > > > >> checkout, or svn branch) on a replicated 2 x 1 volume (hosted on 2 > > > >> VMs, > > > >> 16GB ram, 4 cores each, 10Gb/s network tested at full speed), using > a > > > >> NFS mount which appears much faster than fuse in this case (but > still > > > >> much slower than when served by a normal NFS server). > > > >> Any recommendation for such a setup? > > > >> > > > >> Thanks, > > > >> Thibault. > > > >> > > > >> > > > >> > > > >> _______________________________________________ > > > >> Gluster-users mailing list > > > >> [email protected] > > > >> http://www.gluster.org/mailman/listinfo/gluster-users > > > >> > > > > > > > > > > > > -- > > > > Mit freundlichen Grüßen > > > > André Bauer > > > > > > > > MAGIX Software GmbH > > > > André Bauer > > > > Administrator > > > > August-Bebel-Straße 48 > > > > 01219 Dresden > > > > GERMANY > > > > > > > > tel.: 0351 41884875 > > > > e-mail: [email protected] > > > > [email protected] <mailto:Email> > > > > www.magix.com <http://www.magix.com/> > > > > > > > > > > > > Geschäftsführer | Managing Directors: Dr. Arnd Schröder, Michael > Keith > > > > Amtsgericht | Commercial Register: Berlin Charlottenburg, HRB 127205 > > > > > > > > Find us on: > > > > > > > > <http://www.facebook.com/MAGIX> <http://www.twitter.com/magix_de> > > > > <http://www.youtube.com/wwwmagixcom> <http://www.magixmagazin.de> > > > > > ---------------------------------------------------------------------- > > > > The information in this email is intended only for the addressee > named > > > > above. Access to this email by anyone else is unauthorized. If you > are > > > > not the intended recipient of this message any disclosure, copying, > > > > distribution or any action taken in reliance on it is prohibited and > > > > may be unlawful. MAGIX does not warrant that any attachments are free > > > > from viruses or other defects and accepts no liability for any losses > > > > resulting from infected email transmissions. Please note that any > > > > views expressed in this email may be those of the originator and do> > > > > Gluster-users mailing list > > > > [email protected] > > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > _______________________________________________ > > > Gluster-users mailing list > > > [email protected] > > > http://www.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ > > Gluster-users mailing list > > [email protected] > > http://www.gluster.org/mailman/listinfo/gluster-users > > >
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
