Hello Gluster Gurus, I'm trying to find out what performance data you could get while trying eDiscovery searching application in a namespace with over 3 billins small files on GlusterFS...
Thanks & Good w/e Henry PAN Sr. Data Storage Eng/Adm Iron Mountain 650-962-6184 (o) 650-930-6544 (c) [email protected] -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of [email protected] Sent: Saturday, January 15, 2011 1:20 AM To: [email protected] Subject: Gluster-very bad performance on small files Send Gluster-users mailing list submissions to [email protected] To subscribe or unsubscribe via the World Wide Web, visit http://gluster.org/cgi-bin/mailman/listinfo/gluster-users or, via email, send a message with subject or body 'help' to [email protected] You can reach the person managing the list at [email protected] When replying, please edit your Subject line so it is more specific than "Re: Contents of Gluster-users digest..." Today's Topics: 1. Re: very bad performance on small files (Marcus Bointon) 2. Re: very bad performance on small files (Joe Landman) 3. Re: very bad performance on small files (Max Ivanov) 4. Re: very bad performance on small files (Joe Landman) 5. Re: very bad performance on small files (Marcus Bointon) 6. Re: very bad performance on small files (Joe Landman) 7. Re: very bad performance on small files (Max Ivanov) 8. Re: very bad performance on small files (Rudi Ahlers) ---------------------------------------------------------------------- Message: 1 Date: Fri, 14 Jan 2011 22:50:37 +0100 From: Marcus Bointon <[email protected]> Subject: Re: [Gluster-users] very bad performance on small files To: Gluster General Discussion List <[email protected]> Message-ID: <[email protected]> Content-Type: text/plain; charset=us-ascii On 14 Jan 2011, at 18:58, Jacob Shucart wrote: > This kind of thing is fine on local disks, but when you're talking about a > distributed filesystem the network latency starts to add up since 1 > request to the web server results in a bunch of file requests. I think the main objection is that it takes a huge amount of network latency to explain a > 1,500% overhead with only 2 machines. On 14 Jan 2011, at 15:20, Joe Landman wrote: > MB size or larger So does gluster become faster abruptly when file sizes cross some threshold? Or are average speeds are proportional to file size? Would be good to see a wider spread of values on benchmarks of throughput vs file size for the same overall volume (like Max's data but with more intermediate values) Marcus ------------------------------ Message: 2 Date: Fri, 14 Jan 2011 17:12:01 -0500 From: Joe Landman <[email protected]> Subject: Re: [Gluster-users] very bad performance on small files To: [email protected] Message-ID: <[email protected]> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 01/14/2011 04:50 PM, Marcus Bointon wrote: > On 14 Jan 2011, at 18:58, Jacob Shucart wrote: > >> This kind of thing is fine on local disks, but when you're talking >> about a distributed filesystem the network latency starts to add up >> since 1 request to the web server results in a bunch of file >> requests. > > I think the main objection is that it takes a huge amount of network > latency to explain a> 1,500% overhead with only 2 machines. If most of your file access times are dominated by latency (e.g. small, seeky like loads), and you are going over a gigabit connection, yeah, your performance is going to crater on any cluster file system. Local latency to traverse the storage stack is on the order of 10's of microseconds. Physical latency of the disk medium is on the order of 10's of microseconds for RAMdisk, 100's of microseconds for flash/ssd, and 1000's of microseconds (e.g. milliseconds) for spinning rust. Now take 1 million small file writes. Say 1024 bytes. These million writes have to traverse the storage stack in the kernel to get to disk. Now add in a network latency event on the order of 1000's of microseconds for the remote storage stack and network stack to respond. I haven't measured it yet in a methodical manner, but I wouldn't be surprised to see IOP rates within a factor of 2 of the bare metal for a sufficiently fast network such as Infiniband, and within a factor of 4 or 5 for a slow network like Gigabit. Our own experience has been generally that you are IOP constrained because of the stack you have to traverse. If you add more latency into this stack, you have more to traverse, and therefore, you have more you need to wait. Which will have a magnification effect upon times for small IO ops which are seeky (stat, small writes, random ops). > > On 14 Jan 2011, at 15:20, Joe Landman wrote: > >> MB size or larger > > So does gluster become faster abruptly when file sizes cross some > threshold? Or are average speeds are proportional to file size? Would Its a continuous curve, and very much user load specific. The fewer seeky operations you can do the better (true of all cluster file systems). > be good to see a wider spread of values on benchmarks of throughput > vs file size for the same overall volume (like Max's data but with > more intermediate values) I haven't seen Max's data, so I can't comment on this. Understand that performance is going to be bound by many things. One of many things is the speed of the spinning disk if thats what you use. Another will be network. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: [email protected] web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 ------------------------------ Message: 3 Date: Fri, 14 Jan 2011 22:19:58 +0000 From: Max Ivanov <[email protected]> Subject: Re: [Gluster-users] very bad performance on small files To: [email protected] Message-ID: <[email protected]> Content-Type: text/plain; charset=UTF-8 > I haven't seen Max's data, so I can't comment on this. Understand that > performance is going to be bound by many things. One of many things is the > speed of the spinning disk if thats what you use. Another will be network. > It is very similair to kernel source tree - tons of small (2-20kb) files. 1.1G in total. ------------------------------ Message: 4 Date: Fri, 14 Jan 2011 17:20:58 -0500 From: Joe Landman <[email protected]> Subject: Re: [Gluster-users] very bad performance on small files To: [email protected] Message-ID: <[email protected]> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 01/14/2011 05:19 PM, Max Ivanov wrote: >> I haven't seen Max's data, so I can't comment on this. Understand that >> performance is going to be bound by many things. One of many things is the >> speed of the spinning disk if thats what you use. Another will be network. >> > > It is very similair to kernel source tree - tons of small (2-20kb) > files. 1.1G in total. Ok, worth looking into > _______________________________________________ > Gluster-users mailing list > [email protected] > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: [email protected] web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 ------------------------------ Message: 5 Date: Sat, 15 Jan 2011 00:26:53 +0100 From: Marcus Bointon <[email protected]> Subject: Re: [Gluster-users] very bad performance on small files To: Gluster General Discussion List <[email protected]> Message-ID: <[email protected]> Content-Type: text/plain; charset=us-ascii On 14 Jan 2011, at 23:12, Joe Landman wrote: > If most of your file access times are dominated by latency (e.g. small, seeky > like loads), and you are going over a gigabit connection, yeah, your > performance is going to crater on any cluster file system. > > Local latency to traverse the storage stack is on the order of 10's of > microseconds. Physical latency of the disk medium is on the order of 10's of > microseconds for RAMdisk, 100's of microseconds for flash/ssd, and 1000's of > microseconds (e.g. milliseconds) for spinning rust. > > Now take 1 million small file writes. Say 1024 bytes. These million writes > have to traverse the storage stack in the kernel to get to disk. > > Now add in a network latency event on the order of 1000's of microseconds for > the remote storage stack and network stack to respond. > > I haven't measured it yet in a methodical manner, but I wouldn't be surprised > to see IOP rates within a factor of 2 of the bare metal for a sufficiently > fast network such as Infiniband, and within a factor of 4 or 5 for a slow > network like Gigabit. > > Our own experience has been generally that you are IOP constrained because of > the stack you have to traverse. If you add more latency into this stack, you > have more to traverse, and therefore, you have more you need to wait. Which > will have a magnification effect upon times for small IO ops which are seeky > (stat, small writes, random ops). Sure, and all that applies equally to both NFS and gluster, yet in Max's example NFS was ~50x faster than gluster for an identical small-file workload. So what's gluster doing over and above what NFS is doing that's taking so long, given that network and disk factors are equal? I'd buy a factor of 2 for replication, but not 50. In case you missed what I'm on about, it was these stats that Max posted: > Here is the results per command: > dd if=/dev/zero of=M/tmp bs=1M count=16384 69.2 MB/se (Native) 69.2 > MB/sec(FUSE) 52 MB/sec (NFS) > dd if=/dev/zero of=M/tmp bs=1K count=163840000 88.1 MB/sec (Native) > 1.1MB/sec (FUSE) 52.4 MB/sec (NFS) > time tar cf - M | pv > /dev/null 15.8 MB/sec (native) 3.48MB/sec > (FUSE) 254 Kb/sec (NFS) In my case I'm running 30kiops SSDs over gigabit. At the moment my problem (running 3.0.6) isn't performance but reliability - files are occasionally reported as 'vanished' by front-end apps (like rsync) even though they are present on both backing stores; no errors in gluster logs, self-heal doesn't help. Marcus ------------------------------ Message: 6 Date: Fri, 14 Jan 2011 18:51:39 -0500 From: Joe Landman <[email protected]> Subject: Re: [Gluster-users] very bad performance on small files To: [email protected] Message-ID: <[email protected]> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 01/14/2011 06:26 PM, Marcus Bointon wrote: >> Our own experience has been generally that you are IOP constrained >> because of the stack you have to traverse. If you add more latency >> into this stack, you have more to traverse, and therefore, you have >> more you need to wait. Which will have a magnification effect upon >> times for small IO ops which are seeky (stat, small writes, random >> ops). > > Sure, and all that applies equally to both NFS and gluster, yet in > Max's example NFS was ~50x faster than gluster for an identical > small-file workload. So what's gluster doing over and above what NFS > is doing that's taking so long, given that network and disk factors > are equal? I'd buy a factor of 2 for replication, but not 50. If the NFS was doing attribute caching and the GlusterFS implementation had stat prefetch and other caching turned off, this could explain it. > In case you missed what I'm on about, it was these stats that Max > posted: > >> Here is the results per command: dd if=/dev/zero of=M/tmp bs=1M >> count=16384 69.2 MB/se (Native) 69.2 MB/sec(FUSE) 52 MB/sec (NFS) >> dd if=/dev/zero of=M/tmp bs=1K count=163840000 88.1 MB/sec >> (Native) 1.1MB/sec (FUSE) 52.4 MB/sec (NFS) time tar cf - M | pv> >> /dev/null 15.8 MB/sec (native) 3.48MB/sec (FUSE) 254 Kb/sec (NFS) Ok, I am not sure if I saw the numbers before. Thanks. > > In my case I'm running 30kiops SSDs over gigabit. At the moment my > problem (running 3.0.6) isn't performance but reliability - files are > occasionally reported as 'vanished' by front-end apps (like rsync) > even though they are present on both backing stores; no errors in > gluster logs, self-heal doesn't help. Check your stat-prefetch settings, and your time base. We've had some strange issues that seem to be correlated with time bases drifting. Including files disappearing. We have a few open tickets on this. The way we've worked around this problem is to abandon the NFS client and use the glusterfs client. Not our preferred option, but it provides a workaround for the moment. The NFS translator does appear to have a few issues. I am hoping we get more tuning knobs for it soon so we can see if we can work around this. Regards, Joe > > Marcus _______________________________________________ Gluster-users > mailing list [email protected] > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. email: [email protected] web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 ------------------------------ Message: 7 Date: Sat, 15 Jan 2011 00:30:15 +0000 From: Max Ivanov <[email protected]> Subject: Re: [Gluster-users] very bad performance on small files To: Marcus Bointon <[email protected]> Cc: Gluster General Discussion List <[email protected]> Message-ID: <[email protected]> Content-Type: text/plain; charset=UTF-8 > Sure, and all that applies equally to both NFS and gluster, yet in Max's > example NFS was ~50x faster than gluster for an identical small-file > workload. So what's gluster doing over and above what NFS is doing that's > taking so long, given that network and disk factors are equal? I'd buy a > factor of 2 for replication, but not 50. > Sorry If I didnt make it clear but both NFS in my tests is not well known classic NFS but glusterfs in NFS mode. ------------------------------ Message: 8 Date: Sat, 15 Jan 2011 11:18:22 +0200 From: Rudi Ahlers <[email protected]> Subject: Re: [Gluster-users] very bad performance on small files To: Jacob Shucart <[email protected]> Cc: [email protected] Message-ID: <sig.3996530d0f.AANLkTinY=zubjghto470ygtwhd_vzbb6fpj4-we+m...@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Fri, Jan 14, 2011 at 7:58 PM, Jacob Shucart <[email protected]> wrote: > For web hosting it is best to put user generated content(images, etc) on > Gluster but to leave application files like PHP files on the local disk. > This is because a single application file request could result in 20 other > file requests since applications like PHP use includes/inherits, etc. > This kind of thing is fine on local disks, but when you're talking about a > distributed filesystem the network latency starts to add up since 1 > request to the web server results in a bunch of file requests. > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Max Ivanov > Sent: Friday, January 14, 2011 6:09 AM > To: Burnash, James > Cc: [email protected] > Subject: Re: [Gluster-users] very bad performance on small files > >> Gluster - and in fact most (all?) parallel filesystems are optimized for > very large files. That being the case, small files are not retrieved as > efficiently, and result in a larger number of file operations in total > because there are a fixed number for each file accessed. > > > Which makes glusterfs perfomance unacceptable for web hosting purposes =( > _______________________________________________ So what can one use for webhosting purposes? We use XEN / KVM virtual machines, hosted on NAS devices but the NAS devices doesn't have an easy upgrade path. We literally have to rsync all the data to the new device and then shutdown all the machines on the old one and restart them on the new one. They don't provide 100% uptime either. So I'm looking for something with easier upgrade (GlusterFS can do this) and better uptime (again, GlusterFS can do this). But it's clear that GlusterFS isn't made for small files, so what else could work well for us? -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 ------------------------------ _______________________________________________ Gluster-users mailing list [email protected] http://gluster.org/cgi-bin/mailman/listinfo/gluster-users End of Gluster-users Digest, Vol 33, Issue 23 ********************************************* The information contained in this email message and its attachments is intended only for the private and confidential use of the recipient(s) named above, unless the sender expressly agrees otherwise. Transmission of email over the Internet is not a secure communications medium. If you are requesting or have requested the transmittal of personal data, as defined in applicable privacy laws by means of email or in an attachment to email, you must select a more secure alternate means of transmittal that supports your obligations to protect such personal data. If the reader of this message is not the intended recipient and/or you have received this email in error, you must take no action based on the information in this email and you are hereby notified that any dissemination, misuse or copying or disclosure of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by email and delete the original message. _______________________________________________ Gluster-users mailing list [email protected] http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
