Re: [Gluster-users] ZFS + Linux + Glusterfs for a production ready 100+ TB NAS on cloud

Joe Landman Thu, 29 Sep 2011 10:57:44 -0700

On 09/29/2011 01:44 PM, David Miller wrote:

On Thu, Sep 29, 2011 at 1:32 PM, David Miller <[email protected]
<mailto:[email protected]>> wrote:


    Couldn't you  accomplish the same thing with flashcache?
    https://github.com/facebook/flashcache/


I should expand on that a little bit.  Flashcache is a kernel module
created by Facebook that uses the device mapper interface in Linux to
provide a ssd cache layer to any block device.

What I think would be interesting is using flashcache with a pcie ssd as
the caching device.  That would add about $500-$600 to the cost of each
brick node but should be able to buffer the active IO from the spinning
media pretty well.

Erp ... low end PCIe flash with decent performance start much higherthan 500-600 $ USD.

Somthing like this.
http://www.amazon.com/OCZ-Technology-Drive-240GB-Express/dp/B0058RECUE
or something from FusionIO if you want something that's aimed more at
the enterprise.

Flashcache is reasonably good, but there are many variables in using it,and its designed for a different use case. For most people thewriteback may be reasonable, but other use cases would require differentconfigs.

This said, please understand that it (and L2ARC, and other similarthings) are *not* silver bullets (e.g. not magical things that willinstantly make something far better, at no cost/effort). They dointroduce additional complexity, and additional tuning points.

The thing you cannot get rid of, the network traversal, is implicated inmuch of the performance degradation for small files. Putting the filesystem on a RAM disk (if possible, tmpfs doesn't support xattrs),wouldn't make the system much faster for small files. Eliminating thenetwork traversal and doing local distributed caching of metadata on theclient side ... could ... but this would be a huge new complication, andI'd argue that it probably isn't worth it.

For the short duration, small file performance is going to be bad. Youmight be able to play some games to make this performance better (L2ARCetc. could help in some aspects, but they won't be universally much better).

What matters most is very good design on the storage backend (we arebiased due to what it is we sell/support), very good networking, andvery good gluster implementation/tuning. Its real easy to hit very slowperformance by missing critical elements. We field many inquiries whichusually start out with "we built our own and the performance isn't thatgood." You won't get good performance on the cluster file system if theunderlying file system and storage design isn't going to give it to youin the first place.

This said, please understand that there is a (significant) performancecost to all those nice features in ZFS. And there is a reason why itsnot generally considered a high performance file system. So if youstart building with it, you shouldn't necessarily think that the wholeis going to be faster than the sum of the parts. Might be worse.

This is a caution from someone who has tested/shipped many differentfile systems in the past. ZFS included, on Solaris and other machines.There is a very significant performance penalty one pays for usingsome of these features. You have to decide if this penalty is worth it.



--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: [email protected]
web  : http://scalableinformatics.com
       http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Re: [Gluster-users] ZFS + Linux + Glusterfs for a production ready 100+ TB NAS on cloud

Reply via email to