Hi Ian, Though it may seem irrelevant, but before answering your questions I'd like to understand what is your web content caching strategy ?
With heavy traffic web sites like the one you described caching may tremendously improve your performance at all tiers and especially for storage. There are several caching options available: browser caching(Cache-Control and Expires HTTP headers...), memory caching (mod_memcached,mod_mem_cache), caching proxy (nginx, varnish). Static pages caching is relatively easy to implement in the web server layer without code modification, for dynamic resources caching code modification could be required. So, I'd strongly suggest to explore all your caching options and make decision on storage update after that. To answer your questions: > 1. Should we consider running a VM on this same server and host e.g. the web server on a VM which accesses files through the virtualization layer, rather than a physical network interconnect. I'd recommend to keep storage separate from web servers, it will let your web tire to scale. It is also more secure that way. > 2. What combination of network filesystem and local file system combination makes sense? (currently NFS + ext4 is on the cards) > 3. Should we consider alternatives to GigE for interconnect. It depends on several factors, one of the important ones is how NFS server on the storage side is implemented. On the client side NFS+ext4 over GB interface are usually sufficient. There are several tools available which allow to simulate network traffic and measure performance, you can also write your own script for basic measurements. > 4. How can we estimate our IOPs and throughput requirements? That's a tough one. Try collecting web server logs for several days and from logs calculate total size of downloaded data per day. >From that you can calculate approximate throughput average per hour/second, etc That should give you some ideas about throughput requirements. Think about expected web site usage increase to project potential throughput for the future. Regards, - Eugene Gorelik On Wed, Aug 11, 2010 at 1:55 PM, Ian Stokes-Rees < [email protected]> wrote: > > Diligent readers will recall the thread a few weeks ago on slow disk > performance with a PATA XRaid system from Apple (HFS, RAID5). Having > evaluated the situation, we're looking to get a new file server that > combines some fast disk with some bulk storage. We have a busy web > server that is mostly occupied with serving static content (read only > access), some dynamic content (Django portal with mod_python/httpd), and > then scientific compute users who do lots of writes (including a 100 > core cluster). > > We have about a $10k budget (ideally $8k). The current plan looks > roughly like this: > > AMD quad socket MB > 1x12-core AMD CPU > 8 GB RAM > 2x160 GB 7200 RPM SATA drives for system software > 11x300 GB 15000 RPM SAS2 fast storage (RAID10 + 1 hot swap, 1.5 TB volume) > 5x2 TB 7200 RPM SATA drives (RAID10 + 1 hot swap, 4 TB volume) > > A 3U chassis will be filled, and the 4U chassis will have some empty bays. > > We can also upgrade processors and RAM as funds become available and the > need arises. > > This will support a compute cluster (~100 cores), 10-20 users (typically > 3-4 active), and a busy web server. > > Besides the obvious question of whether this setup is sensible/cost > efficient (mixing two kinds of storage, etc.), the main unknowns we have > are: > > 1. Should we consider running a VM on this same server and host e.g. the > web server on a VM which accesses files through the virtualization > layer, rather than a physical network interconnect. > > 2. What combination of network filesystem and local file system > combination makes sense? (currently NFS + ext4 is on the cards) > > 3. Should we consider alternatives to GigE for interconnect. > > 4. How can we estimate our IOPs and throughput requirements? > > 5. Perspectives on SLC SSDs vs. SAS2 w/ 15k drives, since we could > probably transfer the 11x300 GB SAS2 drive budget to a collection of > SSDs and live with the reduced storage if that was expected to have a > big performance benefit. > > Thanks in advance for any opinions on this. > > Ian > > > _______________________________________________ > bblisa mailing list > [email protected] > http://www.bblisa.org/mailman/listinfo/bblisa >
_______________________________________________ bblisa mailing list [email protected] http://www.bblisa.org/mailman/listinfo/bblisa
