Hi Ian,

Though it may seem irrelevant, but before answering your questions I'd like
to understand
what is your web content caching strategy ?

With heavy traffic web sites like the one you described caching may
tremendously improve your performance
at all tiers and especially for storage.

There are several caching options available: browser caching(Cache-Control
and Expires HTTP headers...), memory caching (mod_memcached,mod_mem_cache),
caching proxy (nginx, varnish).

Static pages caching is relatively easy to implement in the web server layer
without code modification, for dynamic resources caching code modification
could be required.

So, I'd strongly suggest to explore all your caching options and make
decision on storage update after that.


To answer your questions:

> 1. Should we consider running a VM on this same server and host e.g. the
web server on a VM which accesses files through the virtualization
layer, rather than a physical network interconnect.

I'd recommend to keep storage separate from web servers, it will let your
web tire to scale.
It is also more secure that way.

> 2. What combination of network filesystem and local file system
combination makes sense? (currently NFS + ext4 is on the cards)
> 3. Should we consider alternatives to GigE for interconnect.

It depends on several factors, one of the important ones is how NFS server
on the storage side is implemented.
On the client side NFS+ext4 over GB interface are usually sufficient. There
are several tools available which allow to simulate
network traffic and measure performance, you can also write your own script
for basic measurements.

> 4. How can we estimate our IOPs and throughput requirements?

That's a tough one. Try collecting web server logs for several days and from
logs calculate total size of downloaded data per day.
>From that you can calculate approximate throughput average per hour/second,
etc
That should give you some ideas about throughput requirements. Think about
expected web site usage increase to project potential throughput for the
future.

Regards,

- Eugene Gorelik

On Wed, Aug 11, 2010 at 1:55 PM, Ian Stokes-Rees <
[email protected]> wrote:

>
> Diligent readers will recall the thread a few weeks ago on slow disk
> performance with a PATA XRaid system from Apple (HFS, RAID5).  Having
> evaluated the situation, we're looking to get a new file server that
> combines some fast disk with some bulk storage.  We have a busy web
> server that is mostly occupied with serving static content (read only
> access), some dynamic content (Django portal with mod_python/httpd), and
> then scientific compute users who do lots of writes (including a 100
> core cluster).
>
> We have about a $10k budget (ideally $8k).  The current plan looks
> roughly like this:
>
> AMD quad socket MB
> 1x12-core AMD CPU
> 8 GB RAM
> 2x160 GB 7200 RPM SATA drives for system software
> 11x300 GB 15000 RPM SAS2 fast storage (RAID10 + 1 hot swap, 1.5 TB volume)
> 5x2 TB 7200 RPM SATA drives (RAID10 + 1 hot swap, 4 TB volume)
>
> A 3U chassis will be filled, and the 4U chassis will have some empty bays.
>
> We can also upgrade processors and RAM as funds become available and the
> need arises.
>
> This will support a compute cluster (~100 cores), 10-20 users (typically
> 3-4 active), and a busy web server.
>
> Besides the obvious question of whether this setup is sensible/cost
> efficient (mixing two kinds of storage, etc.), the main unknowns we have
> are:
>
> 1. Should we consider running a VM on this same server and host e.g. the
> web server on a VM which accesses files through the virtualization
> layer, rather than a physical network interconnect.
>
> 2. What combination of network filesystem and local file system
> combination makes sense? (currently NFS + ext4 is on the cards)
>
> 3. Should we consider alternatives to GigE for interconnect.
>
> 4. How can we estimate our IOPs and throughput requirements?
>
> 5. Perspectives on SLC SSDs vs. SAS2 w/ 15k drives, since we could
> probably transfer the 11x300 GB SAS2 drive budget to a collection of
> SSDs and live with the reduced storage if that was expected to have a
> big performance benefit.
>
> Thanks in advance for any opinions on this.
>
> Ian
>
>
> _______________________________________________
> bblisa mailing list
> [email protected]
> http://www.bblisa.org/mailman/listinfo/bblisa
>
_______________________________________________
bblisa mailing list
[email protected]
http://www.bblisa.org/mailman/listinfo/bblisa

Reply via email to