Hi Brian! Thanks for the reply, comments below
Brian J. Murrell wrote:
Instead of just adding another 1TB server, I need to plan for a more
scalable solution. Immediately Lustre came to mind, but I'm wondering
about the performance. Basically our company does niche web-hosting for
"Creative Professionals" so we need fast access to the data in order to
have snappy web services for our clients. Typically these are smaller
files (2MB pictures, 50MB videos, .swf files, etc.).
Well, I'm not sure those files would fall within our general
classification of "small files" (wherein we know we don't perform very
well). Our small-file issues are usually characterized by "kernel
builds" and ~ use, where files are usually much smaller than 1MB.
Aha, OK well then that's good to know. There's also some kind of
read-ahead and client side caching right? So files which are accessed a
lot will be faster to access.
Also I'm wondering about the best way set this up in terms of speed
and ease of growth. I want the web-servers and the storage pool to be
independent of each other. So I can add web-servers as the web traffic
increases, and add more storage ass our storage needs grow.
Well, your web-servers would be Lustre clients. There is no
relationship, or rather requirements in terms of the number of clients
and servers being used. You use as many servers as your client load
demands. So you could imagine both ends of the spectrum where only a
relatively few clients could be used to tax quite a few servers or the
opposite where a lot of clients with modest demand requires only a few
servers.
I was thinking initially we could start with 2 servers, both attached
to the storage array. setup as OSS' and functioning as (load balanced)
web-servers as well.
Sounds like you are describing 2 storage servers, which would require at
least 3 servers total. Don't forget about the MDS. Also don't forget
about HA if that's a concern for you. You could make the 2 OSSes
failover partners for each other if you are willing to accept a possibly
lower performance impact when one of the OSSes failing.
If HA is important to you however, you need to address an MDS failover
with a second server to pick up the MDT should the active MDS fail.
HA is definitely critical, if the storage pool becomes inaccessible we
loose clients (and all fingers point at me!). However, I need to find a
reasonable balance between cost / scalability / performance. The idea
would be to start small, with the simplest configuration, but allow for
a lot of growth. In a years time, if we are using 5TB of data, we will
be in a very good position financially and can afford a systems expansion.
So for starters, what can I get away with here? 1 OSS, 1MDS & 1 Client
node? Is it a smart thing to do to have the MDS and OSS share the same
storage target (just a separate partition for the MDS)? What kind of
system specs are advisable for each type (MDS, OSS & Client node) as far
as RAM, CPU, disk configuration etc? Also, is it possible to add more
OSS' to take over existing OSTs that another OSS was previously
managing? ie. if I have the MD3000i split into 5x1TB volumes (5xOSTs),
and the OSS is getting hammered, I set another OSS up and hand off 2 or
3 OSTs from the old OSS to the new one, and set it up as failover for
the remaining OSTs. Do-able?
As for OSSes being web-servers, that would require the OSS/Webservers
also be clients and that is an unsupported configuration due to the risk
of deadlock due to memory pressure. The recommended architecture would
be to make the webservers Lustre clients.
I see, so from the get-go I'm going to need an internal gigE network for
OSS/Client communication.
performance can I expect, am I out of touch to expect something similar
to a directly attached RAID array?
I think our generally talked about numbers are something on the order of
achieving 80% of the raw storage bandwidth (assuming a capable network
and so on). Maybe somebody who is closer to the benchmarking that we
are constantly doing can comment further on how close-to-raw-disk we are
achieving lately.
Is it safe to say my bottleneck is going to be the OSS & not the
network? Is there some documentation I can read about typical setups,
usage cases & methods for optimal performance?
Thanks!
-Nick
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss