I think it would make sense if you have moderate (< 10 million files) and
reasonable performance expectations.

Any time you are serving lots of little files pretty much at random, you
will have limited performance due to startup costs on each transfer.  That
will include connection setup, directory search, disk seek and so on.  You
obviously should put some memory based LRU cache in front to get rid of some
of those costs, but the limitations still apply to the residue.  Eventually,
not having read-only slaves for the name server will hurt you as well.
Hence you have have to have realistic performance expectations.

That said, having lots of cheap spindles will still help enormously with
this kind of load and Hadoop can give you lots of those spindles in a usable
form.  If your traffic grows beyond the limits of Hadoop before those limits
are extended, you will be doing very well indeed.


On 4/19/08 9:33 PM, "Otis Gospodnetic" <[EMAIL PROTECTED]> wrote:

> Ian - re static files and HDFS - I believe you could, but it would only make
> sense if those files are large (e.g. streaming video)

Reply via email to