I think it would make sense if you have moderate (< 10 million files) and reasonable performance expectations.
Any time you are serving lots of little files pretty much at random, you will have limited performance due to startup costs on each transfer. That will include connection setup, directory search, disk seek and so on. You obviously should put some memory based LRU cache in front to get rid of some of those costs, but the limitations still apply to the residue. Eventually, not having read-only slaves for the name server will hurt you as well. Hence you have have to have realistic performance expectations. That said, having lots of cheap spindles will still help enormously with this kind of load and Hadoop can give you lots of those spindles in a usable form. If your traffic grows beyond the limits of Hadoop before those limits are extended, you will be doing very well indeed. On 4/19/08 9:33 PM, "Otis Gospodnetic" <[EMAIL PROTECTED]> wrote: > Ian - re static files and HDFS - I believe you could, but it would only make > sense if those files are large (e.g. streaming video)
