That said, we routinely deal with large number of files by the simple expedient of packaging many files together for storage in HDFS. If you can do that, you are set.
On 9/18/07 6:27 PM, "Raghu Angadi" <[EMAIL PROTECTED]> wrote: > Andrew Cantino wrote: >> I know that HDFS can handle huge files, but how does it do with very >> large numbers of medium sized files? I'm interested in using it to >> store very large numbers of files (tens to hundreds of millions). >> Will this be a problem? > > pretty much. On a 64 bit JVM, with the current Hadoop (0.14.x), memory > required would be : ~ 600 * (number of files + total number of blocks > across the files). > > With trunk the picture is much better.. around 300 * (...), but still > not enough for hundreds of millions of files (on a 16 GB machine). > > Raghu. > >> Thanks, >> >
