How many small files do you have? What is the typical size of a file? What
are the file creation/deletion rates?

HDFS stores metadata information about each file in the NameNode's main
memory, so the number of files directly determines the size (CPU and memory)
required by the NameNode.

If you have a cluster with 10 million files, you might need to run the
NameNode on a machine that has 16 GB of ram.

Thanks
dhruba

-----Original Message-----
From: rlucindo [mailto:[EMAIL PROTECTED] 
Sent: Friday, August 03, 2007 7:15 AM
To: hadoop-user
Subject: HDFS and Small Files



I would like to know if anyone is using HDFS as a general purpose file
system (not for MapReduce). If so, how good is HDFS to handle lots of small
files?
I'm considering HDFS as an alternative to MogileFS, a big file system with
basically small files for a web application (the file system will store
html, images, videos, etc) where high availability is essential.
The documentation and wiki shows HDFS as a file system to support MapReduce
of big data volume, but not necessarily big files.

[]'s

Lucindo



Reply via email to