How many small files do you have? What is the typical size of a file? What are the file creation/deletion rates?
HDFS stores metadata information about each file in the NameNode's main memory, so the number of files directly determines the size (CPU and memory) required by the NameNode. If you have a cluster with 10 million files, you might need to run the NameNode on a machine that has 16 GB of ram. Thanks dhruba -----Original Message----- From: rlucindo [mailto:[EMAIL PROTECTED] Sent: Friday, August 03, 2007 7:15 AM To: hadoop-user Subject: HDFS and Small Files I would like to know if anyone is using HDFS as a general purpose file system (not for MapReduce). If so, how good is HDFS to handle lots of small files? I'm considering HDFS as an alternative to MogileFS, a big file system with basically small files for a web application (the file system will store html, images, videos, etc) where high availability is essential. The documentation and wiki shows HDFS as a file system to support MapReduce of big data volume, but not necessarily big files. []'s Lucindo
