But, how can I say that a 1KB file will only use 1KB of disc space, if a block is configured has 64MB? In my view, if a 1KB use a block of 64MB, the file will occupy 64MB in the disc.
How can you disassociate a 64MB data block from HDFS of a disk block? On Fri, Jun 10, 2011 at 5:01 PM, Marcos Ortiz <mlor...@uci.cu> wrote: > On 06/10/2011 10:35 AM, Pedro Costa wrote: > > Hi, > > If I define HDFS to use blocks of 64 MB, and I store in HDFS a 1KB > file, this file will ocupy 64MB in the HDFS? > > Thanks, > > HDFS is not very efficient storing small files, because each file is stored > in a block (of 64 MB in your case), and the block metadata > is held in memory by the NN. But you should know that this 1KB file only > will use 1KB of disc space. > > For small files, you can use Hadoop archives. > Regards > > -- > Marcos Luís Ortíz Valmaseda > Software Engineer (UCI) > http://marcosluis2186.posterous.com > http://twitter.com/marcosluis2186 > >