If you set the BS lesser then 64MB, you’ll get into Namenode issues when a larger file will be read by a client. The client will ask for every block the NN - imagine what happen when you want to read a 1TB file. The optimal BS size is 128MB. You have to have in mind, that every block will be replicated (typically 3 times). And since Hadoop is made to store large files in a JBOD (just a bunch of disks) configuration, a BS lesser than 64MB would also overwhelmed the physical disks.
BR, Alex > On 12 May 2015, at 07:47, Krishna Kishore Bonagiri <[email protected]> > wrote: > > The default HDFS block size 64 MB means, it is the maximum size of block of > data written on HDFS. So, if you write 4 MB files, they will still be > occupying only 1 block of 4 MB size, not more than that. If your file is more > than 64MB, it gets split into multiple blocks. > > If you set the HDFS block size to 2MB, then your 4 MB file will get split > into two blocks. > > On Tue, May 12, 2015 at 8:38 AM, Himawan Mahardianto <[email protected] > <mailto:[email protected]>> wrote: > Hi guys, I have a couple question about HDFS block size: > > What if I set my HDFS block size from default 64 MB to 2 MB each block, what > will gonna happen? > > I decrease the value of a block size because I want to store an image file > (jpeg, png etc) that have size about 4MB each file, what is your opinion or > suggestion? > > What will gonna happen if i don't change the default size of a block size, > then I store an image file with 4MB size, will Hadoop use full 64MB block, or > it will create 4Mb block instead 64MB? > > How much memory used on RAM to store each block if my block size is 64MB, or > my block size is 4MB? > > Is there anyone have experience with this? Any suggestion are welcome > Thank you >
