On Tue, Aug 28, 2012 at 7:39 PM, Ted Dunning <[email protected]> wrote:
> Can't do variable block size in vanilla hadoop. That is part of the whole > namenode legacy. > Exactly. HDFS doesn't support variable block sizes. There is a jira of HDFS metioned such feature (HDFS-2362). After all, variable block sizes would make things more complex. It seems that we need a tradeoff: locality or simplicity. On Tue, Aug 28, 2012 at 2:56 AM, Min Zhou <[email protected]> wrote: > > > 1. If it's one data file for each column, data locality is difficult to > > guarantee when rebuilding a row from column files. Unless > > that GFS can keep all fields from the same row in files of the > > same node. Moreover that, data block can't be a fixed > > size like 1MB/64MB/128MB, cuz > > > Regards, Min -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. My profile: http://www.linkedin.com/in/coderplay My blog: http://coderplay.javaeye.com
