Re: A list of questions on Dremel (or Apache Drill)'s columnar storage

Min Zhou Tue, 28 Aug 2012 06:08:02 -0700

On Tue, Aug 28, 2012 at 7:39 PM, Ted Dunning <[email protected]> wrote:


> Can't do variable block size in vanilla hadoop.  That is part of the whole
> namenode legacy.
>
Exactly. HDFS doesn't support variable block sizes. There is a jira of HDFS
metioned such feature (HDFS-2362). After all,  variable block sizes would
make things more complex. It seems that we need a tradeoff: locality or
simplicity.





On Tue, Aug 28, 2012 at 2:56 AM, Min Zhou <[email protected]> wrote:
>
> > 1. If it's one data file for each column, data locality is difficult to
> >    guarantee when rebuilding a row from column files. Unless
> >    that GFS can keep all fields from the same row in files of the
> >    same node. Moreover that, data block can't be a fixed
> >    size like 1MB/64MB/128MB, cuz
> >
>


Regards,
Min
-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: A list of questions on Dremel (or Apache Drill)'s columnar storage

Reply via email to