After going through Dremel & Dryad paper, Here is my understanding -- 1. Columnar storage is chosen so that non-required column of a record can be avoided and hence less IO. 2. All values of a field are kept together to improve retrieval efficiency. >From this my understanding is that if that particular field is required in query, all values can be fetched in one seek efficiently. 3. There is no detail in paper about how to store values, repetition level & definition levels. As David said, it can be done having separate files for value, repetition level & definition level. on top of this we need to index record so that we can seek at right position and fetch desired values only or read more and discard later. 4. I agree on data locality part with Ted and Camuel. It is desired but not mandatory. Dremel paper states that Dremel has ability to access local data or data in GFS or other store like BigTable. 5. Dremel and Dryad both mentions similar way to retrieve data using serving tree, each node acts (independently) as an operator or run some custom code. User submitted query is translated to form a DAG of execution. Dryad states that relational algebra can be expressed as DAG. General graph are more complicated to implement and need to take care of cycles during execution. Hence Dryad chosen DAG as a query execution model.
Please throw your understanding on this to enhance(correct) mine. Regards, Dharm On Tue, Aug 28, 2012 at 4:40 AM, Camuel Gilyadov <[email protected]> wrote: > On Mon, Aug 27, 2012 at 8:40 PM, Min Zhou <[email protected]> wrote: > > > Hi all, > > > > I was every excited that you guys decided to start Apache Drill, an open > > source > > version of Google's Dremel. I was a contributor of Apache Hive, and > > skilled in Hadoop > > related development. We have a nearly 3000-nodes cluster in production, > one > > of the > > largest cluster of the world. > > > > Dremel became more and more popular since Google's BigQuery was > released. I > > took a interest in this nearly two years ago.This paper > > (http://research.google.com/pubs/...< > > http://research.google.com/pubs/pub36632.html> > > ) has describe how dremel organizes > > records into nested columnar data. But there’s almost no information > > about > > how does dremel store those columns. I have many questions on this point. > > > > > > 1. It that one file for each column? > > > > I think it is an less important implementation detail. What is important > that you don't incur IO for non-projected columns. > > 2. It seems that Dremel has no restriction that data must store in local > > disk, > > GFS or Bigtable, all of them could be the target storage. If in > GFS, > > how does dremel retrieve records from different nodes? > > How to guarantee the data locality? > > > > Data locality is not mandatory. It is clearly written that data is either > local or accessed remotely. Search Dremel paper or slide deck for "in-situ" > and "local". > > > > 3. The paper refered that "The blocks in each stripe are prefetched > > asynchronously; the read-ahead cache typically achieves hit rates of > > 95%. " , does GFS support async prefetching? > > > > > > Have you consider the questions above? What's you answers? > > > > BTW, Could I join you guys to start such a cool project? > > > > It is open to everyone > > > > > > > > Thanks, > > Min > > > > -- > > My research interests are distributed systems, parallel computing and > > bytecode based virtual machine. > > > > My profile: > > http://www.linkedin.com/in/coderplay > > My blog: > > http://coderplay.javaeye.com > > >
