Hi All,

Wanted to pass along some good foundational material about databases. We find 
ourselves immersed day-to-day in the details of Drill's implementation. It is 
helpful to occasionally step back and look at the larger DB tradition in which 
Drill resides. This material is especially good for anyone who didn't study DB 
theory in college.

"Architecture of a Database System": 
http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf - By Stonebraker et 
al. While focused on "classic" DB systems, the ideas readily apply to "Big 
Data" distributed engines such as Drill. Walks through many of the basic 
architectural choices. You'll find yourself saying, "I see, Drill chose the 
shared-nothing, OS thread model but random heap allocation rather than a buffer 
pool." That is, you can see Drill's design choices in the context of the 
overall DB solution space.

"Database Management Systems", 3e by Ramakrishnan & Gehrke. A textbook-length 
overview of DB theory. I used the second edition years ago to design and build 
a complete embedded hybrid DB and object store. I keep returning to the book 
any time I need a refresher on some topic or other.

What other favorites do people have? Anyone know of any good references that 
explain the rule-based architecture of a planner such as Calcite? (R&G, 2e, 
mostly discuss the classic "dynamic programming" style of planner.)

Thanks,
- Paul

Reply via email to