Thanks Paul, good to know the design principals of the Drill query execution 
process model.
I am very new to Drill, please bear with me.

One more question. 
As you mentioned, the schema-free processing is the key feature to be advantage 
over Spark, is there any performance consideration behind this design except 
the techniques of the dynamic codegen and vectorization computation?

Regards,
Qiaoyi


------------------------------------------------------------------
发件人:Paul Rogers <[email protected]>
发送时间:2018年8月4日(星期六) 02:27
收件人:dev <[email protected]>
主 题:Re: Is Drill query execution processing model just the same idea with the 
Spark whole-stage codegen improvement

Hi Qiaoyi,
As you noted, Drill and Spark have similar models -- but with important 
differences.
Drill is schema-on-read (also called "schema less"). In particular, this means 
that Drill does not know the schema of the data until the first row (actually 
"record batch") arrives at each operator. Once Drill sees that first batch, it 
has a data schema, and can generate the corresponding code; but only for that 
one operator.
The above process repeats up the fragment ("fragment" is Drill's term for a 
Spark stage.)
I believe that Spark requires (or at least allows) the user to define a schema 
up front. This is particularly true for the more modern data frame APIs.
Do you think the Spark improvement would apply to Drill's case of determining 
the schema operator-by-opeartor up the DAG?
Thanks,
- Paul



    On Friday, August 3, 2018, 8:57:29 AM PDT, 丁乔毅(智乔) 
<[email protected]> wrote:  


Hi, all. 

I'm very new to Apache Drill. 

I'm quite interest in Drill query execution's implementation. 
After a little bit of source code reading, I found it is built on a processing 
model quite like a data-centric pushed-based style, which is very similar with 
the idea behind the Spark whole-stage codegen improvement(jira ticket 
https://issues.apache.org/jira/browse/SPARK-12795)

And I wonder is there any detailed documentation about this? What's the 
consideration behind of our design in the Drill project. : )

Regards,
Qiaoyi  

Reply via email to