You are pretty close. The QueryExecution is what drives the phases from parsing to execution. Once we have a final SparkPlan (the physical plan), toRdd just calls execute() which recursively calls execute() on children until we hit a leaf operator. This gives us an RDD[Row] that will compute the answer and from there the actual execution is driven by Spark Core.
On Mon, Nov 24, 2014 at 9:52 AM, Tim Chou <timchou....@gmail.com> wrote: > Hi All, > > I'm learning the code of Spark SQL. > > I'm confused about how SchemaRDD executes each operator. > > I'm tracing the code. I found toRDD() function in QueryExecution is the > start for running a query. toRDD function will run SparkPlan, which is a > tree structure. > > However, I didn't find any iterative sentence in execute function for any > detail operators. It seems Spark SQL will only run the top node in this > tree. > > I know the conclusion is wrong.But which code have I missed? > > Thanks, > Tim >