Hi calcite developers,


I am currently working on a project about query optimization and trying to
set up calcite to use spark as execution engine. However, I have many
questions about current calcite spark adapter.


I understand that spark adapter need to get metadata about tables from
other source systems. Therefore, what I am trying to do is to use the
schema factory provided by JDBC adaptor to get metadata from a MySQL server
and then use spark as an execution engine.



Currently, I am able to get a plan from the volcano planner.  The root of
the plan is a SparkToEnumerableConverter and its input is a
JdbcToSparkConverter whose input is the query plan. However, the program
crashes and exit in the later phase when the plan is being implemented.



I am not sure if it is the correct way to use Spark adapter and I am still
unclear about how Spark adapter works in detail. I think it should compose
a Java program based on the query plan using  Spark's RDD API and submit
the program to Spark cluster. However, I just find only a very limited
number of RDD operations are used in Spark adapter's code.  Just wondering
if Spark adapter is capable of converting a non-trivial query (with join
and aggregation) to a Spark program.



Cheers,

Hao Tan

Reply via email to