Greetings, Calcite devs. First of all, thank you for your work on Calcite! I am working on a federated query engine that will use Spark (or something similar) as the main execution engine. Among other data sources the query engine will read from Apache Phoenix tables/views. The hope is to utilize Calcite as the query planner and optimizer component of this query engine.
At a high level, I am trying to build the following using Calcite: 1. Generate a relational algebra expression tree using RelBuilder based on user input. I plan to implement custom schema and table classes based on my metadata. 2. Provide Calcite with query optimization rules. 3. Traverse the optimized expression tree to generate a set of Spark instructions. 4. Execute query instructions via Spark. A few questions regarding the above: 1. Are there existing examples of code that does #3 above? I looked at the Spark submodule and it seems pretty bare-bones. What would be great to see is an example of a RelNode tree being traversed to create a plan for asynchronous execution via something like Spark or Pig. 2. An important query optimization that is planned initially is to be able to push down simple filters to Phoenix (the plan is to use Phoenix-Spark <http://phoenix.apache.org/phoenix_spark.html> integration for reading data). Any examples of such push-downs to specific data sources in a federated query scenario would be much appreciated. Thank you! Looking forward to working with the Calcite community. ------------- Eli Levine Software Engineering Architect -- Salesforce.com
