Edmon Begoli created CALCITE-2169:
-------------------------------------

             Summary: Conduct a comparative performance study of the framework 
                 Key: CALCITE-2169
                 URL: https://issues.apache.org/jira/browse/CALCITE-2169
             Project: Calcite
          Issue Type: Task
          Components: core
         Environment: Use Calcite Benchmark, and run it on the Benchmark 
environment. 
            Reporter: Edmon Begoli
            Assignee: Edmon Begoli


Design and implement a study of the Calcite framework using benchmark that is 
to be developed for CALCITE-2168 (Implement a General Purpose Benchmark for 
Calcite), and run a comparative analysis of the performance of the Calcite 
optimizer, and the performance of the queries under Calcite optimized and 
un-optimized, and in comparison to standalone databases, or other frameworks.

Some ideas and targets for the study:

* Planning and execution time with queries that span across multiple systems 
(e.g. Postgres and Cassandra, Postgres and Pig, Pig and Cassandra).
* for TCP-DS, study the plan produced by Calcite vs. existing RDBMS optimizers 
(e.g. Postgres, MySQL). This would be interesting even as a
feature to use in conjunction with the lattice framework to decide what queries 
to eventually build lattices as an estimation of time savings.
* Optimizer runtime for complex queries (we could also compare with the runtime 
of executing the optimized query directly)
* Calcite optimized query
* Unoptimized query with the optimizer of the backend disabled
* Unoptimized query with the optimizer of the backend enabled
* Comparison with other federated query processing engines such as Spark SQL, 
PrestoDB, and maybe KSQL[1] and InfluxDB
* Uses Calcite to optimize Spark queries [2]

[1] https://github.com/confluentinc/ksql
[2] 
https://www.datascience.com/blog/grunion-data-science-tools-query-optimizer-apache-spark



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to