One interesting exercise would also be to pick a popular benchmark (e.g. TPC-H) and just look at the plan produced by Calcite vs existing RDBMS optimizers (e.g. Postgres, MySQL). Along with performance analysis of the various options, it seems there's a paper in there.
-- Michael Mior [email protected] 2018-02-03 23:21 GMT-05:00 Edmon Begoli <[email protected]>: > I am planning on opening an issue, and coordinating an initiative to > develop a Calcite-focused benchmark. > > This would lead to the development of the executable, reportable benchmark, > and of the next publication aimed at another significant computer science > conference or a journal. > > Before I submit a JIRA issue, i would like to get your feedback on what > this benchmark might be both in terms of what it should benchmark, and now > it should be implemented. > > Couple of preliminary thoughts that came out of the conversation with the > co-authors of our SIGMOD paper are: > > * Optimizer runtime for complex queries (we could also compare with the > runtime of executing the optimized query directly) > * Calcite optimized query > * Unoptimized query with the optimizer of the backend disabled > * Unoptimized query with the optimizer of the backend enabled > * Overhead of going through Calcite adapters vs. natively accessing the > target DB > * Comparison with other federated query processing engines such as Spark > SQL and PrestoDB > * use TCP-H or DS for this purpose > * use Star Schema Benchmark (SSB) > * Planning and execution time with queries that span across multiple > systems (e.g. Postgres and Cassandra, Postgres and Pig, Pig and Cassandra). > > > > Follow approaches similar to: > * https://www.slideshare.net/julianhyde/w-435phyde-3 > * > https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/ > bk_hive-performance-tuning/content/ch_cost-based-optimizer.html > * (How much of this is still relevant (Hive 0.14)? Can we use > queries/benchmarks?) > https://hortonworks.com/blog/hive-0-14-cost-based-optimizer-cbo-technical- > overview/ > > > Please share your suggestions. >
