Hello Nikolay.
You've asked very good questions. I'll try to answer.
1. What the exact issues with the H2 integration?
Can you send a tickets links?
Can we label all H2 integration issues in JIRA? I propose to use "h2" label.
Current SQL engine is confined in the single-pass map-reduce algorithm.
This make impossible to execute complex queries which can not be
expressed with a single map-reduce pass like subqueries with aggregates
[1]. Another problem is that H2 optimizer is very primitive and not
able to perform many useful optimizations [2].
Also Apache Calcite is commonly used in popular Apache projects like
Hive, Drill, Flink and others [3]. So it's mature and well battle tested
framework, while H2 is a toy database which is hardly ever used in the
real production systems.
2. What are the requirements for the new SQL engine?
We should write it down and discuss.
The main requirement is to fix the problems listed above. The new SQL
engine should be able to *effectively* execute SQL queries of the
*arbitrary complexity*. For example the new engine will be able to
perform distributed joins in a multiple ways [4], when current engine
can do it only in two ways: collocated and distributed (the latter is
usually not very efficient and needed to set manually).
3. What options do we have?
Are there any alternatives to Calcite on the market?
We did the wrong choice that looked obvious one time.
So we should carefully avoid it at this time.
I know the only one open source implementation of the efficient query
optimization strategy - and this is Apache Calcite. The alternative way
is to write our own query optimizer from scratch which is not a trivial
task at all.
4. What is improvements of Ignite we want to make with the new engine?
Ignite will be able to execute complex queries using optimal strategy. I
think this is a quite good improvement.
[1] https://issues.apache.org/jira/browse/IGNITE-11448
[2] https://issues.apache.org/jira/browse/IGNITE-6085
[3] https://calcite.apache.org/docs/powered_by.html
[4] https://www.memsql.com/blog/scaling-distributed-joins/
--
Kind Regards
Roman Kondakov
On 27.09.2019 12:20, Nikolay Izhikov wrote:
Hello, Igor.
Thanks for starting this discussion.
I think we should take a step back in it and answer the following questions:
1. What the exact issues with the H2 integration?
Can you send a tickets links?
Can we label all H2 integration issues in JIRA? I propose to use "h2" label.
2. What are the requirements for the new SQL engine?
We should write it down and discuss.
3. What options do we have?
Are there any alternatives to Calcite on the market?
We did the wrong choice that looked obvious one time.
So we should carefully avoid it at this time.
4. What is improvements of Ignite we want to make with the new engine?
В Пт, 27/09/2019 в 08:44 +0000, Igor Seliverstov пишет:
Hi Igniters!
As you might know currently we have many open issues relating to current H2
based engine and its execution flow.
Some of them are critical (like impossibility to execute particular queries),
some of them are majors (like impossibility to execute particular queries
without pre-preparation your data to have a collocation) and many minors.
Most of the issues cannot be solved without whole engine redesign.
So, here the proposal:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084
I'll appreciate if you share your thoughts on top of that.
Regards,
Igor