Hello Nikolay.

You've asked very good questions. I'll try to answer.

1. What the exact issues with the H2 integration?
Can you send a tickets links?
Can we label all H2 integration issues in JIRA? I propose to use "h2" label.
Current SQL engine is confined in the single-pass map-reduce algorithm. This make impossible to execute complex queries which can not be expressed with a single map-reduce pass like subqueries with aggregates [1].  Another problem is that H2 optimizer is very primitive and not able to perform many useful optimizations [2].

Also Apache Calcite is commonly used in popular Apache projects like Hive, Drill, Flink and others [3]. So it's mature and well battle tested framework, while H2 is a toy database which is hardly ever used in the real production systems.

2. What are the requirements for the new SQL engine?
We should write it down and discuss.
The main requirement is to fix the problems listed above. The new SQL engine should be able to *effectively* execute SQL queries of the *arbitrary complexity*. For example the new engine will be able to perform distributed joins in a multiple ways [4], when current engine can do it only in two ways: collocated and distributed (the latter is usually not very efficient and needed to set manually).

3. What options do we have?
Are there any alternatives to Calcite on the market?
We did the wrong choice that looked obvious one time.
So we should carefully avoid it at this time.
I know the only one open source implementation of the efficient query optimization strategy - and this is Apache Calcite. The alternative way is to write our own query optimizer from scratch which is not a trivial task at all.


4. What is improvements of Ignite we want to make with the new engine?
Ignite will be able to execute complex queries using optimal strategy. I think this is a quite good improvement.


[1] https://issues.apache.org/jira/browse/IGNITE-11448
[2] https://issues.apache.org/jira/browse/IGNITE-6085
[3] https://calcite.apache.org/docs/powered_by.html
[4] https://www.memsql.com/blog/scaling-distributed-joins/
--
Kind Regards
Roman Kondakov

On 27.09.2019 12:20, Nikolay Izhikov wrote:
Hello, Igor.

Thanks for starting this discussion.

I think we should take a step back in it and answer the following questions:

1. What the exact issues with the H2 integration?
Can you send a tickets links?
Can we label all H2 integration issues in JIRA? I propose to use "h2" label.

2. What are the requirements for the new SQL engine?
We should write it down and discuss.

3. What options do we have?
Are there any alternatives to Calcite on the market?
We did the wrong choice that looked obvious one time.
So we should carefully avoid it at this time.

4. What is improvements of Ignite we want to make with the new engine?


В Пт, 27/09/2019 в 08:44 +0000, Igor Seliverstov пишет:
Hi Igniters!

As you might know currently we have many open issues relating to current H2 
based engine and its execution flow.

Some of them are critical (like impossibility to execute particular queries), 
some of them are majors (like impossibility to execute particular queries 
without pre-preparation your data to have a collocation) and many minors.

Most of the issues cannot be solved without whole engine redesign.

So, here the proposal: 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084

I'll appreciate if you share your thoughts on top of that.

Regards,
Igor

Reply via email to