Nikolay, > What project hosted Calcite based engine?
Currently the prototype is placed in my personal Ignite fork. I need an appropriate ticket before pushing it to ASF git repository. At first, I think, we should discuss the idea in general. > Personally, I'm against the support of two independent implementation of SQL > engine for several releases. I don’t like the idea to have two engines too. But even development the engine on top of Calcite library is still a big deal. I not sure it will be ready, no, I sure it WONT be ready by Ignite3 release. So I mentioned the option to have two engines at the same time. > Let's start with the IEP clarification and replace the SQL engine with the > best one for Ignite good. Of course, but anyway it’s good to make familiar with a couple of examples it already describes and clarify some additional questions the community may ask. Regards, Igor > 27 сент. 2019 г., в 18:22, Nikolay Izhikov <nizhi...@apache.org> написал(а): > > Igor. > >> There is no decision, here we should decide. > > Great. > >> At now Calcite based engine is placed in different module > > What project hosted Calcite based engine? > >> It’s possible to develop it as an experimental extension at first (not a >> replacement) > > For me, Ignite 3 are the place where the new engine has to be placed. > Personally, I'm against the support of two independent implementation of SQL > engine for several releases. > > Ignite has too many partially implemented features to include on more :) > > Let's start with the IEP clarification and replace the SQL engine with the > best one for Ignite good. > > > В Пт, 27/09/2019 в 18:08 +0300, Seliverstov Igor пишет: >> Nikolay, >> >> At last we have better questions. >> >> There is no decision, here we should decide. >> >> Doing nothing isn’t a decision, it’s just doing nothing >> >> Spark Catalyst is a good example, but under the hood it has absolutely the >> same idea, but adopted to Spark. Calcite is the same, but general. That’s >> why it’s better start point. >> >> Implementing an engine from scratch is really cool, but looks like inventing >> a bicycle, don’t think it makes sense. At least I against this option. >> >> I added requirements to IEP (as you asked), you may see it’s in DRAFT state >> and will be complemented by details. >> >> We have some thoughts on how to make smooth replacement, but at first we >> should decide what to replace and what with. >> >> At now Calcite based engine is placed in different module, we checked it can >> build execution graph for both local and distributed cases, it has good >> expandability. >> We talked to Calcite community to identify possible future issues and >> everything points to the fact it’s the best option. >> It’s possible to develop it as an experimental extension at first (not a >> replacement) until we make sure that it works as expected. This way there >> are no risks for anybody who uses Ignite on production environment. >> >> Regards, >> Igor >> >> >>> 27 сент. 2019 г., в 17:25, Nikolay Izhikov <nizhi...@apache.org> написал(а): >>> >>> Igor. >>> >>>> The main issue - there is no *selection*. >>> >>> 1. I don't remember community decision about this. >>> >>> 2. We should avoid to make such long-term decision so quickly. >>> We done this kind of decision with H2 and come to the point when we should >>> review it. >>> >>>> 1) Implementing white papers from scratch >>>> 2) Adopting Calcite to our needs. >>> >>> The third option don't fix issues we have with H2. >>> The fourth option I know is using spark-catalyst. >>> >>> What is wrong with writing engine from scratch? >>> >>> I ask you to start with engine requirements. >>> Can we, please, discuss it? >>> >>>> If you have an alternative - you're welcome, I'll gratefully listen to you. >>> >>> We have alternative for now - H2 based engine. >>> >>>> The main question isn't "WHAT" but "HOW" - that's the discussion topic >>>> from my point of view. >>> >>> When we make a decision about engine we can discuss roadmap for replacement. >>> One more time - replacement of SQL engine to some more customizable make >>> sense for me. >>> But, this kind of decisions need carefull discussion. >>> >>> В Пт, 27/09/2019 в 17:08 +0300, Seliverstov Igor пишет: >>>> Nikolay, >>>> >>>> The main issue - there is no *selection*. >>>> >>>> There is a field of knowledge - relational algebra, which describes how to >>>> transform relational expressions saving their semantics, and a couple of >>>> implementations (Calcite is only one written in Java). >>>> >>>> There are only two alternatives: >>>> >>>> 1) Implementing white papers from scratch >>>> 2) Adopting Calcite to our needs. >>>> >>>> The second way was chosen by several other projects, there is experience, >>>> there is a list of known issues (like using indexes) so, almost everything >>>> is already done for us. >>>> >>>> Implementing a planner is a big deal, I think anybody understands it >>>> there. That's why our proposal to reuse others experience is obvious. >>>> >>>> If you have an alternative - you're welcome, I'll gratefully listen to you. >>>> >>>> The main question isn't "WHAT" but "HOW" - that's the discussion topic >>>> from my point of view. >>>> >>>> Regards, >>>> Igor >>>> >>>>> 27 сент. 2019 г., в 16:37, Nikolay Izhikov <nizhi...@apache.org> >>>>> написал(а): >>>>> >>>>> Roman. >>>>> >>>>>> Nikolay, Maxim, I understand that our arguments may not be as obvious >>>>>> for you as it obvious for SQL team. So, please arrange your questions in >>>>>> a more constructive way. >>>>> >>>>> What is SQL team? >>>>> I only know Ignite community :) >>>>> >>>>> Please, share you knowledge in IEP. >>>>> I want to join to the process of engine *selection*. >>>>> It should start with the requirements to such engine. >>>>> Can you write it in IEP, please? >>>>> >>>>> My point is very simple: >>>>> >>>>> 1. We made the wrong decision with H2 >>>>> 2. We should make a well-thought decision about the new engine. >>>>> >>>>>> How many tickets would satisfy you? >>>>> >>>>> You write about "issueS" with the H2. >>>>> All I see is one open ticket. >>>>> IEP doesn't provide enough information. >>>>> So it's not about the number of tickets, it's about >>>>> >>>>>> These two points (single map-reduce execution and inflexible optimizer) >>>>>> are the main problems with the current engine. >>>>> >>>>> We may come to the point when Calcite(or any other engine) brings us >>>>> third and other "main problems". >>>>> This is how it happens with H2. >>>>> >>>>> Let's start from what we want to get with the engine and move forward >>>>> from this base. >>>>> What do you think? >>>>> >>>>> >>>>> >>>>> В Пт, 27/09/2019 в 16:15 +0300, Roman Kondakov пишет: >>>>>> Maxim, Nikolay, >>>>>> >>>>>> I've listed two issues which show the ideological flaws of the current >>>>>> engine. >>>>>> >>>>>> 1. IGNITE-11448 - Open. This ticket describes the impossibility of >>>>>> executing queries which can not be fit in the hardcoded one pass >>>>>> map-reduce paradigm. >>>>>> >>>>>> 2. IGNITE-6085 - Closed (won't fix) - This ticket describes the second >>>>>> major problem with the current engine: H2 query optimizer is very >>>>>> primitive and can not perform many useful optimizations. >>>>>> >>>>>> These two points (single map-reduce execution and inflexible optimizer) >>>>>> are the main problems with the current engine. It means that our engine >>>>>> is currently suitable for execution only a very limited subset of the >>>>>> typical SQL queries. For example it can not even run most of the TPC-H >>>>>> benchmark queries because they don't fit to the simple map-reduce >>>>>> paradigm. >>>>>> >>>>>>> All I see is links to two tickets: >>>>>> >>>>>> How many tickets would satisfy you? I named two. And it looks like it is >>>>>> not enough from your point of view. Ok, so how many is enough? The set >>>>>> of problems caused by listed above tickets is infinite, therefore I can >>>>>> not create a ticket for each of them. >>>>>>> Tech details also should be added. >>>>>> >>>>>> Tech details are in the tickets. >>>>>> >>>>>>> We can't discuss such a huge change as an execution engine replacement >>>>>>> with descrition like: >>>>>>> "No data co-location control, i.e. arbitrary data can be returned >>>>>>> silently" or >>>>>>> "Low control on how query executes internally, as a result we have >>>>>>> limited possibility to implement improvements/fixes." >>>>>> >>>>>> Why not? Don't you understand these problems? Or you don't think this is >>>>>> a problem? >>>>>> >>>>>>> Let's make these descriptions more specific. >>>>>> >>>>>> What do you mean by "more specific"? What is the criteria of the >>>>>> specific description? >>>>>> >>>>>> >>>>>> >>>>>> Nikolay, Maxim, I understand that our arguments may not be as obvious >>>>>> for you as it obvious for SQL team. So, please arrange your questions in >>>>>> a more constructive way. >>>>>> >>>>>> Thank you! >>>> >>>> >> >>