Hi all, Thanks for opening this discussion. I agree with Francesco that we should go for the best solution instead of another bandaid or intermediate solution. I think there's actually consensus for that, given the argument provided by Yuxia:
> The first way is the ideal way and we should go in that direction. But it will take much effort for it requires rewriting all the code about Hive dialect and it's hard to do it in one shot. And given we want to move out Hive connector in 1.16, it's more pratical to decouple first, and then migrate it to operation tree. Why should we first invest time in another intermediate solution and then spend time afterwards to actually get to the proper solution? I would propose to: - Remove Hive connector for version 1.*, 2.1.* and 2.2* in Flink 1.16 - Upgrade to the latest supported Hive 2.3.* and 3.1.* in Flink 1.16 - Get as much work done as possible to decouple the Hive connector from Flink in 1.16. There's customer value in this approach, because we'll support the latest supported Hive versions. If customers need support for older Hive versions, they can still use older Flink versions. There is also community value in this approach, because we're actively working on making our codebase maintainable. If the entire decoupling is not possible, then we won't move the Hive 2.3.* and 3.1.* connector out in Flink 1.16, but in Flink 1.17. Support for the new Hive version 4.* could then also be added in Flink 1.17 (we should not add support for newer versions of Hive until this decoupling has been finished). Best regards, Martijn Visser https://twitter.com/MartijnVisser82 https://github.com/MartijnVisser On Wed, 30 Mar 2022 at 17:07, Francesco Guardiani <france...@ververica.com> wrote: > Sorry I replied on the wrong thread, i repost my answer here :) > > As there was already a discussion in the doc, I'll just summarize my > opinions here on the proposed execution of this FLIP. > > I think we should rather avoid exposing internal details, which I consider > Calcite to be part of, but rather reuse what we already have to define an > AST from Table API, which is what I'll refer in this mail as Operation > tree. > > First of all, the reason I think this FLIP is not a good idea is that it > proposes is to expose types out of our control, so an API we cannot control > and we may realistically never be able to stabilize. A Calcite bump in the > table project is already pretty hard today, as shown by tasks like that > https://github.com/apache/flink/pull/13577. This will make them even > harder. Essentially it will couple us to Calcite even more, and create a > different but still big maintenance/complexity burden we would like to get > rid of with this FLIP. > > There are also some technical aspects that seem to me a bit overlooked > here: > > * What about Scala? Is flink-table-planner-spi going to be a scala module > with the related suffix? Because I see you want to expose a couple of types > which we have implemented with Scala right now, and making this module > Scala dependent makes even more complicated shipping both modules that use > it and flink-table-planner-loader. > * Are you sure exposing the Calcite interfaces is going to be enough? Don't > you also require some instance specific methods? E.g. > FlinkTypeFactory#toLogicalType? What if at some point you need to expose > something like FlinkTypeFactory? How do you plan to support it and > stabilize it in the long term? > > Now let me talk a bit about the Operation tree. For who doesn't know what > it is, it's the pure Flink AST for defining DML, used for converting the > Table API DSL to an AST the planner can manipulate. Essentially, it's our > own version of the RelNode tree/RexNode tree. This operation tree can be > used already by Hive, without any API changes on Table side. You just need > a downcast of TableEnvironmentImpl to use getPlanner() and use > Planner#translate, or alternatively you can add getPlanner to > TableEnvironmentInternal directly. From what I've seen about your use case, > and please correct me if I'm wrong, you can implement your SQL -> Operation > tree layer without substantial changes on both sides. > > The reason why I think this is a better idea, rather than exposing Calcite > and RelNodes directly, is: > > * Aforementioned downsides of exposing Calcite > * It doesn't require a new API to get you started with it > * Doesn't add complexity on planner side, just removes it from the existing > coupling with hive > * Letting another project use the Operation tree will harden it, make it > more stable and eventually lead to become public > > The last point in particular is extremely interesting for the future of the > project, as having a stable public Operation tree will allow people to > implement other relational based APIs on top of Flink SQL, or manipulate > the AST to define new semantics, or even more crazy things we can't think > of right now, leading to a broader bigger and more diverse ecosystem. Which > is exactly what Hive is doing right now at the end of the day, define a new > relational API on top of the Flink Table planner functionalities. > > On Wed, Mar 30, 2022 at 4:45 PM 罗宇侠(莫辞) > <luoyuxia.luoyu...@alibaba-inc.com.invalid> wrote: > > > Hi, I would like to explain a bit more about the current dissusion[1] for > > the ways to decouple Hive connector with Flink Planner. > > > > The background is to support Hive dialect, the Hive connector is > dependent > > on Flink Planner for the current implementation is generate RelNode and > > then deliver the RelNode to Flink. > > But it also brings much complexity and maintenance burden, so we want to > > decouple Hive connector with Flink planner. > > > > There're two ways to do that: > > 1. Make the hive parser just generate an Operation tree like Table API > > currently does. > > > > 2. Introduce a slim module called table-planner-spl which provide Calcite > > dependency and expose limit public intefaces. Then, still generating > > Calcite RelNode, the Hive connector will only require the slim module. > > > > The first way is the ideal way and we should go in that direction. But it > > will take much effort for it requires rewriting all the code about Hive > > dialect and it's hard to do it in one shot. > > And given we want to move out Hive connector in 1.16, it's more pratical > > to decouple first, and then migrate it to operation tree. > > So, the FLIP-216[2] is for the second way. It explains the public > > interfaces to be exposed, all of which has been implemented by > > PlannerContext. > > > > [1] > > > https://docs.google.com/document/d/1LMQ_mWfB_mkYkEBCUa2DgCO2YdtiZV7YRs2mpXyjdP4/edit?usp=sharing > > [2] > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-216%3A+Decouple+Hive+connector+with+Flink+planner > > > > Best regards, > > Yuxia------------------------------------------------------------------ > > 发件人:罗宇侠(莫辞)<luoyuxia.luoyu...@alibaba-inc.com.INVALID> > > 日 期:2022年03月25日 20:41:15 > > 收件人:dev@flink.apache.org<dev@flink.apache.org> > > 主 题:[DISCUSS] FLIP-216 Decouple Hive connector with Flink planner > > > > Hi, everyone > > > > I would like to open a discussion about decoupling Hive connector with > > Flink table planner. It's a follow-up discussion after Hive syntax > > discussion[1], but only focus on how to decouple Hive connector. The > origin > > doc is here[2], from which you can see the details work and heated > > discussion about exposing Calcite API or reuse Operation tree to > decouple. > > I have created FLIP-216: Decouple Hive connector with Flink planner[3] > for > > it. > > > > Thanks and looking forward to a lively discussion! > > > > [1] https://lists.apache.org/thread/2w046dwl46tf2wy750gzmt0qrcz17z8t > > [2] > > > https://docs.google.com/document/d/1LMQ_mWfB_mkYkEBCUa2DgCO2YdtiZV7YRs2mpXyjdP4/edit?usp=sharing > > [3] > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-216%3A+Decouple+Hive+connector+with+Flink+planner > > > > Best regards, > > Yuxia > > >