+1 Thank you, Pan, for the summarization.
kent Cheng Pan <pan3...@gmail.com> 于2023年12月1日周五 15:42写道: > > With additional offline discussion and feedback from other Kyuubi developers, > I get the major concern here is that the user/developer may not be able to > distinguish such two kinds of artifacts because the same word “shaded” is > used in the package name. > > Given that, I propose > > 1. use a different word “relocated” for artifacts for Kyuubi internal usage, > so the new pattern is `kyuubi-relocated-<thrid_party_name>`, and the proposed > jar in this thread will be `kyuubi-relocated-hive-service-rpc` (I will rename > the existing `kyuubi-shaded-zookeeper-*`) > 2. enrich the description in the module's pom, and the description will be > present on the Maven Central page[1], e.g. the current description of > `kyuubi-hive-jdbc-shaded` is “Kyuubi Project Hive JDBC Shaded Client", I > propose to change it to "Kyuubi Hive JDBC Driver with dependencies shaded"; > and the current description of `kyuubi-shaded-zookeeper-34` is "Kyuubi Shaded > ZooKeeper 34", I propose to change it to "Relocated Zookeeper 3.4 classes > used by Kyuubi internally." > > [1] > https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-hive-jdbc-shaded > > Thanks, > Cheng Pan > > > > On Nov 30, 2023, at 14:46, Cheng Pan <pan3...@gmail.com> wrote: > > > >> You completely misunderstood what I meant. > > > > I don’t get your point, I tried my best to answer/explain each of your > > questions/concerns. > > > >> kyuubi-shaded-hive-service-rpc, > >> kyuubi-hive-service-rpc-shaded > >> > >> I just wanted to know why these two kinds of naming policies can make > >> both end-users and kyuubi developers happy? > > > > Just follow the common practice, I believe I have listed sufficient > > examples to prove that. > > > >>> Personally, I think it’s a quite common practice, especially in Apache > >>> projects, thus it’s obvious to me, for example: > >>> > >>> - > >>> https://mvnrepository.com/artifact/org.apache.hbase.thirdparty/hbase-shaded-netty > >>> - > >>> https://mvnrepository.com/artifact/org.apache.hadoop.thirdparty/hadoop-shaded-guava > >>> - https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-jackson > >>> - > >>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-bundled-guava > > > >>> Also, I just follow the most popular name pattern > >>> `<project_name>-shaded-<thrid_party_name>` to name Kyuubi shaded > >>> artifacts. > > > > > > ==================================================================================================== > > > >> Is it said that if an end-users use kyuubi-hive-jdbc-shaded as a JDBC > >> Driver, he/she could not use kyuubi-shaded-hive-service-rpc as > >> as a thrift client? > > > > > > We encourage users to use `kyuubi-hive-jdbc-shaded` and do not recommend > > users to use `kyuubi-shaded-hive-service-rpc`, > > `kyuubi-shaded-hive-service-rpc` is designed for Kyuubi project internal > > usage, we should not expose those shaded classes to public API. But as I > > said before > > > >> Technically, we can not stop the user from using such shaded API directly. > > > > Thanks, > > Cheng Pan > > > > > >> On Nov 30, 2023, at 14:34, Kent Yao <y...@apache.org> wrote: > >> > >> Hi Pan, > >> > >> You completely misunderstood what I meant. > >> > >> kyuubi-shaded-hive-service-rpc, > >> kyuubi-hive-service-rpc-shaded > >> > >> I just wanted to know why these two kinds of naming policies can make > >> both end-users and kyuubi developers happy? > >> > >> Why is even this necessary? > >> > >> Is it said that if an end-users use kyuubi-hive-jdbc-shaded as a JDBC > >> Driver, he/she could not use kyuubi-shaded-hive-service-rpc as > >> as a thrift client? > >> > >> Cheng Pan <pan3...@gmail.com> 于2023年11月30日周四 11:25写道: > >>> > >>> > >>>>> The artifacts end with “-shaded” are for end users. > >>>> > >>>> This looks like an explanation for mistakes we've already made. > >>> > >>> I don’t think they're mistakes, only facts. In practice, there are > >>> various ways to name this type of jar, we just picked one of them. > >>> > >>> There are some examples: > >>> > >>> - > >>> https://mvnrepository.com/artifact/com.datastax.oss/java-driver-core-shaded > >>> - > >>> https://mvnrepository.com/artifact/org.apache.zeppelin/zeppelin-interpreter-shaded > >>> - > >>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-spark-runtime-3.3 > >>> - https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client-api > >>> - > >>> https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client-runtime > >>> - > >>> https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kinesis-asl-assembly > >>> > >>> Currently, Kyuubi has two such modules > >>> > >>> - kyuubi-hive-jdbc-shaded > >>> - kyuubi-spark-authz-shaded > >>> > >>> The main purpose of this kind of artifact is to assemble all transitive > >>> dependencies into one jar, usually with proper third-party class > >>> relocation but WITHOUT changing any public API, e.g. for the JDBC driver, > >>> the public API is JDBC API, it also means that the user can migrate from > >>> `A` to `A-shaded` smoothly if only public API is consumed. > >>> > >>>> I've never been heard of such policy being discussed, documented. > >>> > >>> There hasn't been any specific topic of discussion about name policy > >>> before. It's impossible to discuss everything in detail. If someone > >>> thinks something matters, then we should discuss it. > >>> > >>>> Furthermore, since we've already posted them publicly in the maven > >>>> repository, > >>>> how can we decide that they won't be used by end-users? > >>> > >>> Technically, we can not stop the user from using such shaded API > >>> directly. Personally, I think it’s a quite common practice, especially in > >>> Apache projects, thus it’s obvious to me, for example: > >>> > >>> - > >>> https://mvnrepository.com/artifact/org.apache.hbase.thirdparty/hbase-shaded-netty > >>> - > >>> https://mvnrepository.com/artifact/org.apache.hadoop.thirdparty/hadoop-shaded-guava > >>> - https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-jackson > >>> - > >>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-bundled-guava > >>> > >>> Also, I just follow the most popular name pattern > >>> `<project_name>-shaded-<thrid_party_name>` to name Kyuubi shaded > >>> artifacts. > >>> > >>>> Did we explicitly remove the public APIs from the original artifact? > >>> > >>> > >>> No, in most cases, just simple relocation and repackaging of the original > >>> jar, with class overwriting if necessary. > >>> > >>> Thanks, > >>> Cheng Pan > >>> > >>> > >>>> On Nov 29, 2023, at 11:13, Kent Yao <y...@apache.org> wrote: > >>>> > >>>>> The artifacts end with “-shaded” are for end users. > >>>> > >>>> This looks like an explanation for mistakes we've already made. > >>>> > >>>> I've never been heard of such policy being discussed, documented. > >>>> > >>>> Furthermore, since we've already posted them publicly in the maven > >>>> repository, > >>>> how can we decide that they won't be used by end-users? Did we > >>>> explicitly remove > >>>> the public APIs from the original artifact? > >>>> > >>>> Kent > >>>> > >>>> Cheng Pan <pan3...@gmail.com> 于2023年11月29日周三 10:25写道: > >>>>> > >>>>> Kent > >>>>> > >>>>> They are for different purposes. > >>>>> > >>>>> The artifacts start with "kyuubi-shaded” is for the Kyuubi project's > >>>>> internal usage, see [1] and [2] > >>>>> > >>>>> The artifacts end with “-shaded” are for end users. > >>>>> > >>>>> [1] https://github.com/apache/kyuubi-shaded > >>>>> [2] > >>>>> https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-shaded-zookeeper-34 > >>>>> > >>>>> Thanks, > >>>>> Cheng Pan > >>>>> > >>>>> > >>>>>> On Nov 28, 2023, at 18:36, Kent Yao <y...@apache.org> wrote: > >>>>>> > >>>>>> -1 for the arbitrary naming > >>>>>> > >>>>>> > >>>>>> FYI, > >>>>>> https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-hive-jdbc-shaded > >>>>>> > >>>>>> XiDuo You <ulyssesyo...@gmail.com> 于2023年11月28日周二 18:03写道: > >>>>>>> > >>>>>>> +1, looks fine to me > >>>>>>> > >>>>>>> Zhen Wang <wangz...@apache.org> 于2023年11月28日周二 12:53写道: > >>>>>>>> > >>>>>>>> +1, I like this idea, it avoids the possibility of engine thrift > >>>>>>>> class > >>>>>>>> conflicts in different environments. > >>>>>>>> > >>>>>>>> > >>>>>>>> Kind Regards, > >>>>>>>> Zhen Wang > >>>>>>>> > >>>>>>>> Cheng Pan <pan3...@gmail.com> 于2023年11月28日周二 01:00写道: > >>>>>>>>> > >>>>>>>>> Hi Kyuubi developers, > >>>>>>>>> > >>>>>>>>> Recently, I've been looking at the code of the Hive engine, one of > >>>>>>>>> my > >>>>>>>>> goals id to make it support multiple versions of Hive runtime, > >>>>>>>>> including > >>>>>>>>> - 3.1.3, which is the latest stable version of Apache Hive, already > >>>>>>>>> supported > >>>>>>>>> - 2.3.9, which is the latest stable version of Apache Hive 2.x, is > >>>>>>>>> adopted widely, including Apache Spark, Apache Flink > >>>>>>>>> - 2.1.1-cdh6.3.2, the latest free version of CDH, has lots of users > >>>>>>>>> > >>>>>>>>> When I tried to run the Hive engine built against Hive 3.x with Hive > >>>>>>>>> 2.3.9/2.1.1-cdh6.3.2 runtime, I encountered some thrift class > >>>>>>>>> conflict > >>>>>>>>> issues which were hard to resolve, thus I propose to create a > >>>>>>>>> pre-shaded hive-service-rpc to tackles such issue. > >>>>>>>>> > >>>>>>>>> I have created two PRs[1][2] to demonstrate and verify my idea, it > >>>>>>>>> also introduces additional benefits, e.g. speed up the engine > >>>>>>>>> packaging by reducing relocation classes. > >>>>>>>>> > >>>>>>>>> Looking forward to the community feedback and PR reviews. > >>>>>>>>> > >>>>>>>>> Once the idea is accepted, and [1] gets merged, I will start the > >>>>>>>>> kyuubi-shaded release voting. > >>>>>>>>> > >>>>>>>>> [1] https://github.com/apache/kyuubi-shaded/pull/20 > >>>>>>>>> [2] https://github.com/apache/kyuubi/pull/5783 > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Cheng Pan > >>>>> > >>> > > >