With additional offline discussion and feedback from other Kyuubi developers, I get the major concern here is that the user/developer may not be able to distinguish such two kinds of artifacts because the same word “shaded” is used in the package name.
Given that, I propose 1. use a different word “relocated” for artifacts for Kyuubi internal usage, so the new pattern is `kyuubi-relocated-<thrid_party_name>`, and the proposed jar in this thread will be `kyuubi-relocated-hive-service-rpc` (I will rename the existing `kyuubi-shaded-zookeeper-*`) 2. enrich the description in the module's pom, and the description will be present on the Maven Central page[1], e.g. the current description of `kyuubi-hive-jdbc-shaded` is “Kyuubi Project Hive JDBC Shaded Client", I propose to change it to "Kyuubi Hive JDBC Driver with dependencies shaded"; and the current description of `kyuubi-shaded-zookeeper-34` is "Kyuubi Shaded ZooKeeper 34", I propose to change it to "Relocated Zookeeper 3.4 classes used by Kyuubi internally." [1] https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-hive-jdbc-shaded Thanks, Cheng Pan > On Nov 30, 2023, at 14:46, Cheng Pan <pan3...@gmail.com> wrote: > >> You completely misunderstood what I meant. > > I don’t get your point, I tried my best to answer/explain each of your > questions/concerns. > >> kyuubi-shaded-hive-service-rpc, >> kyuubi-hive-service-rpc-shaded >> >> I just wanted to know why these two kinds of naming policies can make >> both end-users and kyuubi developers happy? > > Just follow the common practice, I believe I have listed sufficient examples > to prove that. > >>> Personally, I think it’s a quite common practice, especially in Apache >>> projects, thus it’s obvious to me, for example: >>> >>> - >>> https://mvnrepository.com/artifact/org.apache.hbase.thirdparty/hbase-shaded-netty >>> - >>> https://mvnrepository.com/artifact/org.apache.hadoop.thirdparty/hadoop-shaded-guava >>> - https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-jackson >>> - >>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-bundled-guava > >>> Also, I just follow the most popular name pattern >>> `<project_name>-shaded-<thrid_party_name>` to name Kyuubi shaded artifacts. > > > ==================================================================================================== > >> Is it said that if an end-users use kyuubi-hive-jdbc-shaded as a JDBC >> Driver, he/she could not use kyuubi-shaded-hive-service-rpc as >> as a thrift client? > > > We encourage users to use `kyuubi-hive-jdbc-shaded` and do not recommend > users to use `kyuubi-shaded-hive-service-rpc`, > `kyuubi-shaded-hive-service-rpc` is designed for Kyuubi project internal > usage, we should not expose those shaded classes to public API. But as I said > before > >> Technically, we can not stop the user from using such shaded API directly. > > Thanks, > Cheng Pan > > >> On Nov 30, 2023, at 14:34, Kent Yao <y...@apache.org> wrote: >> >> Hi Pan, >> >> You completely misunderstood what I meant. >> >> kyuubi-shaded-hive-service-rpc, >> kyuubi-hive-service-rpc-shaded >> >> I just wanted to know why these two kinds of naming policies can make >> both end-users and kyuubi developers happy? >> >> Why is even this necessary? >> >> Is it said that if an end-users use kyuubi-hive-jdbc-shaded as a JDBC >> Driver, he/she could not use kyuubi-shaded-hive-service-rpc as >> as a thrift client? >> >> Cheng Pan <pan3...@gmail.com> 于2023年11月30日周四 11:25写道: >>> >>> >>>>> The artifacts end with “-shaded” are for end users. >>>> >>>> This looks like an explanation for mistakes we've already made. >>> >>> I don’t think they're mistakes, only facts. In practice, there are various >>> ways to name this type of jar, we just picked one of them. >>> >>> There are some examples: >>> >>> - >>> https://mvnrepository.com/artifact/com.datastax.oss/java-driver-core-shaded >>> - >>> https://mvnrepository.com/artifact/org.apache.zeppelin/zeppelin-interpreter-shaded >>> - >>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-spark-runtime-3.3 >>> - https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client-api >>> - https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client-runtime >>> - >>> https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kinesis-asl-assembly >>> >>> Currently, Kyuubi has two such modules >>> >>> - kyuubi-hive-jdbc-shaded >>> - kyuubi-spark-authz-shaded >>> >>> The main purpose of this kind of artifact is to assemble all transitive >>> dependencies into one jar, usually with proper third-party class relocation >>> but WITHOUT changing any public API, e.g. for the JDBC driver, the public >>> API is JDBC API, it also means that the user can migrate from `A` to >>> `A-shaded` smoothly if only public API is consumed. >>> >>>> I've never been heard of such policy being discussed, documented. >>> >>> There hasn't been any specific topic of discussion about name policy >>> before. It's impossible to discuss everything in detail. If someone thinks >>> something matters, then we should discuss it. >>> >>>> Furthermore, since we've already posted them publicly in the maven >>>> repository, >>>> how can we decide that they won't be used by end-users? >>> >>> Technically, we can not stop the user from using such shaded API directly. >>> Personally, I think it’s a quite common practice, especially in Apache >>> projects, thus it’s obvious to me, for example: >>> >>> - >>> https://mvnrepository.com/artifact/org.apache.hbase.thirdparty/hbase-shaded-netty >>> - >>> https://mvnrepository.com/artifact/org.apache.hadoop.thirdparty/hadoop-shaded-guava >>> - https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-jackson >>> - >>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-bundled-guava >>> >>> Also, I just follow the most popular name pattern >>> `<project_name>-shaded-<thrid_party_name>` to name Kyuubi shaded artifacts. >>> >>>> Did we explicitly remove the public APIs from the original artifact? >>> >>> >>> No, in most cases, just simple relocation and repackaging of the original >>> jar, with class overwriting if necessary. >>> >>> Thanks, >>> Cheng Pan >>> >>> >>>> On Nov 29, 2023, at 11:13, Kent Yao <y...@apache.org> wrote: >>>> >>>>> The artifacts end with “-shaded” are for end users. >>>> >>>> This looks like an explanation for mistakes we've already made. >>>> >>>> I've never been heard of such policy being discussed, documented. >>>> >>>> Furthermore, since we've already posted them publicly in the maven >>>> repository, >>>> how can we decide that they won't be used by end-users? Did we explicitly >>>> remove >>>> the public APIs from the original artifact? >>>> >>>> Kent >>>> >>>> Cheng Pan <pan3...@gmail.com> 于2023年11月29日周三 10:25写道: >>>>> >>>>> Kent >>>>> >>>>> They are for different purposes. >>>>> >>>>> The artifacts start with "kyuubi-shaded” is for the Kyuubi project's >>>>> internal usage, see [1] and [2] >>>>> >>>>> The artifacts end with “-shaded” are for end users. >>>>> >>>>> [1] https://github.com/apache/kyuubi-shaded >>>>> [2] >>>>> https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-shaded-zookeeper-34 >>>>> >>>>> Thanks, >>>>> Cheng Pan >>>>> >>>>> >>>>>> On Nov 28, 2023, at 18:36, Kent Yao <y...@apache.org> wrote: >>>>>> >>>>>> -1 for the arbitrary naming >>>>>> >>>>>> >>>>>> FYI, >>>>>> https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-hive-jdbc-shaded >>>>>> >>>>>> XiDuo You <ulyssesyo...@gmail.com> 于2023年11月28日周二 18:03写道: >>>>>>> >>>>>>> +1, looks fine to me >>>>>>> >>>>>>> Zhen Wang <wangz...@apache.org> 于2023年11月28日周二 12:53写道: >>>>>>>> >>>>>>>> +1, I like this idea, it avoids the possibility of engine thrift class >>>>>>>> conflicts in different environments. >>>>>>>> >>>>>>>> >>>>>>>> Kind Regards, >>>>>>>> Zhen Wang >>>>>>>> >>>>>>>> Cheng Pan <pan3...@gmail.com> 于2023年11月28日周二 01:00写道: >>>>>>>>> >>>>>>>>> Hi Kyuubi developers, >>>>>>>>> >>>>>>>>> Recently, I've been looking at the code of the Hive engine, one of my >>>>>>>>> goals id to make it support multiple versions of Hive runtime, >>>>>>>>> including >>>>>>>>> - 3.1.3, which is the latest stable version of Apache Hive, already >>>>>>>>> supported >>>>>>>>> - 2.3.9, which is the latest stable version of Apache Hive 2.x, is >>>>>>>>> adopted widely, including Apache Spark, Apache Flink >>>>>>>>> - 2.1.1-cdh6.3.2, the latest free version of CDH, has lots of users >>>>>>>>> >>>>>>>>> When I tried to run the Hive engine built against Hive 3.x with Hive >>>>>>>>> 2.3.9/2.1.1-cdh6.3.2 runtime, I encountered some thrift class conflict >>>>>>>>> issues which were hard to resolve, thus I propose to create a >>>>>>>>> pre-shaded hive-service-rpc to tackles such issue. >>>>>>>>> >>>>>>>>> I have created two PRs[1][2] to demonstrate and verify my idea, it >>>>>>>>> also introduces additional benefits, e.g. speed up the engine >>>>>>>>> packaging by reducing relocation classes. >>>>>>>>> >>>>>>>>> Looking forward to the community feedback and PR reviews. >>>>>>>>> >>>>>>>>> Once the idea is accepted, and [1] gets merged, I will start the >>>>>>>>> kyuubi-shaded release voting. >>>>>>>>> >>>>>>>>> [1] https://github.com/apache/kyuubi-shaded/pull/20 >>>>>>>>> [2] https://github.com/apache/kyuubi/pull/5783 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Cheng Pan >>>>> >>> >