+1

Thank you, Pan, for the summarization.


kent

Cheng Pan <pan3...@gmail.com> 于2023年12月1日周五 15:42写道:
>
> With additional offline discussion and feedback from other Kyuubi developers, 
> I get the major concern here is that the user/developer may not be able to 
> distinguish such two kinds of artifacts because the same word “shaded” is 
> used in the package name.
>
> Given that, I propose
>
> 1. use a different word “relocated” for artifacts for Kyuubi internal usage, 
> so the new pattern is `kyuubi-relocated-<thrid_party_name>`, and the proposed 
> jar in this thread will be `kyuubi-relocated-hive-service-rpc` (I will rename 
> the existing `kyuubi-shaded-zookeeper-*`)
> 2. enrich the description in the module's pom, and the description will be 
> present on the Maven Central page[1], e.g. the current description of 
> `kyuubi-hive-jdbc-shaded` is “Kyuubi Project Hive JDBC Shaded Client", I 
> propose to change it to "Kyuubi Hive JDBC Driver with dependencies shaded"; 
> and the current description of `kyuubi-shaded-zookeeper-34` is "Kyuubi Shaded 
> ZooKeeper 34", I propose to change it to "Relocated Zookeeper 3.4 classes 
> used by Kyuubi internally."
>
> [1] 
> https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-hive-jdbc-shaded
>
> Thanks,
> Cheng Pan
>
>
> > On Nov 30, 2023, at 14:46, Cheng Pan <pan3...@gmail.com> wrote:
> >
> >> You completely misunderstood what I meant.
> >
> > I don’t get your point, I tried my best to answer/explain each of your 
> > questions/concerns.
> >
> >> kyuubi-shaded-hive-service-rpc,
> >> kyuubi-hive-service-rpc-shaded
> >>
> >> I just wanted to know why these two kinds of naming policies can make
> >> both end-users and kyuubi developers happy?
> >
> > Just follow the common practice, I believe I have listed sufficient 
> > examples to prove that.
> >
> >>> Personally, I think it’s a quite common practice, especially in Apache 
> >>> projects, thus it’s obvious to me, for example:
> >>>
> >>> - 
> >>> https://mvnrepository.com/artifact/org.apache.hbase.thirdparty/hbase-shaded-netty
> >>> - 
> >>> https://mvnrepository.com/artifact/org.apache.hadoop.thirdparty/hadoop-shaded-guava
> >>> - https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-jackson
> >>> - 
> >>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-bundled-guava
> >
> >>> Also, I just follow the most popular name pattern 
> >>> `<project_name>-shaded-<thrid_party_name>` to name Kyuubi shaded 
> >>> artifacts.
> >
> >
> > ====================================================================================================
> >
> >> Is it said that if an end-users use kyuubi-hive-jdbc-shaded as a JDBC
> >> Driver, he/she could not use kyuubi-shaded-hive-service-rpc as
> >> as a thrift client?
> >
> >
> > We encourage users to use `kyuubi-hive-jdbc-shaded` and do not recommend 
> > users to use `kyuubi-shaded-hive-service-rpc`, 
> > `kyuubi-shaded-hive-service-rpc` is designed for Kyuubi project internal 
> > usage, we should not expose those shaded classes to public API. But as I 
> > said before
> >
> >> Technically, we can not stop the user from using such shaded API directly.
> >
> > Thanks,
> > Cheng Pan
> >
> >
> >> On Nov 30, 2023, at 14:34, Kent Yao <y...@apache.org> wrote:
> >>
> >> Hi Pan,
> >>
> >> You completely misunderstood what I meant.
> >>
> >> kyuubi-shaded-hive-service-rpc,
> >> kyuubi-hive-service-rpc-shaded
> >>
> >> I just wanted to know why these two kinds of naming policies can make
> >> both end-users and kyuubi developers happy?
> >>
> >> Why is even this necessary?
> >>
> >> Is it said that if an end-users use kyuubi-hive-jdbc-shaded as a JDBC
> >> Driver, he/she could not use kyuubi-shaded-hive-service-rpc as
> >> as a thrift client?
> >>
> >> Cheng Pan <pan3...@gmail.com> 于2023年11月30日周四 11:25写道:
> >>>
> >>>
> >>>>> The artifacts end with “-shaded” are for end users.
> >>>>
> >>>> This looks like an explanation for mistakes we've already made.
> >>>
> >>> I don’t think they're mistakes, only facts. In practice, there are 
> >>> various ways to name this type of jar, we just picked one of them.
> >>>
> >>> There are some examples:
> >>>
> >>> - 
> >>> https://mvnrepository.com/artifact/com.datastax.oss/java-driver-core-shaded
> >>> - 
> >>> https://mvnrepository.com/artifact/org.apache.zeppelin/zeppelin-interpreter-shaded
> >>> - 
> >>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-spark-runtime-3.3
> >>> - https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client-api
> >>> - 
> >>> https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client-runtime
> >>> - 
> >>> https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kinesis-asl-assembly
> >>>
> >>> Currently, Kyuubi has two such modules
> >>>
> >>> - kyuubi-hive-jdbc-shaded
> >>> - kyuubi-spark-authz-shaded
> >>>
> >>> The main purpose of this kind of artifact is to assemble all transitive 
> >>> dependencies into one jar, usually with proper third-party class 
> >>> relocation but WITHOUT changing any public API, e.g. for the JDBC driver, 
> >>> the public API is JDBC API, it also means that the user can migrate from 
> >>> `A` to `A-shaded` smoothly if only public API is consumed.
> >>>
> >>>> I've never been heard of such policy being discussed, documented.
> >>>
> >>> There hasn't been any specific topic of discussion about name policy 
> >>> before. It's impossible to discuss everything in detail. If someone 
> >>> thinks something matters, then we should discuss it.
> >>>
> >>>> Furthermore, since we've already posted them publicly in the maven 
> >>>> repository,
> >>>> how can we decide that they won't be used by end-users?
> >>>
> >>> Technically, we can not stop the user from using such shaded API 
> >>> directly. Personally, I think it’s a quite common practice, especially in 
> >>> Apache projects, thus it’s obvious to me, for example:
> >>>
> >>> - 
> >>> https://mvnrepository.com/artifact/org.apache.hbase.thirdparty/hbase-shaded-netty
> >>> - 
> >>> https://mvnrepository.com/artifact/org.apache.hadoop.thirdparty/hadoop-shaded-guava
> >>> - https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-jackson
> >>> - 
> >>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-bundled-guava
> >>>
> >>> Also, I just follow the most popular name pattern 
> >>> `<project_name>-shaded-<thrid_party_name>` to name Kyuubi shaded 
> >>> artifacts.
> >>>
> >>>> Did we explicitly remove the public APIs from the original artifact?
> >>>
> >>>
> >>> No, in most cases, just simple relocation and repackaging of the original 
> >>> jar, with class overwriting if necessary.
> >>>
> >>> Thanks,
> >>> Cheng Pan
> >>>
> >>>
> >>>> On Nov 29, 2023, at 11:13, Kent Yao <y...@apache.org> wrote:
> >>>>
> >>>>> The artifacts end with “-shaded” are for end users.
> >>>>
> >>>> This looks like an explanation for mistakes we've already made.
> >>>>
> >>>> I've never been heard of such policy being discussed, documented.
> >>>>
> >>>> Furthermore, since we've already posted them publicly in the maven 
> >>>> repository,
> >>>> how can we decide that they won't be used by end-users? Did we 
> >>>> explicitly remove
> >>>> the public APIs from the original artifact?
> >>>>
> >>>> Kent
> >>>>
> >>>> Cheng Pan <pan3...@gmail.com> 于2023年11月29日周三 10:25写道:
> >>>>>
> >>>>> Kent
> >>>>>
> >>>>> They are for different purposes.
> >>>>>
> >>>>> The artifacts start with "kyuubi-shaded” is for the Kyuubi project's 
> >>>>> internal usage, see [1] and [2]
> >>>>>
> >>>>> The artifacts end with “-shaded” are for end users.
> >>>>>
> >>>>> [1] https://github.com/apache/kyuubi-shaded
> >>>>> [2] 
> >>>>> https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-shaded-zookeeper-34
> >>>>>
> >>>>> Thanks,
> >>>>> Cheng Pan
> >>>>>
> >>>>>
> >>>>>> On Nov 28, 2023, at 18:36, Kent Yao <y...@apache.org> wrote:
> >>>>>>
> >>>>>> -1 for the arbitrary naming
> >>>>>>
> >>>>>>
> >>>>>> FYI, 
> >>>>>> https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-hive-jdbc-shaded
> >>>>>>
> >>>>>> XiDuo You <ulyssesyo...@gmail.com> 于2023年11月28日周二 18:03写道:
> >>>>>>>
> >>>>>>> +1, looks fine to me
> >>>>>>>
> >>>>>>> Zhen Wang <wangz...@apache.org> 于2023年11月28日周二 12:53写道:
> >>>>>>>>
> >>>>>>>> +1, I like this idea, it avoids the possibility of engine thrift 
> >>>>>>>> class
> >>>>>>>> conflicts in different environments.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Kind Regards,
> >>>>>>>> Zhen Wang
> >>>>>>>>
> >>>>>>>> Cheng Pan <pan3...@gmail.com> 于2023年11月28日周二 01:00写道:
> >>>>>>>>>
> >>>>>>>>> Hi Kyuubi developers,
> >>>>>>>>>
> >>>>>>>>> Recently, I've been looking at the code of the Hive engine, one of 
> >>>>>>>>> my
> >>>>>>>>> goals id to make it support multiple versions of Hive runtime,
> >>>>>>>>> including
> >>>>>>>>> - 3.1.3, which is the latest stable version of Apache Hive, already 
> >>>>>>>>> supported
> >>>>>>>>> - 2.3.9, which is the latest stable version of Apache Hive 2.x, is
> >>>>>>>>> adopted widely, including Apache Spark, Apache Flink
> >>>>>>>>> - 2.1.1-cdh6.3.2, the latest free version of CDH, has lots of users
> >>>>>>>>>
> >>>>>>>>> When I tried to run the Hive engine built against Hive 3.x with Hive
> >>>>>>>>> 2.3.9/2.1.1-cdh6.3.2 runtime, I encountered some thrift class 
> >>>>>>>>> conflict
> >>>>>>>>> issues which were hard to resolve, thus I propose to create a
> >>>>>>>>> pre-shaded hive-service-rpc to tackles such issue.
> >>>>>>>>>
> >>>>>>>>> I have created two PRs[1][2] to demonstrate and verify my idea, it
> >>>>>>>>> also introduces additional benefits, e.g. speed up the engine
> >>>>>>>>> packaging by reducing relocation classes.
> >>>>>>>>>
> >>>>>>>>> Looking forward to the community feedback and PR reviews.
> >>>>>>>>>
> >>>>>>>>> Once the idea is accepted, and [1] gets merged, I will start the
> >>>>>>>>> kyuubi-shaded release voting.
> >>>>>>>>>
> >>>>>>>>> [1] https://github.com/apache/kyuubi-shaded/pull/20
> >>>>>>>>> [2] https://github.com/apache/kyuubi/pull/5783
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Cheng Pan
> >>>>>
> >>>
> >
>

Reply via email to