With additional offline discussion and feedback from other Kyuubi developers, I 
get the major concern here is that the user/developer may not be able to 
distinguish such two kinds of artifacts because the same word “shaded” is used 
in the package name.

Given that, I propose

1. use a different word “relocated” for artifacts for Kyuubi internal usage, so 
the new pattern is `kyuubi-relocated-<thrid_party_name>`, and the proposed jar 
in this thread will be `kyuubi-relocated-hive-service-rpc` (I will rename the 
existing `kyuubi-shaded-zookeeper-*`)
2. enrich the description in the module's pom, and the description will be 
present on the Maven Central page[1], e.g. the current description of 
`kyuubi-hive-jdbc-shaded` is “Kyuubi Project Hive JDBC Shaded Client", I 
propose to change it to "Kyuubi Hive JDBC Driver with dependencies shaded"; and 
the current description of `kyuubi-shaded-zookeeper-34` is "Kyuubi Shaded 
ZooKeeper 34", I propose to change it to "Relocated Zookeeper 3.4 classes used 
by Kyuubi internally."

[1] https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-hive-jdbc-shaded

Thanks,
Cheng Pan


> On Nov 30, 2023, at 14:46, Cheng Pan <pan3...@gmail.com> wrote:
> 
>> You completely misunderstood what I meant.
> 
> I don’t get your point, I tried my best to answer/explain each of your 
> questions/concerns.
> 
>> kyuubi-shaded-hive-service-rpc,
>> kyuubi-hive-service-rpc-shaded
>> 
>> I just wanted to know why these two kinds of naming policies can make
>> both end-users and kyuubi developers happy?
> 
> Just follow the common practice, I believe I have listed sufficient examples 
> to prove that.
> 
>>> Personally, I think it’s a quite common practice, especially in Apache 
>>> projects, thus it’s obvious to me, for example:
>>> 
>>> - 
>>> https://mvnrepository.com/artifact/org.apache.hbase.thirdparty/hbase-shaded-netty
>>> - 
>>> https://mvnrepository.com/artifact/org.apache.hadoop.thirdparty/hadoop-shaded-guava
>>> - https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-jackson
>>> - 
>>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-bundled-guava
> 
>>> Also, I just follow the most popular name pattern 
>>> `<project_name>-shaded-<thrid_party_name>` to name Kyuubi shaded artifacts.
> 
> 
> ====================================================================================================
> 
>> Is it said that if an end-users use kyuubi-hive-jdbc-shaded as a JDBC
>> Driver, he/she could not use kyuubi-shaded-hive-service-rpc as
>> as a thrift client?
> 
> 
> We encourage users to use `kyuubi-hive-jdbc-shaded` and do not recommend 
> users to use `kyuubi-shaded-hive-service-rpc`, 
> `kyuubi-shaded-hive-service-rpc` is designed for Kyuubi project internal 
> usage, we should not expose those shaded classes to public API. But as I said 
> before
> 
>> Technically, we can not stop the user from using such shaded API directly.
> 
> Thanks,
> Cheng Pan
> 
> 
>> On Nov 30, 2023, at 14:34, Kent Yao <y...@apache.org> wrote:
>> 
>> Hi Pan,
>> 
>> You completely misunderstood what I meant.
>> 
>> kyuubi-shaded-hive-service-rpc,
>> kyuubi-hive-service-rpc-shaded
>> 
>> I just wanted to know why these two kinds of naming policies can make
>> both end-users and kyuubi developers happy?
>> 
>> Why is even this necessary?
>> 
>> Is it said that if an end-users use kyuubi-hive-jdbc-shaded as a JDBC
>> Driver, he/she could not use kyuubi-shaded-hive-service-rpc as
>> as a thrift client?
>> 
>> Cheng Pan <pan3...@gmail.com> 于2023年11月30日周四 11:25写道:
>>> 
>>> 
>>>>> The artifacts end with “-shaded” are for end users.
>>>> 
>>>> This looks like an explanation for mistakes we've already made.
>>> 
>>> I don’t think they're mistakes, only facts. In practice, there are various 
>>> ways to name this type of jar, we just picked one of them.
>>> 
>>> There are some examples:
>>> 
>>> - 
>>> https://mvnrepository.com/artifact/com.datastax.oss/java-driver-core-shaded
>>> - 
>>> https://mvnrepository.com/artifact/org.apache.zeppelin/zeppelin-interpreter-shaded
>>> - 
>>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-spark-runtime-3.3
>>> - https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client-api
>>> - https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client-runtime
>>> - 
>>> https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kinesis-asl-assembly
>>> 
>>> Currently, Kyuubi has two such modules
>>> 
>>> - kyuubi-hive-jdbc-shaded
>>> - kyuubi-spark-authz-shaded
>>> 
>>> The main purpose of this kind of artifact is to assemble all transitive 
>>> dependencies into one jar, usually with proper third-party class relocation 
>>> but WITHOUT changing any public API, e.g. for the JDBC driver, the public 
>>> API is JDBC API, it also means that the user can migrate from `A` to 
>>> `A-shaded` smoothly if only public API is consumed.
>>> 
>>>> I've never been heard of such policy being discussed, documented.
>>> 
>>> There hasn't been any specific topic of discussion about name policy 
>>> before. It's impossible to discuss everything in detail. If someone thinks 
>>> something matters, then we should discuss it.
>>> 
>>>> Furthermore, since we've already posted them publicly in the maven 
>>>> repository,
>>>> how can we decide that they won't be used by end-users?
>>> 
>>> Technically, we can not stop the user from using such shaded API directly. 
>>> Personally, I think it’s a quite common practice, especially in Apache 
>>> projects, thus it’s obvious to me, for example:
>>> 
>>> - 
>>> https://mvnrepository.com/artifact/org.apache.hbase.thirdparty/hbase-shaded-netty
>>> - 
>>> https://mvnrepository.com/artifact/org.apache.hadoop.thirdparty/hadoop-shaded-guava
>>> - https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-jackson
>>> - 
>>> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-bundled-guava
>>> 
>>> Also, I just follow the most popular name pattern 
>>> `<project_name>-shaded-<thrid_party_name>` to name Kyuubi shaded artifacts.
>>> 
>>>> Did we explicitly remove the public APIs from the original artifact?
>>> 
>>> 
>>> No, in most cases, just simple relocation and repackaging of the original 
>>> jar, with class overwriting if necessary.
>>> 
>>> Thanks,
>>> Cheng Pan
>>> 
>>> 
>>>> On Nov 29, 2023, at 11:13, Kent Yao <y...@apache.org> wrote:
>>>> 
>>>>> The artifacts end with “-shaded” are for end users.
>>>> 
>>>> This looks like an explanation for mistakes we've already made.
>>>> 
>>>> I've never been heard of such policy being discussed, documented.
>>>> 
>>>> Furthermore, since we've already posted them publicly in the maven 
>>>> repository,
>>>> how can we decide that they won't be used by end-users? Did we explicitly 
>>>> remove
>>>> the public APIs from the original artifact?
>>>> 
>>>> Kent
>>>> 
>>>> Cheng Pan <pan3...@gmail.com> 于2023年11月29日周三 10:25写道:
>>>>> 
>>>>> Kent
>>>>> 
>>>>> They are for different purposes.
>>>>> 
>>>>> The artifacts start with "kyuubi-shaded” is for the Kyuubi project's 
>>>>> internal usage, see [1] and [2]
>>>>> 
>>>>> The artifacts end with “-shaded” are for end users.
>>>>> 
>>>>> [1] https://github.com/apache/kyuubi-shaded
>>>>> [2] 
>>>>> https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-shaded-zookeeper-34
>>>>> 
>>>>> Thanks,
>>>>> Cheng Pan
>>>>> 
>>>>> 
>>>>>> On Nov 28, 2023, at 18:36, Kent Yao <y...@apache.org> wrote:
>>>>>> 
>>>>>> -1 for the arbitrary naming
>>>>>> 
>>>>>> 
>>>>>> FYI, 
>>>>>> https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-hive-jdbc-shaded
>>>>>> 
>>>>>> XiDuo You <ulyssesyo...@gmail.com> 于2023年11月28日周二 18:03写道:
>>>>>>> 
>>>>>>> +1, looks fine to me
>>>>>>> 
>>>>>>> Zhen Wang <wangz...@apache.org> 于2023年11月28日周二 12:53写道:
>>>>>>>> 
>>>>>>>> +1, I like this idea, it avoids the possibility of engine thrift class
>>>>>>>> conflicts in different environments.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Kind Regards,
>>>>>>>> Zhen Wang
>>>>>>>> 
>>>>>>>> Cheng Pan <pan3...@gmail.com> 于2023年11月28日周二 01:00写道:
>>>>>>>>> 
>>>>>>>>> Hi Kyuubi developers,
>>>>>>>>> 
>>>>>>>>> Recently, I've been looking at the code of the Hive engine, one of my
>>>>>>>>> goals id to make it support multiple versions of Hive runtime,
>>>>>>>>> including
>>>>>>>>> - 3.1.3, which is the latest stable version of Apache Hive, already 
>>>>>>>>> supported
>>>>>>>>> - 2.3.9, which is the latest stable version of Apache Hive 2.x, is
>>>>>>>>> adopted widely, including Apache Spark, Apache Flink
>>>>>>>>> - 2.1.1-cdh6.3.2, the latest free version of CDH, has lots of users
>>>>>>>>> 
>>>>>>>>> When I tried to run the Hive engine built against Hive 3.x with Hive
>>>>>>>>> 2.3.9/2.1.1-cdh6.3.2 runtime, I encountered some thrift class conflict
>>>>>>>>> issues which were hard to resolve, thus I propose to create a
>>>>>>>>> pre-shaded hive-service-rpc to tackles such issue.
>>>>>>>>> 
>>>>>>>>> I have created two PRs[1][2] to demonstrate and verify my idea, it
>>>>>>>>> also introduces additional benefits, e.g. speed up the engine
>>>>>>>>> packaging by reducing relocation classes.
>>>>>>>>> 
>>>>>>>>> Looking forward to the community feedback and PR reviews.
>>>>>>>>> 
>>>>>>>>> Once the idea is accepted, and [1] gets merged, I will start the
>>>>>>>>> kyuubi-shaded release voting.
>>>>>>>>> 
>>>>>>>>> [1] https://github.com/apache/kyuubi-shaded/pull/20
>>>>>>>>> [2] https://github.com/apache/kyuubi/pull/5783
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Cheng Pan
>>>>> 
>>> 
> 

Reply via email to