Re: [Discussion] HIVE-28211: Restore hive-exec:core jar

2024-05-03 Thread Denys Kuzmenko
I agree that shaded hive-exec should be the proper way to go, however, ATM it's 
a show-stopper for many downstream projects to upgrade. 
Also based on the mail threads, they clearly understand the risks of using an 
unshaded jar but still insist on keeping it. 
If we'd like to improve the project acceptance, perhaps we could allow some 
flexibility. 


Re: [Discussion] HIVE-28211: Restore hive-exec:core jar

2024-05-03 Thread Zoltan Haindrich

I think the shading should be fixed instead restoring this core jar.
Providing a core-jar means that we support it and I think that would be a bad 
move:
I believe its an irrational expectation from any project to use the same or 
compatible deps as against hive-exec was compiled!
For example hive-exec uses an ancient guava which was released back in 2017 
https://mvnrepository.com/artifact/com.google.guava/guava/22.0
and has 3 CVEs listed... and that's just one from many deps the core-jar will 
pull into a build.
Also note that guava tends to break api quite frequently - so I guess anyone 
using a bit more recent guava will have a hard time consuming the artifact

Downstream projects have had the opportunity to try and report issues with the 
alpha releases before the 4.0 have came out or not?
If they were not doing that - I think that's not our fault!

Middle ground could be to suggest them to try the shaded hive-exec jar (we still have nightly builds [1]); notify these projects to try it and report back issues - give 
them some time fix up any further shading issues and done.


[1] http://ci.hive.apache.org/job/hive-nightly/

cheers,
Zoltan

On 4/29/24 09:16, Stamatis Zampetakis wrote:
I shared the reasons behind the removal of the jar and my concerns around bringing it back. I'm still not convinced that it's needed but if the rest of the community feels 
that it's the right path forward then I am ok with this.


Best,
Stamatis

On Fri, Apr 26, 2024, 2:42 PM Ayush Saxena mailto:ayush...@gmail.com>> wrote:

Stamatis,
Isn't the removal itself an incompatible change? There are a lot of projects 
using it & we suddenly removed a jar because there were some people not sure 
how to
properly use it and were complaining about it.

What about the projects which are now stuck? reading the thread at [1], 
there were promises made that everything will be relocated and sorted before 
the release, but we
couldn't, AFAIK it isn't a naive task to just relocate all the dependencies.

As I see here @Chao Sun , even raised concerns [2], that the removal just 
stops the way for upgrading downstream projects and it got countered like folks 
chasing the
removal will help chase getting all the dependencies relocated or solve the 
issues for downstream. I think none volunteered.

I would either recommend:
* Best case we relocate all the dependencies present in hive-exec, not just one or 
two. Somebody volunteers to raise one PR relocating "all" and we can commit 
that and
we should be sorted.
* Restore back the core jar, because a lot of projects depend on it, the 
removal itself was incompatible, the removal I don't think had a clear 
community agreement, it
was a conditional agreement, which I don't think got sorted, so we should 
rollback.

On a lighter note, we might release with some 5000+ commits, with best 
performance or so, but if nobody is able to consume those release bits, I think 
those efforts are
just getting waste, eventually people will just stick to their older versions 
and not even try to upgrade & we will be releasing for nobody or maybe for few 
folks who
just have only Hive in their stack (I don't know if there are folks like 
that), No matter how good a product is, if people don't use it, it is gonna die 
:-(


I think we have a ticket which talks about relocating all dependencies, I 
agree we should drop the core jar for sure, it leads to all the problems as 
Stamatis mentioned
but lets restore the core jar back & we can drop it when that relocation 
ticket is resolved. Does that sound convincing, or even worth a thought?

btw. having jars with a set of dependencies shaded and other ones unshaded is 
done in hadoop as well, hadoop-minicluster vs hadoop-client-minicluster & such 
problems by
users keep on coming, eg [3]

Anyone else, any thoughts?

-Ayush

[1] https://lists.apache.org/thread/cwtxnffoqpwgmdtlc9hyor2cm22djpkg 

[2] https://lists.apache.org/thread/23sshgolmbpcc01npqgt03woljdy6hdn 

[3] https://lists.apache.org/thread/f47s6bxrtslkxbc8s2gybwrxps8vk63x 




On Fri, 26 Apr 2024 at 16:37, Stamatis Zampetakis mailto:zabe...@gmail.com>> wrote:

Hey Simhadri, thanks for starting this discussion.

Maven has many limitations when it comes to publishing multiple
artifacts from the same module. In most cases, the end result is
broken and hard to use. The pom file that is published for a given
module is not able to describe correctly all artifacts of the module
and that's why there is one main artifact for every module; dependency
declarations are usually correct for the main artifact but are not
representative for the rest.

For 

CVE-2023-35701: Apache Hive: Arbitrary command execution via JDBC driver

2024-05-03 Thread Stamatis Zampetakis
Severity: moderate

Affected versions:

- Apache Hive 4.0.0-alpha-1 before 4.0.0

Description:

Improper Control of Generation of Code ('Code Injection') vulnerability in 
Apache Hive.

The vulnerability affects the Hive JDBC driver component and it can potentially 
lead to arbitrary code execution on the machine/endpoint that the JDBC driver 
(client) is running. The malicious user must have sufficient permissions to 
specify/edit JDBC URL(s) in an endpoint relying on the Hive JDBC driver and the 
JDBC client process must run under a privileged user to fully exploit the 
vulnerability. 

The attacker can setup a malicious HTTP server and specify a JDBC URL pointing 
towards this server. When a JDBC connection is attempted, the malicious HTTP 
server can provide a special response with customized payload that can trigger 
the execution of certain commands in the JDBC client.This issue affects Apache 
Hive: from 4.0.0-alpha-1 before 4.0.0.

Users are recommended to upgrade to version 4.0.0, which fixes the issue.

This issue is being tracked as HIVE-27554 

Credit:

Kostya Kortchinsky (reporter)

References:

https://hive.apache.org/
https://www.cve.org/CVERecord?id=CVE-2023-35701
https://issues.apache.org/jira/browse/HIVE-27554