[GitHub] [hadoop] steveloughran commented on pull request #4491: HADOOP-18311. Upgrade dependencies to address several CVEs

GitBox Thu, 23 Jun 2022 06:35:40 -0700


steveloughran commented on PR #4491:
URL: https://github.com/apache/hadoop/pull/4491#issuecomment-1164417288

(you are going to hate me here. sorry)

First, please let's not have "update a few dependency" patches. Is it not a
useful title and by updating multiple dependencies simultaneously makes it a
lot harder to identify problems through git bisect and makes the changes harder
roll back and cherry pick.

Second, we must not have anything in this release which isn't already in
branch 3.3 and so has been stabilising there in the uses other developers have
been making of that branch.

Finally, I am scared of any- and all- last minute updates of dependences as
the blast radius of a change of a few digits in a number in a POM file can have
dramatic impact on a project two hops away.

That's why I believe the default decision on any last minute dependency
update should be "no". This is worth bearing in mind as I intend to share
release manager responsibilities with Mukund on the branch-3.3 feature release
this summer, and refusing last-minute changes is going to be my default action,
especially when it comes to jar updates. Get those changes in and stabilising
now!

## jetty

-1 to jetty update because I'm scared of what will break. the hadoop.next
release will upgrade to jetty 2 and shade it.

## Htrace

-1 to htrace as it was fixed in this branch by #3520

```
9e2936f8d1f HADOOP-17424. Replace HTrace with No-Op tracer (#3520)
```

If this is not the case then we have a serious issue which needs to be fixed
across all the recent branches. file a critical hadoop JIRA and we can go from
there.

## Zookeeper

-1 until/unless in branch-3.3

Interesting one there. trunk is on 3.6.3 after HADOOP-17612. Upgrade
Zookeeper to 3.6.3 and Curator to 5.2.0 #3241

For any change there, an increment on 3.5.x is lower risk and may not need a
matching curator increment, but that'd still need qualification

for the branch-3.3 release, why don't we cherrypick #3241 and followons?

## AWS SDK

-1 to updating the AWS SDK except as a standalone cherrypick of our
branch-3.3 patch #3864 with full requalification

```
d8ab84275e0 - HADOOP-18068. upgrade AWS SDK to 1.12.132 (#3864)
```

The SDK is covered in HADOOP-18068; any back porting should just be a
cherrypick. But as with most is AWS SDK updates it caused a regression
(HADOOP-18085). Anyone proposing it as a backport has to
1. Run the full hadoop-aws integration test suite with `-Dscale` and declare
which endpoint they ran against.
2. look at the section "Qualifying an AWS SDK Update" and treat the
instructions there as a MUST not a MAY
https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/testing.html#Qualifying_an_AWS_SDK_Update
3. note that instruction 1 there is "Don’t make this a last minute action."

I have encountered other cases where people have been updating this SDK
dependency without raising it with me. Yes, tools do highlight Jackson
serialisation issues which exist in the shaded Jackson dependency. However, the
AWS STK does not use those bits of Jackson. And, because nothing else uses
those bits of Jackson in this library precisely because they are shaded, the
risk is not actually manifest in the S3A connector. Given this fact and the
qualification process I don't want to include it.

If you really want this in, create a single PR cherry picking HADOOP-18068,
and all follow-on fixes which are applicable to this branch, say which AWS
endpoint you ran the hadoop-aws test suites against. And do the entire SDK
update qualification covered in the testing doc. I will then merge the chain of
commits in one by one

This should be safe because we have actually been using this in branch 3.3+
and other than the regression in tests there have been no adverse consequences.
It MUST be the exact version we have been using (1.12.132) as no later release
has been validated.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #4491: HADOOP-18311. Upgrade dependencies to address several CVEs

Reply via email to