mumrah commented on PR #18449:
URL: https://github.com/apache/kafka/pull/18449#issuecomment-2594038906
Ok, there is a fundamental problem here. The `pull_request` target is
building the merge commit of this PR against the base rather than just the PR
contents. This means, the build will include changes on trunk which have not
yet been cached.
When trunk is moving quickly, our PRs will have little hope to benefit from
much caching.
For example:
```
(trunk) HEAD --- A --- B --- C
(PR) HEAD --- X --- Y --- Z --- C
```
If commit C was the last trunk commit to be built, there will be Gradle
cache files for that commit. Commits A and B are still building. If the PR was
simply building X, this would be fine and we would expect cache hits for
anything not changed by X, Y, Z. However, the `pull_request` event will result
in a build of something totally different:
```
(merge) HEAD --- A --- B --- C
`X --- Y --- Z --- C
```
So when the PR is built, it will be fetching the latest cache (C), but will
include file changes from A and B in addition to the PR changes. This greatly
increases cache misses.
---
I think the merge queue might be a solution to this. If we do a full build
as part of the merge queue, then no code will land on trunk that has not been
built, tested, and cached. The risk with this approach is that flaky builds
will prevent things from getting into trunk.
@ijuma @dajac thoughts?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]