zentol commented on code in PR #21339:
URL: https://github.com/apache/flink/pull/21339#discussion_r1026130555
##########
flink-end-to-end-tests/test-scripts/common_yarn_docker.sh:
##########
@@ -100,13 +100,10 @@ function build_image() {
echo "Pre-downloading Hadoop tarball"
local cache_path
cache_path=$(get_artifact
"http://archive.apache.org/dist/hadoop/common/hadoop-2.8.5/hadoop-2.8.5.tar.gz")
- ln "${cache_path}"
"${END_TO_END_DIR}/test-scripts/docker-hadoop-secure-cluster/hadoop-2.8.5.tar.gz"
+ ln "${cache_path}"
"${END_TO_END_DIR}/test-scripts/docker-hadoop-secure-cluster/hadoop/hadoop.tar.gz"
Review Comment:
> I thought the cache is per run. OK, then putting back the version here
too...
That would limit the caching benefits significantly because many things are
only required once.
> Wait a minute, get_artifact downloads the jar if needed and we just make a
link to the cache entry.
[Kafka](https://github.com/apache/flink/blob/master/flink-end-to-end-tests/test-scripts/kafka-common.sh#L43)
works the same way where several version bump already happened.
The code as-is in the PR is fine. The important thing here is that the URL
passed to `get_artifact` contains the version, since we only create a symbolic
link to the path returned by `get_artifact` (which also contains the version,
extracted from the URL).
If you were to _copy_ the file to `cache_path` then @MartijnVisser would be
correct though (kinda; caching just wouldn't work lol).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]