zentol commented on code in PR #21339:
URL: https://github.com/apache/flink/pull/21339#discussion_r1026130555


##########
flink-end-to-end-tests/test-scripts/common_yarn_docker.sh:
##########
@@ -100,13 +100,10 @@ function build_image() {
     echo "Pre-downloading Hadoop tarball"
     local cache_path
     cache_path=$(get_artifact 
"http://archive.apache.org/dist/hadoop/common/hadoop-2.8.5/hadoop-2.8.5.tar.gz";)
-    ln "${cache_path}" 
"${END_TO_END_DIR}/test-scripts/docker-hadoop-secure-cluster/hadoop-2.8.5.tar.gz"
+    ln "${cache_path}" 
"${END_TO_END_DIR}/test-scripts/docker-hadoop-secure-cluster/hadoop/hadoop.tar.gz"

Review Comment:
   > I thought the cache is per run. OK, then putting back the version here 
too...
   
   That would limit the caching benefits significantly because many things are 
only required once.
   
   > Wait a minute, get_artifact downloads the jar if needed and we just make a 
link to the cache entry.
   
[Kafka](https://github.com/apache/flink/blob/master/flink-end-to-end-tests/test-scripts/kafka-common.sh#L43)
 works the same way where several version bump already happened.
   
   The code as-is in the PR is fine. The important thing here is that the URL 
passed to `get_artifact` contains the version, since we only create a symbolic 
link to the path returned by `get_artifact` (which also contains the version, 
extracted from the URL).
   If you were to _copy_ the file to `cache_path` then @MartijnVisser would be 
correct though (kinda; caching just wouldn't work lol).
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to