kennknowles commented on a change in pull request #13342:
URL: https://github.com/apache/beam/pull/13342#discussion_r524593181



##########
File path: sdks/java/build-tools/beam-linkage-check.sh
##########
@@ -36,46 +36,60 @@ set -o pipefail
 set -e
 
 # These default artifacts are common causes of linkage errors.
-ARTIFACTS="beam-sdks-java-core
-  beam-sdks-java-io-google-cloud-platform
-  beam-runners-google-cloud-dataflow-java
-  beam-sdks-java-io-hadoop-format"
-
-if [ ! -z "$1" ]; then
-  ARTIFACTS=$1
+DEFAULT_ARTIFACT_LISTS=" \
+  beam-sdks-java-core \
+  beam-sdks-java-io-google-cloud-platform \
+  beam-runners-google-cloud-dataflow-java \
+  beam-sdks-java-io-hadoop-format \
+"
+
+BASELINE_REF=$1
+PROPOSED_REF=$2
+ARTIFACT_LISTS=$3
+
+if [ -z "$ARTIFACT_LISTS" ]; then
+  ARTIFACT_LISTS=$DEFAULT_ARTIFACT_LISTS
 fi
 
-BRANCH_NAME=$(git symbolic-ref --short HEAD)
+if [ -z "$BASELINE_REF" ] || [ -z "$PROPOSED_REF" ] || [ -z "$ARTIFACT_LISTS" 
] ; then
+  echo "Usage: $0 <baseline ref> <proposed ref> [artifact lists]"
+  exit 1
+fi
 
 if [ ! -d buildSrc ]; then
-  echo "Please run this script in the Beam project root:"
-  echo "  /bin/bash sdks/java/build-tools/beam-linkage-check.sh"
-  exit 1
+  echo "Directory 'buildSrc' not found. Please run this script from the root 
directory of a clone of the Beam git repo."
 fi
 
-if [ "$BRANCH_NAME" = "master" ]; then
-  echo "Please run this script on a branch other than master"
+if [ "$BASELINE_REF" = "$PROPOSED_REF" ]; then
+  echo "Baseline and proposed refs are identical; cannot compare their linkage 
errors!"
   exit 1
 fi
 
-OUTPUT_DIR=build/linkagecheck
-mkdir -p $OUTPUT_DIR
-
 if [ ! -z "$(git diff)" ]; then
   echo "Uncommited change detected. Please commit changes and ensure 'git 
diff' is empty."
   exit 1
 fi
 
+STARTING_REF=$(git rev-parse --abbrev-ref HEAD)
+function cleanup() {
+  git checkout $STARTING_REF
+}
+trap cleanup EXIT
+
+echo "Comparing linkage of artifact lists $ARTIFACT_LISTS using baseline 
$BASELINE_REF and proposal $PROPOSED_REF"
+
+OUTPUT_DIR=build/linkagecheck
+mkdir -p $OUTPUT_DIR
+
 ACCUMULATED_RESULT=0
 
 function runLinkageCheck () {
-  COMMIT=$1
-  BRANCH=$2
-  MODE=$3 # "baseline" or "validate"
-  for ARTIFACT in $ARTIFACTS; do
-    echo "`date`:" "Running linkage check (${MODE}) for ${ARTIFACT} in 
${BRANCH}"
+  MODE=$1 # "baseline" or "validate"
 
-    BASELINE_FILE=${OUTPUT_DIR}/baseline-${ARTIFACT}.xml
+  for ARTIFACT_LIST in $ARTIFACT_LISTS; do
+    echo "`date`:" "Running linkage check (${MODE}) for ${ARTIFACT_LISTS}"
+
+    BASELINE_FILE=${OUTPUT_DIR}/baseline-${ARTIFACT_LIST}.xml

Review comment:
       It is a stylistic "nit" really. My expectation of gradle commands is 
that they operate on the current source tree. There are exceptions, sure. But 
my thinking was:
   
    - `:checkJavaLinkage` always writes to `build/linkagecheck` and the report 
is the linkage report for the current state of the worktree.
    - `beam-linkage-check.sh` operates at level above that. It produces the 
above file for two separate states of the source tree and compares them.
   
   It is not that important. Just recording my thoughts so we can discuss the 
script a bit.

##########
File path: sdks/java/build-tools/beam-linkage-check.sh
##########
@@ -36,46 +36,60 @@ set -o pipefail
 set -e
 
 # These default artifacts are common causes of linkage errors.
-ARTIFACTS="beam-sdks-java-core
-  beam-sdks-java-io-google-cloud-platform
-  beam-runners-google-cloud-dataflow-java
-  beam-sdks-java-io-hadoop-format"
-
-if [ ! -z "$1" ]; then
-  ARTIFACTS=$1
+DEFAULT_ARTIFACT_LISTS=" \

Review comment:
       Yes, I understand what the comma-separated list and space-separated list 
do. I just don't understand why we use both. I trust that there is a good 
reason, but I do not know it. Like if you had two subsets of artifacts that are 
not expected to work together?
   
   I would expect that we do want to run all the GCP modules together with the 
core SDK and, separately, run all the hadoop modules together with the core 
SDK, etc. Or maybe we just run them all together.
   
   So my proposal would be to drop the space-separated and only keep the 
comma-separated logic. But like I said, I trust there is a reason. So I am just 
making this proposal so you can tell me why it does not work.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to