Copilot commented on code in PR #50364:
URL: https://github.com/apache/arrow/pull/50364#discussion_r3521783951


##########
dev/release/utils-watch-gh-workflow.sh:
##########
@@ -21,33 +21,43 @@ set -e
 set -u
 set -o pipefail
 
-if [ "$#" -ne 2 ]; then
-  echo "Usage: $0 <tag> <workflow>"
+if [ "$#" -lt 2 ] || [ "$#" -gt 3 ]; then
+  echo "Usage: $0 <tag> <workflow> [run-id]"
   exit 1
 fi
 
 TAG=$1
 WORKFLOW=$2
+RUN_ID="${3:-}"
 : "${REPOSITORY:=${GITHUB_REPOSITORY:-apache/arrow}}"
 
-echo "Looking for GitHub Actions workflow on ${REPOSITORY}:${TAG}"
-RUN_ID=""
-while true; do
-  echo "Waiting for run to start..."
-  RUN_ID=$(gh run list \
-              --branch "${TAG}" \
-              --jq '.[].databaseId' \
-              --json databaseId \
-              --limit 1 \
-              --repo "${REPOSITORY}" \
-              --workflow "${WORKFLOW}")
-  if [ -n "${RUN_ID}" ]; then
-    break
-  fi
-  sleep 60
-done
+if [ -z "${RUN_ID}" ]; then
+  echo "Looking for GitHub Actions workflow on ${REPOSITORY}:${TAG} (any ID)"
+else
+  echo "Looking for GitHub Actions workflow on ${REPOSITORY}:${TAG} with run 
ID ${RUN_ID}"
+fi
+if [ -z "${RUN_ID}" ]; then
+  while true; do
+    echo "Waiting for run to start..."
+    RUN_ID=$(gh run list \
+                --branch "${TAG}" \
+                --jq '.[].databaseId' \

Review Comment:
   When no run ID is provided, the watcher now filters `gh run list` to 
`--status in_progress`. This can hang indefinitely if the most recent run is 
already `queued` (not yet in_progress) or if the workflow has already completed 
by the time this script is run (e.g. 
`dev/release/post-05-update-gh-release-notes.sh` calls this watcher without a 
run ID). Consider selecting the newest non-completed run first, but falling 
back to the newest run (any status) after a short grace period so the script 
still works for already-completed workflows while avoiding picking an older 
completed run right after triggering a new one.



##########
dev/release/utils-watch-gh-workflow.sh:
##########
@@ -21,33 +21,43 @@ set -e
 set -u
 set -o pipefail
 
-if [ "$#" -ne 2 ]; then
-  echo "Usage: $0 <tag> <workflow>"
+if [ "$#" -lt 2 ] || [ "$#" -gt 3 ]; then
+  echo "Usage: $0 <tag> <workflow> [run-id]"
   exit 1
 fi
 
 TAG=$1
 WORKFLOW=$2
+RUN_ID="${3:-}"
 : "${REPOSITORY:=${GITHUB_REPOSITORY:-apache/arrow}}"
 
-echo "Looking for GitHub Actions workflow on ${REPOSITORY}:${TAG}"
-RUN_ID=""
-while true; do
-  echo "Waiting for run to start..."
-  RUN_ID=$(gh run list \
-              --branch "${TAG}" \
-              --jq '.[].databaseId' \
-              --json databaseId \
-              --limit 1 \
-              --repo "${REPOSITORY}" \
-              --workflow "${WORKFLOW}")
-  if [ -n "${RUN_ID}" ]; then
-    break
-  fi
-  sleep 60
-done
+if [ -z "${RUN_ID}" ]; then
+  echo "Looking for GitHub Actions workflow on ${REPOSITORY}:${TAG} (any ID)"
+else
+  echo "Looking for GitHub Actions workflow on ${REPOSITORY}:${TAG} with run 
ID ${RUN_ID}"
+fi
+if [ -z "${RUN_ID}" ]; then
+  while true; do
+    echo "Waiting for run to start..."
+    RUN_ID=$(gh run list \
+                --branch "${TAG}" \
+                --jq '.[].databaseId' \
+                --json databaseId \
+                --limit 1 \
+                --repo "${REPOSITORY}" \
+                --status "in_progress" \
+                --workflow "${WORKFLOW}")
+    if [ -n "${RUN_ID}" ]; then
+      break
+    fi
+    sleep 60
+  done
+  echo "Found GitHub Actions workflow with ID: ${RUN_ID}"
+else
+  echo "Using provided run ID: ${RUN_ID}. Sleeping for 10 seconds to let the 
job become available..."
+  sleep 10
+fi

Review Comment:
   A fixed `sleep 10` before `gh run watch` can be flaky: the run may take 
longer than 10 seconds to become visible via the API, causing `gh run watch` to 
fail intermittently. Poll for run availability (e.g., `gh run view`) with a 
timeout instead of a hard-coded sleep.



##########
dev/release/07-flightsqlodbc-upload.sh:
##########
@@ -137,14 +137,19 @@ fi
 
 if [ "${PHASE_BUILD_MSI}" -gt 0 ]; then
   echo "[4/8] Triggering odbc_release_step in package_odbc.yml workflow..."
-  gh workflow run package_odbc.yml \
+  workflow_url=$(gh workflow run package_odbc.yml \
     --repo "${GITHUB_REPOSITORY}" \
     --ref "${tag}" \
-    --field odbc_release_step=true
+    --field odbc_release_step=true)
+  echo "${workflow_url}"
+  # Extract run ID from `gh workflow run` output. The output is structured 
like,
+  # https://github.com/apache/arrow/actions/runs/28679576610 and we just need
+  # the id.
+  run_id=$(echo "${workflow_url}" | grep -Eo 'actions/runs/[0-9]+' | grep -Eo 
'[0-9]+$' || true)
 

Review Comment:
   `run_id` extraction is allowed to silently fail (`|| true`) and the script 
will then pass an empty run ID to the watcher. That undermines the goal of 
waiting for the exact run that was just triggered and can reintroduce the 
“wrong run” problem. Consider failing fast with a clear error if the run ID 
can’t be parsed from `gh workflow run` output.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to