Re: [PR] [VL][Delta] Add Delta Spark UT pipeline gated against a known-failures baseline [gluten]

via GitHub Sun, 28 Jun 2026 00:37:35 -0700


Copilot commented on code in PR #12388:
URL: https://github.com/apache/gluten/pull/12388#discussion_r3487529897



##########
.github/workflows/util/delta-spark-ut/setup-delta.sh:
##########
@@ -0,0 +1,206 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+#
+# Prepares a delta-io/delta clone for running its `spark` module tests with the
+# Gluten (Velox) bundle jar on the classpath.
+#
+# Usage:
+#   setup-delta.sh <delta_ref> <delta_dir> <gluten_bundle_jar> 
<gluten_repo_root>
+#
+# Arguments:
+#   delta_ref           - git ref (tag/branch/sha) to check out (e.g. v4.2.0)
+#   delta_dir           - destination directory for the Delta clone
+#   gluten_bundle_jar   - path to the gluten-velox-bundle fat jar
+#   gluten_repo_root    - path to the Gluten repository root (used to locate
+#                         
backends-velox/src-delta40/.../DeltaSQLCommandTest.scala)
+#
+
+set -euo pipefail
+
+if [ "$#" -ne 4 ]; then
+  echo "Usage: $0 <delta_ref> <delta_dir> <gluten_bundle_jar> 
<gluten_repo_root>" >&2
+  exit 1
+fi
+
+DELTA_REF="$1"
+DELTA_DIR="$2"
+GLUTEN_BUNDLE_JAR="$3"
+GLUTEN_ROOT="$4"
+
+if [ ! -f "$GLUTEN_BUNDLE_JAR" ]; then
+  echo "Gluten bundle jar not found: $GLUTEN_BUNDLE_JAR" >&2
+  exit 1
+fi
+
+# Reuse the existing DeltaSQLCommandTest from Gluten's backends-velox module
+# rather than maintaining a separate copy. This file is compiled as part of the
+# unified `spark` project's Test scope, which has the Gluten bundle on its
+# classpath (via spark-unified/lib/), so the typed GlutenConfig / 
VeloxDeltaConfig
+# imports resolve correctly.
+PATCH_SOURCE="$GLUTEN_ROOT/backends-velox/src-delta40/test/scala/org/apache/spark/sql/delta/test/DeltaSQLCommandTest.scala"
+if [ ! -f "$PATCH_SOURCE" ]; then
+  echo "Gluten DeltaSQLCommandTest not found: $PATCH_SOURCE" >&2
+  exit 1
+fi
+
+echo "::group::Cloning delta-io/delta @ ${DELTA_REF}"
+# Shallow clone the requested tag/branch. Fall back to full clone when the ref 
is a SHA.
+if ! git clone --depth 1 --branch "$DELTA_REF" 
https://github.com/delta-io/delta.git "$DELTA_DIR"; then
+  echo "Shallow clone of ref '${DELTA_REF}' failed, falling back to full 
clone."
+  rm -rf "$DELTA_DIR"
+  git clone https://github.com/delta-io/delta.git "$DELTA_DIR"
+  git -C "$DELTA_DIR" checkout "$DELTA_REF"
+fi
+git -C "$DELTA_DIR" --no-pager log -1 --oneline
+echo "::endgroup::"
+
+echo "::group::Injecting Gluten bundle jar onto the spark project's TEST 
classpath"
+# The Gluten bundle jar must be on the spark project's TEST runtime classpath
+# (so DeltaSQLCommandTest can load org.apache.gluten.GlutenPlugin by name) but
+# NOT on the COMPILE classpath of `sparkV1`, which is the project that holds
+# Delta's main sources. The bundle's transitive contents include extra symbols
+# under `org.apache.spark.sql` that collide with Delta's main sources -- e.g.
+# MergeOutputGeneration.scala imports both `org.apache.spark.sql._` and
+# `org.apache.spark.sql.delta.ClassicColumnConversions._`, and would then fail
+# with `reference to expression is ambiguous`.
+#
+# sbt auto-scans `<baseDirectory>/lib` via `unmanagedBase`. Two relevant
+# projects in Delta v4.2.0 have a `lib/` baseDirectory:
+#   - sparkV1: `project in file("spark")`     -> spark/lib
+#   - spark  : `project in file("spark-unified")` -> spark-unified/lib
+# unmanagedJars are project-scoped (NOT inherited by dependents), so dropping
+# the bundle into spark-unified/lib/ adds it to the unified `spark` project's
+# Compile *and* Test classpaths -- but NOT to sparkV1's. That's exactly what
+# we want:
+#   * sparkV1/Compile sees ONLY Delta's regular deps -> Delta main compiles.
+#   * spark/Test/fullClasspath sees the bundle -> tests load GlutenPlugin.
+# (Verified empirically: with bundle only in spark-unified/lib/, sbt's
+#  `show sparkV1/Compile/dependencyClasspath` excludes the bundle and
+#  `show spark/Test/fullClasspath` includes it.)
+#
+# We deliberately do NOT also drop the bundle into spark/lib/, which is what
+# caused the previous compile failure: spark/lib/ is sparkV1's unmanagedBase,
+# and putting the bundle there would re-introduce the ambiguity errors.
+SPARK_UNIFIED_LIB="$DELTA_DIR/spark-unified/lib"
+mkdir -p "$SPARK_UNIFIED_LIB"
+cp "$GLUTEN_BUNDLE_JAR" "$SPARK_UNIFIED_LIB/gluten-velox-bundle.jar"
+ls -lh "$SPARK_UNIFIED_LIB"
+echo "::endgroup::"
+
+echo "::group::Patching DeltaSQLCommandTest to enable Gluten plugin"
+TARGET="$DELTA_DIR/spark/src/test/scala/org/apache/spark/sql/delta/test/DeltaSQLCommandTest.scala"
+if [ ! -f "$TARGET" ]; then
+  echo "Expected file not found in Delta clone: $TARGET" >&2
+  echo "The Delta directory layout for ref '${DELTA_REF}' may have changed."
+  exit 1
+fi
+cp "$PATCH_SOURCE" "$TARGET"
+echo "Patched $TARGET"
+echo "--- diff vs. upstream ---"
+git -C "$DELTA_DIR" --no-pager diff -- 
"spark/src/test/scala/org/apache/spark/sql/delta/test/DeltaSQLCommandTest.scala"
 || true
+echo "::endgroup::"
+
+# Delta's tests collect file-source scans by matching the concrete
+# `FileSourceScanExec` case class; Gluten offloads the scan to
+# DeltaScanTransformer, a `FileSourceScanLike` sibling, so those matches miss
+# (`scala.MatchError: List()`, empty partition filters, broken column-pruning /
+# scan-metric checks across many suites). delta-io/delta#7104 and #7105 widen 
the
+# matches to the shared `FileSourceScanLike` interface that both the vanilla 
and
+# Gluten scans implement (behavior-preserving for vanilla). Both are merged
+# upstream but land after the pinned DELTA_REF (v4.2.0), so apply them here; 
once
+# DELTA_REF includes a commit its cherry-pick is a clean no-op and the call 
can go.
+#
+# Depth-2 fetch brings each fix commit and its parent, which cherry-pick needs 
to
+# diff against (a depth-1 fetch grafts the parent away); `-n` stages the change
+# without requiring a committer identity.
+cherry_pick_delta_fix() {
+  local sha="$1" pr="$2"
+  echo "Cherry-picking delta-io/delta${pr}"
+  git -C "$DELTA_DIR" fetch --quiet --depth 2 origin "$sha"
+  git -C "$DELTA_DIR" cherry-pick -n "$sha"
+}

Review Comment:
   `git cherry-pick` is not a clean no-op when the target ref already contains 
(or has already backported) the change — it can fail with an "empty 
cherry-pick" and leave the repo in CHERRY_PICK_HEAD state, breaking setup when 
`delta_ref` is bumped. Add a guard to skip cherry-picking commits that are 
already present in the checked-out Delta ref (at minimum when the SHA is an 
ancestor).



##########
.github/workflows/delta_spark_ut.yml:
##########
@@ -0,0 +1,672 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Runs Delta Lake's `spark` sbt module unit tests against a Gluten Velox bundle
+# that is built from the source in this repository. The pipeline:
+#
+#   1. Builds the Velox/Gluten native libraries (centos-7 + vcpkg, x86_64).
+#   2. Builds the Gluten Java/Scala jars and assembles the
+#      `gluten-velox-bundle-spark<spark>_<scala>-linux_amd64-<version>.jar`
+#      fat jar for Spark 4.1 + Scala 2.13 + Java 17 with the Delta profile.
+#   3. Clones delta-io/delta at the requested release tag (default `v4.2.0`),
+#      drops the bundle jar into `spark-unified/lib/` only (NOT `spark/lib/`
+#      -- see setup-delta.sh for the unmanagedJars scoping rationale),
+#      patches Delta's `DeltaSQLCommandTest` to register the Gluten plugin,
+#      and runs `sbt spark/test` sharded across the matrix.
+#
+# Limited to Velox + x86 to keep the matrix simple, per the pipeline's purpose
+# of validating Gluten changes against the latest Delta release.
+
+name: Delta Spark UT (Gluten)
+
+on:
+  # Reusable workflow. velox_backend_x86.yml calls this (gated on 
Delta-relevant
+  # changes) and passes the native-lib + arrow-jars artifacts it already built,
+  # so the expensive native C++ build is NOT duplicated. Those artifacts live 
in
+  # the CALLER's run (a called workflow runs as part of the caller run), so the
+  # jobs below download them by name. See velox_backend_x86.yml 
`delta-spark-ut`.
+  #
+  # NOTE: the `pull_request` trigger was removed so this no longer runs as its 
own
+  # workflow on PRs (which would double-run the Delta suite). 
velox_backend_x86.yml
+  # is now the single PR entry point; `workflow_dispatch` keeps manual 
standalone
+  # runs working (those build the native lib themselves -- see 
build-native-lib).
+  workflow_call:
+    inputs:
+      native_lib_artifact:
+        description: 'Name of the cpp/build artifact uploaded by the caller'
+        type: string
+        required: true
+      arrow_jars_artifact:
+        description: 'Name of the org.apache.arrow jars artifact uploaded by 
the caller'
+        type: string
+        required: true
+      delta_ref:
+        type: string
+        required: false
+        default: 'v4.2.0'
+      spark_version:
+        type: string
+        required: false
+        default: '4.1'
+      test_parallelism:
+        type: string
+        required: false
+        default: '4'
+      update_baseline:
+        type: boolean
+        required: false
+        default: false
+      fail_on_fixed:
+        type: boolean
+        required: false
+        default: true
+  workflow_dispatch:
+    inputs:
+      delta_ref:
+        description: 'delta-io/delta git ref (tag/branch/SHA) to test against'
+        required: true
+        default: 'v4.2.0'
+      spark_version:
+        description: 'Delta `-DsparkVersion` value (must match the Gluten -P 
profile below)'
+        required: true
+        default: '4.1'
+      test_parallelism:
+        description: 'Forked test JVMs per shard (TEST_PARALLELISM_COUNT)'
+        required: true
+        default: '4'
+      update_baseline:
+        description: 'Seed/refresh the known-failures baseline instead of 
enforcing it'
+        type: boolean
+        required: false
+        default: false
+      fail_on_fixed:
+        description: 'Fail when a baseline test now passes (keeps the baseline 
honest)'
+        type: boolean
+        required: false
+        default: true
+
+env:
+  ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
+  MVN_CMD: 'build/mvn -ntp'
+  CCACHE_DIR: "${{ github.workspace }}/.ccache"
+  # Gluten profile / bundle naming for the build-gluten-bundle and
+  # delta-spark-test jobs. Spark 4.1 + Scala 2.13 + JDK 17 matches Delta 
v4.2.0's
+  # default Spark version (4.1.0) from project/CrossSparkVersions.scala.
+  GLUTEN_SPARK_PROFILE: 'spark-4.1'
+  GLUTEN_SCALA_PROFILE: 'scala-2.13'
+  GLUTEN_JAVA_PROFILE: 'java-17'
+  GLUTEN_BUNDLE_SPARK_VERSION: '4.1'
+  GLUTEN_BUNDLE_SCALA_VERSION: '2.13'
+  DELTA_SCALA_VERSION: '2.13.16'
+  # Number of shards in the delta-spark-test matrix. Must equal the length of
+  # the `shard` matrix below.
+  #
+  # 4 shards x TEST_PARALLELISM_COUNT=4 gives ~16-way parallelism packed into 4
+  # runner jobs (4 forks each) rather than 16 single-fork jobs -- fewer 
concurrent
+  # runners for the same throughput. Sharding is by SUITE; total work
+  # (~1250 shard-minutes) is fixed. Each forked test JVM uses ~4G (2G heap + 2G
+  # off-heap), so 4 forks plus the sbt launcher sit close to the ~16G runner 
limit;
+  # this fits because the worst memory hog (DeletionVectorsSuite 2B-row) is
+  # force-failed in setup-delta.sh.
+  DELTA_NUM_SHARDS: '4'
+
+# No `concurrency:` here on purpose. As a reusable workflow this runs inside 
the
+# caller's run, where `github.workflow` resolves to the CALLER's name -- a 
group
+# keyed on it would collide with the caller's own group and, with
+# cancel-in-progress, could cancel the parent run. The caller's concurrency
+# already governs cancellation. (A standalone workflow_dispatch run just won't
+# auto-cancel, which is fine for infrequent manual runs.)
+
+jobs:
+  build-native-lib-centos-7:
+    # Standalone (workflow_dispatch) only. When called by velox_backend_x86.yml
+    # the caller already built the native lib + arrow jars and passes them as
+    # inputs, so this job is skipped and the duplicate native build is avoided.
+    if: github.event_name == 'workflow_dispatch'
+    runs-on: ubuntu-22.04
+    steps:
+      - uses: actions/checkout@v4
+      - name: Get Ccache
+        uses: actions/cache/restore@v4
+        with:
+          path: '${{ env.CCACHE_DIR }}'
+          key: ccache-delta-spark-ut-centos7-release-default-${{github.sha}}
+          restore-keys: |
+            ccache-delta-spark-ut-centos7-release-default
+            ccache-centos7-release-default
+      - name: Build Gluten native libraries
+        run: |
+          docker run -v $GITHUB_WORKSPACE:/work -w /work 
apache/gluten:vcpkg-centos-7-gcc13 bash -c "
+            set -e
+            yum install tzdata -y
+            df -a
+            cd /work
+            export CCACHE_DIR=/work/.ccache
+            export CCACHE_MAXSIZE=1G
+            mkdir -p /work/.ccache
+            ccache -sz
+            bash dev/ci-velox-buildstatic-centos-7.sh
+            ccache -s
+            mkdir -p /work/.m2/repository/org/apache/arrow/
+            cp -r /root/.m2/repository/org/apache/arrow/* 
/work/.m2/repository/org/apache/arrow/
+          "
+      - name: Save Ccache
+        if: always()
+        uses: actions/cache/save@v4
+        with:
+          path: '${{ env.CCACHE_DIR }}'
+          key: ccache-delta-spark-ut-centos7-release-default-${{github.sha}}
+      - uses: actions/upload-artifact@v4
+        with:
+          name: delta-spark-ut-native-lib-centos-7-${{github.sha}}
+          path: ./cpp/build/
+          if-no-files-found: error
+      - uses: actions/upload-artifact@v4
+        with:
+          name: delta-spark-ut-arrow-jars-centos-7-${{github.sha}}
+          path: .m2/repository/org/apache/arrow/
+          if-no-files-found: error
+
+  build-gluten-bundle:
+    needs: build-native-lib-centos-7
+    # Run whether the native lib was built here (dispatch -> success) or 
provided
+    # by the caller (workflow_call -> build-native-lib-centos-7 skipped).
+    if: ${{ always() && needs.build-native-lib-centos-7.result != 'failure' && 
needs.build-native-lib-centos-7.result != 'cancelled' }}
+    runs-on: ubuntu-22.04
+    container: apache/gluten:centos-9-jdk17
+    steps:
+      - uses: actions/checkout@v4
+      - name: Download native artifacts
+        uses: actions/download-artifact@v4
+        with:
+          name: ${{ inputs.native_lib_artifact || 
format('delta-spark-ut-native-lib-centos-7-{0}', github.sha) }}
+          path: ./cpp/build/
+      - name: Download Arrow jars
+        uses: actions/download-artifact@v4
+        with:
+          name: ${{ inputs.arrow_jars_artifact || 
format('delta-spark-ut-arrow-jars-centos-7-{0}', github.sha) }}
+          path: /root/.m2/repository/org/apache/arrow/
+      - name: Cache Maven repository
+        uses: actions/cache@v4
+        with:
+          path: /root/.m2/repository
+          key: m2-delta-spark-ut-bundle-${{ env.GLUTEN_SPARK_PROFILE }}-${{ 
env.GLUTEN_SCALA_PROFILE }}-${{ hashFiles('pom.xml', '**/pom.xml') }}
+          restore-keys: |
+            m2-delta-spark-ut-bundle-${{ env.GLUTEN_SPARK_PROFILE }}-${{ 
env.GLUTEN_SCALA_PROFILE }}-
+            m2-delta-spark-ut-bundle-

Review Comment:
   The Maven cache restore runs *after* downloading the custom Arrow jars. 
`actions/cache` restore can overwrite files under `/root/.m2/repository`, which 
risks replacing the caller-provided Arrow artifacts with whatever is in the 
cache (defeating the purpose of passing the Arrow jars artifact). Reorder these 
steps so the cache is restored first, then download Arrow jars afterwards to 
guarantee the custom jars win.



##########
.github/workflows/util/delta-spark-ut/compare-test-results.py:
##########
@@ -0,0 +1,467 @@
+#!/usr/bin/env python3
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Gate / seed / aggregate the Delta-on-Gluten unit test results.
+
+Running delta-io/delta's ScalaTest suite against the Gluten Velox bundle
+produces many *expected* failures (Gluten does not yet support every Delta
+code path). To keep the red/green signal meaningful while we fix those
+failures incrementally, we maintain a committed baseline of known failing
+tests (``known-failures.txt``) and compare each CI run against it.
+
+This script has three modes:
+
+``enforce`` (default, per shard)
+    Parse the JUnit XML produced by ``sbt spark/test`` (ScalaTest ``-u``
+    reporter) and compare against the baseline:
+
+      * regression -- a test that FAILED but is NOT in the baseline. These
+        fail the build: a previously-passing test just started failing.
+      * expected   -- a test that failed and IS in the baseline. Ignored.
+      * fixed      -- a baseline test that now PASSES. By default these also
+        fail the build (``--fail-on-fixed true``) so the baseline stays honest
+        and contributors remove entries as they fix them.
+
+    If the baseline is empty (not yet bootstrapped) the mode automatically
+    degrades to ``seed`` so the first run is never spuriously red.
+
+``seed`` (bootstrap / ``update_baseline``)
+    Never fails. Just writes the current shard's failing tests so the baseline
+    can be (re)generated from a real run.
+
+``aggregate`` (final job)
+    Merge every shard's ``--failures-out`` / ``--ran-out`` file into a single,
+    sorted, ready-to-commit ``known-failures.txt`` and report stale baseline
+    entries (tests no longer present in any shard).
+
+Baseline file format (``known-failures.txt``)::
+
+    # comment lines start with '#'
+    <fully.qualified.SuiteName>#<test display name>
+
+The suite is always a JVM class name (dot-separated, never starts with '#'),
+so a line whose first non-space character is '#' is unambiguously a comment,
+and the FIRST '#' after the suite separates suite from the (possibly
+'#'-containing) test name.
+
+Only the Python standard library is used so the script runs in the bare
+centos image used by the Delta UT pipeline with no ``pip install``.
+"""
+
+import argparse
+import glob
+import os
+import sys
+import xml.etree.ElementTree as ET
+
+# Synthetic "test name" recorded when a whole suite aborts (e.g. beforeAll
+# throws) so that the JUnit XML reports a suite-level error with no per-test
+# <testcase>. Without this, a suite that used to pass but now aborts entirely
+# would record zero failing testcases and the regression would be missed.
+SUITE_ABORTED = "<suite aborted>"
+
+SEP = "#"
+
+
+def eprint(*args, **kwargs):
+    print(*args, file=sys.stderr, **kwargs)
+
+
+# --------------------------------------------------------------------------- #
+# Baseline (known-failures.txt) parsing / formatting
+# --------------------------------------------------------------------------- #
+def format_entry(suite, test):
+    return "{}{}{}".format(suite, SEP, test)
+
+
+def parse_entry(line):
+    """Parse a 'suite#test' line into (suite, test) or return None for 
blanks/comments."""
+    stripped = line.strip()
+    if not stripped or stripped.startswith("#"):
+        return None
+    idx = stripped.find(SEP)
+    if idx < 0:
+        # No separator: treat the whole line as a suite-level entry.
+        return (stripped, SUITE_ABORTED)
+    return (stripped[:idx], stripped[idx + len(SEP) :])
+
+
+def load_entries(path):
+    """Load a set of (suite, test) tuples from a baseline/shard-list file."""
+    entries = set()
+    if not path or not os.path.exists(path):
+        return entries
+    with open(path, "r", encoding="utf-8") as fh:
+        for line in fh:
+            parsed = parse_entry(line)
+            if parsed is not None:
+                entries.add(parsed)
+    return entries
+
+
+def write_entries(path, entries, header=None):
+    """Write a sorted set of (suite, test) tuples to a file."""
+    os.makedirs(os.path.dirname(os.path.abspath(path)) or ".", exist_ok=True)
+    with open(path, "w", encoding="utf-8") as fh:
+        if header:
+            for hl in header.splitlines():
+                fh.write(hl.rstrip() + "\n")
+        for suite, test in sorted(entries):
+            # Defensive: collapse any stray newlines so each entry stays on 
one line.
+            safe_test = test.replace("\r", " ").replace("\n", " ")
+            fh.write(format_entry(suite, safe_test) + "\n")
+
+
+# --------------------------------------------------------------------------- #
+# JUnit XML parsing
+# --------------------------------------------------------------------------- #
+def _iter_testsuites(root):
+    """Yield every <testsuite> element regardless of whether the file root is
+    <testsuites> (wrapper) or a single <testsuite>."""
+    tag = root.tag.split("}")[-1]  # strip any namespace
+    if tag == "testsuites":
+        for child in root:
+            if child.tag.split("}")[-1] == "testsuite":
+                yield child
+    elif tag == "testsuite":
+        yield root
+
+
+def _child_local_tags(elem):
+    return {c.tag.split("}")[-1] for c in elem}
+
+
+def parse_reports(reports_dir):
+    """Walk reports_dir for JUnit XML and classify every test.
+
+    Returns (passed, failed, skipped) sets of (suite, test) tuples. A test is
+    'failed' if its <testcase> has a <failure> or <error> child, 'skipped' if
+    it has a <skipped> child, otherwise 'passed'. Suite-level aborts (a
+    <testsuite> reporting errors/failures with no failing <testcase>) are
+    recorded as a synthetic (suite, SUITE_ABORTED) failure.
+    """
+    passed, failed, skipped = set(), set(), set()
+
+    xml_files = []
+    # ScalaTest's -u reporter and Maven surefire both write `TEST-<suite>.xml`
+    # under a `target/.../*-reports/` dir. Restrict the secondary glob to
+    # `target/` so we never parse Delta's own XML *test resources* (which live
+    # under src/test/resources and are not reports). The <testsuite>-root guard
+    # below is a final safety net.
+    for pattern in ("**/TEST-*.xml", "**/target/**/*.xml"):
+        xml_files.extend(glob.glob(os.path.join(reports_dir, pattern), 
recursive=True))
+    xml_files = sorted(set(xml_files))
+
+    parsed_any = False
+    for xml_file in xml_files:
+        try:
+            tree = ET.parse(xml_file)
+        except ET.ParseError as exc:
+            eprint("WARNING: could not parse {}: {}".format(xml_file, exc))
+            continue
+        root = tree.getroot()
+        root_tag = root.tag.split("}")[-1]
+        if root_tag not in ("testsuites", "testsuite"):
+            continue  # not a JUnit report
+
+        for ts in _iter_testsuites(root):
+            parsed_any = True
+            suite_name = ts.get("name") or ""
+            suite_has_failing_tc = False
+            for tc in ts:
+                if tc.tag.split("}")[-1] != "testcase":
+                    continue
+                suite = tc.get("classname") or suite_name
+                name = tc.get("name") or ""
+                key = (suite, name)
+                tags = _child_local_tags(tc)
+                if "failure" in tags or "error" in tags:
+                    failed.add(key)
+                    suite_has_failing_tc = True
+                elif "skipped" in tags:
+                    skipped.add(key)
+                else:
+                    passed.add(key)
+
+            # Suite-level abort: counters say something failed but no testcase
+            # carried the failure (the suite blew up in beforeAll/constructor).
+            # Record a
+            # synthetic entry so the regression is visible.
+            try:
+                errors = int(ts.get("errors", "0") or "0")
+                failures = int(ts.get("failures", "0") or "0")
+            except ValueError:
+                errors = failures = 0
+            if (errors + failures) > 0 and not suite_has_failing_tc:
+                failed.add((suite_name, SUITE_ABORTED))
+
+    if not parsed_any:
+        eprint(
+            "WARNING: no JUnit <testsuite> elements found under 
{}".format(reports_dir)
+        )
+
+    # A test can't be both passed and failed; failure wins. Skipped only counts
+    # if the test was not otherwise seen (e.g. retried).
+    passed -= failed
+    skipped -= failed
+    skipped -= passed
+    return passed, failed, skipped
+
+
+# --------------------------------------------------------------------------- #
+# Reporting helpers
+# --------------------------------------------------------------------------- #
+def _summary_sink():
+    """Return a writer that mirrors to GITHUB_STEP_SUMMARY when available."""
+    path = os.environ.get("GITHUB_STEP_SUMMARY")
+    handle = open(path, "a", encoding="utf-8") if path else None
+
+    def write(line=""):
+        print(line)
+        if handle:
+            handle.write(line + "\n")
+
+    return write, handle
+
+
+def _print_block(write, title, entries, limit=50):
+    write("")
+    write("### {} ({})".format(title, len(entries)))
+    if not entries:
+        return
+    write("")
+    write("```")
+    for i, (suite, test) in enumerate(sorted(entries)):
+        if i >= limit:
+            write("... and {} more".format(len(entries) - limit))
+            break
+        write(format_entry(suite, test))
+    write("```")
+
+
+# --------------------------------------------------------------------------- #
+# Modes
+# --------------------------------------------------------------------------- #
+def run_enforce(args):
+    baseline = load_entries(args.known_failures)
+    passed, failed, skipped = parse_reports(args.reports_dir)
+
+    # Always emit this shard's artifacts for the aggregation job.
+    if args.failures_out:
+        write_entries(args.failures_out, failed)
+    if args.ran_out:
+        write_entries(args.ran_out, passed | failed)

Review Comment:
   The per-shard `ran-out` file excludes skipped tests (`passed | failed`). In 
aggregate mode, `stale = baseline - union_ran` will then incorrectly mark 
baseline tests as "stale" when they were actually executed and reported as 
skipped in JUnit XML. If you want stale to mean "suite/test no longer exists", 
consider emitting a skipped list too (or including skipped in ran-out and 
adjusting the aggregate "fixed" computation to avoid treating skipped as 
passing).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [VL][Delta] Add Delta Spark UT pipeline gated against a known-failures baseline [gluten]

Reply via email to