This is an automated email from the ASF dual-hosted git repository.

zhengruifeng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 283949ccbb37 [SPARK-56744][INFRA] Document test base class hierarchy 
in AGENTS.md
283949ccbb37 is described below

commit 283949ccbb37bd961dbc47b38a6a0445397cabe0
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Wed May 13 19:21:47 2026 +0800

    [SPARK-56744][INFRA] Document test base class hierarchy in AGENTS.md
    
    ### What changes were proposed in this pull request?
    
    Add a `Scala Test Base Classes` section to `AGENTS.md` that documents the 
layered Scala test base hierarchy in this repo and how to pick a base class for 
a new test suite. Spark uses the `AnyFunSuite` ScalaTest style throughout, and 
the chain is:
    
        SparkFunSuite                                                           
(core)
          <- PlanTest                                                           
(sql/catalyst)
            <- QueryTest                                                        
(sql/core)
    
    `QueryTest` declares `spark: SparkSession` abstractly via 
`SparkSessionProvider`, so a concrete SQL test suite mixes in one of the 
session-providing traits:
    
        QueryTest                                                               
(abstract `spark`)
          + SharedSparkSession (sql/core)        -> classic in-process 
`TestSparkSession`
          + TestHiveSingleton  (sql/hive)        -> Hive-backed `TestHive` 
session
    
    The new section also includes:
    - A decision table mapping test scope (plain JVM, Catalyst plans, 
SQL/DataFrame with a session) to the right base.
    - A session-provider table noting that `SharedSparkSession` itself extends 
`QueryTest` (so concrete suites just `extends SharedSparkSession`), while 
`TestHiveSingleton` is mixed in alongside `QueryTest`.
    - A linearization gotcha: the first item in an `extends` clause must 
transitively extend a class. Pure helper traits (`*ErrorsBase`, `*Helper`) 
cannot be put first.
    
    `CLAUDE.md` is a symlink to `AGENTS.md`, so this change is picked up by 
both AI agent toolchains.
    
    ### Why are the changes needed?
    
    Picking the wrong test base class (e.g. extending `QueryTest` directly when 
a session is needed, or `SparkFunSuite` when `PlanTest` would do) is a common 
stumble when adding new Scala test suites. The information is currently spread 
across the source of `SparkFunSuite`, `PlanTest`, `QueryTest`, and the 
session-providing traits, with no single place that summarizes when to use 
which. Documenting it in `AGENTS.md` gives both contributors and AI coding 
agents a quick reference.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No. Documentation-only change to a developer/agent guide file.
    
    ### How was this patch tested?
    
    N/A. Documentation-only change; no code or tests are affected.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Generated-by: Claude opus-4-7
    
    Closes #55707 from zhengruifeng/add-test-base-class-guide.
    
    Authored-by: Ruifeng Zheng <[email protected]>
    Signed-off-by: Ruifeng Zheng <[email protected]>
---
 AGENTS.md | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/AGENTS.md b/AGENTS.md
index 96f5b7917cae..28944c9d7810 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -20,6 +20,33 @@ Spark Connect protocol is defined in proto files under 
`sql/connect/common/src/m
 
 Avoid introducing non-ASCII characters in code or comments. String literals 
may contain non-ASCII when the content requires it (error messages, test data, 
etc.). Identifiers are ASCII by convention. The common failure mode is 
typographic characters (em-dash, smart quotes, ellipsis, non-breaking space) 
sneaking into comments; scalastyle flags some of these. Spot-check before 
committing: `grep -rn -P "[^\x00-\x7F]" <files>`.
 
+## Scala Test Base Classes
+
+When writing a new Scala test suite, pick the lowest base class that provides 
what the test actually needs. Spark uses the `AnyFunSuite` ScalaTest style 
throughout, so the bases below are the chain to choose from. Each adds 
capability on top of the previous:
+
+    SparkFunSuite                                                           
(core)
+      <- PlanTest                                                           
(sql/catalyst)
+        <- QueryTest                                                        
(sql/core)
+
+| Test scope | Base | Notes |
+|------------|------|-------|
+| Plain JVM/Scala — no Spark SQL | `SparkFunSuite` | `core` utilities, RDD, 
network, util classes, etc. Adds per-test timeout, `testRetry`, `gridTest`, 
thread audit, fixed timezone/locale, `withTempDir`, `withLogAppender`, 
`checkError`. |
+| Catalyst plan tests — no `SparkSession` | `PlanTest` | Adds `comparePlans`, 
`normalizePlan`, `normalizeExprIds`. For analyzer / optimizer / planner rule 
tests. |
+| SQL/DataFrame tests — needs a `SparkSession` | `QueryTest` | Adds 
`checkAnswer`, codegen-on/off helpers. `spark: SparkSession` is abstract and 
must be supplied by a session-providing trait (see below). |
+
+### Providing a `SparkSession` for `QueryTest`
+
+`QueryTest` declares `spark: SparkSession` abstractly via 
`SparkSessionProvider`, so it cannot be instantiated on its own. A concrete 
suite mixes in one of the session-providing traits below:
+
+    QueryTest                                                               
(abstract `spark`)
+      + SharedSparkSession (sql/core)        -> classic in-process 
`TestSparkSession`
+      + TestHiveSingleton  (sql/hive)        -> Hive-backed `TestHive` session
+
+| Session provider | Module / location | Typical usage |
+|---|---|---|
+| `SharedSparkSession` | `sql/core` | Already extends `QueryTest` for 
historical reasons, but still mix in `QueryTest` explicitly, e.g. `class X 
extends QueryTest with SharedSparkSession`. Default for tests under `sql/core`. 
|
+| `TestHiveSingleton` | `sql/hive` | Mixed in alongside `QueryTest`, e.g. 
`class X extends QueryTest with TestHiveSingleton`. Used by tests under 
`sql/hive`. |
+
 ## Build and Test
 
 Build and tests can take a long time. If the user explicitly asked to run 
tests, run them. Otherwise (you are running tests on your own to verify a 
change), first ask the user if they have more changes to make.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to