vaquar khan created SPARK-56040:
-----------------------------------

             Summary: Spark Automated Integrity Validation (AIV) Gate
                 Key: SPARK-56040
                 URL: https://issues.apache.org/jira/browse/SPARK-56040
             Project: Spark
          Issue Type: Improvement
          Components: Build, Project Infra
    Affects Versions: 4.1.1
            Reporter: vaquar khan


*Background / Motivation:* The open-source ecosystem is facing an unprecedented 
surge in low-quality, automated pull requests ("AI slop"). With the massive 
contribution volume Spark handles, relying purely on PR template checkboxes 
(soft controls) is becoming unsustainable for maintainer bandwidth. Spark needs 
a deterministic "hard control" to catch structurally flawed submissions before 
they reach a human reviewer.
 
*Proposed Solution:* We propose adding an Automated Integrity Validation (AIV) 
Gate to Spark's CI pipeline. This tool will perform deterministic, AST-based 
build validation to catch the two most damaging categories of low-quality 
contributions:
 
- Scaffolding-heavy PRs with no real logic (boilerplate inflation) via Logic 
Density Ratio (LDR) validation.
- Code that violates Spark's specific architectural rules (domain-specific 
anti-patterns) via a declarative YAML-based design compliance checker.
 
*Technical Implementation*
 
Written in Python to scale existing precedents (like 
dev/structured_logging_style.py).
 
Utilizes tree-sitter-scala and jAST for robust, source-level AST parsing.
 
Runs entirely locally within the existing .github/workflows/build_and_test.yml 
lint job, requiring zero external dependencies or APIs to prevent supply-chain 
attacks.
 
Includes a secure, GPG-signed /aiv skip committer bypass mechanism to ensure 
maintainers are never blocked during release freezes.
 
 *Rollout Plan* The plugin architecture is modular. We propose an initial 
deployment in a non-blocking "Shadow Mode" to collect baseline data, calibrate 
LDR thresholds, and ensure zero disruption to current contributor workflows.
 
*References*
 
SPIP Document: 
https://docs.google.com/document/d/1-PCSq0PT_B45MbXVxkJ_E3GUHvK-8VV6WxQjKSGEh9o/edit?usp=sharing
 
dev@ Mailing List Discussion: 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to