This is an automated email from the ASF dual-hosted git repository.
liuneng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git
The following commit(s) were added to refs/heads/main by this push:
new d4c6fd82db [CORE] Add Bloop integration for faster Scala incremental
compilation (#11645)
d4c6fd82db is described below
commit d4c6fd82dbaf32010c56f141f7a2ae25350f1a61
Author: LiuNeng <[email protected]>
AuthorDate: Thu Feb 26 10:11:28 2026 +0800
[CORE] Add Bloop integration for faster Scala incremental compilation
(#11645)
Summary
This PR adds Bloop build server integration to accelerate Scala development
workflows. Bloop maintains a persistent JVM with warm Zinc compiler state,
dramatically reducing incremental compilation times.
Key Changes
pom.xml: Add bloop-maven-plugin (v2.0.3) with version property,
pluginManagement entry, and -Pbloop profile to skip style checks during config
generation
dev/bloop-setup.sh: Script to generate Bloop configuration with JVM test
options injection (required for Java 17+) and -release option removal for
compatibility
dev/bloop-test.sh: Convenience wrapper with SPARK_ANSI_SQL_MODE=false
default for Spark 4.x test compatibility
docs/developers/bloop-integration.md: Comprehensive usage guide with setup
instructions and troubleshooting
Benchmark Results
Scenario Maven Bloop Speedup
gluten-core incremental 8.26s 0.23s 35.9x
gluten-core clean+compile 29.62s 14.90s 2.0x
Full project incremental ~2 min 2.82s 40.1x
Usage
# Generate bloop config (first time / when changing profiles)
./dev/bloop-setup.sh -Pspark-3.5,scala-2.12,backends-velox
# Compile with watch mode (auto-recompile on save)
bloop compile gluten-core -w
# Run tests
bloop test backends-velox -o '*VeloxHashJoinSuite*'
# Or use the convenience wrapper
./dev/bloop-test.sh -pl backends-velox -s VeloxHashJoinSuite
Test plan
Verified bloop config generation with ./dev/bloop-setup.sh
-Pjava-17,spark-4.1,scala-2.13,backends-velox
Verified compilation: bloop compile gluten-core - passes
Verified tests: bloop test gluten-core - 6 suites, 24 tests pass
Verified Spark 4.1 tests with ANSI mode disabled:
SPARK_ANSI_SQL_MODE=false bloop test backends-velox -o '*VeloxHashJoinSuite*' -
7 tests pass
Verified benchmark results show significant speedup
Default Maven builds remain unaffected (bloop profile is opt-in)
🤖 Generated with Claude Code
---
dev/bloop-setup.sh | 326 +++++++++++++++++++++++++++++++++++
dev/bloop-test.sh | 289 +++++++++++++++++++++++++++++++
docs/developers/bloop-integration.md | 300 ++++++++++++++++++++++++++++++++
pom.xml | 24 +++
4 files changed, 939 insertions(+)
diff --git a/dev/bloop-setup.sh b/dev/bloop-setup.sh
new file mode 100755
index 0000000000..e633e2453b
--- /dev/null
+++ b/dev/bloop-setup.sh
@@ -0,0 +1,326 @@
+#!/bin/bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# =============================================================================
+# bloop-setup.sh - Generate Bloop Configuration for Gluten
+# =============================================================================
+#
+# This script generates Bloop configuration files for fast incremental Scala
+# compilation. Bloop maintains a persistent build server that:
+# - Keeps the JVM warm (no startup overhead)
+# - Maintains Zinc incremental compiler state
+# - Enables watch mode for automatic recompilation
+#
+# Usage:
+# ./dev/bloop-setup.sh -P<profiles>
+#
+# Examples:
+# # Velox backend with Spark 3.5
+# ./dev/bloop-setup.sh -Pspark-3.5,scala-2.12,backends-velox
+#
+# # Velox backend with Spark 4.0 and unit tests
+# ./dev/bloop-setup.sh -Pjava-17,spark-4.0,scala-2.13,backends-velox,spark-ut
+#
+# # ClickHouse backend
+# ./dev/bloop-setup.sh -Pspark-3.5,scala-2.12,backends-clickhouse
+#
+# After running this script:
+# - Use 'bloop projects' to list available projects
+# - Use 'bloop compile <project>' to compile
+# - Use 'bloop compile <project> -w' for watch mode
+# - Use 'bloop test <project>' to run tests
+#
+# =============================================================================
+
+set -e
+
+# Get script directory and project root
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+GLUTEN_HOME="$(cd "${SCRIPT_DIR}/.." && pwd)"
+
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m'
+
+# =============================================================================
+# Functions
+# =============================================================================
+
+print_usage() {
+ cat << EOF
+Usage: $0 -P<profiles>
+
+Generate Bloop configuration for the specified Maven profiles.
+
+Required:
+ -P<profiles> Maven profiles (e.g.,
-Pspark-3.5,scala-2.12,backends-velox)
+
+Optional:
+ --help Show this help message
+
+Examples:
+ # Velox backend with Spark 3.5
+ $0 -Pspark-3.5,scala-2.12,backends-velox
+
+ # Velox backend with Spark 4.0 (requires JDK 17)
+ $0 -Pjava-17,spark-4.0,scala-2.13,backends-velox,spark-ut
+
+ # ClickHouse backend
+ $0 -Pspark-3.5,scala-2.12,backends-clickhouse
+
+Common Profile Combinations:
+ Spark 3.5 + Velox: -Pspark-3.5,scala-2.12,backends-velox
+ Spark 4.0 + Velox: -Pjava-17,spark-4.0,scala-2.13,backends-velox
+ Spark 4.1 + Velox: -Pjava-17,spark-4.1,scala-2.13,backends-velox
+ With unit tests: Add ,spark-ut to any profile
+ With Delta: Add ,delta to any profile
+ With Iceberg: Add ,iceberg to any profile
+EOF
+}
+
+log_info() {
+ echo -e "${GREEN}[INFO]${NC} $1"
+}
+
+log_warn() {
+ echo -e "${YELLOW}[WARN]${NC} $1"
+}
+
+log_error() {
+ echo -e "${RED}[ERROR]${NC} $1"
+}
+
+log_step() {
+ echo -e "${BLUE}[STEP]${NC} $1"
+}
+
+# =============================================================================
+# Parse Arguments
+# =============================================================================
+
+PROFILES=""
+
+while [[ $# -gt 0 ]]; do
+ case $1 in
+ -P*)
+ PROFILES="${1#-P}"
+ shift
+ ;;
+ --help)
+ print_usage
+ exit 0
+ ;;
+ *)
+ log_error "Unknown argument: $1"
+ print_usage
+ exit 1
+ ;;
+ esac
+done
+
+# Validate required arguments
+if [[ -z "$PROFILES" ]]; then
+ log_error "Missing required argument: -P<profiles>"
+ print_usage
+ exit 1
+fi
+
+# =============================================================================
+# Check Prerequisites
+# =============================================================================
+
+log_step "Checking prerequisites..."
+
+# Check if bloop CLI is installed
+if ! command -v bloop &> /dev/null; then
+ log_warn "Bloop CLI not found in PATH"
+ echo ""
+ echo "Install bloop CLI using one of these methods:"
+ echo ""
+ echo " # Using coursier (recommended)"
+ echo " cs install bloop"
+ echo ""
+ echo " # Using Homebrew (macOS)"
+ echo " brew install scalacenter/bloop/bloop"
+ echo ""
+ echo " # Using apt (Debian/Ubuntu)"
+ echo " echo 'deb https://dl.bintray.com/scalacenter/releases /' | sudo tee
/etc/apt/sources.list.d/sbt.list"
+ echo " sudo apt-get update && sudo apt-get install bloop"
+ echo ""
+ echo "See https://scalacenter.github.io/bloop/setup for more options."
+ echo ""
+ log_info "Continuing with bloopInstall (you'll need bloop CLI to use the
generated config)..."
+fi
+
+# =============================================================================
+# Generate Bloop Configuration
+# =============================================================================
+
+log_step "Generating Bloop configuration for profiles: ${PROFILES}"
+log_info "This may take several minutes on first run (full Maven dependency
resolution)..."
+
+cd "${GLUTEN_HOME}"
+
+# Remove existing bloop config to ensure clean generation
+if [[ -d ".bloop" ]]; then
+ log_info "Removing existing .bloop directory..."
+ rm -rf .bloop
+fi
+
+# Run generate-sources + bloopInstall in single Maven invocation
+# This ensures:
+# 1. Protobuf and other code generators run first (generate-sources phase)
+# 2. bloop-maven-plugin picks up the generated source directories
+# Skip style checks for faster generation
+log_step "Running generate-sources + bloopInstall..."
+./build/mvn generate-sources bloop:bloopInstall \
+ -P"${PROFILES}",bloop \
+ -DskipTests
+
+if [[ ! -d ".bloop" ]]; then
+ log_error "Bloop configuration generation failed - .bloop directory not
created"
+ exit 1
+fi
+
+# Count generated projects
+PROJECT_COUNT=$(ls -1 .bloop/*.json 2>/dev/null | wc -l)
+
+log_info "Successfully generated ${PROJECT_COUNT} project configurations in
.bloop/"
+
+# =============================================================================
+# Inject JVM Options for Tests
+# =============================================================================
+# Bloop-maven-plugin doesn't automatically pick up Maven's
argLine/extraJavaTestArgs.
+# We need to inject the --add-opens flags required for Spark tests on Java 17+.
+
+log_step "Injecting JVM test options into bloop configurations..."
+
+# JVM options required for Spark tests (matches extraJavaTestArgs in pom.xml)
+JVM_OPTIONS='[
+ "-XX:+IgnoreUnrecognizedVMOptions",
+ "--add-opens=java.base/java.lang=ALL-UNNAMED",
+ "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED",
+ "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED",
+ "--add-opens=java.base/java.io=ALL-UNNAMED",
+ "--add-opens=java.base/java.net=ALL-UNNAMED",
+ "--add-opens=java.base/java.nio=ALL-UNNAMED",
+ "--add-opens=java.base/java.util=ALL-UNNAMED",
+ "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED",
+ "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED",
+ "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED",
+ "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
+ "--add-opens=java.base/sun.nio.cs=ALL-UNNAMED",
+ "--add-opens=java.base/sun.security.action=ALL-UNNAMED",
+ "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED",
+ "-Djdk.reflect.useDirectMethodHandle=false",
+ "-Dio.netty.tryReflectionSetAccessible=true"
+]'
+
+# Inject JVM options into each bloop config file
+INJECTED_COUNT=0
+for config_file in .bloop/*.json; do
+ if [[ -f "$config_file" ]]; then
+ # Use python to update the JSON (available on most systems)
+ python3 -c "
+import json
+import sys
+
+with open('$config_file', 'r') as f:
+ config = json.load(f)
+
+if 'project' in config:
+ project = config['project']
+
+ # Set JVM options in platform.config.options
+ if 'platform' in project:
+ platform = project['platform']
+ if platform.get('name') == 'jvm' and 'config' in platform:
+ platform['config']['options'] = $JVM_OPTIONS
+
+ # Fix scala options: remove unsupported -release 21 (separate args)
+ if 'scala' in project and 'options' in project['scala']:
+ opts = project['scala']['options']
+ new_opts = []
+ skip_next = False
+ for i, opt in enumerate(opts):
+ if skip_next:
+ skip_next = False
+ continue
+ if opt == '-release' and i+1 < len(opts) and opts[i+1] == '21':
+ skip_next = True
+ continue
+ # Also remove combined form
+ if opt == '-release:21':
+ continue
+ new_opts.append(opt)
+ project['scala']['options'] = new_opts
+
+ # Fix java options: remove --release 21 (separate args)
+ if 'java' in project and 'options' in project['java']:
+ opts = project['java']['options']
+ new_opts = []
+ skip_next = False
+ for i, opt in enumerate(opts):
+ if skip_next:
+ skip_next = False
+ continue
+ if opt == '--release' and i+1 < len(opts) and opts[i+1] == '21':
+ skip_next = True
+ continue
+ new_opts.append(opt)
+ project['java']['options'] = new_opts
+
+with open('$config_file', 'w') as f:
+ json.dump(config, f, indent=2)
+" 2>/dev/null && ((INJECTED_COUNT++))
+ fi
+done
+
+log_info "Injected JVM options into ${INJECTED_COUNT} configurations"
+
+# =============================================================================
+# Print Usage Instructions
+# =============================================================================
+
+echo ""
+echo -e "${GREEN}=========================================="
+echo " Bloop Configuration Complete"
+echo -e "==========================================${NC}"
+echo ""
+echo "Quick Start:"
+echo " bloop projects # List all projects"
+echo " bloop compile gluten-core # Compile a project"
+echo " bloop compile gluten-core -w # Watch mode (auto-recompile)"
+echo " bloop test gluten-core # Run tests"
+echo ""
+echo "Tips:"
+echo " - Use 'bloop compile -w <project>' for instant feedback during
development"
+echo " - Bloop keeps a warm JVM, so subsequent compiles are much faster"
+echo " - Project names match Maven artifactIds (e.g., gluten-core,
backends-velox)"
+echo ""
+echo "Environment Variables:"
+echo " - SPARK_ANSI_SQL_MODE=false Required for Spark 4.x tests (set by
bloop-test.sh)"
+echo " - JAVA_HOME Set to JDK 21 path if bloop uses wrong
JDK version"
+echo ""
+echo "Unit Test Examples:"
+echo " bloop test gluten-ut-spark35 -o '*GlutenSuite*' # Run matching
suites"
+echo " bloop test gluten-ut-spark40 -o SomeSuite # Run specific suite"
+echo ""
+echo -e "${YELLOW}Note:${NC} Re-run this script when changing Maven profiles."
+echo ""
diff --git a/dev/bloop-test.sh b/dev/bloop-test.sh
new file mode 100755
index 0000000000..7af2562010
--- /dev/null
+++ b/dev/bloop-test.sh
@@ -0,0 +1,289 @@
+#!/bin/bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# =============================================================================
+# bloop-test.sh - Run Scala Tests via Bloop
+# =============================================================================
+#
+# A convenience wrapper matching the run-scala-test.sh interface for running
+# tests via Bloop instead of Maven. Requires bloop configuration to be
+# generated first via bloop-setup.sh.
+#
+# Usage:
+# ./dev/bloop-test.sh -pl <module> -s <suite> [-t "test name"]
+#
+# Examples:
+# # Run entire suite
+# ./dev/bloop-test.sh -pl gluten-ut/spark35 -s GlutenSQLQuerySuite
+#
+# # Run specific test method
+# ./dev/bloop-test.sh -pl gluten-ut/spark35 -s GlutenSQLQuerySuite -t "test
method"
+#
+# # Run with wildcard pattern
+# ./dev/bloop-test.sh -pl gluten-ut/spark40 -s '*Aggregate*'
+#
+# Prerequisites:
+# 1. Install bloop CLI: cs install bloop
+# 2. Generate config: ./dev/bloop-setup.sh -P<profiles>
+#
+# =============================================================================
+
+set -e
+
+# Get script directory and project root
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+GLUTEN_HOME="$(cd "${SCRIPT_DIR}/.." && pwd)"
+
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m'
+
+# =============================================================================
+# Module Mapping: Maven module path -> Bloop project name
+# =============================================================================
+# Bloop uses Maven artifactIds as project names.
+# This maps the -pl module path format to the corresponding bloop project name.
+
+declare -A MODULE_MAP=(
+ # Core modules
+ ["gluten-core"]="gluten-core"
+ ["gluten-substrait"]="gluten-substrait"
+ ["gluten-ui"]="gluten-ui"
+ ["gluten-arrow"]="gluten-arrow"
+
+ # Backend modules
+ ["backends-velox"]="backends-velox"
+ ["backends-clickhouse"]="backends-clickhouse"
+
+ # RAS modules
+ ["gluten-ras/common"]="gluten-ras-common"
+ ["gluten-ras/planner"]="gluten-ras-planner"
+
+ # Shims modules
+ ["shims/common"]="spark-sql-columnar-shims-common"
+ ["shims/spark32"]="spark-sql-columnar-shims-spark32"
+ ["shims/spark33"]="spark-sql-columnar-shims-spark33"
+ ["shims/spark34"]="spark-sql-columnar-shims-spark34"
+ ["shims/spark35"]="spark-sql-columnar-shims-spark35"
+ ["shims/spark40"]="spark-sql-columnar-shims-spark40"
+ ["shims/spark41"]="spark-sql-columnar-shims-spark41"
+
+ # Unit test modules
+ ["gluten-ut/common"]="gluten-ut-common"
+ ["gluten-ut/test"]="gluten-ut-test"
+ ["gluten-ut/spark32"]="gluten-ut-spark32"
+ ["gluten-ut/spark33"]="gluten-ut-spark33"
+ ["gluten-ut/spark34"]="gluten-ut-spark34"
+ ["gluten-ut/spark35"]="gluten-ut-spark35"
+ ["gluten-ut/spark40"]="gluten-ut-spark40"
+ ["gluten-ut/spark41"]="gluten-ut-spark41"
+
+ # Data lake modules
+ ["gluten-delta"]="gluten-delta"
+ ["gluten-iceberg"]="gluten-iceberg"
+ ["gluten-hudi"]="gluten-hudi"
+ ["gluten-paimon"]="gluten-paimon"
+
+ # Shuffle modules
+ ["gluten-celeborn"]="gluten-celeborn"
+ ["gluten-uniffle"]="gluten-uniffle"
+
+ # Other modules
+ ["gluten-kafka"]="gluten-kafka"
+)
+
+# =============================================================================
+# Functions
+# =============================================================================
+
+print_usage() {
+ cat << EOF
+Usage: $0 -pl <module> -s <suite> [-t "test name"]
+
+Run Scala tests via Bloop (faster than Maven).
+
+Required:
+ -pl <module> Target module (e.g., gluten-ut/spark40)
+ -s <suite> Suite class name (can be short name or fully qualified)
+
+Optional:
+ -t "test name" Specific test method name to run
+ --help Show this help message
+
+Examples:
+ # Run entire suite
+ $0 -pl gluten-ut/spark35 -s GlutenSQLQuerySuite
+
+ # Run specific test method
+ $0 -pl gluten-ut/spark35 -s GlutenSQLQuerySuite -t "test method name"
+
+ # Run suites matching pattern
+ $0 -pl gluten-ut/spark40 -s '*Aggregate*'
+
+Prerequisites:
+ 1. Install bloop CLI: cs install bloop
+ 2. Generate config: ./dev/bloop-setup.sh -P<your-profiles>
+EOF
+}
+
+log_info() {
+ echo -e "${GREEN}[INFO]${NC} $1"
+}
+
+log_warn() {
+ echo -e "${YELLOW}[WARN]${NC} $1"
+}
+
+log_error() {
+ echo -e "${RED}[ERROR]${NC} $1"
+}
+
+log_step() {
+ echo -e "${BLUE}[STEP]${NC} $1"
+}
+
+# Get bloop project name from module path
+get_bloop_project() {
+ local module="$1"
+ local project="${MODULE_MAP[$module]}"
+
+ if [[ -z "$project" ]]; then
+ # Try using the module path as-is (may work for simple cases)
+ # Replace slashes with dashes for bloop naming convention
+ project=$(echo "$module" | tr '/' '-')
+ fi
+
+ echo "$project"
+}
+
+# =============================================================================
+# Parse Arguments
+# =============================================================================
+
+MODULE=""
+SUITE=""
+TEST_METHOD=""
+
+while [[ $# -gt 0 ]]; do
+ case $1 in
+ -pl)
+ MODULE="$2"
+ shift 2
+ ;;
+ -s)
+ SUITE="$2"
+ shift 2
+ ;;
+ -t)
+ TEST_METHOD="$2"
+ shift 2
+ ;;
+ --help)
+ print_usage
+ exit 0
+ ;;
+ *)
+ log_error "Unknown argument: $1"
+ print_usage
+ exit 1
+ ;;
+ esac
+done
+
+# Validate required arguments
+if [[ -z "$MODULE" ]]; then
+ log_error "Missing required argument: -pl <module>"
+ print_usage
+ exit 1
+fi
+
+if [[ -z "$SUITE" ]]; then
+ log_error "Missing required argument: -s <suite>"
+ print_usage
+ exit 1
+fi
+
+# =============================================================================
+# Check Prerequisites
+# =============================================================================
+
+cd "${GLUTEN_HOME}"
+
+# Check for bloop CLI
+if ! command -v bloop &> /dev/null; then
+ log_error "Bloop CLI not found. Install with: cs install bloop"
+ exit 1
+fi
+
+# Check for .bloop directory
+if [[ ! -d ".bloop" ]]; then
+ log_error "Bloop configuration not found (.bloop directory missing)"
+ echo ""
+ echo "Generate configuration first:"
+ echo " ./dev/bloop-setup.sh -P<your-profiles>"
+ echo ""
+ echo "Example:"
+ echo " ./dev/bloop-setup.sh -Pspark-3.5,scala-2.12,backends-velox"
+ exit 1
+fi
+
+# =============================================================================
+# Run Tests
+# =============================================================================
+
+PROJECT=$(get_bloop_project "$MODULE")
+
+# Verify project exists
+if [[ ! -f ".bloop/${PROJECT}.json" ]]; then
+ log_error "Bloop project '${PROJECT}' not found"
+ echo ""
+ echo "Available projects:"
+ bloop projects 2>/dev/null | head -20
+ echo ""
+ echo "If your project is not listed, regenerate config with the correct
profiles:"
+ echo " ./dev/bloop-setup.sh -P<profiles>"
+ exit 1
+fi
+
+log_step "Running tests via Bloop"
+log_info "Project: ${PROJECT}"
+log_info "Suite: ${SUITE}"
+[[ -n "$TEST_METHOD" ]] && log_info "Test method: ${TEST_METHOD}"
+
+echo ""
+echo "=========================================="
+echo "Running Tests via Bloop"
+echo "=========================================="
+echo ""
+
+# Build test command
+BLOOP_ARGS=("test" "$PROJECT" "-o" "$SUITE")
+
+# Add test method filter if specified
+# Note: Bloop uses -t for test name filtering (same as ScalaTest)
+if [[ -n "$TEST_METHOD" ]]; then
+ BLOOP_ARGS+=("--" "-t" "$TEST_METHOD")
+fi
+
+# Set SPARK_ANSI_SQL_MODE=false by default for Gluten tests
+# Spark 4.x enables ANSI mode by default, which is incompatible with some
Gluten features
+export SPARK_ANSI_SQL_MODE="${SPARK_ANSI_SQL_MODE:-false}"
+
+# Execute tests
+bloop "${BLOOP_ARGS[@]}"
diff --git a/docs/developers/bloop-integration.md
b/docs/developers/bloop-integration.md
new file mode 100644
index 0000000000..2364b8f596
--- /dev/null
+++ b/docs/developers/bloop-integration.md
@@ -0,0 +1,300 @@
+# Bloop Integration for Faster Scala Builds
+
+Bloop is a build server for Scala that dramatically accelerates incremental
compilation by maintaining a persistent JVM with warm compiler state. For
Gluten development, this eliminates the ~52s Zinc analysis loading overhead
that occurs with every Maven build.
+
+## Benefits
+
+- **Persistent incremental compilation**: Bloop keeps Zinc's incremental
compiler state warm
+- **Watch mode**: Automatic recompilation when files change (`bloop compile
-w`)
+- **Fast test iterations**: Skip Maven overhead for repeated test runs
+- **IDE integration**: Metals/VS Code can use Bloop for builds
+
+## Prerequisites
+
+### Install Bloop CLI
+
+Choose one of these installation methods:
+
+```bash
+# Using Coursier (recommended)
+cs install bloop
+
+# Using Homebrew (macOS)
+brew install scalacenter/bloop/bloop
+
+# Using SDKMAN
+sdk install bloop
+
+# Manual installation
+# See https://scalacenter.github.io/bloop/setup
+```
+
+Verify installation:
+```bash
+bloop --version
+```
+
+## Setup
+
+### Generate Bloop Configuration
+
+Run the setup script with your desired Maven profiles:
+
+```bash
+# Velox backend with Spark 3.5
+./dev/bloop-setup.sh -Pspark-3.5,scala-2.12,backends-velox
+
+# Velox backend with Spark 4.0 (requires JDK 17)
+./dev/bloop-setup.sh -Pjava-17,spark-4.0,scala-2.13,backends-velox,spark-ut
+
+# ClickHouse backend
+./dev/bloop-setup.sh -Pspark-3.5,scala-2.12,backends-clickhouse
+
+# With optional modules
+./dev/bloop-setup.sh -Pspark-3.5,scala-2.12,backends-velox,delta,iceberg
+```
+
+This generates `.bloop/` directory with JSON configuration files for each
Maven module.
+
+### Using the Maven Profile Directly
+
+The `-Pbloop` profile automatically skips style checks during configuration
generation. You can use it directly with Maven:
+
+```bash
+# These are equivalent:
+./dev/bloop-setup.sh -Pspark-3.5,scala-2.12,backends-velox
+
+# Manual invocation with profile
+./build/mvn generate-sources bloop:bloopInstall
-Pspark-3.5,scala-2.12,backends-velox,bloop -DskipTests
+```
+
+The bloop profile sets these properties automatically:
+- `spotless.check.skip=true`
+- `scalastyle.skip=true`
+- `checkstyle.skip=true`
+- `maven.gitcommitid.skip=true`
+- `remoteresources.skip=true`
+
+**Note:** The setup script also injects JVM options (e.g., `--add-opens`
flags) required for Spark tests on Java 17+. If you run `bloop:bloopInstall`
manually without the script, tests may fail with `IllegalAccessError`. Use the
setup script to ensure proper configuration.
+
+### Common Profile Combinations
+
+| Use Case | Profiles |
+|----------|----------|
+| Spark 3.5 + Velox | `-Pspark-3.5,scala-2.12,backends-velox` |
+| Spark 4.0 + Velox | `-Pjava-17,spark-4.0,scala-2.13,backends-velox` |
+| Spark 4.1 + Velox | `-Pjava-17,spark-4.1,scala-2.13,backends-velox` |
+| With unit tests | Add `,spark-ut` to any profile |
+| ClickHouse backend | Replace `backends-velox` with `backends-clickhouse` |
+| With Delta Lake | Add `,delta` to any profile |
+| With Iceberg | Add `,iceberg` to any profile |
+
+## Usage
+
+### Basic Commands
+
+```bash
+# List all projects
+bloop projects
+
+# Compile a project
+bloop compile gluten-core
+
+# Compile with watch mode (auto-recompile on changes)
+bloop compile gluten-core -w
+
+# Compile all projects
+bloop compile --cascade gluten-core
+
+# Run tests
+bloop test gluten-core
+
+# Run specific test suite
+bloop test gluten-ut-spark35 -o GlutenSQLQuerySuite
+
+# Run tests matching pattern
+bloop test gluten-ut-spark35 -o '*Aggregate*'
+```
+
+### Running Tests
+
+Use the convenience wrapper to match `run-scala-test.sh` interface:
+
+```bash
+# Run entire suite
+./dev/bloop-test.sh -pl gluten-ut/spark35 -s GlutenSQLQuerySuite
+
+# Run specific test method
+./dev/bloop-test.sh -pl gluten-ut/spark35 -s GlutenSQLQuerySuite -t "test
method name"
+
+# Run with wildcard pattern
+./dev/bloop-test.sh -pl gluten-ut/spark40 -s '*Aggregate*'
+```
+
+### Environment Variables
+
+When running tests with bloop directly (not via `bloop-test.sh`), set these
environment variables:
+
+```bash
+# Required for Spark 4.x tests - disables ANSI mode which is incompatible with
some Gluten features
+export SPARK_ANSI_SQL_MODE=false
+
+# If bloop uses wrong JDK version, set JAVA_HOME before starting bloop server
+export JAVA_HOME=/usr/lib/jvm/java-21-openjdk-amd64
+bloop exit && bloop about # Restart server with new JDK
+
+# Then run tests
+bloop test backends-velox -o '*VeloxHashJoinSuite*'
+```
+
+**Note:** The `bloop-test.sh` wrapper automatically sets
`SPARK_ANSI_SQL_MODE=false`.
+
+### Watch Mode for Rapid Development
+
+Watch mode is ideal for iterative development:
+
+```bash
+# Terminal 1: Start watch mode for your module
+bloop compile gluten-core -w
+
+# Terminal 2: Edit files and see instant compilation feedback
+# Errors appear immediately as you save files
+```
+
+## Comparison: Bloop vs Maven
+
+| Aspect | Maven | Bloop |
+|--------|-------|-------|
+| First compilation | Baseline | Same (full build needed) |
+| Incremental compilation | ~52s+ (Zinc reload) | <5s (warm JVM) |
+| Watch mode | Not supported | Native support |
+| Test execution | Full Maven lifecycle | Direct execution |
+| IDE integration | Limited | Metals/VS Code native |
+| Profile switching | Edit command | Re-run setup script |
+
+### When to Use Each
+
+**Use Bloop when:**
+- Rapid iteration during development
+- Running tests repeatedly
+- Want instant feedback on changes
+- Using Metals/VS Code
+
+**Use Maven when:**
+- CI/CD builds
+- Full release builds
+- First-time setup
+- Switching between profile combinations
+- Need Maven-specific plugins
+
+## IDE Integration
+
+### VS Code with Metals
+
+1. Install Metals extension in VS Code
+2. Generate bloop configuration: `./dev/bloop-setup.sh -P<profiles>`
+3. Open the project folder in VS Code
+4. Metals will detect `.bloop/` and use it for builds
+
+### IntelliJ IDEA
+
+IntelliJ uses its own incremental compiler by default. However, you can:
+1. Use the terminal for bloop commands
+2. Configure IntelliJ to use BSP (Build Server Protocol) with bloop
+
+## Troubleshooting
+
+### "Bloop project not found"
+
+```
+Error: Bloop project 'gluten-ut-spark35' not found
+```
+
+The project wasn't included in the generated configuration. Regenerate with
the correct profiles:
+
+```bash
+# Make sure to include the spark-ut profile for test modules
+./dev/bloop-setup.sh -Pspark-3.5,scala-2.12,backends-velox,spark-ut
+```
+
+### "Bloop CLI not found"
+
+```
+Error: Bloop CLI not found. Install with: cs install bloop
+```
+
+Install the bloop CLI:
+```bash
+# Using Coursier
+cs install bloop
+
+# Or check if it's in your PATH
+which bloop
+```
+
+### Configuration Out of Sync
+
+If compilation fails with unexpected errors, regenerate the configuration:
+
+```bash
+# Remove old config
+rm -rf .bloop
+
+# Regenerate
+./dev/bloop-setup.sh -P<your-profiles>
+```
+
+### Bloop Server Issues
+
+```bash
+# Restart bloop server
+bloop exit
+bloop about # This starts a new server
+
+# Or kill all bloop processes
+pkill -f bloop
+```
+
+### Profile Mismatch
+
+Remember that bloop configuration is generated for a specific set of Maven
profiles. If you need to switch profiles:
+
+```bash
+# Switching from Spark 3.5 to Spark 4.0
+./dev/bloop-setup.sh -Pjava-17,spark-4.0,scala-2.13,backends-velox,spark-ut
+```
+
+## Advanced Usage
+
+### Parallel Compilation
+
+Bloop automatically uses parallel compilation. Control with:
+
+```bash
+# Limit parallelism
+bloop compile gluten-core --parallelism 4
+```
+
+### Clean Build
+
+```bash
+# Clean specific project
+bloop clean gluten-core
+
+# Clean all projects
+bloop clean
+```
+
+### Dependency Graph
+
+```bash
+# Show project dependencies
+bloop projects --dot | dot -Tpng -o deps.png
+```
+
+## Notes
+
+- **Configuration is not committed**: `.bloop/` is in `.gitignore` by design
+- **Profile-specific**: Must regenerate when changing Maven profiles
+- **Complements Maven**: Bloop accelerates development; Maven remains for
CI/production builds
+- **First run is slow**: Initial `bloopInstall` does full Maven resolution
diff --git a/pom.xml b/pom.xml
index 48e6415ab3..9af9927f9e 100644
--- a/pom.xml
+++ b/pom.xml
@@ -145,6 +145,7 @@
<maven.jar.plugin>3.2.2</maven.jar.plugin>
<scalastyle.version>1.0.0</scalastyle.version>
<scalatest-maven-plugin.version>2.2.0</scalatest-maven-plugin.version>
+ <bloop-maven-plugin.version>2.0.3</bloop-maven-plugin.version>
<argLine/>
<!-- Required since there is extensive setting of spark home through
argLine-->
<extraJavaTestArgs>
@@ -917,6 +918,11 @@
<artifactId>maven-enforcer-plugin</artifactId>
<version>3.3.0</version>
</plugin>
+ <plugin>
+ <groupId>ch.epfl.scala</groupId>
+ <artifactId>bloop-maven-plugin</artifactId>
+ <version>${bloop-maven-plugin.version}</version>
+ </plugin>
</plugins>
</pluginManagement>
<plugins>
@@ -966,6 +972,10 @@
<groupId>com.diffplug.spotless</groupId>
<artifactId>spotless-maven-plugin</artifactId>
</plugin>
+ <plugin>
+ <groupId>ch.epfl.scala</groupId>
+ <artifactId>bloop-maven-plugin</artifactId>
+ </plugin>
</plugins>
<extensions>
<!-- provides os.detected.classifier (i.e. linux-x86_64, osx-x86_64)
property -->
@@ -2136,6 +2146,20 @@
</plugins>
</build>
</profile>
+ <profile>
+ <id>bloop</id>
+ <activation>
+ <activeByDefault>false</activeByDefault>
+ </activation>
+ <properties>
+ <!-- Skip style checks during bloop configuration generation -->
+ <spotless.check.skip>true</spotless.check.skip>
+ <scalastyle.skip>true</scalastyle.skip>
+ <checkstyle.skip>true</checkstyle.skip>
+ <maven.gitcommitid.skip>true</maven.gitcommitid.skip>
+ <remoteresources.skip>true</remoteresources.skip>
+ </properties>
+ </profile>
<!-- Profiles for different platforms -->
<profile>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]