davidzollo commented on code in PR #10411: URL: https://github.com/apache/seatunnel/pull/10411#discussion_r2758916432
########## AGENTS.md: ########## @@ -0,0 +1,238 @@ +# LLM Context Guide for Apache SeaTunnel + +This guide helps AI assistants (LLMs / Agents) make **safe, consistent, and verifiable** changes to the Apache SeaTunnel codebase. It mirrors practices from mature Apache projects and adapts them to SeaTunnel’s **build, testing, architecture, and documentation conventions**. + +## ⚠️ CRITICAL: Validate Before Proposing Changes + +**Agents MUST run verification commands locally before suggesting or finalizing changes.** + +```bash +# Format code (mandatory) +./mvnw spotless:apply + +# Quick verification (mandatory) +./mvnw -q -DskipTests verify + +# Unit tests (strongly recommended) +./mvnw test +``` + +Failure to meet these requirements will likely result in PR rejection. + +## Git Commit Message Convention + +SeaTunnel follows a **strict commit message format** to maintain a clean and searchable history. + +**Format**: + +``` +[Type][Module] Description +``` + +### Types + +* `Feature` – New features +* `Fix` – Bug fixes +* `Improve` – Improvements to existing behavior +* `Docs` – Documentation-only changes +* `Test` – Test cases or test framework changes +* `Chore` – Build, dependency, or maintenance tasks + +### Modules + +* `Connector-V2` – seatunnel-connectors-v2 +* `Zeta` – seatunnel-engine (Zeta engine) +* `Core` – seatunnel-core +* `API` – seatunnel-api +* `Transform-V2` – seatunnel-transforms-v2 +* `Format` – seatunnel-formats +* `Translation` – seatunnel-translation +* `E2E` – seatunnel-e2e + +### Examples + +* `[Fix][Connector-V2] Fix MySQL source split enumeration bug` +* `[Fix][Zeta] Fix checkpoint timeout under heavy backpressure` +* `[Feature][Transform-V2] Add LLM transform plugin` +* `[Improve][Core] Optimize jar package loading speed` +* `[Docs] Update quick start guide` + +## Repository Structure + +```text +seatunnel/ +├── seatunnel-api/ # Core API definitions +├── seatunnel-connectors-v2/ # Source & Sink connectors (main contribution area) +├── seatunnel-transforms-v2/ # Transform plugins (including LLM) +├── seatunnel-engine/ # Zeta engine & Web UI +├── seatunnel-core/ # Job submission & CLI entry points +├── seatunnel-translation/ # Flink & Spark adapters +├── seatunnel-formats/ # Data formats (JSON, Avro, etc.) +├── seatunnel-e2e/ # End-to-End integration tests +├── docs/ # Documentation (en & zh) +└── config/ # Default configurations +``` + +## Code Standards + +### Java Backend + +* **Formatting**: Google Java Format (AOSP style), enforced by Spotless +* **Imports**: + * No wildcard imports + * Use shaded dependencies: `org.apache.seatunnel.shade.*` +* **Nullability**: Avoid implicit null assumptions +* **Visibility**: Keep APIs minimal; prefer package-private when possible +* **Comments**: Add comments for important methods (public APIs, complex logic). + +### Apache License Header (MANDATORY) + +All **new files** MUST include the ASF license header: + +```java +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +``` + +## 🚨 Backward Compatibility (VERY IMPORTANT) + +Agents MUST treat backward compatibility as a **hard constraint**. + +* DO NOT remove or rename existing config options +* DO NOT change default values casually +* DO NOT break public APIs or SPI contracts + +Any incompatible change MUST: + +* Be explicitly documented +* Be documented in `docs/en/introduction/concepts/incompatible-changes.md` +* Include migration guidance +* Be clearly explained in the PR description + +## Dependency Rules + +* DO NOT introduce new dependencies unless absolutely necessary +* Prefer existing shaded dependencies under `org.apache.seatunnel.shade.*` +* Any new dependency MUST: + * Be justified in the PR description + * Consider shading, size, and conflict risks + +## Architecture Guidelines + +### Connector (V2) + +* Implement `SeaTunnelSource` or `SeaTunnelSink` +* Define configs using `Option` +* Support parallelism via `SourceSplitEnumerator` +* Avoid connector-specific logic leaking into engine or core + +### Zeta Engine + +* **Client**: Submits job config +* **Master**: Schedules & coordinates +* **Worker**: Executes tasks (Source → Transform → Sink) + +Respect task boundaries and lifecycle semantics. + +## Configuration (Option) Rules + +* All user-facing configs MUST be defined using `Option` +* Each option MUST include: + * name + * type + * default value (if applicable) + * clear description +* Option names are **stable contracts** and must not be renamed lightly + +## Error Handling & Logging + +* Exceptions MUST include sufficient context (table, task, config key) +* Avoid swallowing exceptions +* Use proper log levels: + * INFO – lifecycle events + * WARN – recoverable issues + * ERROR – task-failing errors +* NEVER log sensitive information (passwords, tokens, credentials) + +## Documentation Rules + +* Any user-visible change MUST update: + + * `docs/en` + * `docs/zh` +* Config names, defaults, and examples MUST match the code exactly +* Documentation is part of the feature, not an afterthought + +## Testing Guidelines + +### Unit Tests + +* Located under `src/test/java` +* Validate behavior, not implementation details +* Prefer deterministic and minimal tests + +Command: + +```bash +./mvnw test +``` + +### E2E Tests + +* Located in `seatunnel-e2e` +* Uses Testcontainers +* Extend `TestSuiteBase` + +Command: + +```bash +./mvnw -DskipUT -DskipIT=false verify +``` Review Comment: ```bash # Run specific connector E2E tests ./mvnw test -pl seatunnel-e2e/seatunnel-connector-v2-e2e/connector-xxx-e2e # Run all E2E tests (warning: time-consuming) ./mvnw -DskipUT -DskipIT=false verify -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
