This is an automated email from the ASF dual-hosted git repository.
nielifeng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/seatunnel-tools.git
The following commit(s) were added to refs/heads/main by this push:
new becc021 [Feature][SeaTunnel Skill] add seatunnel-skills (#3)
becc021 is described below
commit becc0217dc3ecfdc9f6fe684d31f9adb017f7a46
Author: ocean-zhc <[email protected]>
AuthorDate: Thu Jan 29 12:52:44 2026 +0800
[Feature][SeaTunnel Skill] add seatunnel-skills (#3)
---
README.md | 653 ++++++++++++++++++++++++-
README_CN.md | 641 ++++++++++++++++++++++++
SKILL_SETUP_GUIDE.md | 673 +++++++++++++++++++++++++
seatunnel-skill/SKILL.md | 1212 ++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 3154 insertions(+), 25 deletions(-)
diff --git a/README.md b/README.md
index 1817050..9c1f431 100644
--- a/README.md
+++ b/README.md
@@ -1,38 +1,641 @@
# Apache SeaTunnel Tools
-This repository hosts auxiliary tools for Apache SeaTunnel. It focuses on
developer/operator productivity around configuration, conversion, packaging and
diagnostics. Current modules:
+**English** | [中文](README_CN.md)
-- x2seatunnel: Convert configurations (e.g., DataX) into SeaTunnel
configuration files.
+Auxiliary tools for Apache SeaTunnel focusing on developer/operator
productivity around configuration, conversion, LLM integration, packaging, and
diagnostics.
-More tools may be added in the future. For the main data integration engine,
see the
-[Apache SeaTunnel](https://github.com/apache/seatunnel) project.
+## 🎯 What's Inside
-## Tool 1 - SeaTunnel MCP Server
+| Tool | Purpose | Status |
+|------|---------|--------|
+| **SeaTunnel Skill** | Claude AI integration for SeaTunnel operations | ✅ New
|
+| **SeaTunnel MCP Server** | Model Context Protocol for LLM integration | ✅
Available |
+| **x2seatunnel** | Configuration converter (DataX → SeaTunnel) | ✅ Available |
-What is MCP?
-- MCP (Model Context Protocol) is an open protocol for connecting LLMs to
tools, data, and systems. With SeaTunnel MCP, you can operate SeaTunnel
directly from an LLM-powered interface while keeping the server-side logic
secure and auditable.
-- Learn more: https://github.com/modelcontextprotocol
+---
-SeaTunnel MCP Server
-- Source folder: [seatunnel-mcp/](seatunnel-mcp/)
-- English README: [seatunnel-mcp/README.md](seatunnel-mcp/README.md)
-- Chinese: [seatunnel-mcp/README_CN.md](seatunnel-mcp/README_CN.md)
-- Quick Start:
[seatunnel-mcp/docs/QUICK_START.md](seatunnel-mcp/docs/QUICK_START.md)
-- User Guide:
[seatunnel-mcp/docs/USER_GUIDE.md](seatunnel-mcp/docs/USER_GUIDE.md)
-- Developer Guide:
[seatunnel-mcp/docs/DEVELOPER_GUIDE.md](seatunnel-mcp/docs/DEVELOPER_GUIDE.md)
+## ⚡ Quick Start
-For screenshots, demo video, features, installation and usage instructions,
please refer to the README in the seatunnel-mcp directory.
+### For SeaTunnel Skill (Claude Code Integration)
-## Tool 2 - x2seatunnel
+**Installation & Setup:**
-What is x2seatunnel?
-- x2seatunnel is a configuration conversion tool that helps users migrate from
other data integration tools (e.g., DataX) to SeaTunnel by converting existing
configurations into SeaTunnel-compatible formats.
-- x2seatunnel
- - English: [x2seatunnel/README.md](x2seatunnel/README.md)
- - Chinese: [x2seatunnel/README_zh.md](x2seatunnel/README_zh.md)
+```bash
+# 1. Clone this repository
+git clone https://github.com/apache/seatunnel-tools.git
+cd seatunnel-tools
-## Contributing
+# 2. Copy seatunnel-skill to Claude Code skills directory
+cp -r seatunnel-skill ~/.claude/skills/
-Issues and PRs are welcome.
+# 3. Restart Claude Code or reload skills
+# Then use: /seatunnel-skill "your prompt here"
+```
-Get the main project from [Apache
SeaTunnel](https://github.com/apache/seatunnel)
+**Quick Example:**
+
+```bash
+# Query SeaTunnel documentation
+/seatunnel-skill "How do I configure a MySQL to PostgreSQL job?"
+
+# Get connector information
+/seatunnel-skill "List all available Kafka connector options"
+
+# Debug configuration issues
+/seatunnel-skill "Why is my job failing with OutOfMemoryError?"
+```
+
+### For SeaTunnel Core (Direct Installation)
+
+```bash
+# Download binary (recommended)
+wget
https://archive.apache.org/dist/seatunnel/2.3.12/apache-seatunnel-2.3.12-bin.tar.gz
+tar -xzf apache-seatunnel-2.3.12-bin.tar.gz
+cd apache-seatunnel-2.3.12
+
+# Verify installation
+./bin/seatunnel.sh --version
+
+# Run your first job
+./bin/seatunnel.sh -c config/hello_world.conf -e spark
+```
+
+---
+
+## 📋 Features Overview
+
+### SeaTunnel Skill
+- 🤖 **AI-Powered Assistant**: Get instant help with SeaTunnel concepts and
configurations
+- 📚 **Knowledge Integration**: Query official documentation and best practices
+- 🔍 **Smart Debugging**: Analyze errors and suggest fixes
+- 💡 **Code Examples**: Generate configuration examples for your use case
+
+### SeaTunnel Core Engine
+- **Multimodal Support**: Structured, unstructured, and semi-structured data
+- **100+ Connectors**: Databases, data warehouses, cloud services, message
queues
+- **Multiple Engines**: Zeta (lightweight), Spark, Flink
+- **Synchronization Modes**: Batch, Streaming, CDC (Change Data Capture)
+- **Real-time Performance**: 100K - 1M records/second throughput
+
+---
+
+## 🔧 Installation & Setup
+
+### Method 1: SeaTunnel Skill (AI Integration)
+
+**Step 1: Copy Skill File**
+```bash
+mkdir -p ~/.claude/skills
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+**Step 2: Verify Installation**
+```bash
+# In Claude Code, try:
+/seatunnel-skill "What is SeaTunnel?"
+```
+
+**Step 3: Start Using**
+```bash
+# Help with configuration
+/seatunnel-skill "Create a MySQL to Elasticsearch job config"
+
+# Troubleshoot errors
+/seatunnel-skill "My Kafka connector keeps timing out"
+
+# Learn features
+/seatunnel-skill "Explain CDC (Change Data Capture) in SeaTunnel"
+```
+
+### Method 2: SeaTunnel Binary Installation
+
+**Supported Platforms**: Linux, macOS, Windows
+
+```bash
+# Download latest version
+VERSION=2.3.12
+wget
https://archive.apache.org/dist/seatunnel/${VERSION}/apache-seatunnel-${VERSION}-bin.tar.gz
+
+# Extract
+tar -xzf apache-seatunnel-${VERSION}-bin.tar.gz
+cd apache-seatunnel-${VERSION}
+
+# Set environment
+export JAVA_HOME=/path/to/java
+export PATH=$PATH:$(pwd)/bin
+
+# Verify
+seatunnel.sh --version
+```
+
+### Method 3: Build from Source
+
+```bash
+# Clone repository
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+
+# Build
+mvn clean install -DskipTests
+
+# Run from distribution
+cd seatunnel-dist/target/apache-seatunnel-*-bin/apache-seatunnel-*
+./bin/seatunnel.sh --version
+```
+
+### Method 4: Docker
+
+```bash
+# Pull official image
+docker pull apache/seatunnel:latest
+
+# Run container
+docker run -it apache/seatunnel:latest /bin/bash
+
+# Run job directly
+docker run -v /path/to/config:/config \
+ apache/seatunnel:latest \
+ seatunnel.sh -c /config/job.conf -e spark
+```
+
+---
+
+## 💻 Usage Guide
+
+### Use Case 1: MySQL to PostgreSQL (Batch)
+
+**config/mysql_to_postgres.conf**
+```hocon
+env {
+ job.mode = "BATCH"
+ job.name = "MySQL to PostgreSQL"
+}
+
+source {
+ Jdbc {
+ driver = "com.mysql.cj.jdbc.Driver"
+ url = "jdbc:mysql://mysql-host:3306/mydb"
+ user = "root"
+ password = "password"
+ query = "SELECT * FROM users"
+ connection_check_timeout_sec = 100
+ }
+}
+
+sink {
+ Jdbc {
+ driver = "org.postgresql.Driver"
+ url = "jdbc:postgresql://pg-host:5432/mydb"
+ user = "postgres"
+ password = "password"
+ database = "mydb"
+ table = "users"
+ primary_keys = ["id"]
+ connection_check_timeout_sec = 100
+ }
+}
+```
+
+**Run:**
+```bash
+seatunnel.sh -c config/mysql_to_postgres.conf -e spark
+```
+
+### Use Case 2: Kafka Streaming to Elasticsearch
+
+**config/kafka_to_es.conf**
+```hocon
+env {
+ job.mode = "STREAMING"
+ job.name = "Kafka to Elasticsearch"
+ parallelism = 2
+}
+
+source {
+ Kafka {
+ bootstrap.servers = "kafka-host:9092"
+ topic = "events"
+ consumer.group = "seatunnel-group"
+ format = "json"
+ schema = {
+ fields {
+ event_id = "bigint"
+ event_name = "string"
+ timestamp = "bigint"
+ }
+ }
+ }
+}
+
+sink {
+ Elasticsearch {
+ hosts = ["es-host:9200"]
+ index = "events"
+ username = "elastic"
+ password = "password"
+ }
+}
+```
+
+**Run:**
+```bash
+seatunnel.sh -c config/kafka_to_es.conf -e flink
+```
+
+### Use Case 3: MySQL CDC to Kafka
+
+**config/mysql_cdc_kafka.conf**
+```hocon
+env {
+ job.mode = "STREAMING"
+ job.name = "MySQL CDC to Kafka"
+}
+
+source {
+ Mysql {
+ server_id = 5400
+ hostname = "mysql-host"
+ port = 3306
+ username = "root"
+ password = "password"
+ database = ["mydb"]
+ table = ["users", "orders"]
+ startup.mode = "initial"
+ }
+}
+
+sink {
+ Kafka {
+ bootstrap.servers = "kafka-host:9092"
+ topic = "mysql_cdc"
+ format = "canal_json"
+ semantic = "EXACTLY_ONCE"
+ }
+}
+```
+
+**Run:**
+```bash
+seatunnel.sh -c config/mysql_cdc_kafka.conf -e flink
+```
+
+---
+
+## 📚 API Reference
+
+### Core Connector Types
+
+**Source Connectors**
+- `Jdbc` - Generic JDBC databases (MySQL, PostgreSQL, Oracle, SQL Server)
+- `Kafka` - Apache Kafka topics
+- `Mysql` - MySQL with CDC support
+- `MongoDB` - MongoDB collections
+- `PostgreSQL` - PostgreSQL with CDC
+- `S3` - Amazon S3 and compatible storage
+- `Http` - HTTP/HTTPS endpoints
+- `FakeSource` - For testing
+
+**Sink Connectors**
+- `Jdbc` - Write to JDBC-compatible databases
+- `Kafka` - Publish to Kafka topics
+- `Elasticsearch` - Write to Elasticsearch indices
+- `S3` - Write to S3 buckets
+- `Redis` - Write to Redis
+- `HBase` - Write to HBase tables
+- `Console` - Output to console
+
+**Transform Connectors**
+- `Sql` - Execute SQL transformations
+- `FieldMapper` - Rename/map columns
+- `JsonPath` - Extract data from JSON
+
+---
+
+## ⚙️ Configuration & Tuning
+
+### Environment Variables
+
+```bash
+# Java configuration
+export JAVA_HOME=/path/to/java
+export JVM_OPTS="-Xms1G -Xmx4G"
+
+# Spark configuration (if using Spark engine)
+export SPARK_HOME=/path/to/spark
+export SPARK_MASTER=spark://master:7077
+
+# Flink configuration (if using Flink engine)
+export FLINK_HOME=/path/to/flink
+
+# SeaTunnel configuration
+export SEATUNNEL_HOME=/path/to/seatunnel
+```
+
+### Performance Tuning for Batch Jobs
+
+```hocon
+env {
+ job.mode = "BATCH"
+ parallelism = 8 # Increase for larger clusters
+}
+
+source {
+ Jdbc {
+ split_size = 100000 # Parallel reads
+ fetch_size = 5000
+ }
+}
+
+sink {
+ Jdbc {
+ batch_size = 1000 # Batch inserts
+ max_retries = 3
+ }
+}
+```
+
+### Performance Tuning for Streaming Jobs
+
+```hocon
+env {
+ job.mode = "STREAMING"
+ parallelism = 4
+ checkpoint.interval = 30000 # 30 seconds
+}
+
+source {
+ Kafka {
+ consumer.group = "seatunnel-consumer"
+ max_poll_records = 500
+ }
+}
+```
+
+---
+
+## 🛠️ Development Guide
+
+### Project Structure
+
+```
+seatunnel-tools/
+├── seatunnel-skill/ # Claude Code AI skill
+├── seatunnel-mcp/ # MCP server for LLM integration
+├── x2seatunnel/ # DataX to SeaTunnel converter
+└── README.md
+```
+
+### SeaTunnel Core Architecture
+
+```
+seatunnel/
+├── seatunnel-api/ # Core APIs
+├── seatunnel-core/ # Execution engine
+├── seatunnel-engines/ # Engine implementations
+│ ├── seatunnel-engine-flink/
+│ ├── seatunnel-engine-spark/
+│ └── seatunnel-engine-zeta/
+├── seatunnel-connectors/ # Connector implementations
+└── seatunnel-dist/ # Distribution package
+```
+
+### Building SeaTunnel from Source
+
+```bash
+# Full build
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+mvn clean install -DskipTests
+
+# Build specific module
+mvn clean install -pl
seatunnel-connectors/seatunnel-connectors-seatunnel-kafka -DskipTests
+```
+
+### Running Tests
+
+```bash
+# Unit tests
+mvn test
+
+# Specific test class
+mvn test -Dtest=MySqlConnectorTest
+
+# Integration tests
+mvn verify
+```
+
+---
+
+## 🐛 Troubleshooting (6 Common Issues)
+
+### Issue 1: ClassNotFoundException: com.mysql.jdbc.Driver
+
+**Solution:**
+```bash
+wget
https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.33.jar
+cp mysql-connector-java-8.0.33.jar $SEATUNNEL_HOME/lib/
+seatunnel.sh -c config/job.conf -e spark
+```
+
+### Issue 2: OutOfMemoryError: Java heap space
+
+**Solution:**
+```bash
+export JVM_OPTS="-Xms2G -Xmx8G"
+echo 'JVM_OPTS="-Xms2G -Xmx8G"' >> $SEATUNNEL_HOME/bin/seatunnel-env.sh
+```
+
+### Issue 3: Connection refused: connect
+
+**Solution:**
+```bash
+# Verify connectivity
+ping source-host
+telnet source-host 3306
+
+# Check credentials
+mysql -h source-host -u root -p
+```
+
+### Issue 4: Table not found during CDC
+
+**Solution:**
+```sql
+-- Check binlog status
+SHOW VARIABLES LIKE 'log_bin';
+
+-- Enable binlog in my.cnf
+[mysqld]
+log_bin = mysql-bin
+binlog_format = row
+```
+
+### Issue 5: Slow Job Performance
+
+**Solution:**
+```hocon
+env {
+ parallelism = 8 # Increase parallelism
+}
+
+source {
+ Jdbc {
+ fetch_size = 5000
+ split_size = 100000
+ }
+}
+
+sink {
+ Jdbc {
+ batch_size = 2000
+ }
+}
+```
+
+### Issue 6: Kafka offset out of range
+
+**Solution:**
+```hocon
+source {
+ Kafka {
+ auto.offset.reset = "earliest" # or "latest"
+ }
+}
+```
+
+---
+
+## ❓ FAQ (8 Common Questions)
+
+**Q: What's the difference between BATCH and STREAMING mode?**
+
+A:
+- **BATCH**: One-time execution, suitable for full database migration
+- **STREAMING**: Continuous execution, suitable for real-time sync and CDC
+
+**Q: How do I handle schema changes during CDC?**
+
+A: Configure auto-detection in source:
+```hocon
+source {
+ Mysql {
+ schema_change_mode = "auto"
+ }
+}
+```
+
+**Q: Can I transform data during synchronization?**
+
+A: Yes, use SQL transform:
+```hocon
+transform {
+ Sql {
+ sql = "SELECT id, UPPER(name) as name FROM source"
+ }
+}
+```
+
+**Q: What's the maximum throughput?**
+
+A: Typical throughput is 100K - 1M records/second per executor. Depends on:
+- Hardware (CPU, RAM, Network)
+- Database configuration
+- Data size per record
+- Network latency
+
+**Q: How do I handle errors in production?**
+
+A: Configure restart strategy:
+```hocon
+env {
+ restart_strategy = "exponential_delay"
+ restart_strategy.exponential_delay.initial_delay = 1000
+ restart_strategy.exponential_delay.max_delay = 30000
+ restart_strategy.exponential_delay.multiplier = 2.0
+}
+```
+
+**Q: Is there a web UI for job management?**
+
+A: Yes! Use SeaTunnel Web Project:
+```bash
+git clone https://github.com/apache/seatunnel-web.git
+cd seatunnel-web
+mvn clean install
+java -jar target/seatunnel-web-*.jar
+# Access at http://localhost:8080
+```
+
+**Q: How do I use the SeaTunnel Skill with Claude Code?**
+
+A: After copying to `~/.claude/skills/`, use:
+```bash
+/seatunnel-skill "your question about SeaTunnel"
+```
+
+**Q: Which engine should I use: Spark, Flink, or Zeta?**
+
+A:
+- **Zeta**: Lightweight, no external dependencies, single machine
+- **Spark**: Batch and batch-stream processing on distributed clusters
+- **Flink**: Advanced streaming and CDC on distributed clusters
+
+---
+
+## 🔗 Resources & Links
+
+### Official Documentation
+- [SeaTunnel Website](https://seatunnel.apache.org/)
+- [GitHub Repository](https://github.com/apache/seatunnel)
+- [Connector
List](https://seatunnel.apache.org/docs/2.3.12/connector-v2/overview)
+- [HOCON Configuration
Guide](https://github.com/lightbend/config/blob/main/HOCON.md)
+
+### Community & Support
+- [Slack Channel](https://the-asf.slack.com/archives/C01CB5186TL)
+- [Mailing Lists](https://seatunnel.apache.org/community/mail-lists/)
+- [GitHub Issues](https://github.com/apache/seatunnel/issues)
+- [Discussion Forum](https://github.com/apache/seatunnel/discussions)
+
+### Related Projects
+- [SeaTunnel Web UI](https://github.com/apache/seatunnel-web)
+- [SeaTunnel Tools](https://github.com/apache/seatunnel-tools)
+- [Apache Kafka](https://kafka.apache.org/)
+- [Apache Flink](https://flink.apache.org/)
+- [Apache Spark](https://spark.apache.org/)
+
+---
+
+## 📄 Individual Tools
+
+### 1. SeaTunnel Skill (New)
+- **Purpose**: AI-powered assistant for SeaTunnel in Claude Code
+- **Location**: [seatunnel-skill/](seatunnel-skill/)
+- **Quick Setup**: `cp -r seatunnel-skill ~/.claude/skills/`
+- **Usage**: `/seatunnel-skill "your question"`
+
+### 2. SeaTunnel MCP Server
+- **Purpose**: Model Context Protocol integration for LLM systems
+- **Location**: [seatunnel-mcp/](seatunnel-mcp/)
+- **English**: [README.md](seatunnel-mcp/README.md)
+- **Chinese**: [README_CN.md](seatunnel-mcp/README_CN.md)
+- **Quick Start**: [QUICK_START.md](seatunnel-mcp/docs/QUICK_START.md)
+
+### 3. x2seatunnel
+- **Purpose**: Convert DataX and other configurations to SeaTunnel format
+- **Location**: [x2seatunnel/](x2seatunnel/)
+- **English**: [README.md](x2seatunnel/README.md)
+- **Chinese**: [README_zh.md](x2seatunnel/README_zh.md)
+
+---
+
+## 🤝 Contributing
+
+Issues and PRs are welcome!
+
+For the main SeaTunnel engine, see [Apache
SeaTunnel](https://github.com/apache/seatunnel).
+
+For these tools, please contribute to [SeaTunnel
Tools](https://github.com/apache/seatunnel-tools).
+
+---
+
+**Last Updated**: 2026-01-28 | **License**: Apache 2.0
diff --git a/README_CN.md b/README_CN.md
new file mode 100644
index 0000000..5541aa1
--- /dev/null
+++ b/README_CN.md
@@ -0,0 +1,641 @@
+# Apache SeaTunnel 工具集
+
+[English](README.md) | **中文**
+
+Apache SeaTunnel 辅助工具集,重点关注开发者/运维生产力,包括配置转换、LLM 集成、打包和诊断。
+
+## 🎯 工具概览
+
+| 工具 | 用途 | 状态 |
+|------|------|------|
+| **SeaTunnel Skill** | Claude AI 集成 | ✅ 新功能 |
+| **SeaTunnel MCP 服务** | LLM 集成协议 | ✅ 可用 |
+| **x2seatunnel** | 配置转换工具 (DataX → SeaTunnel) | ✅ 可用 |
+
+---
+
+## ⚡ 快速开始
+
+### SeaTunnel Skill (Claude Code 集成)
+
+**安装步骤:**
+
+```bash
+# 1. 克隆本仓库
+git clone https://github.com/apache/seatunnel-tools.git
+cd seatunnel-tools
+
+# 2. 复制 seatunnel-skill 到 Claude Code 技能目录
+cp -r seatunnel-skill ~/.claude/skills/
+
+# 3. 重启 Claude Code 或重新加载技能
+# 然后使用: /seatunnel-skill "你的问题"
+```
+
+**快速示例:**
+
+```bash
+# 查询 SeaTunnel 文档
+/seatunnel-skill "如何配置 MySQL 到 PostgreSQL 的数据同步?"
+
+# 获取连接器信息
+/seatunnel-skill "列出所有可用的 Kafka 连接器选项"
+
+# 调试配置问题
+/seatunnel-skill "为什么我的任务出现 OutOfMemoryError 错误?"
+```
+
+### SeaTunnel 核心引擎(直接安装)
+
+```bash
+# 下载二进制文件(推荐)
+wget
https://archive.apache.org/dist/seatunnel/2.3.12/apache-seatunnel-2.3.12-bin.tar.gz
+tar -xzf apache-seatunnel-2.3.12-bin.tar.gz
+cd apache-seatunnel-2.3.12
+
+# 验证安装
+./bin/seatunnel.sh --version
+
+# 运行第一个任务
+./bin/seatunnel.sh -c config/hello_world.conf -e spark
+```
+
+---
+
+## 📋 功能概览
+
+### SeaTunnel Skill
+- 🤖 **AI 助手**: 获得 SeaTunnel 概念和配置的即时帮助
+- 📚 **知识集成**: 查询官方文档和最佳实践
+- 🔍 **智能调试**: 分析错误并提出修复建议
+- 💡 **代码示例**: 为您的用例生成配置示例
+
+### SeaTunnel 核心引擎
+- **多模式支持**: 结构化、非结构化和半结构化数据
+- **100+ 连接器**: 数据库、数据仓库、云服务、消息队列
+- **多引擎支持**: Zeta(轻量级)、Spark、Flink
+- **同步模式**: 批处理、流处理、CDC(变更数据捕获)
+- **实时性能**: 每秒 100K - 1M 条记录吞吐量
+
+---
+
+## 🔧 安装与设置
+
+### 方法 1: SeaTunnel Skill (AI 集成)
+
+**第一步:复制技能文件**
+```bash
+mkdir -p ~/.claude/skills
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+**第二步:验证安装**
+```bash
+# 在 Claude Code 中尝试:
+/seatunnel-skill "什么是 SeaTunnel?"
+```
+
+**第三步:开始使用**
+```bash
+# 帮助配置
+/seatunnel-skill "创建一个 MySQL 到 Elasticsearch 的任务配置"
+
+# 故障排除
+/seatunnel-skill "我的 Kafka 连接器一直超时"
+
+# 学习功能
+/seatunnel-skill "在 SeaTunnel 中解释 CDC(变更数据捕获)"
+```
+
+### 方法 2: 二进制安装
+
+**支持平台**: Linux、macOS、Windows
+
+```bash
+# 下载最新版本
+VERSION=2.3.12
+wget
https://archive.apache.org/dist/seatunnel/${VERSION}/apache-seatunnel-${VERSION}-bin.tar.gz
+
+# 解压
+tar -xzf apache-seatunnel-${VERSION}-bin.tar.gz
+cd apache-seatunnel-${VERSION}
+
+# 设置环境
+export JAVA_HOME=/path/to/java
+export PATH=$PATH:$(pwd)/bin
+
+# 验证
+seatunnel.sh --version
+```
+
+### 方法 3: 从源代码构建
+
+```bash
+# 克隆仓库
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+
+# 构建
+mvn clean install -DskipTests
+
+# 从分发目录运行
+cd seatunnel-dist/target/apache-seatunnel-*-bin/apache-seatunnel-*
+./bin/seatunnel.sh --version
+```
+
+### 方法 4: Docker
+
+```bash
+# 拉取官方镜像
+docker pull apache/seatunnel:latest
+
+# 运行容器
+docker run -it apache/seatunnel:latest /bin/bash
+
+# 直接运行任务
+docker run -v /path/to/config:/config \
+ apache/seatunnel:latest \
+ seatunnel.sh -c /config/job.conf -e spark
+```
+
+---
+
+## 💻 使用指南
+
+### 用例 1: MySQL 到 PostgreSQL(批处理)
+
+**config/mysql_to_postgres.conf**
+```hocon
+env {
+ job.mode = "BATCH"
+ job.name = "MySQL 到 PostgreSQL"
+}
+
+source {
+ Jdbc {
+ driver = "com.mysql.cj.jdbc.Driver"
+ url = "jdbc:mysql://mysql-host:3306/mydb"
+ user = "root"
+ password = "password"
+ query = "SELECT * FROM users"
+ connection_check_timeout_sec = 100
+ }
+}
+
+sink {
+ Jdbc {
+ driver = "org.postgresql.Driver"
+ url = "jdbc:postgresql://pg-host:5432/mydb"
+ user = "postgres"
+ password = "password"
+ database = "mydb"
+ table = "users"
+ primary_keys = ["id"]
+ connection_check_timeout_sec = 100
+ }
+}
+```
+
+**运行:**
+```bash
+seatunnel.sh -c config/mysql_to_postgres.conf -e spark
+```
+
+### 用例 2: Kafka 流到 Elasticsearch
+
+**config/kafka_to_es.conf**
+```hocon
+env {
+ job.mode = "STREAMING"
+ job.name = "Kafka 到 Elasticsearch"
+ parallelism = 2
+}
+
+source {
+ Kafka {
+ bootstrap.servers = "kafka-host:9092"
+ topic = "events"
+ consumer.group = "seatunnel-group"
+ format = "json"
+ schema = {
+ fields {
+ event_id = "bigint"
+ event_name = "string"
+ timestamp = "bigint"
+ }
+ }
+ }
+}
+
+sink {
+ Elasticsearch {
+ hosts = ["es-host:9200"]
+ index = "events"
+ username = "elastic"
+ password = "password"
+ }
+}
+```
+
+**运行:**
+```bash
+seatunnel.sh -c config/kafka_to_es.conf -e flink
+```
+
+### 用例 3: MySQL CDC 到 Kafka
+
+**config/mysql_cdc_kafka.conf**
+```hocon
+env {
+ job.mode = "STREAMING"
+ job.name = "MySQL CDC 到 Kafka"
+}
+
+source {
+ Mysql {
+ server_id = 5400
+ hostname = "mysql-host"
+ port = 3306
+ username = "root"
+ password = "password"
+ database = ["mydb"]
+ table = ["users", "orders"]
+ startup.mode = "initial"
+ }
+}
+
+sink {
+ Kafka {
+ bootstrap.servers = "kafka-host:9092"
+ topic = "mysql_cdc"
+ format = "canal_json"
+ semantic = "EXACTLY_ONCE"
+ }
+}
+```
+
+**运行:**
+```bash
+seatunnel.sh -c config/mysql_cdc_kafka.conf -e flink
+```
+
+---
+
+## 📚 API 参考
+
+### 核心连接器类型
+
+**源连接器**
+- `Jdbc` - 通用 JDBC 数据库(MySQL、PostgreSQL、Oracle、SQL Server)
+- `Kafka` - Apache Kafka 主题
+- `Mysql` - 支持 CDC 的 MySQL
+- `MongoDB` - MongoDB 集合
+- `PostgreSQL` - 支持 CDC 的 PostgreSQL
+- `S3` - Amazon S3 和兼容存储
+- `Http` - HTTP/HTTPS 端点
+- `FakeSource` - 用于测试
+
+**宿连接器**
+- `Jdbc` - 写入 JDBC 兼容数据库
+- `Kafka` - 发布到 Kafka 主题
+- `Elasticsearch` - 写入 Elasticsearch 索引
+- `S3` - 写入 S3 存储桶
+- `Redis` - 写入 Redis
+- `HBase` - 写入 HBase 表
+- `Console` - 输出到控制台
+
+**转换连接器**
+- `Sql` - 执行 SQL 转换
+- `FieldMapper` - 列重命名/映射
+- `JsonPath` - 从 JSON 提取数据
+
+---
+
+## ⚙️ 配置与优化
+
+### 环境变量
+
+```bash
+# Java 配置
+export JAVA_HOME=/path/to/java
+export JVM_OPTS="-Xms1G -Xmx4G"
+
+# Spark 配置(使用 Spark 引擎时)
+export SPARK_HOME=/path/to/spark
+export SPARK_MASTER=spark://master:7077
+
+# Flink 配置(使用 Flink 引擎时)
+export FLINK_HOME=/path/to/flink
+
+# SeaTunnel 配置
+export SEATUNNEL_HOME=/path/to/seatunnel
+```
+
+### 批处理任务性能调优
+
+```hocon
+env {
+ job.mode = "BATCH"
+ parallelism = 8 # 根据集群大小增加
+}
+
+source {
+ Jdbc {
+ split_size = 100000 # 并行读取
+ fetch_size = 5000
+ }
+}
+
+sink {
+ Jdbc {
+ batch_size = 1000 # 批量插入
+ max_retries = 3
+ }
+}
+```
+
+### 流处理任务性能调优
+
+```hocon
+env {
+ job.mode = "STREAMING"
+ parallelism = 4
+ checkpoint.interval = 30000 # 30 秒
+}
+
+source {
+ Kafka {
+ consumer.group = "seatunnel-consumer"
+ max_poll_records = 500
+ }
+}
+```
+
+---
+
+## 🛠️ 开发指南
+
+### 项目结构
+
+```
+seatunnel-tools/
+├── seatunnel-skill/ # Claude Code AI 技能
+├── seatunnel-mcp/ # LLM 集成 MCP 服务
+├── x2seatunnel/ # DataX 到 SeaTunnel 转换器
+└── README_CN.md
+```
+
+### SeaTunnel 核心架构
+
+```
+seatunnel/
+├── seatunnel-api/ # 核心 API
+├── seatunnel-core/ # 执行引擎
+├── seatunnel-engines/ # 引擎实现
+│ ├── seatunnel-engine-flink/
+│ ├── seatunnel-engine-spark/
+│ └── seatunnel-engine-zeta/
+├── seatunnel-connectors/ # 连接器实现
+└── seatunnel-dist/ # 分发包
+```
+
+### 从源代码构建 SeaTunnel
+
+```bash
+# 完整构建
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+mvn clean install -DskipTests
+
+# 构建特定模块
+mvn clean install -pl
seatunnel-connectors/seatunnel-connectors-seatunnel-kafka -DskipTests
+```
+
+### 运行测试
+
+```bash
+# 单元测试
+mvn test
+
+# 特定测试类
+mvn test -Dtest=MySqlConnectorTest
+
+# 集成测试
+mvn verify
+```
+
+---
+
+## 🐛 故障排查(6 个常见问题)
+
+### 问题 1: ClassNotFoundException: com.mysql.jdbc.Driver
+
+**解决方案:**
+```bash
+wget
https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.33.jar
+cp mysql-connector-java-8.0.33.jar $SEATUNNEL_HOME/lib/
+seatunnel.sh -c config/job.conf -e spark
+```
+
+### 问题 2: OutOfMemoryError: Java heap space
+
+**解决方案:**
+```bash
+export JVM_OPTS="-Xms2G -Xmx8G"
+echo 'JVM_OPTS="-Xms2G -Xmx8G"' >> $SEATUNNEL_HOME/bin/seatunnel-env.sh
+```
+
+### 问题 3: Connection refused: connect
+
+**解决方案:**
+```bash
+# 验证连接
+ping source-host
+telnet source-host 3306
+
+# 检查凭证
+mysql -h source-host -u root -p
+```
+
+### 问题 4: CDC 期间找不到表
+
+**解决方案:**
+```sql
+-- 检查二进制日志状态
+SHOW VARIABLES LIKE 'log_bin';
+
+-- 在 my.cnf 中启用二进制日志
+[mysqld]
+log_bin = mysql-bin
+binlog_format = row
+```
+
+### 问题 5: 任务性能缓慢
+
+**解决方案:**
+```hocon
+env {
+ parallelism = 8 # 增加并行性
+}
+
+source {
+ Jdbc {
+ fetch_size = 5000
+ split_size = 100000
+ }
+}
+
+sink {
+ Jdbc {
+ batch_size = 2000
+ }
+}
+```
+
+### 问题 6: Kafka 偏移量超出范围
+
+**解决方案:**
+```hocon
+source {
+ Kafka {
+ auto.offset.reset = "earliest" # 或 "latest"
+ }
+}
+```
+
+---
+
+## ❓ 常见问题(8 个常见问题)
+
+**Q: BATCH 和 STREAMING 模式有什么区别?**
+
+A:
+- **BATCH**: 一次性执行,适合全量数据库迁移
+- **STREAMING**: 持续执行,适合实时同步和 CDC
+
+**Q: 如何在 CDC 期间处理架构更改?**
+
+A: 在源中配置自动检测:
+```hocon
+source {
+ Mysql {
+ schema_change_mode = "auto"
+ }
+}
+```
+
+**Q: 我能在同步期间转换数据吗?**
+
+A: 可以,使用 SQL 转换:
+```hocon
+transform {
+ Sql {
+ sql = "SELECT id, UPPER(name) as name FROM source"
+ }
+}
+```
+
+**Q: 最大吞吐量是多少?**
+
+A: 典型吞吐量为每个执行器每秒 100K - 1M 条记录。取决于:
+- 硬件(CPU、RAM、网络)
+- 数据库配置
+- 每条记录的数据大小
+- 网络延迟
+
+**Q: 如何在生产环境中处理错误?**
+
+A: 配置重启策略:
+```hocon
+env {
+ restart_strategy = "exponential_delay"
+ restart_strategy.exponential_delay.initial_delay = 1000
+ restart_strategy.exponential_delay.max_delay = 30000
+ restart_strategy.exponential_delay.multiplier = 2.0
+}
+```
+
+**Q: 是否有用于任务管理的 Web UI?**
+
+A: 是的!使用 SeaTunnel Web 项目:
+```bash
+git clone https://github.com/apache/seatunnel-web.git
+cd seatunnel-web
+mvn clean install
+java -jar target/seatunnel-web-*.jar
+# 访问 http://localhost:8080
+```
+
+**Q: 如何在 Claude Code 中使用 SeaTunnel Skill?**
+
+A: 复制到 `~/.claude/skills/` 后,使用:
+```bash
+/seatunnel-skill "关于 SeaTunnel 的问题"
+```
+
+**Q: 应该使用哪个引擎:Spark、Flink 还是 Zeta?**
+
+A:
+- **Zeta**: 轻量级,无外部依赖,单机
+- **Spark**: 分布式集群的批处理和批流混合
+- **Flink**: 分布式集群的高级流处理和 CDC
+
+---
+
+## 🔗 资源与链接
+
+### 官方文档
+- [SeaTunnel 官网](https://seatunnel.apache.org/)
+- [GitHub 仓库](https://github.com/apache/seatunnel)
+- [连接器列表](https://seatunnel.apache.org/docs/2.3.12/connector-v2/overview)
+- [HOCON 配置指南](https://github.com/lightbend/config/blob/main/HOCON.md)
+
+### 社区与支持
+- [Slack 频道](https://the-asf.slack.com/archives/C01CB5186TL)
+- [邮件列表](https://seatunnel.apache.org/community/mail-lists/)
+- [GitHub Issues](https://github.com/apache/seatunnel/issues)
+- [讨论论坛](https://github.com/apache/seatunnel/discussions)
+
+### 相关项目
+- [SeaTunnel Web UI](https://github.com/apache/seatunnel-web)
+- [SeaTunnel 工具集](https://github.com/apache/seatunnel-tools)
+- [Apache Kafka](https://kafka.apache.org/)
+- [Apache Flink](https://flink.apache.org/)
+- [Apache Spark](https://spark.apache.org/)
+
+---
+
+## 📄 单个工具说明
+
+### 1. SeaTunnel Skill(新功能)
+- **用途**: Claude Code 中 SeaTunnel 的 AI 助手
+- **位置**: [seatunnel-skill/](seatunnel-skill/)
+- **快速设置**: `cp -r seatunnel-skill ~/.claude/skills/`
+- **使用方法**: `/seatunnel-skill "你的问题"`
+
+### 2. SeaTunnel MCP 服务
+- **用途**: LLM 系统的模型上下文协议集成
+- **位置**: [seatunnel-mcp/](seatunnel-mcp/)
+- **英文**: [README.md](seatunnel-mcp/README.md)
+- **中文**: [README_CN.md](seatunnel-mcp/README_CN.md)
+- **快速开始**: [QUICK_START.md](seatunnel-mcp/docs/QUICK_START.md)
+
+### 3. x2seatunnel
+- **用途**: 将 DataX 等配置转换为 SeaTunnel 格式
+- **位置**: [x2seatunnel/](x2seatunnel/)
+- **英文**: [README.md](x2seatunnel/README.md)
+- **中文**: [README_zh.md](x2seatunnel/README_zh.md)
+
+---
+
+## 🤝 贡献
+
+欢迎提交 Issues 和 Pull Requests!
+
+对于主要的 SeaTunnel 引擎,请参阅 [Apache SeaTunnel](https://github.com/apache/seatunnel)。
+
+对于这些工具,请贡献到 [SeaTunnel 工具集](https://github.com/apache/seatunnel-tools)。
+
+---
+
+**最后更新**: 2026-01-28 | **许可证**: Apache 2.0
\ No newline at end of file
diff --git a/SKILL_SETUP_GUIDE.md b/SKILL_SETUP_GUIDE.md
new file mode 100644
index 0000000..c9922d3
--- /dev/null
+++ b/SKILL_SETUP_GUIDE.md
@@ -0,0 +1,673 @@
+# SeaTunnel Skill Setup Guide
+
+**English** | [中文](#中文版本)
+
+## Getting Started with SeaTunnel Skill in Claude Code
+
+SeaTunnel Skill is an AI-powered assistant for Apache SeaTunnel integrated
directly into Claude Code. It helps you with configuration, troubleshooting,
learning, and best practices.
+
+---
+
+## Installation
+
+### Step 1: Clone the Repository
+
+```bash
+git clone https://github.com/apache/seatunnel-tools.git
+cd seatunnel-tools
+```
+
+### Step 2: Locate Skills Directory
+
+Claude Code stores skills in your home directory. Create the skills directory
if it doesn't exist:
+
+```bash
+# Create ~/.claude/skills directory if it doesn't exist
+mkdir -p ~/.claude/skills
+```
+
+**Directory Locations by OS:**
+- **macOS/Linux**: `~/.claude/skills/`
+- **Windows**: `%USERPROFILE%\.claude\skills\`
+
+### Step 3: Copy the Skill
+
+```bash
+# Copy seatunnel-skill to Claude Code skills directory
+cp -r seatunnel-skill ~/.claude/skills/
+
+# Verify installation
+ls ~/.claude/skills/seatunnel-skill/
+```
+
+You should see:
+```
+SKILL.md # Skill definition and metadata
+README.md # Skill documentation
+```
+
+### Step 4: Verify Installation
+
+**Option A: Using Claude Code Terminal**
+
+```bash
+# In Claude Code terminal, run:
+ls ~/.claude/skills/seatunnel-skill/
+
+# You should see the skill files listed
+```
+
+**Option B: Check Skill Loading**
+
+In Claude Code, you might see a skill reload notification. If not:
+1. Restart Claude Code
+2. Or reload the skills manually through the skill menu
+
+### Step 5: Test the Skill
+
+Open Claude Code and try:
+
+```bash
+/seatunnel-skill "What is SeaTunnel?"
+```
+
+You should get an AI-powered response about SeaTunnel.
+
+---
+
+## Usage Examples
+
+### Getting Help with Configuration
+
+**Question:** How do I configure a MySQL to PostgreSQL job?
+
+```bash
+/seatunnel-skill "Create a job configuration to sync data from MySQL to
PostgreSQL with batch mode"
+```
+
+**Response:** The skill will provide a complete HOCON configuration example
with explanations.
+
+### Learning SeaTunnel Concepts
+
+**Question:** Explain CDC mode
+
+```bash
+/seatunnel-skill "Explain Change Data Capture (CDC) in SeaTunnel. When should
I use it?"
+```
+
+**Response:** Comprehensive explanation of CDC, use cases, and configuration
examples.
+
+### Troubleshooting
+
+**Question:** My job is failing
+
+```bash
+/seatunnel-skill "I'm getting 'OutOfMemoryError: Java heap space' in my batch
job. How do I fix it?"
+```
+
+**Response:** Detailed diagnosis and solutions, including:
+- Root cause explanation
+- Configuration fixes
+- Environment variable adjustments
+- Performance tuning tips
+
+### Connector Information
+
+**Question:** Available Kafka options
+
+```bash
+/seatunnel-skill "What are all the configuration options for Kafka source
connector?"
+```
+
+**Response:** Complete list of options with descriptions and examples.
+
+### Performance Optimization
+
+**Question:** How to optimize streaming
+
+```bash
+/seatunnel-skill "How do I optimize a Kafka to Elasticsearch streaming job for
maximum throughput?"
+```
+
+**Response:** Performance tuning recommendations for parallelism, batch sizes,
and resource allocation.
+
+---
+
+## Common Questions
+
+### Q: Why doesn't the skill show up?
+
+**A:** Make sure you:
+1. Copied the folder to `~/.claude/skills/` (not a subdirectory)
+2. Restarted Claude Code or reloaded skills
+3. The folder is named exactly `seatunnel-skill`
+
+**Fix:**
+```bash
+# Verify the path
+ls -la ~/.claude/skills/seatunnel-skill/SKILL.md
+
+# If it doesn't exist, copy it again
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+### Q: How do I update the skill?
+
+**A:**
+```bash
+# Navigate to seatunnel-tools directory
+cd /path/to/seatunnel-tools
+
+# Pull latest changes
+git pull origin main
+
+# Update the skill
+rm -rf ~/.claude/skills/seatunnel-skill
+cp -r seatunnel-skill ~/.claude/skills/
+
+# Restart Claude Code
+```
+
+### Q: Can I customize the skill?
+
+**A:** Yes! Edit `seatunnel-skill/SKILL.md`:
+
+```bash
+# Open the skill definition
+nano ~/.claude/skills/seatunnel-skill/SKILL.md
+
+# Make your changes
+# The skill will use your customizations
+```
+
+### Q: Where are skill responses saved?
+
+**A:** Skill responses are part of your Claude Code conversation history. They
are saved in:
+- Local Claude Code workspace
+- Optionally synced to Claude.ai if configured
+
+---
+
+## Advanced Usage
+
+### Chaining Questions
+
+You can build on previous questions in the same conversation:
+
+```bash
+/seatunnel-skill "What is batch mode?"
+
+# In next message, reference previous context:
+/seatunnel-skill "Show me a complete example combining batch mode with MySQL
source"
+
+# The skill understands the context from previous messages
+```
+
+### Getting Code Examples
+
+The skill can generate complete, production-ready configurations:
+
+```bash
+/seatunnel-skill "Generate a complete SeaTunnel job configuration that:
+1. Reads from MySQL database 'sales_db' table 'orders'
+2. Filters orders from last 30 days
+3. Writes to PostgreSQL 'analytics_db' table 'orders_processed'
+4. Uses batch mode with 4 parallelism"
+```
+
+### Integration with Your Workflow
+
+**Development Pipeline:**
+```bash
+# 1. Understand requirements
+/seatunnel-skill "Explain how to set up CDC from MySQL"
+
+# 2. Design solution
+/seatunnel-skill "Design a real-time data pipeline from MySQL CDC to Kafka"
+
+# 3. Generate configuration
+/seatunnel-skill "Generate the complete HOCON configuration for the pipeline"
+
+# 4. Debug issues
+/seatunnel-skill "My job is timing out. Debug this configuration: [paste
config]"
+
+# 5. Optimize performance
+/seatunnel-skill "How can I optimize this job for better throughput?"
+```
+
+---
+
+## Troubleshooting
+
+### Issue: Skill not found error
+
+```
+Error: Unknown skill: seatunnel-skill
+```
+
+**Solution:**
+```bash
+# 1. Verify skill exists
+ls ~/.claude/skills/seatunnel-skill/
+
+# 2. Check file permissions
+chmod +r ~/.claude/skills/seatunnel-skill/*
+
+# 3. Restart Claude Code and try again
+```
+
+### Issue: Outdated responses
+
+**Solution:**
+```bash
+# Update skill to latest version
+cd seatunnel-tools
+git pull origin main
+rm -rf ~/.claude/skills/seatunnel-skill
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+### Issue: Responses are too generic
+
+**Try:**
+```bash
+# Be more specific in your question:
+# Instead of:
+/seatunnel-skill "How to configure MySQL?"
+
+# Try:
+/seatunnel-skill "Configure MySQL source for a batch job that reads table
'users' with filters"
+```
+
+---
+
+## Tips for Best Results
+
+1. **Be Specific**: More details in your question = better responses
+2. **Include Context**: Mention your use case (batch/streaming, source/sink
types)
+3. **Show Configuration**: Paste your HOCON config for debugging
+4. **Reference Versions**: Specify SeaTunnel version (e.g., 2.3.12)
+5. **Ask Follow-ups**: The skill remembers conversation context
+
+---
+
+## Keyboard Shortcuts
+
+- **Cmd+K** (macOS) / **Ctrl+K** (Windows/Linux): Quick open skill
+- **Type** `/seatunnel-skill`: Invoke skill
+- **Tab**: Auto-complete skill parameters
+- **Esc**: Cancel skill input
+
+---
+
+## File Locations
+
+```
+seatunnel-tools/
+├── seatunnel-skill/ # AI Skill
+│ ├── SKILL.md # Skill definition
+│ └── README.md # Documentation
+├── README.md # Main documentation
+├── README_CN.md # Chinese documentation
+└── SKILL_SETUP_GUIDE.md # This file
+```
+
+---
+
+## Getting Help
+
+- **Skill Issues**: Try `/seatunnel-skill "How do I troubleshoot..."`
+- **SeaTunnel Questions**: Ask the skill directly
+- **Installation Help**: See [README.md](README.md) or
[README_CN.md](README_CN.md)
+- **Report Issues**: [GitHub
Issues](https://github.com/apache/seatunnel-tools/issues)
+
+---
+
+## Next Steps
+
+1. ✅ Install skill (`cp -r seatunnel-skill ~/.claude/skills/`)
+2. ✅ Test skill (`/seatunnel-skill "What is SeaTunnel?"`)
+3. 📚 Explore examples in this guide
+4. 🚀 Use skill for your SeaTunnel projects
+5. 📝 Share feedback and improvements
+
+---
+
+---
+
+# 中文版本
+
+# SeaTunnel Skill 安装使用指南
+
+## 开始使用 SeaTunnel Skill
+
+SeaTunnel Skill 是一个集成到 Claude Code 中的 AI 助手,帮助您进行 Apache SeaTunnel
的配置、故障排查、学习和最佳实践。
+
+---
+
+## 安装步骤
+
+### 第一步:克隆仓库
+
+```bash
+git clone https://github.com/apache/seatunnel-tools.git
+cd seatunnel-tools
+```
+
+### 第二步:定位技能目录
+
+Claude Code 在您的主目录中存储技能。如果目录不存在,请创建:
+
+```bash
+# 如果目录不存在,则创建 ~/.claude/skills 目录
+mkdir -p ~/.claude/skills
+```
+
+**不同操作系统的目录位置:**
+- **macOS/Linux**: `~/.claude/skills/`
+- **Windows**: `%USERPROFILE%\.claude\skills\`
+
+### 第三步:复制技能文件
+
+```bash
+# 复制 seatunnel-skill 到 Claude Code 技能目录
+cp -r seatunnel-skill ~/.claude/skills/
+
+# 验证安装
+ls ~/.claude/skills/seatunnel-skill/
+```
+
+您应该看到:
+```
+SKILL.md # 技能定义和元数据
+README.md # 技能文档
+```
+
+### 第四步:验证安装
+
+**选项 A:使用 Claude Code 终端**
+
+```bash
+# 在 Claude Code 终端中运行:
+ls ~/.claude/skills/seatunnel-skill/
+
+# 您应该看到技能文件列出
+```
+
+**选项 B:检查技能加载**
+
+在 Claude Code 中,您可能会看到技能重新加载通知。如果没有:
+1. 重启 Claude Code
+2. 或通过技能菜单手动重新加载
+
+### 第五步:测试技能
+
+打开 Claude Code 并尝试:
+
+```bash
+/seatunnel-skill "什么是 SeaTunnel?"
+```
+
+您应该获得关于 SeaTunnel 的 AI 驱动响应。
+
+---
+
+## 使用示例
+
+### 获取配置帮助
+
+**问题:** 如何配置从 MySQL 到 PostgreSQL 的任务?
+
+```bash
+/seatunnel-skill "创建一个任务配置,以批处理模式将数据从 MySQL 同步到 PostgreSQL"
+```
+
+**响应:** 技能将提供完整的 HOCON 配置示例和说明。
+
+### 学习 SeaTunnel 概念
+
+**问题:** 解释 CDC 模式
+
+```bash
+/seatunnel-skill "在 SeaTunnel 中解释变更数据捕获 (CDC)。何时应该使用它?"
+```
+
+**响应:** 关于 CDC 的全面解释、用例和配置示例。
+
+### 故障排查
+
+**问题:** 我的任务失败了
+
+```bash
+/seatunnel-skill "我的批处理任务出现 'OutOfMemoryError: Java heap space' 错误。我应该如何修复?"
+```
+
+**响应:** 详细的诊断和解决方案,包括:
+- 根本原因说明
+- 配置修复
+- 环境变量调整
+- 性能调优建议
+
+### 连接器信息
+
+**问题:** 可用的 Kafka 选项
+
+```bash
+/seatunnel-skill "Kafka 源连接器的所有配置选项是什么?"
+```
+
+**响应:** 完整的选项列表,带有描述和示例。
+
+### 性能优化
+
+**问题:** 如何优化流处理
+
+```bash
+/seatunnel-skill "如何优化从 Kafka 到 Elasticsearch 的流处理任务以获得最大吞吐量?"
+```
+
+**响应:** 并行度、批大小和资源分配的性能调优建议。
+
+---
+
+## 常见问题
+
+### Q: 为什么技能不显示?
+
+**A:** 请确保您:
+1. 将文件夹复制到 `~/.claude/skills/`(不是子目录)
+2. 重启了 Claude Code 或重新加载了技能
+3. 文件夹名称完全是 `seatunnel-skill`
+
+**修复:**
+```bash
+# 验证路径
+ls -la ~/.claude/skills/seatunnel-skill/SKILL.md
+
+# 如果不存在,再次复制
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+### Q: 如何更新技能?
+
+**A:**
+```bash
+# 导航到 seatunnel-tools 目录
+cd /path/to/seatunnel-tools
+
+# 拉取最新更改
+git pull origin main
+
+# 更新技能
+rm -rf ~/.claude/skills/seatunnel-skill
+cp -r seatunnel-skill ~/.claude/skills/
+
+# 重启 Claude Code
+```
+
+### Q: 我可以自定义技能吗?
+
+**A:** 可以!编辑 `seatunnel-skill/SKILL.md`:
+
+```bash
+# 打开技能定义
+nano ~/.claude/skills/seatunnel-skill/SKILL.md
+
+# 进行更改
+# 技能将使用您的自定义设置
+```
+
+### Q: 技能响应保存在哪里?
+
+**A:** 技能响应是您的 Claude Code 对话历史的一部分。它们保存在:
+- 本地 Claude Code 工作区
+- 如果配置,可选地同步到 Claude.ai
+
+---
+
+## 高级用法
+
+### 链接问题
+
+您可以在同一对话中基于之前的问题进行构建:
+
+```bash
+/seatunnel-skill "什么是批处理模式?"
+
+# 在下一条消息中,参考之前的上下文:
+/seatunnel-skill "展示一个结合批处理模式和 MySQL 源的完整示例"
+
+# 技能理解来自之前消息的上下文
+```
+
+### 获取代码示例
+
+技能可以生成完整的、生产就绪的配置:
+
+```bash
+/seatunnel-skill "生成一个完整的 SeaTunnel 任务配置,该配置:
+1. 从 MySQL 数据库 'sales_db' 表 'orders' 读取
+2. 过滤最近 30 天的订单
+3. 写入 PostgreSQL 'analytics_db' 表 'orders_processed'
+4. 使用 4 个并行度的批处理模式"
+```
+
+### 与您的工作流集成
+
+**开发流程:**
+```bash
+# 1. 了解需求
+/seatunnel-skill "解释如何从 MySQL 设置 CDC"
+
+# 2. 设计解决方案
+/seatunnel-skill "设计从 MySQL CDC 到 Kafka 的实时数据管道"
+
+# 3. 生成配置
+/seatunnel-skill "为管道生成完整的 HOCON 配置"
+
+# 4. 调试问题
+/seatunnel-skill "我的任务超时。调试此配置:[粘贴配置]"
+
+# 5. 优化性能
+/seatunnel-skill "我应该如何优化此任务以获得更好的吞吐量?"
+```
+
+---
+
+## 故障排查
+
+### 问题:技能未找到错误
+
+```
+Error: Unknown skill: seatunnel-skill
+```
+
+**解决方案:**
+```bash
+# 1. 验证技能存在
+ls ~/.claude/skills/seatunnel-skill/
+
+# 2. 检查文件权限
+chmod +r ~/.claude/skills/seatunnel-skill/*
+
+# 3. 重启 Claude Code 并重试
+```
+
+### 问题:响应过时
+
+**解决方案:**
+```bash
+# 更新技能到最新版本
+cd seatunnel-tools
+git pull origin main
+rm -rf ~/.claude/skills/seatunnel-skill
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+### 问题:响应过于笼统
+
+**尝试:**
+```bash
+# 在问题中更具体:
+# 不是:
+/seatunnel-skill "如何配置 MySQL?"
+
+# 而是:
+/seatunnel-skill "配置 MySQL 源进行批处理任务,读取 'users' 表并应用过滤器"
+```
+
+---
+
+## 获得最佳结果的提示
+
+1. **具体明确**: 问题中的细节越多 = 响应越好
+2. **包含上下文**: 提及您的用例(批/流、源/宿类型)
+3. **显示配置**: 粘贴您的 HOCON 配置以进行调试
+4. **参考版本**: 指定 SeaTunnel 版本(例如 2.3.12)
+5. **提出后续问题**: 技能会记住对话上下文
+
+---
+
+## 键盘快捷键
+
+- **Cmd+K** (macOS) / **Ctrl+K** (Windows/Linux): 快速打开技能
+- **输入** `/seatunnel-skill`: 调用技能
+- **Tab**: 自动完成技能参数
+- **Esc**: 取消技能输入
+
+---
+
+## 文件位置
+
+```
+seatunnel-tools/
+├── seatunnel-skill/ # AI 技能
+│ ├── SKILL.md # 技能定义
+│ └── README.md # 文档
+├── README.md # 主文档
+├── README_CN.md # 中文文档
+└── SKILL_SETUP_GUIDE.md # 此文件
+```
+
+---
+
+## 获取帮助
+
+- **技能问题**: 尝试 `/seatunnel-skill "我应该如何故障排查..."`
+- **SeaTunnel 问题**: 直接向技能提问
+- **安装帮助**: 查看 [README.md](README.md) 或 [README_CN.md](README_CN.md)
+- **报告问题**: [GitHub Issues](https://github.com/apache/seatunnel-tools/issues)
+
+---
+
+## 后续步骤
+
+1. ✅ 安装技能 (`cp -r seatunnel-skill ~/.claude/skills/`)
+2. ✅ 测试技能 (`/seatunnel-skill "什么是 SeaTunnel?"`)
+3. 📚 探索本指南中的示例
+4. 🚀 将技能用于您的 SeaTunnel 项目
+5. 📝 分享反馈和改进
+
+---
+
+**最后更新**: 2026-01-28 | **许可证**: Apache 2.0
\ No newline at end of file
diff --git a/seatunnel-skill/SKILL.md b/seatunnel-skill/SKILL.md
new file mode 100644
index 0000000..b0ceb2d
--- /dev/null
+++ b/seatunnel-skill/SKILL.md
@@ -0,0 +1,1212 @@
+---
+name: seatunnel-skill
+description: Apache SeaTunnel - A multimodal, high-performance, distributed
data integration tool for massive data synchronization across 100+ connectors
+author: auto-generated by repo2skill
+platform: github
+source: https://github.com/apache/seatunnel
+tags: [data-integration, data-pipeline, etl, elt, real-time-streaming,
batch-processing, cdc, distributed-computing, apache, java]
+version: 2.3.13
+generated: 2026-01-28
+license: Apache 2.0
+repository: apache/seatunnel
+---
+
+# Apache SeaTunnel OpenCode Skill
+
+Apache SeaTunnel is a **multimodal, high-performance, distributed data
integration tool** capable of synchronizing vast amounts of data daily. It
connects hundreds of evolving data sources with support for real-time, CDC
(Change Data Capture), and full database synchronization.
+
+## Quick Start
+
+### Prerequisites
+
+- **Java**: JDK 8 or higher
+- **Maven**: 3.6.0 or higher (for building from source)
+- **Python**: 3.7+ (optional, for Python API)
+- **Git**: For cloning the repository
+
+### Installation
+
+#### 1. Clone Repository
+```bash
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+```
+
+#### 2. Build from Source
+```bash
+# Build entire project
+mvn clean install -DskipTests
+
+# Build specific module
+mvn clean install -pl seatunnel-core -DskipTests
+```
+
+#### 3. Download Pre-built Binary (Recommended)
+Visit the [official download page](https://seatunnel.apache.org/download) and
select your version:
+
+```bash
+# Example: Download SeaTunnel 2.3.12
+wget
https://archive.apache.org/dist/seatunnel/2.3.12/apache-seatunnel-2.3.12-bin.tar.gz
+tar -xzf apache-seatunnel-2.3.12-bin.tar.gz
+cd apache-seatunnel-2.3.12
+```
+
+#### 4. Basic Configuration
+```bash
+# Set JAVA_HOME
+export JAVA_HOME=/path/to/java
+
+# Add to PATH
+export PATH=$PATH:/path/to/seatunnel/bin
+```
+
+#### 5. Verify Installation
+```bash
+seatunnel --version
+```
+
+### Hello World Example
+
+Create a simple data integration job:
+
+**config/hello_world.conf**
+```hocon
+env {
+ job.mode = "BATCH"
+ job.name = "Hello World"
+}
+
+source {
+ FakeSource {
+ row.num = 100
+ schema = {
+ fields {
+ id = "bigint"
+ name = "string"
+ age = "int"
+ }
+ }
+ }
+}
+
+sink {
+ Console {
+ format = "json"
+ }
+}
+```
+
+Run the job:
+```bash
+seatunnel.sh -c config/hello_world.conf -e spark
+# or with flink
+seatunnel.sh -c config/hello_world.conf -e flink
+```
+
+---
+
+## Overview
+
+### Purpose and Target Users
+
+**SeaTunnel** is designed for:
+- **Data Engineers**: Building large-scale data pipelines with minimal
complexity
+- **DevOps Teams**: Managing data integration infrastructure
+- **Enterprise Platforms**: Handling 100+ billion data records daily
+- **Real-time Analytics**: Supporting streaming data synchronization
+- **Legacy System Migration**: CDC-based incremental sync from transactional
databases
+
+### Core Capabilities
+
+1. **Multimodal Support**
+ - Structured data (databases, data warehouses)
+ - Unstructured data (video, images, binaries)
+ - Semi-structured data (JSON, logs, binlog streams)
+
+2. **Multiple Synchronization Methods**
+ - **Batch**: Full historical data transfer
+ - **Streaming**: Real-time data pipeline
+ - **CDC**: Incremental capture from databases
+ - **Full + Incremental**: Combined approach
+
+3. **100+ Pre-built Connectors**
+ - Databases: MySQL, PostgreSQL, Oracle, SQL Server, MongoDB
+ - Data Warehouses: Snowflake, BigQuery, Redshift, Iceberg
+ - Cloud SaaS: Salesforce, Shopify, Google Sheets
+ - Message Queues: Kafka, RabbitMQ, Pulsar
+ - Search Engines: Elasticsearch, OpenSearch
+ - Object Storage: S3, GCS, HDFS
+
+4. **Multi-Engine Support**
+ - **Zeta Engine**: Lightweight, standalone deployment (no Spark/Flink
required)
+ - **Apache Flink**: Distributed streaming engine
+ - **Apache Spark**: Distributed batch/batch-stream processing
+
+---
+
+## Features
+
+### High-Performance
+
+- **Distributed Snapshot Algorithm**: Ensures data consistency without locks
+- **JDBC Multiplexing**: Minimizes database connections for real-time sync
+- **Log Parsing**: Efficient CDC implementation with binary log analysis
+- **Resource Optimization**: Reduces computing resources and I/O overhead
+
+### Data Quality & Reliability
+
+- **Real-time Monitoring**: Track synchronization progress and data metrics
+- **Data Loss Prevention**: Transactional guarantees (exactly-once semantics)
+- **Deduplication**: Prevents duplicate records during reprocessing
+- **Error Handling**: Graceful failure recovery and retry logic
+
+### Developer-Friendly
+
+- **SQL-like Configuration**: Intuitive job definition syntax
+- **Visual Web UI**: Drag-and-drop job builder (SeaTunnel Web Project)
+- **Extensive Documentation**: Comprehensive guides and examples
+- **Community Support**: Active community via Slack and mailing lists
+
+### Production Ready
+
+- **Proven at Scale**: Used in enterprises processing billions of records daily
+- **Version Stability**: Regular releases with backward compatibility
+- **Enterprise Features**: Multi-tenancy, RBAC, audit logging
+- **Cloud Native**: Kubernetes-ready deployment
+
+---
+
+## Installation
+
+### Installation Methods
+
+#### Method 1: Binary Download (Recommended for Quick Start)
+
+```bash
+# 1. Download binary
+VERSION=2.3.12
+wget
https://archive.apache.org/dist/seatunnel/${VERSION}/apache-seatunnel-${VERSION}-bin.tar.gz
+
+# 2. Extract
+tar -xzf apache-seatunnel-${VERSION}-bin.tar.gz
+cd apache-seatunnel-${VERSION}
+
+# 3. Verify
+./bin/seatunnel.sh --version
+```
+
+#### Method 2: Build from Source
+
+```bash
+# 1. Clone repository
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+
+# 2. Build with Maven
+mvn clean install -DskipTests
+
+# 3. Navigate to distribution
+cd seatunnel-dist/target/apache-seatunnel-*-bin/apache-seatunnel-*
+
+# 4. Verify
+./bin/seatunnel.sh --version
+```
+
+#### Method 3: Docker
+
+```bash
+# Pull official Docker image
+docker pull apache/seatunnel:latest
+
+# Run container
+docker run -it apache/seatunnel:latest /bin/bash
+
+# Or run a job directly
+docker run -v /path/to/config:/config \
+ apache/seatunnel:latest \
+ seatunnel.sh -c /config/job.conf -e spark
+```
+
+### Environment Setup
+
+#### Set Java Home
+```bash
+# Bash/Zsh
+export JAVA_HOME=/path/to/java
+export PATH=$JAVA_HOME/bin:$PATH
+
+# Verify
+java -version
+```
+
+#### Configure for Spark Engine
+```bash
+# Set Spark Home (if using Spark engine)
+export SPARK_HOME=/path/to/spark
+
+# Verify Spark installation
+$SPARK_HOME/bin/spark-submit --version
+```
+
+#### Configure for Flink Engine
+```bash
+# Set Flink Home (if using Flink engine)
+export FLINK_HOME=/path/to/flink
+
+# Verify Flink installation
+$FLINK_HOME/bin/flink --version
+```
+
+### System Requirements
+
+| Requirement | Version/Spec |
+|---|---|
+| Java | JDK 1.8+ |
+| Memory | 2GB+ (minimum), 8GB+ (recommended) |
+| Disk | 500MB (binary) + job storage |
+| Network | Connectivity to source/sink systems |
+| Scala | 2.12.15 (for Spark/Flink integration) |
+
+---
+
+## Usage
+
+### 1. Job Configuration (HOCON Format)
+
+SeaTunnel uses **HOCON** (Human-Optimized Config Object Notation) for job
configuration.
+
+**Basic Structure:**
+```hocon
+env {
+ job.mode = "BATCH" # or STREAMING
+ job.name = "My Job"
+ parallelism = 4
+}
+
+source {
+ SourceConnector {
+ option1 = value1
+ option2 = value2
+ schema = {
+ fields {
+ column1 = "type"
+ column2 = "type"
+ }
+ }
+ }
+}
+
+# Optional: Transform data
+transform {
+ TransformName {
+ option = value
+ }
+}
+
+sink {
+ SinkConnector {
+ option1 = value1
+ option2 = value2
+ }
+}
+```
+
+### 2. Common Use Cases
+
+#### Use Case 1: MySQL to PostgreSQL (Batch)
+
+**config/mysql_to_postgres.conf**
+```hocon
+env {
+ job.mode = "BATCH"
+ job.name = "MySQL to PostgreSQL"
+}
+
+source {
+ Jdbc {
+ driver = "com.mysql.cj.jdbc.Driver"
+ url = "jdbc:mysql://mysql-host:3306/mydb"
+ user = "root"
+ password = "password"
+ query = "SELECT * FROM users"
+ connection_check_timeout_sec = 100
+ }
+}
+
+sink {
+ Jdbc {
+ driver = "org.postgresql.Driver"
+ url = "jdbc:postgresql://pg-host:5432/mydb"
+ user = "postgres"
+ password = "password"
+ database = "mydb"
+ table = "users"
+ primary_keys = ["id"]
+ connection_check_timeout_sec = 100
+ }
+}
+```
+
+Run:
+```bash
+seatunnel.sh -c config/mysql_to_postgres.conf -e spark
+```
+
+#### Use Case 2: Kafka Streaming to Elasticsearch
+
+**config/kafka_to_es.conf**
+```hocon
+env {
+ job.mode = "STREAMING"
+ job.name = "Kafka to Elasticsearch"
+ parallelism = 2
+}
+
+source {
+ Kafka {
+ bootstrap.servers = "kafka-host:9092"
+ topic = "events"
+ patterns = "event.*"
+ consumer.group = "seatunnel-group"
+ format = "json"
+ schema = {
+ fields {
+ event_id = "bigint"
+ event_name = "string"
+ timestamp = "bigint"
+ payload = "string"
+ }
+ }
+ }
+}
+
+transform {
+ Sql {
+ sql = "SELECT event_id, event_name, FROM_UNIXTIME(timestamp/1000) as ts,
payload FROM source"
+ }
+}
+
+sink {
+ Elasticsearch {
+ hosts = ["es-host:9200"]
+ index = "events"
+ index_type = "_doc"
+ username = "elastic"
+ password = "password"
+ }
+}
+```
+
+Run:
+```bash
+seatunnel.sh -c config/kafka_to_es.conf -e flink
+```
+
+#### Use Case 3: CDC from MySQL to Kafka
+
+**config/mysql_cdc_kafka.conf**
+```hocon
+env {
+ job.mode = "STREAMING"
+ job.name = "MySQL CDC to Kafka"
+}
+
+source {
+ Mysql {
+ server_id = 5400
+ hostname = "mysql-host"
+ port = 3306
+ username = "root"
+ password = "password"
+ database = ["mydb"]
+ table = ["users", "orders"]
+ startup.mode = "initial"
+ snapshot.split.size = 8096
+ incremental.snapshot.chunk.size = 1024
+ snapshot_fetch_size = 1024
+ snapshot_lock_timeout_sec = 10
+ server_time_zone = "UTC"
+ }
+}
+
+sink {
+ Kafka {
+ bootstrap.servers = "kafka-host:9092"
+ topic = "mysql_cdc"
+ format = "canal_json"
+ semantic = "EXACTLY_ONCE"
+ }
+}
+```
+
+Run:
+```bash
+seatunnel.sh -c config/mysql_cdc_kafka.conf -e flink
+```
+
+### 3. Running Jobs
+
+#### Local Mode (Single Machine)
+```bash
+# Using Spark (default)
+seatunnel.sh -c config/job.conf -e spark
+
+# Using Flink
+seatunnel.sh -c config/job.conf -e flink
+
+# Using Zeta (lightweight)
+seatunnel.sh -c config/job.conf -e zeta
+```
+
+#### Cluster Mode
+```bash
+# Submit to Spark cluster
+seatunnel.sh -c config/job.conf -e spark -m cluster -n hadoop-master:7077
+
+# Submit to Flink cluster
+seatunnel.sh -c config/job.conf -e flink -m remote -s localhost 8081
+```
+
+#### Verbose Output
+```bash
+seatunnel.sh -c config/job.conf -e spark -l DEBUG
+```
+
+#### Check Status
+```bash
+# View running jobs (Spark Cluster)
+spark-submit --status <driver-id>
+
+# View running jobs (Flink Cluster)
+$FLINK_HOME/bin/flink list
+```
+
+### 4. SQL API (Advanced)
+
+Use SQL for complex transformations:
+
+```hocon
+source {
+ MySQL {
+ # Source config...
+ }
+}
+
+transform {
+ Sql {
+ # Multiple SQL statements
+ sql = """
+ SELECT
+ id,
+ UPPER(name) as name,
+ age + 10 as age_plus_10,
+ CURRENT_TIMESTAMP as created_at
+ FROM source
+ WHERE age > 18
+ """
+ }
+}
+
+sink {
+ PostgreSQL {
+ # Sink config...
+ }
+}
+```
+
+---
+
+## API Reference
+
+### Core Connector Types
+
+#### Source Connectors
+- **Jdbc**: Generic JDBC databases (MySQL, PostgreSQL, Oracle, SQL Server)
+- **Kafka**: Apache Kafka topics
+- **Mysql**: MySQL with CDC support
+- **MongoDB**: MongoDB collections
+- **PostgreSQL**: PostgreSQL with CDC
+- **S3**: Amazon S3 and compatible storage
+- **Http**: HTTP/HTTPS endpoints
+- **FakeSource**: For testing and development
+
+#### Transform Connectors
+- **Sql**: Execute SQL transformations
+- **Dummy**: Pass-through (testing)
+- **FieldMapper**: Rename/map columns
+- **JsonPath**: Extract data from JSON
+
+#### Sink Connectors
+- **Jdbc**: Write to JDBC-compatible databases
+- **Kafka**: Publish to Kafka topics
+- **Elasticsearch**: Write to Elasticsearch indices
+- **S3**: Write to S3 buckets
+- **Redis**: Write to Redis
+- **HBase**: Write to HBase tables
+- **StarRocks**: Write to StarRocks tables
+- **Console**: Output to console (testing)
+
+### Configuration Options
+
+#### Common Source Options
+```hocon
+source {
+ ConnectorName {
+ # Connection
+ hostname = "host"
+ port = 3306
+ username = "user"
+ password = "pass"
+
+ # Data selection
+ database = "db_name"
+ table = "table_name"
+
+ # Performance
+ fetch_size = 1000
+ connection_check_timeout_sec = 100
+ split_size = 10000
+
+ # Schema
+ schema = {
+ fields {
+ id = "bigint"
+ name = "string"
+ }
+ }
+ }
+}
+```
+
+#### Common Sink Options
+```hocon
+sink {
+ ConnectorName {
+ # Connection
+ hostname = "host"
+ port = 3306
+ username = "user"
+ password = "pass"
+
+ # Write behavior
+ database = "db_name"
+ table = "table_name"
+ primary_keys = ["id"]
+ batch_size = 500
+
+ # Error handling
+ max_retries = 3
+ retry_wait_time_ms = 1000
+ on_duplicate_key_update_column_names = ["field1"]
+ }
+}
+```
+
+#### Engine Options
+```hocon
+env {
+ # Execution mode
+ job.mode = "BATCH" # or STREAMING
+
+ # Job identity
+ job.name = "My Job"
+
+ # Parallelism
+ parallelism = 4
+
+ # Checkpoint (streaming)
+ checkpoint.interval = 60000
+
+ # Restart strategy
+ restart_strategy = "fixed_delay"
+ restart_strategy.fixed_delay.attempts = 3
+ restart_strategy.fixed_delay.delay = 10000
+}
+```
+
+### Debugging and Monitoring
+
+#### View Job Metrics
+```bash
+# During execution, monitor logs
+tail -f logs/seatunnel.log
+
+# Check specific log level
+grep ERROR logs/seatunnel.log
+```
+
+#### Enable Debug Logging
+```bash
+# Set log level to DEBUG
+seatunnel.sh -c config/job.conf -e spark -l DEBUG
+```
+
+#### Use Test Sources
+```hocon
+source {
+ FakeSource {
+ row.num = 1000
+ schema = {
+ fields {
+ id = "bigint"
+ name = "string"
+ }
+ }
+ }
+}
+
+sink {
+ Console {
+ format = "json"
+ }
+}
+```
+
+---
+
+## Configuration
+
+### Project Structure
+```
+seatunnel/
+├── bin/ # Executable scripts
+│ ├── seatunnel.sh # Main entry point
+│ └── seatunnel-submit.sh # Spark submission script
+├── config/ # Configuration examples
+│ ├── flink-conf.yaml # Flink configuration
+│ └── spark-conf.yaml # Spark configuration
+├── connectors/ # Pre-built connectors
+├── lib/ # JAR dependencies
+├── logs/ # Runtime logs
+└── plugin/ # Plugin directory
+```
+
+### Environment Variables
+
+```bash
+# Java configuration
+export JAVA_HOME=/path/to/java
+export JVM_OPTS="-Xms1G -Xmx4G"
+
+# Spark configuration
+export SPARK_HOME=/path/to/spark
+export SPARK_MASTER=spark://master:7077
+
+# Flink configuration
+export FLINK_HOME=/path/to/flink
+
+# SeaTunnel configuration
+export SEATUNNEL_HOME=/path/to/seatunnel
+```
+
+### Key Configuration Files
+
+#### `seatunnel-env.sh`
+Configure runtime environment:
+```bash
+JAVA_HOME=/path/to/java
+JVM_OPTS="-Xms1G -Xmx4G -XX:+UseG1GC"
+SPARK_HOME=/path/to/spark
+FLINK_HOME=/path/to/flink
+```
+
+#### `flink-conf.yaml` (for Flink engine)
+```yaml
+taskmanager.memory.process.size: 2g
+taskmanager.memory.jvm-overhead.fraction: 0.1
+parallelism.default: 4
+```
+
+#### `spark-conf.yaml` (for Spark engine)
+```yaml
+driver-memory: 2g
+executor-memory: 4g
+num-executors: 4
+```
+
+### Performance Tuning
+
+#### For Batch Jobs
+```hocon
+env {
+ job.mode = "BATCH"
+ parallelism = 8 # Increase for larger clusters
+}
+
+source {
+ Jdbc {
+ # Split large tables for parallel reads
+ split_size = 100000
+ fetch_size = 5000
+ }
+}
+
+sink {
+ Jdbc {
+ # Batch inserts for better throughput
+ batch_size = 1000
+ # Use connection pooling
+ max_retries = 3
+ }
+}
+```
+
+#### For Streaming Jobs
+```hocon
+env {
+ job.mode = "STREAMING"
+ parallelism = 4
+ checkpoint.interval = 30000 # 30 seconds
+}
+
+source {
+ Kafka {
+ # Consumer group for parallel reads
+ consumer.group = "seatunnel-consumer"
+ # Batch reading
+ max_poll_records = 500
+ }
+}
+```
+
+---
+
+## Development
+
+### Project Architecture
+
+**SeaTunnel** follows a modular architecture:
+
+```
+seatunnel/
+├── seatunnel-api/ # Core APIs
+├── seatunnel-core/ # Execution engine
+├── seatunnel-engines/ # Engine implementations
+│ ├── seatunnel-engine-flink/
+│ ├── seatunnel-engine-spark/
+│ └── seatunnel-engine-zeta/
+├── seatunnel-connectors/ # Connector implementations
+│ ├── seatunnel-connectors-*/ # One per connector type
+└── seatunnel-dist/ # Distribution package
+```
+
+### Building from Source
+
+#### Full Build
+```bash
+# Clone repository
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+
+# Build all modules
+mvn clean install -DskipTests
+
+# Build with tests (slower)
+mvn clean install
+```
+
+#### Build Specific Module
+```bash
+# Build only Kafka connector
+mvn clean install -pl
seatunnel-connectors/seatunnel-connectors-seatunnel-kafka -DskipTests
+
+# Build only Flink engine
+mvn clean install -pl seatunnel-engines/seatunnel-engine-flink -DskipTests
+```
+
+#### Build Distribution
+```bash
+# Create binary distribution
+mvn clean install -DskipTests
+cd seatunnel-dist
+tar -tzf target/apache-seatunnel-*-bin.tar.gz | head
+```
+
+### Running Tests
+
+#### Unit Tests
+```bash
+# Run all tests
+mvn test
+
+# Run specific test class
+mvn test -Dtest=MySqlConnectorTest
+
+# Run with coverage
+mvn test jacoco:report
+```
+
+#### Integration Tests
+```bash
+# Run integration tests
+mvn verify
+
+# Run with Docker containers
+mvn verify -Pintegration-tests
+```
+
+### Development Setup
+
+#### IDE Configuration (IntelliJ IDEA)
+
+1. **Import Project**
+ - File → Open → Select seatunnel directory
+ - Choose Maven as build system
+
+2. **Configure JDK**
+ - Project Settings → Project → JDK
+ - Select JDK 1.8 or higher
+
+3. **Enable Annotation Processing**
+ - Project Settings → Build → Compiler → Annotation Processors
+ - Enable annotation processing
+
+4. **Run Configuration**
+ - Run → Edit Configurations
+ - Add new "Application" configuration
+ - Set main class:
`org.apache.seatunnel.core.starter.command.CommandExecuteRunner`
+
+#### Code Style
+
+SeaTunnel follows Apache project conventions:
+- 4-space indentation (not tabs)
+- Line length max 120 characters
+- Standard Java naming conventions
+- Organize imports alphabetically
+
+Use the provided `.editorconfig`:
+```bash
+# Install EditorConfig plugin (IntelliJ)
+# Then your IDE will auto-format code
+```
+
+### Creating Custom Connectors
+
+#### 1. Extend SeConnector Interface
+```java
+import org.apache.seatunnel.api.source.SeSource;
+import org.apache.seatunnel.api.table.catalog.Table;
+
+public class CustomSource extends SeSource {
+ @Override
+ public String getPluginName() {
+ return "Custom";
+ }
+
+ @Override
+ public void validate() {
+ // Validation logic
+ }
+
+ @Override
+ public ResultSet read(Boundedness boundedness) {
+ // Implementation
+ }
+}
+```
+
+#### 2. Create Configuration Class
+```java
+import org.apache.seatunnel.api.configuration.Option;
+import org.apache.seatunnel.api.configuration.Options;
+
+public class CustomSourceOptions {
+ public static final Option<String> HOST =
+ Options.key("host")
+ .stringType()
+ .noDefaultValue()
+ .withDescription("Source hostname");
+
+ public static final Option<Integer> PORT =
+ Options.key("port")
+ .intType()
+ .defaultValue(9200)
+ .withDescription("Source port");
+}
+```
+
+#### 3. Register Connector
+```
+META-INF/services/org.apache.seatunnel.api.source.SeSource
+```
+
+### Contributing Guide
+
+1. **Fork and Clone**
+ ```bash
+ git clone https://github.com/YOUR_USERNAME/seatunnel.git
+ cd seatunnel
+ git remote add upstream https://github.com/apache/seatunnel.git
+ ```
+
+2. **Create Feature Branch**
+ ```bash
+ git checkout -b feature/my-feature
+ ```
+
+3. **Make Changes**
+ - Follow code style guide
+ - Add tests for new features
+ - Update documentation
+
+4. **Commit and Push**
+ ```bash
+ git add .
+ git commit -m "feat: add new feature"
+ git push origin feature/my-feature
+ ```
+
+5. **Create Pull Request**
+ - Go to GitHub repository
+ - Create PR with clear description
+ - Link any related issues
+ - Wait for review
+
+6. **Code Review Process**
+ - Address feedback from maintainers
+ - Update PR with changes
+ - After approval, maintainers will merge
+
+---
+
+## Troubleshooting
+
+### Common Issues and Solutions
+
+#### Issue 1: "ClassNotFoundException: com.mysql.jdbc.Driver"
+
+**Cause**: JDBC driver JAR not in classpath
+
+**Solution**:
+```bash
+# 1. Download MySQL JDBC driver
+wget
https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.33.jar
+
+# 2. Copy to lib directory
+cp mysql-connector-java-8.0.33.jar $SEATUNNEL_HOME/lib/
+
+# 3. Restart job
+seatunnel.sh -c config/job.conf -e spark
+```
+
+#### Issue 2: "OutOfMemoryError: Java heap space"
+
+**Cause**: Insufficient JVM heap memory
+
+**Solution**:
+```bash
+# Increase JVM memory
+export JVM_OPTS="-Xms2G -Xmx8G"
+
+# Or set in seatunnel-env.sh
+echo 'JVM_OPTS="-Xms2G -Xmx8G"' >> $SEATUNNEL_HOME/bin/seatunnel-env.sh
+```
+
+#### Issue 3: "Connection refused: connect"
+
+**Cause**: Source/sink service not reachable
+
+**Solution**:
+```bash
+# 1. Verify connectivity
+ping source-host
+telnet source-host 3306
+
+# 2. Check credentials
+mysql -h source-host -u root -p
+
+# 3. Check firewall rules
+# Ensure port 3306 is open
+```
+
+#### Issue 4: "Table not found" during CDC
+
+**Cause**: Binlog not enabled on MySQL
+
+**Solution**:
+```sql
+-- Check binlog status
+SHOW VARIABLES LIKE 'log_bin';
+
+-- Enable binlog in my.cnf
+[mysqld]
+log_bin = mysql-bin
+binlog_format = row # Important for CDC
+
+-- Restart MySQL and verify
+SHOW MASTER STATUS;
+```
+
+#### Issue 5: Slow Job Performance
+
+**Cause**: Suboptimal configuration
+
+**Solutions**:
+```hocon
+# 1. Increase parallelism
+env {
+ parallelism = 8 # or higher based on cluster
+}
+
+# 2. Increase batch sizes
+source {
+ Jdbc {
+ fetch_size = 5000
+ split_size = 100000
+ }
+}
+
+sink {
+ Jdbc {
+ batch_size = 2000
+ }
+}
+
+# 3. Enable connection pooling
+source {
+ Jdbc {
+ pool_size = 10
+ max_idle_time = 300
+ }
+}
+```
+
+#### Issue 6: "Kafka topic offset out of range"
+
+**Cause**: Offset doesn't exist in topic
+
+**Solution**:
+```hocon
+source {
+ Kafka {
+ # Reset to earliest or latest
+ auto.offset.reset = "earliest" # or "latest"
+
+ # Or specify explicit offsets
+ start_mode = "earliest"
+ }
+}
+```
+
+### FAQ
+
+**Q: What's the difference between BATCH and STREAMING mode?**
+
+A:
+- **BATCH**: One-time execution, suitable for full database migration
+- **STREAMING**: Continuous execution, suitable for real-time data sync and CDC
+
+**Q: How do I handle schema changes during CDC?**
+
+A: SeaTunnel automatically detects schema changes in CDC mode. Configure in
source:
+```hocon
+source {
+ Mysql {
+ schema_change_mode = "auto" # auto-detect and apply
+ }
+}
+```
+
+**Q: Can I transform data during synchronization?**
+
+A: Yes, use SQL transform:
+```hocon
+transform {
+ Sql {
+ sql = "SELECT id, UPPER(name) as name FROM source"
+ }
+}
+```
+
+**Q: What's the maximum throughput?**
+
+A: Depends on:
+- Hardware (CPU, RAM, Network)
+- Source/sink database configuration
+- Data size per record
+- Network latency
+
+Typical throughput: 100K - 1M records/second per executor
+
+**Q: How do I handle errors in production?**
+
+A: Configure restart strategy:
+```hocon
+env {
+ restart_strategy = "exponential_delay"
+ restart_strategy.exponential_delay.initial_delay = 1000
+ restart_strategy.exponential_delay.max_delay = 30000
+ restart_strategy.exponential_delay.multiplier = 2.0
+ restart_strategy.exponential_delay.attempts_unlimited = true
+}
+```
+
+**Q: Is there a web UI for job management?**
+
+A: Yes! Use **SeaTunnel Web Project**:
+```bash
+# Check out web UI project
+git clone https://github.com/apache/seatunnel-web.git
+cd seatunnel-web
+mvn clean install
+
+# Run web UI
+java -jar target/seatunnel-web-*.jar
+# Access at http://localhost:8080
+```
+
+---
+
+## Resources
+
+### Official Documentation
+- [SeaTunnel Official Website](https://seatunnel.apache.org/)
+- [GitHub Repository](https://github.com/apache/seatunnel)
+- [Documentation Hub](https://seatunnel.apache.org/docs/)
+- [Connector
List](https://seatunnel.apache.org/docs/2.3.12/connector-v2/overview)
+
+### Community
+- [Slack Channel](https://the-asf.slack.com/archives/C01CB5186TL)
+- [Mailing Lists](https://seatunnel.apache.org/community/mail-lists/)
+- [Issue Tracker](https://github.com/apache/seatunnel/issues)
+- [Discussion Forum](https://github.com/apache/seatunnel/discussions)
+
+### Related Projects
+- [SeaTunnel Web UI](https://github.com/apache/seatunnel-web)
+- [Apache Kafka](https://kafka.apache.org/)
+- [Apache Flink](https://flink.apache.org/)
+- [Apache Spark](https://spark.apache.org/)
+
+### Learning Resources
+- [HOCON Configuration
Guide](https://github.com/lightbend/config/blob/main/HOCON.md)
+- [SQL Functions
Reference](https://seatunnel.apache.org/docs/2.3.12/transform-v2/sql)
+- [CDC Pattern Explained](https://en.wikipedia.org/wiki/Change_data_capture)
+- [Distributed Systems
Concepts](https://en.wikipedia.org/wiki/Distributed_computing)
+
+### Version History
+- **2.3.12** (Stable) - Current recommended version
+- **2.3.13-SNAPSHOT** (Development)
+- [All Releases](https://archive.apache.org/dist/seatunnel/)
+
+---
+
+## Additional Notes
+
+### License
+Apache License 2.0 - See
[LICENSE](https://github.com/apache/seatunnel/blob/master/LICENSE) file
+
+### Security
+- Report security issues via [Apache
Security](https://www.apache.org/security/)
+- Do NOT create public issues for security vulnerabilities
+
+### Support & Contribution
+- Join the community Slack for support
+- Submit feature requests on GitHub Issues
+- Contribute code via Pull Requests
+- Follow [Contributing
Guide](https://github.com/apache/seatunnel/blob/master/CONTRIBUTING.md)
+
+---
+
+**Last Updated**: 2026-01-28
+**Skill Version**: 2.3.13
+**Status**: Production Ready ✓