(seatunnel-tools) branch main updated: [Feature][SeaTunnel Skill] add seatunnel-skills (#3)

nielifeng Wed, 28 Jan 2026 20:52:58 -0800

This is an automated email from the ASF dual-hosted git repository.

nielifeng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/seatunnel-tools.git



The following commit(s) were added to refs/heads/main by this push:
     new becc021  [Feature][SeaTunnel Skill] add seatunnel-skills (#3)
becc021 is described below

commit becc0217dc3ecfdc9f6fe684d31f9adb017f7a46
Author: ocean-zhc <[email protected]>
AuthorDate: Thu Jan 29 12:52:44 2026 +0800

    [Feature][SeaTunnel Skill] add seatunnel-skills (#3)
---
 README.md                |  653 ++++++++++++++++++++++++-
 README_CN.md             |  641 ++++++++++++++++++++++++
 SKILL_SETUP_GUIDE.md     |  673 +++++++++++++++++++++++++
 seatunnel-skill/SKILL.md | 1212 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 3154 insertions(+), 25 deletions(-)

diff --git a/README.md b/README.md
index 1817050..9c1f431 100644
--- a/README.md
+++ b/README.md
@@ -1,38 +1,641 @@
 # Apache SeaTunnel Tools
 
-This repository hosts auxiliary tools for Apache SeaTunnel. It focuses on 
developer/operator productivity around configuration, conversion, packaging and 
diagnostics. Current modules:
+**English** | [中文](README_CN.md)
 
-- x2seatunnel: Convert configurations (e.g., DataX) into SeaTunnel 
configuration files.
+Auxiliary tools for Apache SeaTunnel focusing on developer/operator 
productivity around configuration, conversion, LLM integration, packaging, and 
diagnostics.
 
-More tools may be added in the future. For the main data integration engine, 
see the
-[Apache SeaTunnel](https://github.com/apache/seatunnel) project.
+## 🎯 What's Inside
 
-## Tool 1 - SeaTunnel MCP Server
+| Tool | Purpose | Status |
+|------|---------|--------|
+| **SeaTunnel Skill** | Claude AI integration for SeaTunnel operations | ✅ New 
|
+| **SeaTunnel MCP Server** | Model Context Protocol for LLM integration | ✅ 
Available |
+| **x2seatunnel** | Configuration converter (DataX → SeaTunnel) | ✅ Available |
 
-What is MCP?
-- MCP (Model Context Protocol) is an open protocol for connecting LLMs to 
tools, data, and systems. With SeaTunnel MCP, you can operate SeaTunnel 
directly from an LLM-powered interface while keeping the server-side logic 
secure and auditable.
-- Learn more: https://github.com/modelcontextprotocol
+---
 
-SeaTunnel MCP Server
-- Source folder: [seatunnel-mcp/](seatunnel-mcp/)
-- English README: [seatunnel-mcp/README.md](seatunnel-mcp/README.md)
-- Chinese: [seatunnel-mcp/README_CN.md](seatunnel-mcp/README_CN.md)
-- Quick Start: 
[seatunnel-mcp/docs/QUICK_START.md](seatunnel-mcp/docs/QUICK_START.md)
-- User Guide: 
[seatunnel-mcp/docs/USER_GUIDE.md](seatunnel-mcp/docs/USER_GUIDE.md)
-- Developer Guide: 
[seatunnel-mcp/docs/DEVELOPER_GUIDE.md](seatunnel-mcp/docs/DEVELOPER_GUIDE.md)
+## ⚡ Quick Start
 
-For screenshots, demo video, features, installation and usage instructions, 
please refer to the README in the seatunnel-mcp directory.
+### For SeaTunnel Skill (Claude Code Integration)
 
-## Tool 2 - x2seatunnel
+**Installation & Setup:**
 
-What is x2seatunnel?
-- x2seatunnel is a configuration conversion tool that helps users migrate from 
other data integration tools (e.g., DataX) to SeaTunnel by converting existing 
configurations into SeaTunnel-compatible formats.
-- x2seatunnel
-       - English: [x2seatunnel/README.md](x2seatunnel/README.md)
-       - Chinese: [x2seatunnel/README_zh.md](x2seatunnel/README_zh.md)
+```bash
+# 1. Clone this repository
+git clone https://github.com/apache/seatunnel-tools.git
+cd seatunnel-tools
 
-## Contributing
+# 2. Copy seatunnel-skill to Claude Code skills directory
+cp -r seatunnel-skill ~/.claude/skills/
 
-Issues and PRs are welcome.
+# 3. Restart Claude Code or reload skills
+# Then use: /seatunnel-skill "your prompt here"
+```
 
-Get the main project from [Apache 
SeaTunnel](https://github.com/apache/seatunnel) 
+**Quick Example:**
+
+```bash
+# Query SeaTunnel documentation
+/seatunnel-skill "How do I configure a MySQL to PostgreSQL job?"
+
+# Get connector information
+/seatunnel-skill "List all available Kafka connector options"
+
+# Debug configuration issues
+/seatunnel-skill "Why is my job failing with OutOfMemoryError?"
+```
+
+### For SeaTunnel Core (Direct Installation)
+
+```bash
+# Download binary (recommended)
+wget 
https://archive.apache.org/dist/seatunnel/2.3.12/apache-seatunnel-2.3.12-bin.tar.gz
+tar -xzf apache-seatunnel-2.3.12-bin.tar.gz
+cd apache-seatunnel-2.3.12
+
+# Verify installation
+./bin/seatunnel.sh --version
+
+# Run your first job
+./bin/seatunnel.sh -c config/hello_world.conf -e spark
+```
+
+---
+
+## 📋 Features Overview
+
+### SeaTunnel Skill
+- 🤖 **AI-Powered Assistant**: Get instant help with SeaTunnel concepts and 
configurations
+- 📚 **Knowledge Integration**: Query official documentation and best practices
+- 🔍 **Smart Debugging**: Analyze errors and suggest fixes
+- 💡 **Code Examples**: Generate configuration examples for your use case
+
+### SeaTunnel Core Engine
+- **Multimodal Support**: Structured, unstructured, and semi-structured data
+- **100+ Connectors**: Databases, data warehouses, cloud services, message 
queues
+- **Multiple Engines**: Zeta (lightweight), Spark, Flink
+- **Synchronization Modes**: Batch, Streaming, CDC (Change Data Capture)
+- **Real-time Performance**: 100K - 1M records/second throughput
+
+---
+
+## 🔧 Installation & Setup
+
+### Method 1: SeaTunnel Skill (AI Integration)
+
+**Step 1: Copy Skill File**
+```bash
+mkdir -p ~/.claude/skills
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+**Step 2: Verify Installation**
+```bash
+# In Claude Code, try:
+/seatunnel-skill "What is SeaTunnel?"
+```
+
+**Step 3: Start Using**
+```bash
+# Help with configuration
+/seatunnel-skill "Create a MySQL to Elasticsearch job config"
+
+# Troubleshoot errors
+/seatunnel-skill "My Kafka connector keeps timing out"
+
+# Learn features
+/seatunnel-skill "Explain CDC (Change Data Capture) in SeaTunnel"
+```
+
+### Method 2: SeaTunnel Binary Installation
+
+**Supported Platforms**: Linux, macOS, Windows
+
+```bash
+# Download latest version
+VERSION=2.3.12
+wget 
https://archive.apache.org/dist/seatunnel/${VERSION}/apache-seatunnel-${VERSION}-bin.tar.gz
+
+# Extract
+tar -xzf apache-seatunnel-${VERSION}-bin.tar.gz
+cd apache-seatunnel-${VERSION}
+
+# Set environment
+export JAVA_HOME=/path/to/java
+export PATH=$PATH:$(pwd)/bin
+
+# Verify
+seatunnel.sh --version
+```
+
+### Method 3: Build from Source
+
+```bash
+# Clone repository
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+
+# Build
+mvn clean install -DskipTests
+
+# Run from distribution
+cd seatunnel-dist/target/apache-seatunnel-*-bin/apache-seatunnel-*
+./bin/seatunnel.sh --version
+```
+
+### Method 4: Docker
+
+```bash
+# Pull official image
+docker pull apache/seatunnel:latest
+
+# Run container
+docker run -it apache/seatunnel:latest /bin/bash
+
+# Run job directly
+docker run -v /path/to/config:/config \
+  apache/seatunnel:latest \
+  seatunnel.sh -c /config/job.conf -e spark
+```
+
+---
+
+## 💻 Usage Guide
+
+### Use Case 1: MySQL to PostgreSQL (Batch)
+
+**config/mysql_to_postgres.conf**
+```hocon
+env {
+  job.mode = "BATCH"
+  job.name = "MySQL to PostgreSQL"
+}
+
+source {
+  Jdbc {
+    driver = "com.mysql.cj.jdbc.Driver"
+    url = "jdbc:mysql://mysql-host:3306/mydb"
+    user = "root"
+    password = "password"
+    query = "SELECT * FROM users"
+    connection_check_timeout_sec = 100
+  }
+}
+
+sink {
+  Jdbc {
+    driver = "org.postgresql.Driver"
+    url = "jdbc:postgresql://pg-host:5432/mydb"
+    user = "postgres"
+    password = "password"
+    database = "mydb"
+    table = "users"
+    primary_keys = ["id"]
+    connection_check_timeout_sec = 100
+  }
+}
+```
+
+**Run:**
+```bash
+seatunnel.sh -c config/mysql_to_postgres.conf -e spark
+```
+
+### Use Case 2: Kafka Streaming to Elasticsearch
+
+**config/kafka_to_es.conf**
+```hocon
+env {
+  job.mode = "STREAMING"
+  job.name = "Kafka to Elasticsearch"
+  parallelism = 2
+}
+
+source {
+  Kafka {
+    bootstrap.servers = "kafka-host:9092"
+    topic = "events"
+    consumer.group = "seatunnel-group"
+    format = "json"
+    schema = {
+      fields {
+        event_id = "bigint"
+        event_name = "string"
+        timestamp = "bigint"
+      }
+    }
+  }
+}
+
+sink {
+  Elasticsearch {
+    hosts = ["es-host:9200"]
+    index = "events"
+    username = "elastic"
+    password = "password"
+  }
+}
+```
+
+**Run:**
+```bash
+seatunnel.sh -c config/kafka_to_es.conf -e flink
+```
+
+### Use Case 3: MySQL CDC to Kafka
+
+**config/mysql_cdc_kafka.conf**
+```hocon
+env {
+  job.mode = "STREAMING"
+  job.name = "MySQL CDC to Kafka"
+}
+
+source {
+  Mysql {
+    server_id = 5400
+    hostname = "mysql-host"
+    port = 3306
+    username = "root"
+    password = "password"
+    database = ["mydb"]
+    table = ["users", "orders"]
+    startup.mode = "initial"
+  }
+}
+
+sink {
+  Kafka {
+    bootstrap.servers = "kafka-host:9092"
+    topic = "mysql_cdc"
+    format = "canal_json"
+    semantic = "EXACTLY_ONCE"
+  }
+}
+```
+
+**Run:**
+```bash
+seatunnel.sh -c config/mysql_cdc_kafka.conf -e flink
+```
+
+---
+
+## 📚 API Reference
+
+### Core Connector Types
+
+**Source Connectors**
+- `Jdbc` - Generic JDBC databases (MySQL, PostgreSQL, Oracle, SQL Server)
+- `Kafka` - Apache Kafka topics
+- `Mysql` - MySQL with CDC support
+- `MongoDB` - MongoDB collections
+- `PostgreSQL` - PostgreSQL with CDC
+- `S3` - Amazon S3 and compatible storage
+- `Http` - HTTP/HTTPS endpoints
+- `FakeSource` - For testing
+
+**Sink Connectors**
+- `Jdbc` - Write to JDBC-compatible databases
+- `Kafka` - Publish to Kafka topics
+- `Elasticsearch` - Write to Elasticsearch indices
+- `S3` - Write to S3 buckets
+- `Redis` - Write to Redis
+- `HBase` - Write to HBase tables
+- `Console` - Output to console
+
+**Transform Connectors**
+- `Sql` - Execute SQL transformations
+- `FieldMapper` - Rename/map columns
+- `JsonPath` - Extract data from JSON
+
+---
+
+## ⚙️ Configuration & Tuning
+
+### Environment Variables
+
+```bash
+# Java configuration
+export JAVA_HOME=/path/to/java
+export JVM_OPTS="-Xms1G -Xmx4G"
+
+# Spark configuration (if using Spark engine)
+export SPARK_HOME=/path/to/spark
+export SPARK_MASTER=spark://master:7077
+
+# Flink configuration (if using Flink engine)
+export FLINK_HOME=/path/to/flink
+
+# SeaTunnel configuration
+export SEATUNNEL_HOME=/path/to/seatunnel
+```
+
+### Performance Tuning for Batch Jobs
+
+```hocon
+env {
+  job.mode = "BATCH"
+  parallelism = 8  # Increase for larger clusters
+}
+
+source {
+  Jdbc {
+    split_size = 100000    # Parallel reads
+    fetch_size = 5000
+  }
+}
+
+sink {
+  Jdbc {
+    batch_size = 1000      # Batch inserts
+    max_retries = 3
+  }
+}
+```
+
+### Performance Tuning for Streaming Jobs
+
+```hocon
+env {
+  job.mode = "STREAMING"
+  parallelism = 4
+  checkpoint.interval = 30000  # 30 seconds
+}
+
+source {
+  Kafka {
+    consumer.group = "seatunnel-consumer"
+    max_poll_records = 500
+  }
+}
+```
+
+---
+
+## 🛠️ Development Guide
+
+### Project Structure
+
+```
+seatunnel-tools/
+├── seatunnel-skill/          # Claude Code AI skill
+├── seatunnel-mcp/            # MCP server for LLM integration
+├── x2seatunnel/              # DataX to SeaTunnel converter
+└── README.md
+```
+
+### SeaTunnel Core Architecture
+
+```
+seatunnel/
+├── seatunnel-api/            # Core APIs
+├── seatunnel-core/           # Execution engine
+├── seatunnel-engines/        # Engine implementations
+│   ├── seatunnel-engine-flink/
+│   ├── seatunnel-engine-spark/
+│   └── seatunnel-engine-zeta/
+├── seatunnel-connectors/     # Connector implementations
+└── seatunnel-dist/           # Distribution package
+```
+
+### Building SeaTunnel from Source
+
+```bash
+# Full build
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+mvn clean install -DskipTests
+
+# Build specific module
+mvn clean install -pl 
seatunnel-connectors/seatunnel-connectors-seatunnel-kafka -DskipTests
+```
+
+### Running Tests
+
+```bash
+# Unit tests
+mvn test
+
+# Specific test class
+mvn test -Dtest=MySqlConnectorTest
+
+# Integration tests
+mvn verify
+```
+
+---
+
+## 🐛 Troubleshooting (6 Common Issues)
+
+### Issue 1: ClassNotFoundException: com.mysql.jdbc.Driver
+
+**Solution:**
+```bash
+wget 
https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.33.jar
+cp mysql-connector-java-8.0.33.jar $SEATUNNEL_HOME/lib/
+seatunnel.sh -c config/job.conf -e spark
+```
+
+### Issue 2: OutOfMemoryError: Java heap space
+
+**Solution:**
+```bash
+export JVM_OPTS="-Xms2G -Xmx8G"
+echo 'JVM_OPTS="-Xms2G -Xmx8G"' >> $SEATUNNEL_HOME/bin/seatunnel-env.sh
+```
+
+### Issue 3: Connection refused: connect
+
+**Solution:**
+```bash
+# Verify connectivity
+ping source-host
+telnet source-host 3306
+
+# Check credentials
+mysql -h source-host -u root -p
+```
+
+### Issue 4: Table not found during CDC
+
+**Solution:**
+```sql
+-- Check binlog status
+SHOW VARIABLES LIKE 'log_bin';
+
+-- Enable binlog in my.cnf
+[mysqld]
+log_bin = mysql-bin
+binlog_format = row
+```
+
+### Issue 5: Slow Job Performance
+
+**Solution:**
+```hocon
+env {
+  parallelism = 8  # Increase parallelism
+}
+
+source {
+  Jdbc {
+    fetch_size = 5000
+    split_size = 100000
+  }
+}
+
+sink {
+  Jdbc {
+    batch_size = 2000
+  }
+}
+```
+
+### Issue 6: Kafka offset out of range
+
+**Solution:**
+```hocon
+source {
+  Kafka {
+    auto.offset.reset = "earliest"  # or "latest"
+  }
+}
+```
+
+---
+
+## ❓ FAQ (8 Common Questions)
+
+**Q: What's the difference between BATCH and STREAMING mode?**
+
+A:
+- **BATCH**: One-time execution, suitable for full database migration
+- **STREAMING**: Continuous execution, suitable for real-time sync and CDC
+
+**Q: How do I handle schema changes during CDC?**
+
+A: Configure auto-detection in source:
+```hocon
+source {
+  Mysql {
+    schema_change_mode = "auto"
+  }
+}
+```
+
+**Q: Can I transform data during synchronization?**
+
+A: Yes, use SQL transform:
+```hocon
+transform {
+  Sql {
+    sql = "SELECT id, UPPER(name) as name FROM source"
+  }
+}
+```
+
+**Q: What's the maximum throughput?**
+
+A: Typical throughput is 100K - 1M records/second per executor. Depends on:
+- Hardware (CPU, RAM, Network)
+- Database configuration
+- Data size per record
+- Network latency
+
+**Q: How do I handle errors in production?**
+
+A: Configure restart strategy:
+```hocon
+env {
+  restart_strategy = "exponential_delay"
+  restart_strategy.exponential_delay.initial_delay = 1000
+  restart_strategy.exponential_delay.max_delay = 30000
+  restart_strategy.exponential_delay.multiplier = 2.0
+}
+```
+
+**Q: Is there a web UI for job management?**
+
+A: Yes! Use SeaTunnel Web Project:
+```bash
+git clone https://github.com/apache/seatunnel-web.git
+cd seatunnel-web
+mvn clean install
+java -jar target/seatunnel-web-*.jar
+# Access at http://localhost:8080
+```
+
+**Q: How do I use the SeaTunnel Skill with Claude Code?**
+
+A: After copying to `~/.claude/skills/`, use:
+```bash
+/seatunnel-skill "your question about SeaTunnel"
+```
+
+**Q: Which engine should I use: Spark, Flink, or Zeta?**
+
+A:
+- **Zeta**: Lightweight, no external dependencies, single machine
+- **Spark**: Batch and batch-stream processing on distributed clusters
+- **Flink**: Advanced streaming and CDC on distributed clusters
+
+---
+
+## 🔗 Resources & Links
+
+### Official Documentation
+- [SeaTunnel Website](https://seatunnel.apache.org/)
+- [GitHub Repository](https://github.com/apache/seatunnel)
+- [Connector 
List](https://seatunnel.apache.org/docs/2.3.12/connector-v2/overview)
+- [HOCON Configuration 
Guide](https://github.com/lightbend/config/blob/main/HOCON.md)
+
+### Community & Support
+- [Slack Channel](https://the-asf.slack.com/archives/C01CB5186TL)
+- [Mailing Lists](https://seatunnel.apache.org/community/mail-lists/)
+- [GitHub Issues](https://github.com/apache/seatunnel/issues)
+- [Discussion Forum](https://github.com/apache/seatunnel/discussions)
+
+### Related Projects
+- [SeaTunnel Web UI](https://github.com/apache/seatunnel-web)
+- [SeaTunnel Tools](https://github.com/apache/seatunnel-tools)
+- [Apache Kafka](https://kafka.apache.org/)
+- [Apache Flink](https://flink.apache.org/)
+- [Apache Spark](https://spark.apache.org/)
+
+---
+
+## 📄 Individual Tools
+
+### 1. SeaTunnel Skill (New)
+- **Purpose**: AI-powered assistant for SeaTunnel in Claude Code
+- **Location**: [seatunnel-skill/](seatunnel-skill/)
+- **Quick Setup**: `cp -r seatunnel-skill ~/.claude/skills/`
+- **Usage**: `/seatunnel-skill "your question"`
+
+### 2. SeaTunnel MCP Server
+- **Purpose**: Model Context Protocol integration for LLM systems
+- **Location**: [seatunnel-mcp/](seatunnel-mcp/)
+- **English**: [README.md](seatunnel-mcp/README.md)
+- **Chinese**: [README_CN.md](seatunnel-mcp/README_CN.md)
+- **Quick Start**: [QUICK_START.md](seatunnel-mcp/docs/QUICK_START.md)
+
+### 3. x2seatunnel
+- **Purpose**: Convert DataX and other configurations to SeaTunnel format
+- **Location**: [x2seatunnel/](x2seatunnel/)
+- **English**: [README.md](x2seatunnel/README.md)
+- **Chinese**: [README_zh.md](x2seatunnel/README_zh.md)
+
+---
+
+## 🤝 Contributing
+
+Issues and PRs are welcome!
+
+For the main SeaTunnel engine, see [Apache 
SeaTunnel](https://github.com/apache/seatunnel).
+
+For these tools, please contribute to [SeaTunnel 
Tools](https://github.com/apache/seatunnel-tools).
+
+---
+
+**Last Updated**: 2026-01-28 | **License**: Apache 2.0 
diff --git a/README_CN.md b/README_CN.md
new file mode 100644
index 0000000..5541aa1
--- /dev/null
+++ b/README_CN.md
@@ -0,0 +1,641 @@
+# Apache SeaTunnel 工具集
+
+[English](README.md) | **中文**
+
+Apache SeaTunnel 辅助工具集，重点关注开发者/运维生产力，包括配置转换、LLM 集成、打包和诊断。
+
+## 🎯 工具概览
+
+| 工具 | 用途 | 状态 |
+|------|------|------|
+| **SeaTunnel Skill** | Claude AI 集成 | ✅ 新功能 |
+| **SeaTunnel MCP 服务** | LLM 集成协议 | ✅ 可用 |
+| **x2seatunnel** | 配置转换工具 (DataX → SeaTunnel) | ✅ 可用 |
+
+---
+
+## ⚡ 快速开始
+
+### SeaTunnel Skill (Claude Code 集成)
+
+**安装步骤：**
+
+```bash
+# 1. 克隆本仓库
+git clone https://github.com/apache/seatunnel-tools.git
+cd seatunnel-tools
+
+# 2. 复制 seatunnel-skill 到 Claude Code 技能目录
+cp -r seatunnel-skill ~/.claude/skills/
+
+# 3. 重启 Claude Code 或重新加载技能
+# 然后使用: /seatunnel-skill "你的问题"
+```
+
+**快速示例：**
+
+```bash
+# 查询 SeaTunnel 文档
+/seatunnel-skill "如何配置 MySQL 到 PostgreSQL 的数据同步？"
+
+# 获取连接器信息
+/seatunnel-skill "列出所有可用的 Kafka 连接器选项"
+
+# 调试配置问题
+/seatunnel-skill "为什么我的任务出现 OutOfMemoryError 错误？"
+```
+
+### SeaTunnel 核心引擎（直接安装）
+
+```bash
+# 下载二进制文件（推荐）
+wget 
https://archive.apache.org/dist/seatunnel/2.3.12/apache-seatunnel-2.3.12-bin.tar.gz
+tar -xzf apache-seatunnel-2.3.12-bin.tar.gz
+cd apache-seatunnel-2.3.12
+
+# 验证安装
+./bin/seatunnel.sh --version
+
+# 运行第一个任务
+./bin/seatunnel.sh -c config/hello_world.conf -e spark
+```
+
+---
+
+## 📋 功能概览
+
+### SeaTunnel Skill
+- 🤖 **AI 助手**: 获得 SeaTunnel 概念和配置的即时帮助
+- 📚 **知识集成**: 查询官方文档和最佳实践
+- 🔍 **智能调试**: 分析错误并提出修复建议
+- 💡 **代码示例**: 为您的用例生成配置示例
+
+### SeaTunnel 核心引擎
+- **多模式支持**: 结构化、非结构化和半结构化数据
+- **100+ 连接器**: 数据库、数据仓库、云服务、消息队列
+- **多引擎支持**: Zeta（轻量级）、Spark、Flink
+- **同步模式**: 批处理、流处理、CDC（变更数据捕获）
+- **实时性能**: 每秒 100K - 1M 条记录吞吐量
+
+---
+
+## 🔧 安装与设置
+
+### 方法 1: SeaTunnel Skill (AI 集成)
+
+**第一步：复制技能文件**
+```bash
+mkdir -p ~/.claude/skills
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+**第二步：验证安装**
+```bash
+# 在 Claude Code 中尝试：
+/seatunnel-skill "什么是 SeaTunnel？"
+```
+
+**第三步：开始使用**
+```bash
+# 帮助配置
+/seatunnel-skill "创建一个 MySQL 到 Elasticsearch 的任务配置"
+
+# 故障排除
+/seatunnel-skill "我的 Kafka 连接器一直超时"
+
+# 学习功能
+/seatunnel-skill "在 SeaTunnel 中解释 CDC（变更数据捕获）"
+```
+
+### 方法 2: 二进制安装
+
+**支持平台**: Linux、macOS、Windows
+
+```bash
+# 下载最新版本
+VERSION=2.3.12
+wget 
https://archive.apache.org/dist/seatunnel/${VERSION}/apache-seatunnel-${VERSION}-bin.tar.gz
+
+# 解压
+tar -xzf apache-seatunnel-${VERSION}-bin.tar.gz
+cd apache-seatunnel-${VERSION}
+
+# 设置环境
+export JAVA_HOME=/path/to/java
+export PATH=$PATH:$(pwd)/bin
+
+# 验证
+seatunnel.sh --version
+```
+
+### 方法 3: 从源代码构建
+
+```bash
+# 克隆仓库
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+
+# 构建
+mvn clean install -DskipTests
+
+# 从分发目录运行
+cd seatunnel-dist/target/apache-seatunnel-*-bin/apache-seatunnel-*
+./bin/seatunnel.sh --version
+```
+
+### 方法 4: Docker
+
+```bash
+# 拉取官方镜像
+docker pull apache/seatunnel:latest
+
+# 运行容器
+docker run -it apache/seatunnel:latest /bin/bash
+
+# 直接运行任务
+docker run -v /path/to/config:/config \
+  apache/seatunnel:latest \
+  seatunnel.sh -c /config/job.conf -e spark
+```
+
+---
+
+## 💻 使用指南
+
+### 用例 1: MySQL 到 PostgreSQL（批处理）
+
+**config/mysql_to_postgres.conf**
+```hocon
+env {
+  job.mode = "BATCH"
+  job.name = "MySQL 到 PostgreSQL"
+}
+
+source {
+  Jdbc {
+    driver = "com.mysql.cj.jdbc.Driver"
+    url = "jdbc:mysql://mysql-host:3306/mydb"
+    user = "root"
+    password = "password"
+    query = "SELECT * FROM users"
+    connection_check_timeout_sec = 100
+  }
+}
+
+sink {
+  Jdbc {
+    driver = "org.postgresql.Driver"
+    url = "jdbc:postgresql://pg-host:5432/mydb"
+    user = "postgres"
+    password = "password"
+    database = "mydb"
+    table = "users"
+    primary_keys = ["id"]
+    connection_check_timeout_sec = 100
+  }
+}
+```
+
+**运行：**
+```bash
+seatunnel.sh -c config/mysql_to_postgres.conf -e spark
+```
+
+### 用例 2: Kafka 流到 Elasticsearch
+
+**config/kafka_to_es.conf**
+```hocon
+env {
+  job.mode = "STREAMING"
+  job.name = "Kafka 到 Elasticsearch"
+  parallelism = 2
+}
+
+source {
+  Kafka {
+    bootstrap.servers = "kafka-host:9092"
+    topic = "events"
+    consumer.group = "seatunnel-group"
+    format = "json"
+    schema = {
+      fields {
+        event_id = "bigint"
+        event_name = "string"
+        timestamp = "bigint"
+      }
+    }
+  }
+}
+
+sink {
+  Elasticsearch {
+    hosts = ["es-host:9200"]
+    index = "events"
+    username = "elastic"
+    password = "password"
+  }
+}
+```
+
+**运行：**
+```bash
+seatunnel.sh -c config/kafka_to_es.conf -e flink
+```
+
+### 用例 3: MySQL CDC 到 Kafka
+
+**config/mysql_cdc_kafka.conf**
+```hocon
+env {
+  job.mode = "STREAMING"
+  job.name = "MySQL CDC 到 Kafka"
+}
+
+source {
+  Mysql {
+    server_id = 5400
+    hostname = "mysql-host"
+    port = 3306
+    username = "root"
+    password = "password"
+    database = ["mydb"]
+    table = ["users", "orders"]
+    startup.mode = "initial"
+  }
+}
+
+sink {
+  Kafka {
+    bootstrap.servers = "kafka-host:9092"
+    topic = "mysql_cdc"
+    format = "canal_json"
+    semantic = "EXACTLY_ONCE"
+  }
+}
+```
+
+**运行：**
+```bash
+seatunnel.sh -c config/mysql_cdc_kafka.conf -e flink
+```
+
+---
+
+## 📚 API 参考
+
+### 核心连接器类型
+
+**源连接器**
+- `Jdbc` - 通用 JDBC 数据库（MySQL、PostgreSQL、Oracle、SQL Server）
+- `Kafka` - Apache Kafka 主题
+- `Mysql` - 支持 CDC 的 MySQL
+- `MongoDB` - MongoDB 集合
+- `PostgreSQL` - 支持 CDC 的 PostgreSQL
+- `S3` - Amazon S3 和兼容存储
+- `Http` - HTTP/HTTPS 端点
+- `FakeSource` - 用于测试
+
+**宿连接器**
+- `Jdbc` - 写入 JDBC 兼容数据库
+- `Kafka` - 发布到 Kafka 主题
+- `Elasticsearch` - 写入 Elasticsearch 索引
+- `S3` - 写入 S3 存储桶
+- `Redis` - 写入 Redis
+- `HBase` - 写入 HBase 表
+- `Console` - 输出到控制台
+
+**转换连接器**
+- `Sql` - 执行 SQL 转换
+- `FieldMapper` - 列重命名/映射
+- `JsonPath` - 从 JSON 提取数据
+
+---
+
+## ⚙️ 配置与优化
+
+### 环境变量
+
+```bash
+# Java 配置
+export JAVA_HOME=/path/to/java
+export JVM_OPTS="-Xms1G -Xmx4G"
+
+# Spark 配置（使用 Spark 引擎时）
+export SPARK_HOME=/path/to/spark
+export SPARK_MASTER=spark://master:7077
+
+# Flink 配置（使用 Flink 引擎时）
+export FLINK_HOME=/path/to/flink
+
+# SeaTunnel 配置
+export SEATUNNEL_HOME=/path/to/seatunnel
+```
+
+### 批处理任务性能调优
+
+```hocon
+env {
+  job.mode = "BATCH"
+  parallelism = 8  # 根据集群大小增加
+}
+
+source {
+  Jdbc {
+    split_size = 100000    # 并行读取
+    fetch_size = 5000
+  }
+}
+
+sink {
+  Jdbc {
+    batch_size = 1000      # 批量插入
+    max_retries = 3
+  }
+}
+```
+
+### 流处理任务性能调优
+
+```hocon
+env {
+  job.mode = "STREAMING"
+  parallelism = 4
+  checkpoint.interval = 30000  # 30 秒
+}
+
+source {
+  Kafka {
+    consumer.group = "seatunnel-consumer"
+    max_poll_records = 500
+  }
+}
+```
+
+---
+
+## 🛠️ 开发指南
+
+### 项目结构
+
+```
+seatunnel-tools/
+├── seatunnel-skill/          # Claude Code AI 技能
+├── seatunnel-mcp/            # LLM 集成 MCP 服务
+├── x2seatunnel/              # DataX 到 SeaTunnel 转换器
+└── README_CN.md
+```
+
+### SeaTunnel 核心架构
+
+```
+seatunnel/
+├── seatunnel-api/            # 核心 API
+├── seatunnel-core/           # 执行引擎
+├── seatunnel-engines/        # 引擎实现
+│   ├── seatunnel-engine-flink/
+│   ├── seatunnel-engine-spark/
+│   └── seatunnel-engine-zeta/
+├── seatunnel-connectors/     # 连接器实现
+└── seatunnel-dist/           # 分发包
+```
+
+### 从源代码构建 SeaTunnel
+
+```bash
+# 完整构建
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+mvn clean install -DskipTests
+
+# 构建特定模块
+mvn clean install -pl 
seatunnel-connectors/seatunnel-connectors-seatunnel-kafka -DskipTests
+```
+
+### 运行测试
+
+```bash
+# 单元测试
+mvn test
+
+# 特定测试类
+mvn test -Dtest=MySqlConnectorTest
+
+# 集成测试
+mvn verify
+```
+
+---
+
+## 🐛 故障排查（6 个常见问题）
+
+### 问题 1: ClassNotFoundException: com.mysql.jdbc.Driver
+
+**解决方案：**
+```bash
+wget 
https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.33.jar
+cp mysql-connector-java-8.0.33.jar $SEATUNNEL_HOME/lib/
+seatunnel.sh -c config/job.conf -e spark
+```
+
+### 问题 2: OutOfMemoryError: Java heap space
+
+**解决方案：**
+```bash
+export JVM_OPTS="-Xms2G -Xmx8G"
+echo 'JVM_OPTS="-Xms2G -Xmx8G"' >> $SEATUNNEL_HOME/bin/seatunnel-env.sh
+```
+
+### 问题 3: Connection refused: connect
+
+**解决方案：**
+```bash
+# 验证连接
+ping source-host
+telnet source-host 3306
+
+# 检查凭证
+mysql -h source-host -u root -p
+```
+
+### 问题 4: CDC 期间找不到表
+
+**解决方案：**
+```sql
+-- 检查二进制日志状态
+SHOW VARIABLES LIKE 'log_bin';
+
+-- 在 my.cnf 中启用二进制日志
+[mysqld]
+log_bin = mysql-bin
+binlog_format = row
+```
+
+### 问题 5: 任务性能缓慢
+
+**解决方案：**
+```hocon
+env {
+  parallelism = 8  # 增加并行性
+}
+
+source {
+  Jdbc {
+    fetch_size = 5000
+    split_size = 100000
+  }
+}
+
+sink {
+  Jdbc {
+    batch_size = 2000
+  }
+}
+```
+
+### 问题 6: Kafka 偏移量超出范围
+
+**解决方案：**
+```hocon
+source {
+  Kafka {
+    auto.offset.reset = "earliest"  # 或 "latest"
+  }
+}
+```
+
+---
+
+## ❓ 常见问题（8 个常见问题）
+
+**Q: BATCH 和 STREAMING 模式有什么区别？**
+
+A:
+- **BATCH**: 一次性执行，适合全量数据库迁移
+- **STREAMING**: 持续执行，适合实时同步和 CDC
+
+**Q: 如何在 CDC 期间处理架构更改？**
+
+A: 在源中配置自动检测：
+```hocon
+source {
+  Mysql {
+    schema_change_mode = "auto"
+  }
+}
+```
+
+**Q: 我能在同步期间转换数据吗？**
+
+A: 可以，使用 SQL 转换：
+```hocon
+transform {
+  Sql {
+    sql = "SELECT id, UPPER(name) as name FROM source"
+  }
+}
+```
+
+**Q: 最大吞吐量是多少？**
+
+A: 典型吞吐量为每个执行器每秒 100K - 1M 条记录。取决于：
+- 硬件（CPU、RAM、网络）
+- 数据库配置
+- 每条记录的数据大小
+- 网络延迟
+
+**Q: 如何在生产环境中处理错误？**
+
+A: 配置重启策略：
+```hocon
+env {
+  restart_strategy = "exponential_delay"
+  restart_strategy.exponential_delay.initial_delay = 1000
+  restart_strategy.exponential_delay.max_delay = 30000
+  restart_strategy.exponential_delay.multiplier = 2.0
+}
+```
+
+**Q: 是否有用于任务管理的 Web UI？**
+
+A: 是的！使用 SeaTunnel Web 项目：
+```bash
+git clone https://github.com/apache/seatunnel-web.git
+cd seatunnel-web
+mvn clean install
+java -jar target/seatunnel-web-*.jar
+# 访问 http://localhost:8080
+```
+
+**Q: 如何在 Claude Code 中使用 SeaTunnel Skill？**
+
+A: 复制到 `~/.claude/skills/` 后，使用：
+```bash
+/seatunnel-skill "关于 SeaTunnel 的问题"
+```
+
+**Q: 应该使用哪个引擎：Spark、Flink 还是 Zeta？**
+
+A:
+- **Zeta**: 轻量级，无外部依赖，单机
+- **Spark**: 分布式集群的批处理和批流混合
+- **Flink**: 分布式集群的高级流处理和 CDC
+
+---
+
+## 🔗 资源与链接
+
+### 官方文档
+- [SeaTunnel 官网](https://seatunnel.apache.org/)
+- [GitHub 仓库](https://github.com/apache/seatunnel)
+- [连接器列表](https://seatunnel.apache.org/docs/2.3.12/connector-v2/overview)
+- [HOCON 配置指南](https://github.com/lightbend/config/blob/main/HOCON.md)
+
+### 社区与支持
+- [Slack 频道](https://the-asf.slack.com/archives/C01CB5186TL)
+- [邮件列表](https://seatunnel.apache.org/community/mail-lists/)
+- [GitHub Issues](https://github.com/apache/seatunnel/issues)
+- [讨论论坛](https://github.com/apache/seatunnel/discussions)
+
+### 相关项目
+- [SeaTunnel Web UI](https://github.com/apache/seatunnel-web)
+- [SeaTunnel 工具集](https://github.com/apache/seatunnel-tools)
+- [Apache Kafka](https://kafka.apache.org/)
+- [Apache Flink](https://flink.apache.org/)
+- [Apache Spark](https://spark.apache.org/)
+
+---
+
+## 📄 单个工具说明
+
+### 1. SeaTunnel Skill（新功能）
+- **用途**: Claude Code 中 SeaTunnel 的 AI 助手
+- **位置**: [seatunnel-skill/](seatunnel-skill/)
+- **快速设置**: `cp -r seatunnel-skill ~/.claude/skills/`
+- **使用方法**: `/seatunnel-skill "你的问题"`
+
+### 2. SeaTunnel MCP 服务
+- **用途**: LLM 系统的模型上下文协议集成
+- **位置**: [seatunnel-mcp/](seatunnel-mcp/)
+- **英文**: [README.md](seatunnel-mcp/README.md)
+- **中文**: [README_CN.md](seatunnel-mcp/README_CN.md)
+- **快速开始**: [QUICK_START.md](seatunnel-mcp/docs/QUICK_START.md)
+
+### 3. x2seatunnel
+- **用途**: 将 DataX 等配置转换为 SeaTunnel 格式
+- **位置**: [x2seatunnel/](x2seatunnel/)
+- **英文**: [README.md](x2seatunnel/README.md)
+- **中文**: [README_zh.md](x2seatunnel/README_zh.md)
+
+---
+
+## 🤝 贡献
+
+欢迎提交 Issues 和 Pull Requests！
+
+对于主要的 SeaTunnel 引擎，请参阅 [Apache SeaTunnel](https://github.com/apache/seatunnel)。
+
+对于这些工具，请贡献到 [SeaTunnel 工具集](https://github.com/apache/seatunnel-tools)。
+
+---
+
+**最后更新**: 2026-01-28 | **许可证**: Apache 2.0
\ No newline at end of file
diff --git a/SKILL_SETUP_GUIDE.md b/SKILL_SETUP_GUIDE.md
new file mode 100644
index 0000000..c9922d3
--- /dev/null
+++ b/SKILL_SETUP_GUIDE.md
@@ -0,0 +1,673 @@
+# SeaTunnel Skill Setup Guide
+
+**English** | [中文](#中文版本)
+
+## Getting Started with SeaTunnel Skill in Claude Code
+
+SeaTunnel Skill is an AI-powered assistant for Apache SeaTunnel integrated 
directly into Claude Code. It helps you with configuration, troubleshooting, 
learning, and best practices.
+
+---
+
+## Installation
+
+### Step 1: Clone the Repository
+
+```bash
+git clone https://github.com/apache/seatunnel-tools.git
+cd seatunnel-tools
+```
+
+### Step 2: Locate Skills Directory
+
+Claude Code stores skills in your home directory. Create the skills directory 
if it doesn't exist:
+
+```bash
+# Create ~/.claude/skills directory if it doesn't exist
+mkdir -p ~/.claude/skills
+```
+
+**Directory Locations by OS:**
+- **macOS/Linux**: `~/.claude/skills/`
+- **Windows**: `%USERPROFILE%\.claude\skills\`
+
+### Step 3: Copy the Skill
+
+```bash
+# Copy seatunnel-skill to Claude Code skills directory
+cp -r seatunnel-skill ~/.claude/skills/
+
+# Verify installation
+ls ~/.claude/skills/seatunnel-skill/
+```
+
+You should see:
+```
+SKILL.md         # Skill definition and metadata
+README.md        # Skill documentation
+```
+
+### Step 4: Verify Installation
+
+**Option A: Using Claude Code Terminal**
+
+```bash
+# In Claude Code terminal, run:
+ls ~/.claude/skills/seatunnel-skill/
+
+# You should see the skill files listed
+```
+
+**Option B: Check Skill Loading**
+
+In Claude Code, you might see a skill reload notification. If not:
+1. Restart Claude Code
+2. Or reload the skills manually through the skill menu
+
+### Step 5: Test the Skill
+
+Open Claude Code and try:
+
+```bash
+/seatunnel-skill "What is SeaTunnel?"
+```
+
+You should get an AI-powered response about SeaTunnel.
+
+---
+
+## Usage Examples
+
+### Getting Help with Configuration
+
+**Question:** How do I configure a MySQL to PostgreSQL job?
+
+```bash
+/seatunnel-skill "Create a job configuration to sync data from MySQL to 
PostgreSQL with batch mode"
+```
+
+**Response:** The skill will provide a complete HOCON configuration example 
with explanations.
+
+### Learning SeaTunnel Concepts
+
+**Question:** Explain CDC mode
+
+```bash
+/seatunnel-skill "Explain Change Data Capture (CDC) in SeaTunnel. When should 
I use it?"
+```
+
+**Response:** Comprehensive explanation of CDC, use cases, and configuration 
examples.
+
+### Troubleshooting
+
+**Question:** My job is failing
+
+```bash
+/seatunnel-skill "I'm getting 'OutOfMemoryError: Java heap space' in my batch 
job. How do I fix it?"
+```
+
+**Response:** Detailed diagnosis and solutions, including:
+- Root cause explanation
+- Configuration fixes
+- Environment variable adjustments
+- Performance tuning tips
+
+### Connector Information
+
+**Question:** Available Kafka options
+
+```bash
+/seatunnel-skill "What are all the configuration options for Kafka source 
connector?"
+```
+
+**Response:** Complete list of options with descriptions and examples.
+
+### Performance Optimization
+
+**Question:** How to optimize streaming
+
+```bash
+/seatunnel-skill "How do I optimize a Kafka to Elasticsearch streaming job for 
maximum throughput?"
+```
+
+**Response:** Performance tuning recommendations for parallelism, batch sizes, 
and resource allocation.
+
+---
+
+## Common Questions
+
+### Q: Why doesn't the skill show up?
+
+**A:** Make sure you:
+1. Copied the folder to `~/.claude/skills/` (not a subdirectory)
+2. Restarted Claude Code or reloaded skills
+3. The folder is named exactly `seatunnel-skill`
+
+**Fix:**
+```bash
+# Verify the path
+ls -la ~/.claude/skills/seatunnel-skill/SKILL.md
+
+# If it doesn't exist, copy it again
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+### Q: How do I update the skill?
+
+**A:**
+```bash
+# Navigate to seatunnel-tools directory
+cd /path/to/seatunnel-tools
+
+# Pull latest changes
+git pull origin main
+
+# Update the skill
+rm -rf ~/.claude/skills/seatunnel-skill
+cp -r seatunnel-skill ~/.claude/skills/
+
+# Restart Claude Code
+```
+
+### Q: Can I customize the skill?
+
+**A:** Yes! Edit `seatunnel-skill/SKILL.md`:
+
+```bash
+# Open the skill definition
+nano ~/.claude/skills/seatunnel-skill/SKILL.md
+
+# Make your changes
+# The skill will use your customizations
+```
+
+### Q: Where are skill responses saved?
+
+**A:** Skill responses are part of your Claude Code conversation history. They 
are saved in:
+- Local Claude Code workspace
+- Optionally synced to Claude.ai if configured
+
+---
+
+## Advanced Usage
+
+### Chaining Questions
+
+You can build on previous questions in the same conversation:
+
+```bash
+/seatunnel-skill "What is batch mode?"
+
+# In next message, reference previous context:
+/seatunnel-skill "Show me a complete example combining batch mode with MySQL 
source"
+
+# The skill understands the context from previous messages
+```
+
+### Getting Code Examples
+
+The skill can generate complete, production-ready configurations:
+
+```bash
+/seatunnel-skill "Generate a complete SeaTunnel job configuration that:
+1. Reads from MySQL database 'sales_db' table 'orders'
+2. Filters orders from last 30 days
+3. Writes to PostgreSQL 'analytics_db' table 'orders_processed'
+4. Uses batch mode with 4 parallelism"
+```
+
+### Integration with Your Workflow
+
+**Development Pipeline:**
+```bash
+# 1. Understand requirements
+/seatunnel-skill "Explain how to set up CDC from MySQL"
+
+# 2. Design solution
+/seatunnel-skill "Design a real-time data pipeline from MySQL CDC to Kafka"
+
+# 3. Generate configuration
+/seatunnel-skill "Generate the complete HOCON configuration for the pipeline"
+
+# 4. Debug issues
+/seatunnel-skill "My job is timing out. Debug this configuration: [paste 
config]"
+
+# 5. Optimize performance
+/seatunnel-skill "How can I optimize this job for better throughput?"
+```
+
+---
+
+## Troubleshooting
+
+### Issue: Skill not found error
+
+```
+Error: Unknown skill: seatunnel-skill
+```
+
+**Solution:**
+```bash
+# 1. Verify skill exists
+ls ~/.claude/skills/seatunnel-skill/
+
+# 2. Check file permissions
+chmod +r ~/.claude/skills/seatunnel-skill/*
+
+# 3. Restart Claude Code and try again
+```
+
+### Issue: Outdated responses
+
+**Solution:**
+```bash
+# Update skill to latest version
+cd seatunnel-tools
+git pull origin main
+rm -rf ~/.claude/skills/seatunnel-skill
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+### Issue: Responses are too generic
+
+**Try:**
+```bash
+# Be more specific in your question:
+# Instead of:
+/seatunnel-skill "How to configure MySQL?"
+
+# Try:
+/seatunnel-skill "Configure MySQL source for a batch job that reads table 
'users' with filters"
+```
+
+---
+
+## Tips for Best Results
+
+1. **Be Specific**: More details in your question = better responses
+2. **Include Context**: Mention your use case (batch/streaming, source/sink 
types)
+3. **Show Configuration**: Paste your HOCON config for debugging
+4. **Reference Versions**: Specify SeaTunnel version (e.g., 2.3.12)
+5. **Ask Follow-ups**: The skill remembers conversation context
+
+---
+
+## Keyboard Shortcuts
+
+- **Cmd+K** (macOS) / **Ctrl+K** (Windows/Linux): Quick open skill
+- **Type** `/seatunnel-skill`: Invoke skill
+- **Tab**: Auto-complete skill parameters
+- **Esc**: Cancel skill input
+
+---
+
+## File Locations
+
+```
+seatunnel-tools/
+├── seatunnel-skill/              # AI Skill
+│   ├── SKILL.md                  # Skill definition
+│   └── README.md                 # Documentation
+├── README.md                      # Main documentation
+├── README_CN.md                   # Chinese documentation
+└── SKILL_SETUP_GUIDE.md          # This file
+```
+
+---
+
+## Getting Help
+
+- **Skill Issues**: Try `/seatunnel-skill "How do I troubleshoot..."`
+- **SeaTunnel Questions**: Ask the skill directly
+- **Installation Help**: See [README.md](README.md) or 
[README_CN.md](README_CN.md)
+- **Report Issues**: [GitHub 
Issues](https://github.com/apache/seatunnel-tools/issues)
+
+---
+
+## Next Steps
+
+1. ✅ Install skill (`cp -r seatunnel-skill ~/.claude/skills/`)
+2. ✅ Test skill (`/seatunnel-skill "What is SeaTunnel?"`)
+3. 📚 Explore examples in this guide
+4. 🚀 Use skill for your SeaTunnel projects
+5. 📝 Share feedback and improvements
+
+---
+
+---
+
+# 中文版本
+
+# SeaTunnel Skill 安装使用指南
+
+## 开始使用 SeaTunnel Skill
+
+SeaTunnel Skill 是一个集成到 Claude Code 中的 AI 助手，帮助您进行 Apache SeaTunnel 
的配置、故障排查、学习和最佳实践。
+
+---
+
+## 安装步骤
+
+### 第一步：克隆仓库
+
+```bash
+git clone https://github.com/apache/seatunnel-tools.git
+cd seatunnel-tools
+```
+
+### 第二步：定位技能目录
+
+Claude Code 在您的主目录中存储技能。如果目录不存在，请创建：
+
+```bash
+# 如果目录不存在，则创建 ~/.claude/skills 目录
+mkdir -p ~/.claude/skills
+```
+
+**不同操作系统的目录位置：**
+- **macOS/Linux**: `~/.claude/skills/`
+- **Windows**: `%USERPROFILE%\.claude\skills\`
+
+### 第三步：复制技能文件
+
+```bash
+# 复制 seatunnel-skill 到 Claude Code 技能目录
+cp -r seatunnel-skill ~/.claude/skills/
+
+# 验证安装
+ls ~/.claude/skills/seatunnel-skill/
+```
+
+您应该看到：
+```
+SKILL.md         # 技能定义和元数据
+README.md        # 技能文档
+```
+
+### 第四步：验证安装
+
+**选项 A：使用 Claude Code 终端**
+
+```bash
+# 在 Claude Code 终端中运行：
+ls ~/.claude/skills/seatunnel-skill/
+
+# 您应该看到技能文件列出
+```
+
+**选项 B：检查技能加载**
+
+在 Claude Code 中，您可能会看到技能重新加载通知。如果没有：
+1. 重启 Claude Code
+2. 或通过技能菜单手动重新加载
+
+### 第五步：测试技能
+
+打开 Claude Code 并尝试：
+
+```bash
+/seatunnel-skill "什么是 SeaTunnel？"
+```
+
+您应该获得关于 SeaTunnel 的 AI 驱动响应。
+
+---
+
+## 使用示例
+
+### 获取配置帮助
+
+**问题：** 如何配置从 MySQL 到 PostgreSQL 的任务？
+
+```bash
+/seatunnel-skill "创建一个任务配置，以批处理模式将数据从 MySQL 同步到 PostgreSQL"
+```
+
+**响应：** 技能将提供完整的 HOCON 配置示例和说明。
+
+### 学习 SeaTunnel 概念
+
+**问题：** 解释 CDC 模式
+
+```bash
+/seatunnel-skill "在 SeaTunnel 中解释变更数据捕获 (CDC)。何时应该使用它？"
+```
+
+**响应：** 关于 CDC 的全面解释、用例和配置示例。
+
+### 故障排查
+
+**问题：** 我的任务失败了
+
+```bash
+/seatunnel-skill "我的批处理任务出现 'OutOfMemoryError: Java heap space' 错误。我应该如何修复？"
+```
+
+**响应：** 详细的诊断和解决方案，包括：
+- 根本原因说明
+- 配置修复
+- 环境变量调整
+- 性能调优建议
+
+### 连接器信息
+
+**问题：** 可用的 Kafka 选项
+
+```bash
+/seatunnel-skill "Kafka 源连接器的所有配置选项是什么？"
+```
+
+**响应：** 完整的选项列表，带有描述和示例。
+
+### 性能优化
+
+**问题：** 如何优化流处理
+
+```bash
+/seatunnel-skill "如何优化从 Kafka 到 Elasticsearch 的流处理任务以获得最大吞吐量？"
+```
+
+**响应：** 并行度、批大小和资源分配的性能调优建议。
+
+---
+
+## 常见问题
+
+### Q: 为什么技能不显示？
+
+**A:** 请确保您：
+1. 将文件夹复制到 `~/.claude/skills/`（不是子目录）
+2. 重启了 Claude Code 或重新加载了技能
+3. 文件夹名称完全是 `seatunnel-skill`
+
+**修复：**
+```bash
+# 验证路径
+ls -la ~/.claude/skills/seatunnel-skill/SKILL.md
+
+# 如果不存在，再次复制
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+### Q: 如何更新技能？
+
+**A：**
+```bash
+# 导航到 seatunnel-tools 目录
+cd /path/to/seatunnel-tools
+
+# 拉取最新更改
+git pull origin main
+
+# 更新技能
+rm -rf ~/.claude/skills/seatunnel-skill
+cp -r seatunnel-skill ~/.claude/skills/
+
+# 重启 Claude Code
+```
+
+### Q: 我可以自定义技能吗？
+
+**A：** 可以！编辑 `seatunnel-skill/SKILL.md`：
+
+```bash
+# 打开技能定义
+nano ~/.claude/skills/seatunnel-skill/SKILL.md
+
+# 进行更改
+# 技能将使用您的自定义设置
+```
+
+### Q: 技能响应保存在哪里？
+
+**A：** 技能响应是您的 Claude Code 对话历史的一部分。它们保存在：
+- 本地 Claude Code 工作区
+- 如果配置，可选地同步到 Claude.ai
+
+---
+
+## 高级用法
+
+### 链接问题
+
+您可以在同一对话中基于之前的问题进行构建：
+
+```bash
+/seatunnel-skill "什么是批处理模式？"
+
+# 在下一条消息中，参考之前的上下文：
+/seatunnel-skill "展示一个结合批处理模式和 MySQL 源的完整示例"
+
+# 技能理解来自之前消息的上下文
+```
+
+### 获取代码示例
+
+技能可以生成完整的、生产就绪的配置：
+
+```bash
+/seatunnel-skill "生成一个完整的 SeaTunnel 任务配置，该配置：
+1. 从 MySQL 数据库 'sales_db' 表 'orders' 读取
+2. 过滤最近 30 天的订单
+3. 写入 PostgreSQL 'analytics_db' 表 'orders_processed'
+4. 使用 4 个并行度的批处理模式"
+```
+
+### 与您的工作流集成
+
+**开发流程：**
+```bash
+# 1. 了解需求
+/seatunnel-skill "解释如何从 MySQL 设置 CDC"
+
+# 2. 设计解决方案
+/seatunnel-skill "设计从 MySQL CDC 到 Kafka 的实时数据管道"
+
+# 3. 生成配置
+/seatunnel-skill "为管道生成完整的 HOCON 配置"
+
+# 4. 调试问题
+/seatunnel-skill "我的任务超时。调试此配置：[粘贴配置]"
+
+# 5. 优化性能
+/seatunnel-skill "我应该如何优化此任务以获得更好的吞吐量？"
+```
+
+---
+
+## 故障排查
+
+### 问题：技能未找到错误
+
+```
+Error: Unknown skill: seatunnel-skill
+```
+
+**解决方案：**
+```bash
+# 1. 验证技能存在
+ls ~/.claude/skills/seatunnel-skill/
+
+# 2. 检查文件权限
+chmod +r ~/.claude/skills/seatunnel-skill/*
+
+# 3. 重启 Claude Code 并重试
+```
+
+### 问题：响应过时
+
+**解决方案：**
+```bash
+# 更新技能到最新版本
+cd seatunnel-tools
+git pull origin main
+rm -rf ~/.claude/skills/seatunnel-skill
+cp -r seatunnel-skill ~/.claude/skills/
+```
+
+### 问题：响应过于笼统
+
+**尝试：**
+```bash
+# 在问题中更具体：
+# 不是：
+/seatunnel-skill "如何配置 MySQL？"
+
+# 而是：
+/seatunnel-skill "配置 MySQL 源进行批处理任务，读取 'users' 表并应用过滤器"
+```
+
+---
+
+## 获得最佳结果的提示
+
+1. **具体明确**: 问题中的细节越多 = 响应越好
+2. **包含上下文**: 提及您的用例（批/流、源/宿类型）
+3. **显示配置**: 粘贴您的 HOCON 配置以进行调试
+4. **参考版本**: 指定 SeaTunnel 版本（例如 2.3.12）
+5. **提出后续问题**: 技能会记住对话上下文
+
+---
+
+## 键盘快捷键
+
+- **Cmd+K** (macOS) / **Ctrl+K** (Windows/Linux): 快速打开技能
+- **输入** `/seatunnel-skill`: 调用技能
+- **Tab**: 自动完成技能参数
+- **Esc**: 取消技能输入
+
+---
+
+## 文件位置
+
+```
+seatunnel-tools/
+├── seatunnel-skill/              # AI 技能
+│   ├── SKILL.md                  # 技能定义
+│   └── README.md                 # 文档
+├── README.md                      # 主文档
+├── README_CN.md                   # 中文文档
+└── SKILL_SETUP_GUIDE.md          # 此文件
+```
+
+---
+
+## 获取帮助
+
+- **技能问题**: 尝试 `/seatunnel-skill "我应该如何故障排查..."`
+- **SeaTunnel 问题**: 直接向技能提问
+- **安装帮助**: 查看 [README.md](README.md) 或 [README_CN.md](README_CN.md)
+- **报告问题**: [GitHub Issues](https://github.com/apache/seatunnel-tools/issues)
+
+---
+
+## 后续步骤
+
+1. ✅ 安装技能 (`cp -r seatunnel-skill ~/.claude/skills/`)
+2. ✅ 测试技能 (`/seatunnel-skill "什么是 SeaTunnel？"`)
+3. 📚 探索本指南中的示例
+4. 🚀 将技能用于您的 SeaTunnel 项目
+5. 📝 分享反馈和改进
+
+---
+
+**最后更新**: 2026-01-28 | **许可证**: Apache 2.0
\ No newline at end of file
diff --git a/seatunnel-skill/SKILL.md b/seatunnel-skill/SKILL.md
new file mode 100644
index 0000000..b0ceb2d
--- /dev/null
+++ b/seatunnel-skill/SKILL.md
@@ -0,0 +1,1212 @@
+---
+name: seatunnel-skill
+description: Apache SeaTunnel - A multimodal, high-performance, distributed 
data integration tool for massive data synchronization across 100+ connectors
+author: auto-generated by repo2skill
+platform: github
+source: https://github.com/apache/seatunnel
+tags: [data-integration, data-pipeline, etl, elt, real-time-streaming, 
batch-processing, cdc, distributed-computing, apache, java]
+version: 2.3.13
+generated: 2026-01-28
+license: Apache 2.0
+repository: apache/seatunnel
+---
+
+# Apache SeaTunnel OpenCode Skill
+
+Apache SeaTunnel is a **multimodal, high-performance, distributed data 
integration tool** capable of synchronizing vast amounts of data daily. It 
connects hundreds of evolving data sources with support for real-time, CDC 
(Change Data Capture), and full database synchronization.
+
+## Quick Start
+
+### Prerequisites
+
+- **Java**: JDK 8 or higher
+- **Maven**: 3.6.0 or higher (for building from source)
+- **Python**: 3.7+ (optional, for Python API)
+- **Git**: For cloning the repository
+
+### Installation
+
+#### 1. Clone Repository
+```bash
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+```
+
+#### 2. Build from Source
+```bash
+# Build entire project
+mvn clean install -DskipTests
+
+# Build specific module
+mvn clean install -pl seatunnel-core -DskipTests
+```
+
+#### 3. Download Pre-built Binary (Recommended)
+Visit the [official download page](https://seatunnel.apache.org/download) and 
select your version:
+
+```bash
+# Example: Download SeaTunnel 2.3.12
+wget 
https://archive.apache.org/dist/seatunnel/2.3.12/apache-seatunnel-2.3.12-bin.tar.gz
+tar -xzf apache-seatunnel-2.3.12-bin.tar.gz
+cd apache-seatunnel-2.3.12
+```
+
+#### 4. Basic Configuration
+```bash
+# Set JAVA_HOME
+export JAVA_HOME=/path/to/java
+
+# Add to PATH
+export PATH=$PATH:/path/to/seatunnel/bin
+```
+
+#### 5. Verify Installation
+```bash
+seatunnel --version
+```
+
+### Hello World Example
+
+Create a simple data integration job:
+
+**config/hello_world.conf**
+```hocon
+env {
+  job.mode = "BATCH"
+  job.name = "Hello World"
+}
+
+source {
+  FakeSource {
+    row.num = 100
+    schema = {
+      fields {
+        id = "bigint"
+        name = "string"
+        age = "int"
+      }
+    }
+  }
+}
+
+sink {
+  Console {
+    format = "json"
+  }
+}
+```
+
+Run the job:
+```bash
+seatunnel.sh -c config/hello_world.conf -e spark
+# or with flink
+seatunnel.sh -c config/hello_world.conf -e flink
+```
+
+---
+
+## Overview
+
+### Purpose and Target Users
+
+**SeaTunnel** is designed for:
+- **Data Engineers**: Building large-scale data pipelines with minimal 
complexity
+- **DevOps Teams**: Managing data integration infrastructure
+- **Enterprise Platforms**: Handling 100+ billion data records daily
+- **Real-time Analytics**: Supporting streaming data synchronization
+- **Legacy System Migration**: CDC-based incremental sync from transactional 
databases
+
+### Core Capabilities
+
+1. **Multimodal Support**
+   - Structured data (databases, data warehouses)
+   - Unstructured data (video, images, binaries)
+   - Semi-structured data (JSON, logs, binlog streams)
+
+2. **Multiple Synchronization Methods**
+   - **Batch**: Full historical data transfer
+   - **Streaming**: Real-time data pipeline
+   - **CDC**: Incremental capture from databases
+   - **Full + Incremental**: Combined approach
+
+3. **100+ Pre-built Connectors**
+   - Databases: MySQL, PostgreSQL, Oracle, SQL Server, MongoDB
+   - Data Warehouses: Snowflake, BigQuery, Redshift, Iceberg
+   - Cloud SaaS: Salesforce, Shopify, Google Sheets
+   - Message Queues: Kafka, RabbitMQ, Pulsar
+   - Search Engines: Elasticsearch, OpenSearch
+   - Object Storage: S3, GCS, HDFS
+
+4. **Multi-Engine Support**
+   - **Zeta Engine**: Lightweight, standalone deployment (no Spark/Flink 
required)
+   - **Apache Flink**: Distributed streaming engine
+   - **Apache Spark**: Distributed batch/batch-stream processing
+
+---
+
+## Features
+
+### High-Performance
+
+- **Distributed Snapshot Algorithm**: Ensures data consistency without locks
+- **JDBC Multiplexing**: Minimizes database connections for real-time sync
+- **Log Parsing**: Efficient CDC implementation with binary log analysis
+- **Resource Optimization**: Reduces computing resources and I/O overhead
+
+### Data Quality & Reliability
+
+- **Real-time Monitoring**: Track synchronization progress and data metrics
+- **Data Loss Prevention**: Transactional guarantees (exactly-once semantics)
+- **Deduplication**: Prevents duplicate records during reprocessing
+- **Error Handling**: Graceful failure recovery and retry logic
+
+### Developer-Friendly
+
+- **SQL-like Configuration**: Intuitive job definition syntax
+- **Visual Web UI**: Drag-and-drop job builder (SeaTunnel Web Project)
+- **Extensive Documentation**: Comprehensive guides and examples
+- **Community Support**: Active community via Slack and mailing lists
+
+### Production Ready
+
+- **Proven at Scale**: Used in enterprises processing billions of records daily
+- **Version Stability**: Regular releases with backward compatibility
+- **Enterprise Features**: Multi-tenancy, RBAC, audit logging
+- **Cloud Native**: Kubernetes-ready deployment
+
+---
+
+## Installation
+
+### Installation Methods
+
+#### Method 1: Binary Download (Recommended for Quick Start)
+
+```bash
+# 1. Download binary
+VERSION=2.3.12
+wget 
https://archive.apache.org/dist/seatunnel/${VERSION}/apache-seatunnel-${VERSION}-bin.tar.gz
+
+# 2. Extract
+tar -xzf apache-seatunnel-${VERSION}-bin.tar.gz
+cd apache-seatunnel-${VERSION}
+
+# 3. Verify
+./bin/seatunnel.sh --version
+```
+
+#### Method 2: Build from Source
+
+```bash
+# 1. Clone repository
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+
+# 2. Build with Maven
+mvn clean install -DskipTests
+
+# 3. Navigate to distribution
+cd seatunnel-dist/target/apache-seatunnel-*-bin/apache-seatunnel-*
+
+# 4. Verify
+./bin/seatunnel.sh --version
+```
+
+#### Method 3: Docker
+
+```bash
+# Pull official Docker image
+docker pull apache/seatunnel:latest
+
+# Run container
+docker run -it apache/seatunnel:latest /bin/bash
+
+# Or run a job directly
+docker run -v /path/to/config:/config \
+  apache/seatunnel:latest \
+  seatunnel.sh -c /config/job.conf -e spark
+```
+
+### Environment Setup
+
+#### Set Java Home
+```bash
+# Bash/Zsh
+export JAVA_HOME=/path/to/java
+export PATH=$JAVA_HOME/bin:$PATH
+
+# Verify
+java -version
+```
+
+#### Configure for Spark Engine
+```bash
+# Set Spark Home (if using Spark engine)
+export SPARK_HOME=/path/to/spark
+
+# Verify Spark installation
+$SPARK_HOME/bin/spark-submit --version
+```
+
+#### Configure for Flink Engine
+```bash
+# Set Flink Home (if using Flink engine)
+export FLINK_HOME=/path/to/flink
+
+# Verify Flink installation
+$FLINK_HOME/bin/flink --version
+```
+
+### System Requirements
+
+| Requirement | Version/Spec |
+|---|---|
+| Java | JDK 1.8+ |
+| Memory | 2GB+ (minimum), 8GB+ (recommended) |
+| Disk | 500MB (binary) + job storage |
+| Network | Connectivity to source/sink systems |
+| Scala | 2.12.15 (for Spark/Flink integration) |
+
+---
+
+## Usage
+
+### 1. Job Configuration (HOCON Format)
+
+SeaTunnel uses **HOCON** (Human-Optimized Config Object Notation) for job 
configuration.
+
+**Basic Structure:**
+```hocon
+env {
+  job.mode = "BATCH"  # or STREAMING
+  job.name = "My Job"
+  parallelism = 4
+}
+
+source {
+  SourceConnector {
+    option1 = value1
+    option2 = value2
+    schema = {
+      fields {
+        column1 = "type"
+        column2 = "type"
+      }
+    }
+  }
+}
+
+# Optional: Transform data
+transform {
+  TransformName {
+    option = value
+  }
+}
+
+sink {
+  SinkConnector {
+    option1 = value1
+    option2 = value2
+  }
+}
+```
+
+### 2. Common Use Cases
+
+#### Use Case 1: MySQL to PostgreSQL (Batch)
+
+**config/mysql_to_postgres.conf**
+```hocon
+env {
+  job.mode = "BATCH"
+  job.name = "MySQL to PostgreSQL"
+}
+
+source {
+  Jdbc {
+    driver = "com.mysql.cj.jdbc.Driver"
+    url = "jdbc:mysql://mysql-host:3306/mydb"
+    user = "root"
+    password = "password"
+    query = "SELECT * FROM users"
+    connection_check_timeout_sec = 100
+  }
+}
+
+sink {
+  Jdbc {
+    driver = "org.postgresql.Driver"
+    url = "jdbc:postgresql://pg-host:5432/mydb"
+    user = "postgres"
+    password = "password"
+    database = "mydb"
+    table = "users"
+    primary_keys = ["id"]
+    connection_check_timeout_sec = 100
+  }
+}
+```
+
+Run:
+```bash
+seatunnel.sh -c config/mysql_to_postgres.conf -e spark
+```
+
+#### Use Case 2: Kafka Streaming to Elasticsearch
+
+**config/kafka_to_es.conf**
+```hocon
+env {
+  job.mode = "STREAMING"
+  job.name = "Kafka to Elasticsearch"
+  parallelism = 2
+}
+
+source {
+  Kafka {
+    bootstrap.servers = "kafka-host:9092"
+    topic = "events"
+    patterns = "event.*"
+    consumer.group = "seatunnel-group"
+    format = "json"
+    schema = {
+      fields {
+        event_id = "bigint"
+        event_name = "string"
+        timestamp = "bigint"
+        payload = "string"
+      }
+    }
+  }
+}
+
+transform {
+  Sql {
+    sql = "SELECT event_id, event_name, FROM_UNIXTIME(timestamp/1000) as ts, 
payload FROM source"
+  }
+}
+
+sink {
+  Elasticsearch {
+    hosts = ["es-host:9200"]
+    index = "events"
+    index_type = "_doc"
+    username = "elastic"
+    password = "password"
+  }
+}
+```
+
+Run:
+```bash
+seatunnel.sh -c config/kafka_to_es.conf -e flink
+```
+
+#### Use Case 3: CDC from MySQL to Kafka
+
+**config/mysql_cdc_kafka.conf**
+```hocon
+env {
+  job.mode = "STREAMING"
+  job.name = "MySQL CDC to Kafka"
+}
+
+source {
+  Mysql {
+    server_id = 5400
+    hostname = "mysql-host"
+    port = 3306
+    username = "root"
+    password = "password"
+    database = ["mydb"]
+    table = ["users", "orders"]
+    startup.mode = "initial"
+    snapshot.split.size = 8096
+    incremental.snapshot.chunk.size = 1024
+    snapshot_fetch_size = 1024
+    snapshot_lock_timeout_sec = 10
+    server_time_zone = "UTC"
+  }
+}
+
+sink {
+  Kafka {
+    bootstrap.servers = "kafka-host:9092"
+    topic = "mysql_cdc"
+    format = "canal_json"
+    semantic = "EXACTLY_ONCE"
+  }
+}
+```
+
+Run:
+```bash
+seatunnel.sh -c config/mysql_cdc_kafka.conf -e flink
+```
+
+### 3. Running Jobs
+
+#### Local Mode (Single Machine)
+```bash
+# Using Spark (default)
+seatunnel.sh -c config/job.conf -e spark
+
+# Using Flink
+seatunnel.sh -c config/job.conf -e flink
+
+# Using Zeta (lightweight)
+seatunnel.sh -c config/job.conf -e zeta
+```
+
+#### Cluster Mode
+```bash
+# Submit to Spark cluster
+seatunnel.sh -c config/job.conf -e spark -m cluster -n hadoop-master:7077
+
+# Submit to Flink cluster
+seatunnel.sh -c config/job.conf -e flink -m remote -s localhost 8081
+```
+
+#### Verbose Output
+```bash
+seatunnel.sh -c config/job.conf -e spark -l DEBUG
+```
+
+#### Check Status
+```bash
+# View running jobs (Spark Cluster)
+spark-submit --status <driver-id>
+
+# View running jobs (Flink Cluster)
+$FLINK_HOME/bin/flink list
+```
+
+### 4. SQL API (Advanced)
+
+Use SQL for complex transformations:
+
+```hocon
+source {
+  MySQL {
+    # Source config...
+  }
+}
+
+transform {
+  Sql {
+    # Multiple SQL statements
+    sql = """
+      SELECT
+        id,
+        UPPER(name) as name,
+        age + 10 as age_plus_10,
+        CURRENT_TIMESTAMP as created_at
+      FROM source
+      WHERE age > 18
+    """
+  }
+}
+
+sink {
+  PostgreSQL {
+    # Sink config...
+  }
+}
+```
+
+---
+
+## API Reference
+
+### Core Connector Types
+
+#### Source Connectors
+- **Jdbc**: Generic JDBC databases (MySQL, PostgreSQL, Oracle, SQL Server)
+- **Kafka**: Apache Kafka topics
+- **Mysql**: MySQL with CDC support
+- **MongoDB**: MongoDB collections
+- **PostgreSQL**: PostgreSQL with CDC
+- **S3**: Amazon S3 and compatible storage
+- **Http**: HTTP/HTTPS endpoints
+- **FakeSource**: For testing and development
+
+#### Transform Connectors
+- **Sql**: Execute SQL transformations
+- **Dummy**: Pass-through (testing)
+- **FieldMapper**: Rename/map columns
+- **JsonPath**: Extract data from JSON
+
+#### Sink Connectors
+- **Jdbc**: Write to JDBC-compatible databases
+- **Kafka**: Publish to Kafka topics
+- **Elasticsearch**: Write to Elasticsearch indices
+- **S3**: Write to S3 buckets
+- **Redis**: Write to Redis
+- **HBase**: Write to HBase tables
+- **StarRocks**: Write to StarRocks tables
+- **Console**: Output to console (testing)
+
+### Configuration Options
+
+#### Common Source Options
+```hocon
+source {
+  ConnectorName {
+    # Connection
+    hostname = "host"
+    port = 3306
+    username = "user"
+    password = "pass"
+
+    # Data selection
+    database = "db_name"
+    table = "table_name"
+
+    # Performance
+    fetch_size = 1000
+    connection_check_timeout_sec = 100
+    split_size = 10000
+
+    # Schema
+    schema = {
+      fields {
+        id = "bigint"
+        name = "string"
+      }
+    }
+  }
+}
+```
+
+#### Common Sink Options
+```hocon
+sink {
+  ConnectorName {
+    # Connection
+    hostname = "host"
+    port = 3306
+    username = "user"
+    password = "pass"
+
+    # Write behavior
+    database = "db_name"
+    table = "table_name"
+    primary_keys = ["id"]
+    batch_size = 500
+
+    # Error handling
+    max_retries = 3
+    retry_wait_time_ms = 1000
+    on_duplicate_key_update_column_names = ["field1"]
+  }
+}
+```
+
+#### Engine Options
+```hocon
+env {
+  # Execution mode
+  job.mode = "BATCH"  # or STREAMING
+
+  # Job identity
+  job.name = "My Job"
+
+  # Parallelism
+  parallelism = 4
+
+  # Checkpoint (streaming)
+  checkpoint.interval = 60000
+
+  # Restart strategy
+  restart_strategy = "fixed_delay"
+  restart_strategy.fixed_delay.attempts = 3
+  restart_strategy.fixed_delay.delay = 10000
+}
+```
+
+### Debugging and Monitoring
+
+#### View Job Metrics
+```bash
+# During execution, monitor logs
+tail -f logs/seatunnel.log
+
+# Check specific log level
+grep ERROR logs/seatunnel.log
+```
+
+#### Enable Debug Logging
+```bash
+# Set log level to DEBUG
+seatunnel.sh -c config/job.conf -e spark -l DEBUG
+```
+
+#### Use Test Sources
+```hocon
+source {
+  FakeSource {
+    row.num = 1000
+    schema = {
+      fields {
+        id = "bigint"
+        name = "string"
+      }
+    }
+  }
+}
+
+sink {
+  Console {
+    format = "json"
+  }
+}
+```
+
+---
+
+## Configuration
+
+### Project Structure
+```
+seatunnel/
+├── bin/                          # Executable scripts
+│   ├── seatunnel.sh             # Main entry point
+│   └── seatunnel-submit.sh      # Spark submission script
+├── config/                       # Configuration examples
+│   ├── flink-conf.yaml          # Flink configuration
+│   └── spark-conf.yaml          # Spark configuration
+├── connectors/                  # Pre-built connectors
+├── lib/                         # JAR dependencies
+├── logs/                        # Runtime logs
+└── plugin/                      # Plugin directory
+```
+
+### Environment Variables
+
+```bash
+# Java configuration
+export JAVA_HOME=/path/to/java
+export JVM_OPTS="-Xms1G -Xmx4G"
+
+# Spark configuration
+export SPARK_HOME=/path/to/spark
+export SPARK_MASTER=spark://master:7077
+
+# Flink configuration
+export FLINK_HOME=/path/to/flink
+
+# SeaTunnel configuration
+export SEATUNNEL_HOME=/path/to/seatunnel
+```
+
+### Key Configuration Files
+
+#### `seatunnel-env.sh`
+Configure runtime environment:
+```bash
+JAVA_HOME=/path/to/java
+JVM_OPTS="-Xms1G -Xmx4G -XX:+UseG1GC"
+SPARK_HOME=/path/to/spark
+FLINK_HOME=/path/to/flink
+```
+
+#### `flink-conf.yaml` (for Flink engine)
+```yaml
+taskmanager.memory.process.size: 2g
+taskmanager.memory.jvm-overhead.fraction: 0.1
+parallelism.default: 4
+```
+
+#### `spark-conf.yaml` (for Spark engine)
+```yaml
+driver-memory: 2g
+executor-memory: 4g
+num-executors: 4
+```
+
+### Performance Tuning
+
+#### For Batch Jobs
+```hocon
+env {
+  job.mode = "BATCH"
+  parallelism = 8  # Increase for larger clusters
+}
+
+source {
+  Jdbc {
+    # Split large tables for parallel reads
+    split_size = 100000
+    fetch_size = 5000
+  }
+}
+
+sink {
+  Jdbc {
+    # Batch inserts for better throughput
+    batch_size = 1000
+    # Use connection pooling
+    max_retries = 3
+  }
+}
+```
+
+#### For Streaming Jobs
+```hocon
+env {
+  job.mode = "STREAMING"
+  parallelism = 4
+  checkpoint.interval = 30000  # 30 seconds
+}
+
+source {
+  Kafka {
+    # Consumer group for parallel reads
+    consumer.group = "seatunnel-consumer"
+    # Batch reading
+    max_poll_records = 500
+  }
+}
+```
+
+---
+
+## Development
+
+### Project Architecture
+
+**SeaTunnel** follows a modular architecture:
+
+```
+seatunnel/
+├── seatunnel-api/              # Core APIs
+├── seatunnel-core/             # Execution engine
+├── seatunnel-engines/          # Engine implementations
+│   ├── seatunnel-engine-flink/
+│   ├── seatunnel-engine-spark/
+│   └── seatunnel-engine-zeta/
+├── seatunnel-connectors/       # Connector implementations
+│   ├── seatunnel-connectors-*/ # One per connector type
+└── seatunnel-dist/             # Distribution package
+```
+
+### Building from Source
+
+#### Full Build
+```bash
+# Clone repository
+git clone https://github.com/apache/seatunnel.git
+cd seatunnel
+
+# Build all modules
+mvn clean install -DskipTests
+
+# Build with tests (slower)
+mvn clean install
+```
+
+#### Build Specific Module
+```bash
+# Build only Kafka connector
+mvn clean install -pl 
seatunnel-connectors/seatunnel-connectors-seatunnel-kafka -DskipTests
+
+# Build only Flink engine
+mvn clean install -pl seatunnel-engines/seatunnel-engine-flink -DskipTests
+```
+
+#### Build Distribution
+```bash
+# Create binary distribution
+mvn clean install -DskipTests
+cd seatunnel-dist
+tar -tzf target/apache-seatunnel-*-bin.tar.gz | head
+```
+
+### Running Tests
+
+#### Unit Tests
+```bash
+# Run all tests
+mvn test
+
+# Run specific test class
+mvn test -Dtest=MySqlConnectorTest
+
+# Run with coverage
+mvn test jacoco:report
+```
+
+#### Integration Tests
+```bash
+# Run integration tests
+mvn verify
+
+# Run with Docker containers
+mvn verify -Pintegration-tests
+```
+
+### Development Setup
+
+#### IDE Configuration (IntelliJ IDEA)
+
+1. **Import Project**
+   - File → Open → Select seatunnel directory
+   - Choose Maven as build system
+
+2. **Configure JDK**
+   - Project Settings → Project → JDK
+   - Select JDK 1.8 or higher
+
+3. **Enable Annotation Processing**
+   - Project Settings → Build → Compiler → Annotation Processors
+   - Enable annotation processing
+
+4. **Run Configuration**
+   - Run → Edit Configurations
+   - Add new "Application" configuration
+   - Set main class: 
`org.apache.seatunnel.core.starter.command.CommandExecuteRunner`
+
+#### Code Style
+
+SeaTunnel follows Apache project conventions:
+- 4-space indentation (not tabs)
+- Line length max 120 characters
+- Standard Java naming conventions
+- Organize imports alphabetically
+
+Use the provided `.editorconfig`:
+```bash
+# Install EditorConfig plugin (IntelliJ)
+# Then your IDE will auto-format code
+```
+
+### Creating Custom Connectors
+
+#### 1. Extend SeConnector Interface
+```java
+import org.apache.seatunnel.api.source.SeSource;
+import org.apache.seatunnel.api.table.catalog.Table;
+
+public class CustomSource extends SeSource {
+    @Override
+    public String getPluginName() {
+        return "Custom";
+    }
+
+    @Override
+    public void validate() {
+        // Validation logic
+    }
+
+    @Override
+    public ResultSet read(Boundedness boundedness) {
+        // Implementation
+    }
+}
+```
+
+#### 2. Create Configuration Class
+```java
+import org.apache.seatunnel.api.configuration.Option;
+import org.apache.seatunnel.api.configuration.Options;
+
+public class CustomSourceOptions {
+    public static final Option<String> HOST =
+        Options.key("host")
+            .stringType()
+            .noDefaultValue()
+            .withDescription("Source hostname");
+
+    public static final Option<Integer> PORT =
+        Options.key("port")
+            .intType()
+            .defaultValue(9200)
+            .withDescription("Source port");
+}
+```
+
+#### 3. Register Connector
+```
+META-INF/services/org.apache.seatunnel.api.source.SeSource
+```
+
+### Contributing Guide
+
+1. **Fork and Clone**
+   ```bash
+   git clone https://github.com/YOUR_USERNAME/seatunnel.git
+   cd seatunnel
+   git remote add upstream https://github.com/apache/seatunnel.git
+   ```
+
+2. **Create Feature Branch**
+   ```bash
+   git checkout -b feature/my-feature
+   ```
+
+3. **Make Changes**
+   - Follow code style guide
+   - Add tests for new features
+   - Update documentation
+
+4. **Commit and Push**
+   ```bash
+   git add .
+   git commit -m "feat: add new feature"
+   git push origin feature/my-feature
+   ```
+
+5. **Create Pull Request**
+   - Go to GitHub repository
+   - Create PR with clear description
+   - Link any related issues
+   - Wait for review
+
+6. **Code Review Process**
+   - Address feedback from maintainers
+   - Update PR with changes
+   - After approval, maintainers will merge
+
+---
+
+## Troubleshooting
+
+### Common Issues and Solutions
+
+#### Issue 1: "ClassNotFoundException: com.mysql.jdbc.Driver"
+
+**Cause**: JDBC driver JAR not in classpath
+
+**Solution**:
+```bash
+# 1. Download MySQL JDBC driver
+wget 
https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.33.jar
+
+# 2. Copy to lib directory
+cp mysql-connector-java-8.0.33.jar $SEATUNNEL_HOME/lib/
+
+# 3. Restart job
+seatunnel.sh -c config/job.conf -e spark
+```
+
+#### Issue 2: "OutOfMemoryError: Java heap space"
+
+**Cause**: Insufficient JVM heap memory
+
+**Solution**:
+```bash
+# Increase JVM memory
+export JVM_OPTS="-Xms2G -Xmx8G"
+
+# Or set in seatunnel-env.sh
+echo 'JVM_OPTS="-Xms2G -Xmx8G"' >> $SEATUNNEL_HOME/bin/seatunnel-env.sh
+```
+
+#### Issue 3: "Connection refused: connect"
+
+**Cause**: Source/sink service not reachable
+
+**Solution**:
+```bash
+# 1. Verify connectivity
+ping source-host
+telnet source-host 3306
+
+# 2. Check credentials
+mysql -h source-host -u root -p
+
+# 3. Check firewall rules
+# Ensure port 3306 is open
+```
+
+#### Issue 4: "Table not found" during CDC
+
+**Cause**: Binlog not enabled on MySQL
+
+**Solution**:
+```sql
+-- Check binlog status
+SHOW VARIABLES LIKE 'log_bin';
+
+-- Enable binlog in my.cnf
+[mysqld]
+log_bin = mysql-bin
+binlog_format = row  # Important for CDC
+
+-- Restart MySQL and verify
+SHOW MASTER STATUS;
+```
+
+#### Issue 5: Slow Job Performance
+
+**Cause**: Suboptimal configuration
+
+**Solutions**:
+```hocon
+# 1. Increase parallelism
+env {
+  parallelism = 8  # or higher based on cluster
+}
+
+# 2. Increase batch sizes
+source {
+  Jdbc {
+    fetch_size = 5000
+    split_size = 100000
+  }
+}
+
+sink {
+  Jdbc {
+    batch_size = 2000
+  }
+}
+
+# 3. Enable connection pooling
+source {
+  Jdbc {
+    pool_size = 10
+    max_idle_time = 300
+  }
+}
+```
+
+#### Issue 6: "Kafka topic offset out of range"
+
+**Cause**: Offset doesn't exist in topic
+
+**Solution**:
+```hocon
+source {
+  Kafka {
+    # Reset to earliest or latest
+    auto.offset.reset = "earliest"  # or "latest"
+
+    # Or specify explicit offsets
+    start_mode = "earliest"
+  }
+}
+```
+
+### FAQ
+
+**Q: What's the difference between BATCH and STREAMING mode?**
+
+A:
+- **BATCH**: One-time execution, suitable for full database migration
+- **STREAMING**: Continuous execution, suitable for real-time data sync and CDC
+
+**Q: How do I handle schema changes during CDC?**
+
+A: SeaTunnel automatically detects schema changes in CDC mode. Configure in 
source:
+```hocon
+source {
+  Mysql {
+    schema_change_mode = "auto"  # auto-detect and apply
+  }
+}
+```
+
+**Q: Can I transform data during synchronization?**
+
+A: Yes, use SQL transform:
+```hocon
+transform {
+  Sql {
+    sql = "SELECT id, UPPER(name) as name FROM source"
+  }
+}
+```
+
+**Q: What's the maximum throughput?**
+
+A: Depends on:
+- Hardware (CPU, RAM, Network)
+- Source/sink database configuration
+- Data size per record
+- Network latency
+
+Typical throughput: 100K - 1M records/second per executor
+
+**Q: How do I handle errors in production?**
+
+A: Configure restart strategy:
+```hocon
+env {
+  restart_strategy = "exponential_delay"
+  restart_strategy.exponential_delay.initial_delay = 1000
+  restart_strategy.exponential_delay.max_delay = 30000
+  restart_strategy.exponential_delay.multiplier = 2.0
+  restart_strategy.exponential_delay.attempts_unlimited = true
+}
+```
+
+**Q: Is there a web UI for job management?**
+
+A: Yes! Use **SeaTunnel Web Project**:
+```bash
+# Check out web UI project
+git clone https://github.com/apache/seatunnel-web.git
+cd seatunnel-web
+mvn clean install
+
+# Run web UI
+java -jar target/seatunnel-web-*.jar
+# Access at http://localhost:8080
+```
+
+---
+
+## Resources
+
+### Official Documentation
+- [SeaTunnel Official Website](https://seatunnel.apache.org/)
+- [GitHub Repository](https://github.com/apache/seatunnel)
+- [Documentation Hub](https://seatunnel.apache.org/docs/)
+- [Connector 
List](https://seatunnel.apache.org/docs/2.3.12/connector-v2/overview)
+
+### Community
+- [Slack Channel](https://the-asf.slack.com/archives/C01CB5186TL)
+- [Mailing Lists](https://seatunnel.apache.org/community/mail-lists/)
+- [Issue Tracker](https://github.com/apache/seatunnel/issues)
+- [Discussion Forum](https://github.com/apache/seatunnel/discussions)
+
+### Related Projects
+- [SeaTunnel Web UI](https://github.com/apache/seatunnel-web)
+- [Apache Kafka](https://kafka.apache.org/)
+- [Apache Flink](https://flink.apache.org/)
+- [Apache Spark](https://spark.apache.org/)
+
+### Learning Resources
+- [HOCON Configuration 
Guide](https://github.com/lightbend/config/blob/main/HOCON.md)
+- [SQL Functions 
Reference](https://seatunnel.apache.org/docs/2.3.12/transform-v2/sql)
+- [CDC Pattern Explained](https://en.wikipedia.org/wiki/Change_data_capture)
+- [Distributed Systems 
Concepts](https://en.wikipedia.org/wiki/Distributed_computing)
+
+### Version History
+- **2.3.12** (Stable) - Current recommended version
+- **2.3.13-SNAPSHOT** (Development)
+- [All Releases](https://archive.apache.org/dist/seatunnel/)
+
+---
+
+## Additional Notes
+
+### License
+Apache License 2.0 - See 
[LICENSE](https://github.com/apache/seatunnel/blob/master/LICENSE) file
+
+### Security
+- Report security issues via [Apache 
Security](https://www.apache.org/security/)
+- Do NOT create public issues for security vulnerabilities
+
+### Support & Contribution
+- Join the community Slack for support
+- Submit feature requests on GitHub Issues
+- Contribute code via Pull Requests
+- Follow [Contributing 
Guide](https://github.com/apache/seatunnel/blob/master/CONTRIBUTING.md)
+
+---
+
+**Last Updated**: 2026-01-28
+**Skill Version**: 2.3.13
+**Status**: Production Ready ✓

(seatunnel-tools) branch main updated: [Feature][SeaTunnel Skill] add seatunnel-skills (#3)

Reply via email to