DanielCarter-stack commented on PR #10429:
URL: https://github.com/apache/seatunnel/pull/10429#issuecomment-3831182419

   <!-- code-pr-reviewer -->
   <!-- cpr:pr_reply_v2_parts {"group": "apache/seatunnel#10429", "part": 1, 
"total": 1} -->
   ### Issue 1: PR description severely mismatches actual changes
   
   **Location**: PR metadata  
   **Problem description**: The PR description states "review and optimize 
Chinese architecture documentation", but actually:
   1. Added a large amount of English documentation (this is a good thing)
   2. Deleted multiple production code features (should be a separate PR)
   3. Deleted multiple tests
   
   **Potential risks**:
   - Users may not notice that code features were deleted
   - May break existing production environments
   
   **Impact scope**:
   - Direct impact: Kafka Schema Registry users, REST API metrics users
   - Affected area: all users using these features
   
   **Severity**: **CRITICAL**
   
   **Improvement suggestions**: 
   ```
   应该拆分成多个 PR:
   1. PR-A: 纯文档新增(英文+中文架构文档)
   2. PR-B: 删除 Schema Registry 功能(需要 JIRA 讨论、废弃通知、迁移指南)
   3. PR-C: REST API metrics 简化(需要说明动机和影响)
   ```
   
   ---
   
   ### Issue 2: Important content deleted from incompatible-changes
   
   **Location**: `docs/en/introduction/concepts/incompatible-changes.md`  
   **Related context**: REST API metrics format change  
   
   **Problem description**: 
   Deleted the description about REST API table metrics key format change:
   - Old format: `{tableName}` 
   - New format: `{VertexIdentifier}.{tableName}` (e.g. 
`Sink[0].fake.user_table`)
   
   This deletion itself may be because the code has been synchronously 
simplified, but:
   1. No explanation of when this change occurred
   2. No explanation of the impact on existing users
   
   **Potential risks**:
   - Existing Grafana dashboards and Prometheus alert rules may fail
   - Users lack clear migration guidelines
   
   **Impact scope**:
   - Direct impact: monitoring systems using REST API metrics
   - Affected area: operations, SRE teams
   
   **Severity**: **MAJOR**
   
   **Improvement suggestions**:
   ```markdown
   应该在 incompatible-changes.md 中保留历史记录,或添加迁移说明:
   
   ### API Changes
   
   - **Removed: Engine REST table metrics VertexIdentifier prefix format**
     - **Version**: 2.x.x
     - **Description**: Simplified table metrics format by removing 
VertexIdentifier prefix
     - **Migration**: Update monitoring dashboards to use simplified format
   ```
   
   ---
   
   ### Issue 3: checkpoint.mode configuration option does not exist
   
   **Location**: `docs/en/architecture/design-philosophy.md:361`  
   **Related context**: EnvCommonOptions.java  
   
   **Problem description**:
   The documentation mentions configuration example:
   ```hocon
   env {
     checkpoint.mode = "EXACTLY_ONCE"  # or "AT_LEAST_ONCE"
     checkpoint.interval = 60000
   }
   ```
   
   But in the code, only `checkpoint.interval` was found, not `checkpoint.mode`.
   
   **Potential risks**:
   - Users attempting to use this configuration will fail
   - Reduces documentation credibility
   
   **Impact scope**:
   - Direct impact: users attempting to configure checkpoint mode
   - Affected area: credibility of configuration documentation
   
   **Severity**: **MAJOR**
   
   **Improvement suggestions**:
   1. Verify whether there are other ways in the code to configure checkpoint 
mode
   2. If it truly does not exist, this example should be deleted or marked as 
"planned feature"
   3. Update to actually available configuration options
   
   ---
   
   ### Issue 4: Kafka Schema Registry feature deletion lacks migration guide
   
   **Location**: 
   - `seatunnel-connectors-v2/connector-kafka/.../KafkaSourceOptions.java`
   - 
`seatunnel-formats/seatunnel-format-protobuf/.../SchemaRegistryAwareProtobufDeserializationSchema.java`
   
   **Problem description**:
   Deleted `strip_schema_registry_header` configuration and related 
implementation, but did not:
   1. Record in the changelog
   2. Provide migration guide
   3. Explain the reason for deletion
   
   **Potential risks**:
   - Users using Confluent Schema Registry cannot upgrade
   - May cause data parsing failures
   
   **Impact scope**:
   - Direct impact: Kafka Protobuf + Schema Registry users
   - Affected area: individual Connector
   
   **Severity**: **CRITICAL**
   
   **Improvement suggestions**:
   ```
   1. 在 incompatible-changes.md 中添加说明
   2. 提供替代方案(如果有)
   3. 如果是临时删除,应该在 JIRA 中跟踪重新添加
   4. 如果是永久删除,需要邮件列表通知用户
   ```
   
   ---
   
   ### Issue 5: Documentation and code evolution consistency
   
   **Location**: Multiple architecture documents  
   **Problem description**:
   Chinese documentation has adopted a "no embedded source code snippets" 
strategy to avoid maintenance drift, but English documentation still contains a 
large number of code examples. This will lead to:
   1. Documentation becoming outdated after code refactoring
   2. Inconsistent maintenance methods between Chinese and English documentation
   
   **Potential risks**:
   - English documentation may become inaccurate over time
   - Increased maintenance burden
   
   **Impact scope**:
   - Direct impact: English readers
   - Affected area: documentation maintenance cost
   
   **Severity**: **MINOR**
   
   **Improvement suggestions**:
   ```
   统一中英文文档策略:
   1. 在英文文档中也采用"描述性"而非"代码复制"的方式
   2. 或者明确说明代码示例基于特定版本,并定期审核更新
   ```
   
   ---
   
   ### Issue 6: Test coverage decreased
   
   **Location**: Multiple test files  
   **Problem description**:
   Deleted the following tests:
   - `SchemaRegistryAwareProtobufDeserializationSchemaTest`
   - `FileSourceSplitCompatibilityTest`
   - `HdfsFileAccordingToSplitSizeSplitStrategyTest`
   - Multiple CDC heartbeat tests
   
   **Potential risks**:
   - Weakened code quality assurance
   - Future refactoring may introduce bugs
   
   **Impact scope**:
   - Direct impact: code quality
   - Affected area: overall test coverage
   
   **Severity**: **MAJOR**
   
   **Improvement suggestions**:
   ```
   如果删除测试是因为功能删除,应明确说明。
   如果删除测试是为了简化,应确保核心逻辑仍有测试覆盖。
   ```
   
   ---


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to