DanielCarter-stack commented on PR #10372:
URL: https://github.com/apache/seatunnel/pull/10372#issuecomment-3773116547
<!-- code-pr-reviewer -->
<!-- cpr:pr_reply_v2_parts {"group": "apache/seatunnel#10372", "part": 1,
"total": 1} -->
### Issue 1: Inconsistent URL option references in JDBC Connector (BLOCKER)
**Location**:
-
`seatunnel-connectors-v2/connector-cdc/connector-cdc-mysql/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/mysql/source/MySqlIncrementalSource.java:88`
-
`seatunnel-connectors-v2/connector-cdc/connector-cdc-postgres/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/postgres/source/PostgresIncrementalSource.java:85`
-
`seatunnel-connectors-v2/connector-cdc/connector-cdc-sqlserver/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/sqlserver/source/SqlServerIncrementalSource.java:76`
**Context Code**:
```java
// MySqlIncrementalSource.java line 88
JdbcUrlUtil.UrlInfo urlInfo =
JdbcUrlUtil.getUrlInfo(config.get(JdbcCommonOptions.URL));
```
**But defined in Factory**:
```java
// MySqlIncrementalSourceFactory.java
.required(
MySqlIncrementalSourceOptions.USERNAME,
MySqlIncrementalSourceOptions.PASSWORD,
MySqlIncrementalSourceOptions.URL) // 使用新的URL
```
**Issue Description**:
The PR adds a new `URL` option (withFallbackKeys "base-url") in
`SourceOptions`, and references `XxxIncrementalSourceOptions.URL` (inherited
from `SourceOptions.URL`) in the `optionRule()` of each Connector's Factory.
However, in the actual `IncrementalSource` implementation, it still uses
`JdbcCommonOptions.URL` to read the configuration.
Although `JdbcCommonOptions.URL` and `SourceOptions.URL` have the same
definition (both use key "url" withFallbackKeys "base-url"), this inconsistency
will lead to:
1. Code confusion - developers are unclear which URL option should be used
2. If `JdbcCommonOptions.URL` and `SourceOptions.URL` definitions become out
of sync in the future, configuration read errors will occur
3. Violates the "Single Source of Truth" principle
**Potential Risks**:
- If one URL definition is modified in the future without synchronizing the
other, it will result in configuration validation passing but runtime reading
failing
- Difficult code maintenance, confusing for newcomers
**Impact Scope**:
- Direct impact: MySQL CDC, PostgreSQL CDC, SQL Server CDC
- Indirect impact: All users relying on these Connectors
- Affected area: Multiple Connectors
**Severity**: BLOCKER
**Improvement Suggestions**:
Option A: Uniformly use `SourceOptions.URL`
```java
// MySqlIncrementalSource.java line 88
JdbcUrlUtil.UrlInfo urlInfo =
JdbcUrlUtil.getUrlInfo(config.get(SourceOptions.URL));
```
Option B: Re-export URL in each `XxxIncrementalSourceOptions`
```java
// 在MySqlIncrementalSourceOptions中添加
public static final Option<String> URL = SourceOptions.URL;
// 然后使用
JdbcUrlUtil.UrlInfo urlInfo =
JdbcUrlUtil.getUrlInfo(config.get(MySqlIncrementalSourceOptions.URL));
```
**Rationale**: Ensure consistency between Option definition and usage, avoid
future maintenance issues.
---
### Issue 2: Oracle IncrementalSource uses inconsistent URL reference (MAJOR)
**Location**:
-
`seatunnel-connectors-v2/connector-cdc/connector-cdc-oracle/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/oracle/source/OracleIncrementalSource.java:92`
**Context Code**:
```java
// OracleIncrementalSource.java line 92
configFactory.originUrl(config.get(JdbcCommonOptions.URL));
```
**But defined in Factory**:
```java
// OracleIncrementalSourceFactory.java
.optional(OracleIncrementalSourceOptions.URL) // 使用继承的URL
```
**Issue Description**:
Similar to other JDBC Connectors, Oracle references
`OracleIncrementalSourceOptions.URL` in Factory but uses
`JdbcCommonOptions.URL` in Source implementation. However, Oracle's special
characteristic is that unlike other Connectors, it doesn't need to parse the
URL, but directly passes originUrl.
**Potential Risks**:
- Same risks as Issue 1
- Oracle's URL handling approach is inconsistent with other Connectors,
increasing maintenance complexity
**Impact Scope**:
- Direct impact: Oracle CDC Connector
- Indirect impact: Oracle CDC users
- Affected area: Single Connector
**Severity**: MAJOR
**Improvement Suggestions**:
```java
// OracleIncrementalSource.java line 92
configFactory.originUrl(config.get(OracleIncrementalSourceOptions.URL));
```
**Rationale**: Same as Issue 1, consistency needs to be maintained.
---
### Issue 3: Incomplete MongoDB constant migration (MINOR)
**Location**:
-
`seatunnel-connectors-v2/connector-cdc/connector-cdc-mongodb/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/mongodb/config/MongodbSourceOptions.java`
(deleted)
**Issue Description**:
The PR creates `MongodbSourceConstants` class and moves constants into it,
but needs confirmation:
1. Were all constants migrated?
2. Is there other code directly referencing the old
`MongodbSourceOptions.常量`?
From the diff, references in the utils class have been updated, but need to
confirm nothing is missing.
**Potential Risks**:
- If there are missing references, compilation errors will occur
- Constant visibility may change
**Impact Scope**:
- Direct impact: MongoDB CDC Connector
- Affected area: Single Connector
**Severity**: MINOR
**Improvement Suggestions**:
Need to add checks in CI to ensure no missing references.
---
### Issue 4: Postgres/Opengauss Factory using FQCN indicates architectural
issues (MINOR)
**Location**:
-
`seatunnel-connectors-v2/connector-cdc/connector-cdc-postgres/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/postgres/source/PostgresIncrementalSourceFactory.java:50,84,109`
-
`seatunnel-connectors-v2/connector-cdc/connector-cdc-opengauss/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/opengauss/OpengaussIncrementalSourceFactory.java:86,111`
**Context Code**:
```java
// PostgresIncrementalSourceFactory.java line 50
return org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source
.PostgresIncrementalSource.IDENTIFIER;
// line 84
return org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source
.PostgresIncrementalSource.class;
// line 109
new org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source
.PostgresIncrementalSource<>(context.getOptions(), catalogTables);
```
**Issue Description**:
Using fully qualified class names (FQCN) instead of simple imports usually
indicates:
1. Circular dependencies exist
2. Package structure is unreasonable
3. Class name conflicts
This reduces code readability.
**Potential Risks**:
- Decreased code readability
- May exist deeper design issues
**Impact Scope**:
- Direct impact: PostgreSQL CDC, OpenGauss CDC
- Affected area: Multiple Connectors
**Severity**: MINOR
**Improvement Suggestions**:
Refactor package structure to eliminate circular dependencies, allow using
normal imports.
---
### Issue 5: Missing test validation (MAJOR)
**Location**: Entire PR
**Issue Description**:
The PR performs large-scale refactoring, but lacks tests:
1. No validation that the new Options class inheritance relationship is
correct
2. No validation that URL configuration actually reads normally
3. No validation of configuration backward compatibility
4. Only modified `ConnectorOptionCheckTest` to remove whitelist, didn't add
new validation
**Potential Risks**:
- Configuration read failures may occur at runtime
- May break existing user configurations
- Difficult to discover regression issues
**Impact Scope**:
- Direct impact: All CDC Connectors
- Indirect impact: All CDC users
- Affected area: Multiple Connectors
**Severity**: MAJOR
**Improvement Suggestions**:
Add unit tests:
```java
@Test
public void testMySqlIncrementalSourceOptionsInheritance() {
// 验证继承关系
assertTrue(JdbcSourceOptions.class.isAssignableFrom(MySqlIncrementalSourceOptions.class));
assertTrue(SourceOptions.class.isAssignableFrom(MySqlIncrementalSourceOptions.class));
// 验证URL选项可用
ReadonlyConfig config = ReadonlyConfig.fromMap(Map.of("url",
"jdbc:mysql://localhost:3306/test"));
assertEquals("jdbc:mysql://localhost:3306/test",
config.get(MySqlIncrementalSourceOptions.URL));
}
@Test
public void testMongodbIncrementalSourceOptionsInheritance() {
// 验证继承关系
assertTrue(SourceOptions.class.isAssignableFrom(MongodbIncrementalSourceOptions.class));
assertFalse(JdbcSourceOptions.class.isAssignableFrom(MongodbIncrementalSourceOptions.class));
}
```
**Rationale**: Ensure correctness and stability of refactoring.
---
### Issue 6: Inaccurate PR description (MINOR)
**Location**: PR description
**Issue Description**:
The PR description states "Does this PR introduce any user-facing change?
no", but actually:
1. Multiple public classes were deleted (`MySqlSourceOptions`, etc.)
2. MongoDB constants were moved to a new class
3. These are API changes and will affect external code
**Potential Risks**:
- Users may encounter compilation errors after upgrade
- Inaccurate documentation
**Impact Scope**:
- Impact: All code directly using CDC API
- Affected area: API users
**Severity**: MINOR
**Improvement Suggestions**:
Update PR description to:
```
Does this PR introduce any user-facing change?
Yes. This is a breaking change for code that directly references:
- MySqlSourceOptions (renamed to MySqlIncrementalSourceOptions)
- OracleSourceOptions (renamed to OracleIncrementalSourceOptions)
- PostgresOptions (renamed to PostgresIncrementalSourceOptions)
- SqlServerSourceOptions (renamed to SqlServerIncrementalSourceOptions)
- MongodbSourceOptions (renamed to MongodbIncrementalSourceOptions)
- MongodbSourceOptions constants (moved to MongodbSourceConstants)
User configurations are not affected.
```
And document this change in `incompatible-changes.md`.
---
### Issue 7: MongoDB splitMetaGroupSize configuration option deleted (MAJOR)
**Location**:
-
`seatunnel-connectors-v2/connector-cdc/connector-cdc-mongodb/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/mongodb/MongodbIncrementalSource.java:89-91`
**Code Before Change**:
```java
Optional.ofNullable(config.get(MongodbSourceOptions.HEARTBEAT_INTERVAL_MILLIS))
.ifPresent(builder::heartbeatIntervalMillis);
Optional.ofNullable(config.get(MongodbSourceOptions.HEARTBEAT_INTERVAL_MILLIS))
.ifPresent(builder::splitMetaGroupSize);
```
**Code After Change**:
```java
Optional.ofNullable(config.get(MongodbIncrementalSourceOptions.HEARTBEAT_INTERVAL_MILLIS))
.ifPresent(builder::heartbeatIntervalMillis);
// splitMetaGroupSize配置行被删除
```
**Issue Description**:
The `splitMetaGroupSize` configuration option was removed, but from the diff:
1. `MongodbSourceConfigProvider.Builder` still has `splitMetaGroupSize`
method
2. Deleting this line means users can no longer set this parameter through
configuration
3. Need to confirm if this change was an intentional feature removal
**Potential Risks**:
- If `splitMetaGroupSize` has a default value and users could override it
before, they cannot override it now
- May affect performance tuning
**Impact Scope**:
- Direct impact: MongoDB CDC Connector
- Indirect impact: Users using this parameter for performance tuning
- Affected area: Single Connector
**Severity**: MAJOR
**Improvement Suggestions**:
1. If intentionally deleted, need to explain in PR description and changelog
2. If accidentally deleted, need to restore:
```java
Optional.ofNullable(config.get(MongodbIncrementalSourceOptions.SPLIT_META_GROUP_SIZE))
.ifPresent(builder::splitMetaGroupSize);
```
And define this Option in `MongodbIncrementalSourceOptions`.
**Rationale**: Need to clarify if this is a breaking change or a bug.
---
## Part 4: Overall Assessment
### Can this be merged
**Conclusion**: ❌ **Not recommended to merge in current state**
**Rationale**:
1. **BLOCKER issues**: Issue 1 (URL option inconsistency) must be fixed
before merging, otherwise will leave serious technical debt
2. **MAJOR issues**: Issues 2, 5, 7 need to be resolved or explicitly
explained before merging
3. **Insufficient testing**: Lacks validation tests for the new architecture
### Improvement Priority
**P0 (Must Fix)**:
- Issue 1: Unify URL option references
- Issue 7: Confirm or restore splitMetaGroupSize configuration
**P1 (Strongly Recommended to Fix)**:
- Issue 2: Oracle's URL reference consistency
- Issue 5: Add test validation
- Issue 6: Update PR description and documentation
**P2 (Recommended to Fix)**:
- Issue 3: Verify MongoDB constant migration completeness
- Issue 4: Refactor to eliminate FQCN usage
### Architecture Recommendations
In the long term, recommend:
1. Unify URL handling approach for all JDBC CDC Connectors
2. Consider migrating `JdbcCommonOptions.URL` to `JdbcSourceOptions.URL` and
deprecating the former
3. Re-examine PostgreSQL/Opengauss package structure to eliminate circular
dependencies
### Summary
This is a PR with **correct direction but incomplete execution**. The idea
of establishing a CDC Options inheritance system is good, but there are
critical inconsistency issues during implementation, especially the URL option
dual reference problem. Recommend the author fix the above issues before
merging.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]