DanielCarter-stack commented on PR #10419:
URL: https://github.com/apache/seatunnel/pull/10419#issuecomment-3831140300
<!-- code-pr-reviewer -->
<!-- cpr:pr_reply_v2_parts {"group": "apache/seatunnel#10419", "part": 1,
"total": 1} -->
### Issue 1: Deleting public constants causes compilation failure
**Location**:
-
`seatunnel-core/seatunnel-flink-starter/seatunnel-flink-13-starter/src/main/java/org/apache/seatunnel/core/starter/flink/FlinkStarter.java:26`
-
`seatunnel-core/seatunnel-flink-starter/seatunnel-flink-20-starter/src/main/java/org/apache/seatunnel/core/starter/flink/FlinkStarter.java:26`
**Related context**:
```java
// FlinkStarter.java (FLINK13/FLINK20) - After modification
public class FlinkStarter extends AbstractFlinkStarter {
// public static final String APP_JAR_NAME =
EngineType.FLINK13.getStarterJarName(); // Deleted
FlinkStarter(String[] args) {
super(args, EngineType.FLINK13);
}
}
// FlinkStarter.java (FLINK15) - Unmodified
public class FlinkStarter extends AbstractFlinkStarter {
public static final String APP_JAR_NAME =
EngineType.FLINK15.getStarterJarName();
// ...
}
// FlinkExecution.java - Caller
new File(
Common.appStarterDir()
.resolve(FlinkStarter.APP_JAR_NAME) // ← References the deleted
constant
.toString())
```
**Problem description**:
The PR deleted the `public static final String APP_JAR_NAME` constant in
FLINK13 and FLINK20, but this constant is referenced by `FlinkExecution.java`.
This leads to:
1. Compilation error: `FlinkStarter.APP_JAR_NAME` cannot be resolved
2. FLINK15 version retains the constant, causing inconsistent behavior
across the three Flink versions
**Potential risks**:
- Risk 1: Code cannot compile
- Risk 2: Even if compilation succeeds, runtime behavior of FLINK13/FLINK20
is inconsistent with FLINK15
- Risk 3: Violates the principle of minimal changes; this deletion is
unrelated to the PR's stated objective
**Impact scope**:
- Direct impact: Task execution paths for Flink 1.13 and 1.20
- Indirect impact: All user jobs using Flink 1.13/1.20
- Impact area: Core framework - Blocking issue
**Severity**: **CRITICAL**
**Improvement suggestions**:
Two fix approaches:
**Approach A**: Restore the deleted constants (recommended)
```java
// FlinkStarter.java (FLINK13/FLINK20) - Restored
public class FlinkStarter extends AbstractFlinkStarter {
public static final String APP_JAR_NAME =
EngineType.FLINK13.getStarterJarName();
FlinkStarter(String[] args) {
super(args, EngineType.FLINK13);
}
}
```
**Approach B**: Refactor FlinkExecution.java dependency injection
```java
// FlinkExecution.java - Modify constructor
public FlinkExecution(Config config, EngineType engineType) {
try {
jarPaths = new ArrayList<>(
Collections.singletonList(
new File(
Common.appStarterDir()
.resolve(engineType.getStarterJarName()) // Use
engineType parameter
.toString())
.toURI()
.toURL()));
} catch (MalformedURLException e) {
throw new SeaTunnelException("load flink starter error.", e);
}
// ...
}
// Also modify all places that call FlinkExecution, pass in engineType
parameter
```
**Rationale**:
- Approach A has minimal risk and maintains backward compatibility
- Approach B better follows the dependency inversion principle but requires
more extensive refactoring
- Given the PR's goal is to fix the runtime.tar.gz path issue, unrelated
major refactoring should not be introduced
- **Must restore the deleted constants**
**Confidence**: High - This is a definitive compilation error
---
### Issue 2: Incomplete error handling in shell scripts
**Location**:
-
`seatunnel-core/seatunnel-flink-starter/seatunnel-flink-13-starter/src/main/bin/start-seatunnel-flink-13-connector-v2.sh:47`
-
`seatunnel-core/seatunnel-flink-starter/seatunnel-flink-15-starter/src/main/bin/start-seatunnel-flink-15-connector-v2.sh:47`
-
`seatunnel-core/seatunnel-flink-starter/seatunnel-flink-20-starter/src/main/bin/start-seatunnel-flink-20-connector-v2.sh:47`
**Related context**:
```bash
if [ ! -f "${APP_DIR}/runtime.tar.gz" ];then
cd "${APP_DIR}" # ← 如果失败,脚本会因 set -e 终止
directories=("connectors" "lib" "plugins")
# ...
tar -zcvf runtime.tar.gz "${existing_dirs[@]}"
cd - # ← 如果失败,工作目录会被改变
fi
```
**Problem description**:
Although `set -eu` is used (exit immediately on error), edge cases exist:
1. If `cd "${APP_DIR}"` fails, the script terminates (this is correct
behavior)
2. But if `cd -` fails (rare, but possible under restricted permissions),
the user's working directory will be permanently changed
**Potential risks**:
- Risk 1: User's working directory is unintentionally changed
- Risk 2: May cause subsequent commands to execute in the wrong directory in
edge cases
**Impact scope**:
- Direct impact: Users running the startup script
- Indirect impact: Subsequent script executions
- Impact area: Individual user session
**Severity**: **MINOR**
**Improvement suggestions**:
```bash
if [ ! -f "${APP_DIR}/runtime.tar.gz" ];then
cd "${APP_DIR}" || {
echo "ERROR: Failed to change directory to ${APP_DIR}" >&2
exit 1
}
directories=("connectors" "lib" "plugins")
existing_dirs=()
for dir in "${directories[@]}"; do
if [ -d "$dir" ]; then
existing_dirs+=("$dir")
fi
done
if [ ${#existing_dirs[@]} -eq 0 ]; then
echo "[${directories[@]}] not existed, skip generate runtime.tar.gz"
else
tar -zcvf runtime.tar.gz "${existing_dirs[@]}" || {
echo "ERROR: Failed to create runtime.tar.gz" >&2
cd - >/dev/null 2>&1
exit 1
}
fi
cd - >/dev/null 2>&1 || echo "WARNING: Failed to restore working
directory" >&2
fi
```
**Rationale**:
- Explicitly handle errors and provide clear error messages
- Ensure `cd -` does not fail silently when it fails
- Use `>/dev/null 2>&1` to suppress normal output of `cd -`
**Confidence**: Medium - This is an edge case, but the risk does exist
---
### Issue 3: Inconsistent path concatenation methods
**Location**:
-
`seatunnel-core/seatunnel-flink-starter/seatunnel-flink-starter-common/src/main/java/org/apache/seatunnel/core/starter/flink/AbstractFlinkStarter.java:69-70`
**Related context**:
```java
// Modified code
command.add(
String.format(
"-Dyarn.ship-archives=\"%s/%s\"",
Common.getSeaTunnelHome(), Common.FLINK_YARN_APPLICATION_PATH));
// Compare path concatenation methods in other places (Common.java)
public static Path pluginRootDir() {
return Paths.get(getSeaTunnelHome(), "plugins");
}
public static Path connectorDir() {
return Paths.get(getSeaTunnelHome(), "connectors");
}
```
**Problem description**:
1. Uses `String.format` to manually concatenate paths (`"%s/%s"`)
2. While other places in the project use `Paths.get()` or `Path.resolve()`
for path concatenation
3. This leads to inconsistent path concatenation methods, and:
- May cause issues on Windows (different path separators)
- Loses the type safety and cross-platform compatibility of the `Path` API
**Potential risks**:
- Risk 1: Windows platform path separator incompatibility (`/` vs `\`)
- Risk 2: If `getSeaTunnelHome()` returns a path ending with `/`, it will
result in double slashes
- Risk 3: Inconsistent code style increases maintenance costs
**Impact scope**:
- Direct impact: YARN Application mode
- Indirect impact: Windows users (if any)
- Impact area: Core framework - Cross-platform compatibility
**Severity**: **MAJOR**
**Improvement suggestions**:
```java
// Option 1: Use Path.resolve() (Recommended)
command.add(
String.format(
"-Dyarn.ship-archives=\"%s\"",
Paths.get(Common.getSeaTunnelHome(),
Common.FLINK_YARN_APPLICATION_PATH)));
// Option 2: Define a helper method
public abstract class AbstractFlinkStarter implements Starter {
private static String buildArchivePath() {
return Paths.get(Common.getSeaTunnelHome(),
Common.FLINK_YARN_APPLICATION_PATH)
.toString();
}
// Then use in buildCommands()
command.add(
String.format(
"-Dyarn.ship-archives=\"%s\"",
buildArchivePath()));
}
```
**Rationale**:
- Using `Paths.get()` is the standard Java NIO practice
- Automatically handles cross-platform path separator issues
- Maintains consistency with path handling methods elsewhere in the project
- Type-safe, avoids string concatenation errors
**Confidence**: High - Cross-platform compatibility issues do exist
---
### Issue 4: Constant access modifier change not documented
**Location**:
-
`seatunnel-common/src/main/java/org/apache/seatunnel/common/config/Common.java:41`
**Related context**:
```java
// Before modification
private static final String FLINK_YARN_APPLICATION_PATH = "runtime.tar.gz";
// After modification
public static final String FLINK_YARN_APPLICATION_PATH = "runtime.tar.gz";
```
**Problem description**:
Changing an `private` constant to `public` is an API change, but:
1. The PR description does not mention this change
2. `incompatible-changes.md` was not updated (explicitly required in the PR
template)
3. No explanation of why this constant needs to be exposed
**Potential risks**:
- Risk 1: External code may start depending on this constant, increasing
future refactoring costs
- Risk 2: Violates the principle of information hiding
- Risk 3: Users don't know this is a public API and may be inadvertently
broken in future versions
**Impact scope**:
- Direct impact: All code that can access the `Common` class
- Indirect impact: Future API evolution
- Impact area: Public API
**Severity**: **MINOR**
**Improvement suggestions**:
**Approach A**: If exposure is necessary, add documentation
```java
/**
* The relative path of the runtime tarball for Flink YARN Application mode.
* This file contains connectors, lib, and plugins directories.
*
* @since 2.3.5
*/
public static final String FLINK_YARN_APPLICATION_PATH = "runtime.tar.gz";
```
**Approach B**: Don't expose the constant, provide an accessor method
(recommended)
```java
// Keep private
private static final String FLINK_YARN_APPLICATION_PATH = "runtime.tar.gz";
// Add accessor method
public static Path getFlinkYarnApplicationPath() {
return Paths.get(getSeaTunnelHome(), FLINK_YARN_APPLICATION_PATH);
}
// Use in AbstractFlinkStarter
command.add(
String.format(
"-Dyarn.ship-archives=\"%s\"",
Common.getFlinkYarnApplicationPath()));
```
**Rationale**:
- Approach A exposes internal implementation details
- Approach B provides better encapsulation
- The method name `getFlinkYarnApplicationPath()` is clearer than
`FLINK_YARN_APPLICATION_PATH`
**Confidence**: Medium - This is an API design issue, but impact is limited
---
### Issue 5: Missing unit tests
**Location**: N/A (missing files)
**Problem description**:
The PR modifies core path construction logic but does not add unit tests to
verify:
1. Shell script logic for `cd` and `tar`
2. `AbstractFlinkStarter.buildCommands()` behavior in YARN_APPLICATION mode
3. Correctness of path construction
**Potential risks**:
- Risk 1: Future modifications may break this fix
- Risk 2: Cannot automatically verify whether the fix is effective
- Risk 3: Regression issues may be introduced in subsequent commits
**Impact scope**:
- Direct impact: Code quality and maintainability
- Indirect impact: Stability of future versions
- Impact area: Test coverage
**Severity**: **MAJOR**
**Improvement suggestions**:
**Test 1**: Java unit tests
```java
// AbstractFlinkStarterTest.java
@Test
public void testBuildCommandsWithYarnApplicationMode() {
// Arrange
String[] args = {"--config", "/path/to/config.conf", "--target",
"yarn-application"};
FlinkStarter starter = new FlinkStarter(args);
// Act
List<String> commands = starter.buildCommands();
// Assert
assertTrue(commands.contains("-Dyarn.ship-archives=\"" +
Common.getSeaTunnelHome() + "/runtime.tar.gz\""));
}
```
**Test 2**: Shell script tests (BATS)
```bash
@test "runtime.tar.gz created in APP_DIR" {
cd /tmp
run /opt/soft/seatunnel/bin/start-seatunnel-flink-13-connector-v2.sh
--config /opt/soft/seatunnel/config/test.conf
[ -f "/opt/soft/seatunnel/runtime.tar.gz" ]
[ ! -f "/tmp/runtime.tar.gz" ]
}
```
**Rationale**:
- Apache top-level projects should have complete test coverage
- Core path logic must have automated tests
- Prevent future regression issues
**Confidence**: High - Missing tests is unacceptable
---
### Issue 6: Inconsistent shell script output
**Location**:
-
`seatunnel-core/seatunnel-flink-starter/seatunnel-flink-13-starter/src/main/bin/start-seatunnel-flink-13-connector-v2.sh:59`
- The same issue exists in FLINK15 and FLINK20 versions
**Problem description**:
Although the PR improves log output, it introduces a minor issue:
```bash
# Before
echo "[connectors,lib,plugins] not existed, skip generate runtime.tar.gz"
# After
echo "[${directories[@]}] not existed, skip generate runtime.tar.gz"
# Expanded becomes:
echo "[connectors lib plugins] not existed, skip generate runtime.tar.gz"
```
**Issue**:
- Before modification, comma separation was used, more readable
- After modification, space separation is used (default behavior of bash
array expansion)
- While not a serious issue, it may affect log parsing tools
**Potential risks**:
- Risk 1: Log parsing tools may depend on comma separators
- Risk 2: Output format is less clear than before
**Impact scope**:
- Direct impact: Log output
- Indirect impact: Log monitoring tools
- Impact area: Shell scripts
**Severity**: **MINOR**
**Improvement suggestions**:
```bash
# Keep comma-separated
if [ ${#existing_dirs[@]} -eq 0 ]; then
echo "[connectors, lib, plugins] not existed, skip generate
runtime.tar.gz"
fi
# Or use IFS to temporarily modify
if [ ${#existing_dirs[@]} -eq 0 ]; then
(IFS=','; echo "[${directories[*]}] not existed, skip generate
runtime.tar.gz")
fi
```
**Rationale**:
- Maintain backward compatibility of output format
- More readable
**Confidence**: Low - This is just a formatting issue and does not affect
functionality
---
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]