DanielCarter-stack commented on PR #10431:
URL: https://github.com/apache/seatunnel/pull/10431#issuecomment-3832468582
<!-- code-pr-reviewer -->
<!-- cpr:pr_reply_v2_parts {"group": "apache/seatunnel#10431", "part": 1,
"total": 1} -->
### Issue 1: Inconsistent URL Construction Leading to Path Format Mismatch
**Location**:
`seatunnel-common/src/main/java/org/apache/seatunnel/common/utils/PathResolver.java:75`
and `:132`
**Problem Description**:
`replacePathWithEnv` and `resolvePathEnv` use different methods to construct
URLs, resulting in inconsistent URL formats:
- `replacePathWithEnv`: `new URL(url.getProtocol(), url.getHost(),
url.getPort(), newPath)`
- For `file:` protocol, generates `file:$SEATUNNEL_HOME/lib/xxx.jar`
- `resolvePathEnv`: `fullPath.toUri().toURL()`
- Generates `file:///opt/seatunnel/lib/xxx.jar` (three slashes)
**Potential Risks**:
- When URLs travel between client and server multiple times (e.g., task
retry, recovery), path formats may be inconsistent
- Some code that depends on URL format (such as `jar.toURI().getPath()`) may
behave inconsistently under different formats
**Impact Scope**:
- Direct impact: Two methods of `PathResolver`
- Indirect impact: All code using `action.getJarUrls()` and
`DefaultClassLoaderService`
- Affected area: Core framework (ClassLoader)
**Severity**: MAJOR
**Improvement Suggestions**:
```java
// In replacePathWithEnv, keep URL format consistent
public static URL replacePathWithEnv(URL url) {
// ... existing logic ...
if (normalizedPath.startsWith(normalizedHome)) {
String relativePath =
normalizedPath.substring(normalizedHome.length());
// Consistent with resolvePathEnv: use URI constructor
String newPath = SEATUNNEL_HOME_VAR + "/" +
relativePath.replace(File.separatorChar, '/');
try {
// Use URI constructor to build URL, ensure format consistency
return new java.net.URI(url.getProtocol(), url.getHost(),
newPath, null).toURL();
} catch (MalformedURLException | URISyntaxException e) {
throw new RuntimeException("Failed to create logical URL for: "
+ url, e);
}
}
return url;
}
```
**Rationale**: Uniformly use URI to construct URLs, ensuring URL formats
generated by `replacePathWithEnv` and `resolvePathEnv` are consistent.
---
### Issue 2: Redundant Path Prefix Handling Logic in resolvePathEnv
**Location**:
`seatunnel-common/src/main/java/org/apache/seatunnel/common/utils/PathResolver.java:116-128`
**Problem Description**:
The code removes leading slashes from paths twice, resulting in redundant
logic and prone to errors:
1. First time: Remove leading slash from `path`
2. After replacing `$SEATUNNEL_HOME`
3. Second time: Remove leading slash from `relativePath`
**Potential Risks**:
- If the input path format does not meet expectations (e.g.,
`file:/$SEATUNNEL_HOME/lib/xxx.jar`), it may produce incorrect results
- Poor code readability and difficult to maintain
**Impact Scope**:
- Direct impact: `PathResolver.resolvePathEnv()`
- Indirect impact: Server-side ClassLoader creation
- Affected area: Core framework
**Severity**: MINOR
**Improvement Suggestions**:
```java
public static URL resolvePathEnv(URL url) {
String path = url.getPath();
if (!path.contains(SEATUNNEL_HOME_VAR)) {
return url;
}
String home = Common.getSeaTunnelHome();
if (StringUtils.isBlank(home)) {
return url;
}
// Handle path prefix uniformly
// Normalize path: remove leading slashes, replace variables
String cleanPath = path.startsWith("/") ? path.substring(1) : path;
String relativePath = cleanPath.replace(SEATUNNEL_HOME_VAR, "");
relativePath = relativePath.replaceAll("^/+", ""); // Remove all
leading slashes
Path fullPath = Paths.get(home, relativePath);
try {
return fullPath.toUri().toURL();
} catch (MalformedURLException e) {
throw new RuntimeException("Failed to resolve logical URL for: " +
url, e);
}
}
```
**Rationale**: Simplify logic, unify path prefix handling, improve code
readability and maintainability.
---
### Issue 3: PathResolver's Collection Modification Methods May Cause
Concurrency Issues
**Location**:
`seatunnel-common/src/main/java/org/apache/seatunnel/common/utils/PathResolver.java:43-51`
and `:89-97`
**Problem Description**:
`replacePathWithEnv(Collection<URL>)` and `resolvePathEnv(Collection<URL>)`
modify the passed-in collections:
```java
urls.clear();
urls.addAll(replaced);
```
**Potential Risks**:
1. If the caller passes a shared collection, it may lead to unexpected
modifications
2. If called in a multi-threaded environment, it may lead to
`ConcurrentModificationException`
3. In current code, `ClientJobExecutionEnvironment` uses `new HashSet<>()`
to create a new collection, so it is safe
4. But `replaceActionJarUrls(action.getJarUrls())` directly modifies the
internal collection of `Action`
**Impact Scope**:
- Direct impact: Collection methods of `PathResolver`
- Indirect impact: `ClientJobExecutionEnvironment` and `DAGUtils`
- Affected area: Client and server
**Severity**: MAJOR
**Improvement Suggestions**:
```java
/**
* Replaces the absolute path of SEATUNNEL_HOME in the given URLs with a
logical variable.
* Note: This method modifies the collection in-place. If you need to
preserve the original
* collection, create a copy before calling this method.
*
* @param urls The collection of absolute URLs (will be modified)
*/
public static void replacePathWithEnv(Collection<URL> urls) {
if (urls == null || urls.isEmpty()) {
return;
}
List<URL> replaced =
urls.stream().map(PathResolver::replacePathWithEnv).collect(Collectors.toList());
urls.clear();
urls.addAll(replaced);
}
// Or, provide a version that does not modify the collection
public static Collection<URL> replacePathWithEnvCopy(Collection<URL> urls) {
if (urls == null || urls.isEmpty()) {
return urls;
}
return urls.stream()
.map(PathResolver::replacePathWithEnv)
.collect(Collectors.toList());
}
```
**Rationale**:
1. Clearly state in JavaDoc that the method modifies the collection
2. Provide alternative methods that do not modify the collection, allowing
the caller to choose
3. Or, change to not modify the collection, letting the caller handle the
return value
---
### Issue 4: DefaultClassLoaderService's Error Message Is Misleading
**Location**:
`seatunnel-engine/seatunnel-engine-core/src/main/java/org/apache/seatunnel/engine/core/classloader/DefaultClassLoaderService.java:78-86`
**Problem Description**:
When a JAR file does not exist, the error message mentions:
```
please ensure that the deployment paths of SeaTunnel on different nodes are
consistent.
```
But in the new design, client and server paths can be different (decoupled
through SEATUNNEL_HOME).
**Potential Risks**:
- Users seeing the error message will try to make all nodes' paths consistent
- But the root cause of the problem may be:
1. Server-side SEATUNNEL_HOME is configured incorrectly
2. Server is missing certain JAR files
3. Path resolution failure
**Impact Scope**:
- Direct impact: `DefaultClassLoaderService`
- Indirect impact: User efficiency in troubleshooting issues
- Affected area: Operations and debugging
**Severity**: MINOR
**Improvement Suggestions**:
```java
if (!file.exists()) {
String host = ((NodeEngineImpl)
nodeEngine).getNode().getThisAddress().getHost();
String jarPath = jar.toString();
String errorMsg = String.format(
"The jar file %s cannot be found on node %s. " +
"Current SEATUNNEL_HOME: %s. " +
"Please ensure: " +
"1) SEATUNNEL_HOME is correctly configured on this node, " +
"2) The connector JARs are installed in $SEATUNNEL_HOME/connectors
or $SEATUNNEL_HOME/plugins, " +
"3) Run 'bin/install-plugin.sh' to install connectors if needed.",
jarPath, host, Common.getSeaTunnelHome());
throw new ClassLoaderException(ClassLoaderErrorCode.NOT_FOUND_JAR,
errorMsg);
}
```
**Rationale**: Provide more accurate and helpful error messages to help
users quickly locate problems.
---
### Issue 5: PathResolver's Exception Handling Is Too Broad
**Location**:
`seatunnel-common/src/main/java/org/apache/seatunnel/common/utils/PathResolver.java:77`
and `:134`
**Problem Description**:
Wrapping `MalformedURLException` as `RuntimeException`, error messages are
not detailed enough.
**Potential Risks**:
- Callers cannot obtain sufficient context information to diagnose problems
- Cannot distinguish which URL processing failed
- Cannot provide targeted fix suggestions
**Impact Scope**:
- Direct impact: `PathResolver`
- Indirect impact: All callers
- Affected area: Core framework
**Severity**: MINOR
**Improvement Suggestions**:
```java
public static URL replacePathWithEnv(URL url) {
// ... existing logic ...
try {
return new URL(url.getProtocol(), url.getHost(), url.getPort(),
newPath);
} catch (MalformedURLException e) {
throw new RuntimeException(
String.format(
"Failed to create logical URL. Original URL: %s, New path:
%s, " +
"SEATUNNEL_HOME: %s. Please check if the URL format is
valid.",
url, newPath, Common.getSeaTunnelHome()),
e);
}
}
public static URL resolvePathEnv(URL url) {
// ... existing logic ...
try {
return fullPath.toUri().toURL();
} catch (MalformedURLException e) {
throw new RuntimeException(
String.format(
"Failed to resolve logical URL. Logical URL: %s, Resolved
path: %s, " +
"SEATUNNEL_HOME: %s. Please check if SEATUNNEL_HOME is
correctly configured.",
url, fullPath, Common.getSeaTunnelHome()),
e);
}
}
```
**Rationale**: Provide more detailed error information, including original
URL, processed path, SEATUNNEL_HOME value, etc., to help diagnose problems.
---
### Issue 6: Missing Handling for SEATUNNEL_HOME Being Empty or Containing
Special Characters
**Location**:
`seatunnel-common/src/main/java/org/apache/seatunnel/common/utils/PathResolver.java`
**Problem Description**:
The code checks for `StringUtils.isBlank(home)`, but does not handle the
following cases:
1. SEATUNNEL_HOME contains spaces (e.g., `C:/Program Files/SeaTunnel`)
2. SEATUNNEL_HOME contains special characters (e.g., Chinese, URL-encoded
characters)
3. SEATUNNEL_HOME is a relative path (e.g., `../seatunnel`)
**Potential Risks**:
- On Windows systems, if SEATUNNEL_HOME contains spaces, the constructed URL
may be invalid
- If the path contains non-ASCII characters, it may cause
serialization/deserialization failures
**Impact Scope**:
- Direct impact: `PathResolver`
- Indirect impact: Cross-platform compatibility
- Affected area: Windows users and non-English environment users
**Severity**: MINOR
**Improvement Suggestions**:
```java
public static URL resolvePathEnv(URL url) {
String path = url.getPath();
if (!path.contains(SEATUNNEL_HOME_VAR)) {
return url;
}
String home = Common.getSeaTunnelHome();
if (StringUtils.isBlank(home)) {
return url;
}
// Normalize home path
try {
Path normalizedHome = Paths.get(home).normalize();
if (!Files.exists(normalizedHome)) {
throw new RuntimeException(
String.format("SEATUNNEL_HOME does not exist: %s", home));
}
home = normalizedHome.toAbsolutePath().toString();
} catch (InvalidPathException e) {
throw new RuntimeException(
String.format("Invalid SEATUNNEL_HOME: %s. Error: %s", home,
e.getMessage()), e);
}
// ... continue processing ...
}
```
**Rationale**: Validate and normalize SEATUNNEL_HOME before processing
paths, detecting configuration issues early.
---
### Issue 7: Incomplete Test Coverage
**Location**:
`seatunnel-common/src/test/java/org/apache/seatunnel/common/utils/PathResolverTest.java`
**Problem Description**:
Although existing tests cover basic functionality, the following edge case
tests are missing:
1. Path exactly equals SEATUNNEL_HOME (`relativePath` is empty)
2. JAR file is in SEATUNNEL_HOME root directory (e.g.,
`$SEATUNNEL_HOME/xxx.jar`)
3. SEATUNNEL_HOME contains special characters (spaces, Chinese, etc.)
4. Abnormal URL format (e.g., `file:/` followed by no path)
5. Windows long paths (e.g., `\\?\C:\...`)
**Potential Risks**:
- Under certain edge conditions, code may produce unexpected results
- Users may encounter issues in production when facing untested scenarios
**Impact Scope**:
- Direct impact: Test coverage
- Indirect impact: Code quality
- Affected area: All code using PathResolver
**Severity**: MINOR
**Improvement Suggestions**:
```java
@Test
public void testJarInRootOfSeaTunnelHome() throws MalformedURLException {
String home = "/opt/seatunnel";
System.setProperty("SEATUNNEL_HOME", home);
Common.setSeaTunnelHome(home);
String jarPath = home + "/plugin.jar";
URL absoluteUrl = new File(jarPath).toURI().toURL();
URL logicalUrl = PathResolver.replacePathWithEnv(absoluteUrl);
Assertions.assertEquals("$SEATUNNEL_HOME/plugin.jar",
logicalUrl.getPath());
}
@Test
public void testSeaTunnelHomeWithSpaces() throws MalformedURLException {
String home = "/opt/sea tunnel/home";
System.setProperty("SEATUNNEL_HOME", home);
Common.setSeaTunnelHome(home);
String jarPath = home + "/lib/test.jar";
URL absoluteUrl = new File(jarPath).toURI().toURL();
URL logicalUrl = PathResolver.replacePathWithEnv(absoluteUrl);
Assertions.assertTrue(logicalUrl.getPath().contains("$SEATUNNEL_HOME"));
}
@Test
public void testEmptyRelativePath() throws MalformedURLException {
// Test when path exactly equals SEATUNNEL_HOME
String home = "/opt/seatunnel";
System.setProperty("SEATUNNEL_HOME", home);
Common.setSeaTunnelHome(home);
// This case is not very practical, but test boundary conditions
URL absoluteUrl = new File(home).toURI().toURL();
URL logicalUrl = PathResolver.replacePathWithEnv(absoluteUrl);
// Should return the original URL, because JAR file should not be a
directory
Assertions.assertEquals(absoluteUrl, logicalUrl);
}
```
**Rationale**: Improve test coverage to ensure code works correctly under
various edge conditions.
---
### Issue 8: Missing Documentation
**Location**: PR overall
**Problem Description**:
1. `PathResolver` class lacks class-level JavaDoc, does not explain its
purpose and usage scenarios
2. PR does not include user documentation updates (e.g., configuration
instructions for separate deployment)
3. Code comments mention "Handle Windows backslashes", but do not explain
the complete cross-platform support solution
**Potential Risks**:
- Developers do not know when to use `replacePathWithEnv` and when to use
`resolvePathEnv`
- Users do not know how to configure SEATUNNEL_HOME for separate deployment
- Maintainers find it difficult to understand the design intent of the code
**Impact Scope**:
- Direct impact: Maintainability
- Indirect impact: User adoption rate
- Affected area: All users and developers
**Severity**: MINOR
**Improvement Suggestions**:
```java
/**
* Utility class for resolving SeaTunnel paths between client and server.
*
* <p>In a distributed deployment where the client and server have different
* SEATUNNEL_HOME paths, JAR file paths must be translated between the two
environments.
* This class provides methods to:
* <ul>
* <li>Replace absolute paths with logical variables (on the client
side)</li>
* <li>Resolve logical variables back to absolute paths (on the server
side)</li>
* </ul>
*
* <h3>Usage Example</h3>
* <pre>
* // Client side (before serialization)
* URL clientJar = new URL("file:/opt/client/lib/connector.jar");
* URL logicalJar = PathResolver.replacePathWithEnv(clientJar);
* // logicalJar is now: file:$SEATUNNEL_HOME/lib/connector.jar
*
* // Server side (after deserialization)
* URL serverJar = PathResolver.resolvePathEnv(logicalJar);
* // serverJar is now: file:/opt/server/lib/connector.jar
* </pre>
*
* @since 2.3.5
*/
public class PathResolver {
// ...
}
```
**Rationale**: Provide clear class-level documentation explaining purpose,
usage scenarios, and examples.
---
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]