Re: [PR] [Feature] Decouple SeaTunnel Client and Server SEATUNNEL_HOME [seatunnel]

2026-02-04 Thread via GitHub


LiJie20190102 commented on PR #10431:
URL: https://github.com/apache/seatunnel/pull/10431#issuecomment-3850657638

   The latest code test results:
   https://github.com/user-attachments/assets/474ae692-781a-4fea-b841-3980154cfb44";
 />
   https://github.com/user-attachments/assets/b9fbd986-17a9-44f6-9bc6-56a96c9d4ba8";
 />
   https://github.com/user-attachments/assets/7d7144c5-85ec-47ea-a124-5613472fc698";
 />
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Feature] Decouple SeaTunnel Client and Server SEATUNNEL_HOME [seatunnel]

2026-02-03 Thread via GitHub


LiJie20190102 commented on code in PR #10431:
URL: https://github.com/apache/seatunnel/pull/10431#discussion_r2758943273


##
seatunnel-engine/seatunnel-engine-core/src/main/java/org/apache/seatunnel/engine/core/classloader/DefaultClassLoaderService.java:
##
@@ -91,8 +97,8 @@ public synchronized ClassLoader getClassLoader(long jobId, 
Collection jars)
 } else {
 log.debug("Run the test class without file checking");
 }
-ClassLoader classLoader = new SeaTunnelChildFirstClassLoader(jars);
-log.info("Create classloader for job {} with jars {}", jobId, 
jars);
+ClassLoader classLoader = new 
SeaTunnelChildFirstClassLoader(resolvedJars);
+log.info("Create classloader for job {} with resolvedJars {}", 
jobId, resolvedJars);

Review Comment:
   Based on the suggestions you provided, I made corresponding adjustments and 
conducted a test to verify them.
   https://github.com/user-attachments/assets/82937df1-20b0-4c59-a68e-0c28de7d1d6d";
 />
   https://github.com/user-attachments/assets/2723c29f-bec8-4a6a-b1ce-158e45b942f6";
 />
   https://github.com/user-attachments/assets/cf0941a7-3dd7-4e9b-9efb-c61678ab5033";
 />
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Feature] Decouple SeaTunnel Client and Server SEATUNNEL_HOME [seatunnel]

2026-02-02 Thread via GitHub


LiJie20190102 commented on code in PR #10431:
URL: https://github.com/apache/seatunnel/pull/10431#discussion_r2757150725


##
seatunnel-engine/seatunnel-engine-core/src/main/java/org/apache/seatunnel/engine/core/classloader/DefaultClassLoaderService.java:
##
@@ -91,8 +97,8 @@ public synchronized ClassLoader getClassLoader(long jobId, 
Collection jars)
 } else {
 log.debug("Run the test class without file checking");
 }
-ClassLoader classLoader = new SeaTunnelChildFirstClassLoader(jars);
-log.info("Create classloader for job {} with jars {}", jobId, 
jars);
+ClassLoader classLoader = new 
SeaTunnelChildFirstClassLoader(resolvedJars);
+log.info("Create classloader for job {} with resolvedJars {}", 
jobId, resolvedJars);

Review Comment:
   Thanks for the question. In my implementation I intentionally work on a copy 
and keep the original collection unchanged, to avoid side effects if the 
original set is shared or reused elsewhere. `resolvedJars` is only used in this 
local flow and is not written back to the original jars , so the copy is still 
useful for isolation. If we can guarantee the collection is not shared, we 
could resolve in-place, but I prefer minimizing side effects.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Feature] Decouple SeaTunnel Client and Server SEATUNNEL_HOME [seatunnel]

2026-02-02 Thread via GitHub


LiJie20190102 commented on code in PR #10431:
URL: https://github.com/apache/seatunnel/pull/10431#discussion_r2757150725


##
seatunnel-engine/seatunnel-engine-core/src/main/java/org/apache/seatunnel/engine/core/classloader/DefaultClassLoaderService.java:
##
@@ -91,8 +97,8 @@ public synchronized ClassLoader getClassLoader(long jobId, 
Collection jars)
 } else {
 log.debug("Run the test class without file checking");
 }
-ClassLoader classLoader = new SeaTunnelChildFirstClassLoader(jars);
-log.info("Create classloader for job {} with jars {}", jobId, 
jars);
+ClassLoader classLoader = new 
SeaTunnelChildFirstClassLoader(resolvedJars);
+log.info("Create classloader for job {} with resolvedJars {}", 
jobId, resolvedJars);

Review Comment:
   Thanks for the question. In my implementation I intentionally work on a copy 
and keep the original collection unchanged, to avoid side effects if the 
original set is shared or reused elsewhere. resolvedJars is only used in this 
local flow and is not written back to the original jars , so the copy is still 
useful for isolation. If we can guarantee the collection is not shared, we 
could resolve in-place, but I prefer minimizing side effects.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Feature] Decouple SeaTunnel Client and Server SEATUNNEL_HOME [seatunnel]

2026-02-02 Thread via GitHub


corgy-w commented on code in PR #10431:
URL: https://github.com/apache/seatunnel/pull/10431#discussion_r2754850980


##
seatunnel-engine/seatunnel-engine-core/src/main/java/org/apache/seatunnel/engine/core/classloader/DefaultClassLoaderService.java:
##
@@ -91,8 +97,8 @@ public synchronized ClassLoader getClassLoader(long jobId, 
Collection jars)
 } else {
 log.debug("Run the test class without file checking");
 }
-ClassLoader classLoader = new SeaTunnelChildFirstClassLoader(jars);
-log.info("Create classloader for job {} with jars {}", jobId, 
jars);
+ClassLoader classLoader = new 
SeaTunnelChildFirstClassLoader(resolvedJars);
+log.info("Create classloader for job {} with resolvedJars {}", 
jobId, resolvedJars);

Review Comment:
   `resolvedJars` this name can be written back to the jars, I feel it is not 
necessary to call `resolvedJars` `PathResolver. ResolvePathEnv (resolvedJars);` 
 It can be used as a decorator, right? Why make a copy? Is it still useful here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Feature] Decouple SeaTunnel Client and Server SEATUNNEL_HOME [seatunnel]

2026-02-01 Thread via GitHub


DanielCarter-stack commented on PR #10431:
URL: https://github.com/apache/seatunnel/pull/10431#issuecomment-3832468582

   
   
   ### Issue 1: Inconsistent URL Construction Leading to Path Format Mismatch
   
   **Location**: 
`seatunnel-common/src/main/java/org/apache/seatunnel/common/utils/PathResolver.java:75`
 and `:132`
   
   **Problem Description**:
   `replacePathWithEnv` and `resolvePathEnv` use different methods to construct 
URLs, resulting in inconsistent URL formats:
   - `replacePathWithEnv`: `new URL(url.getProtocol(), url.getHost(), 
url.getPort(), newPath)`
 - For `file:` protocol, generates `file:$SEATUNNEL_HOME/lib/xxx.jar`
   - `resolvePathEnv`: `fullPath.toUri().toURL()`
 - Generates `file:///opt/seatunnel/lib/xxx.jar` (three slashes)
   
   **Potential Risks**:
   - When URLs travel between client and server multiple times (e.g., task 
retry, recovery), path formats may be inconsistent
   - Some code that depends on URL format (such as `jar.toURI().getPath()`) may 
behave inconsistently under different formats
   
   **Impact Scope**:
   - Direct impact: Two methods of `PathResolver`
   - Indirect impact: All code using `action.getJarUrls()` and 
`DefaultClassLoaderService`
   - Affected area: Core framework (ClassLoader)
   
   **Severity**: MAJOR
   
   **Improvement Suggestions**:
   ```java
   // In replacePathWithEnv, keep URL format consistent
   public static URL replacePathWithEnv(URL url) {
   // ... existing logic ...
   
   if (normalizedPath.startsWith(normalizedHome)) {
   String relativePath = 
normalizedPath.substring(normalizedHome.length());
   // Consistent with resolvePathEnv: use URI constructor
   String newPath = SEATUNNEL_HOME_VAR + "/" + 
relativePath.replace(File.separatorChar, '/');
   try {
   // Use URI constructor to build URL, ensure format consistency
   return new java.net.URI(url.getProtocol(), url.getHost(), 
newPath, null).toURL();
   } catch (MalformedURLException | URISyntaxException e) {
   throw new RuntimeException("Failed to create logical URL for: " 
+ url, e);
   }
   }
   return url;
   }
   ```
   
   **Rationale**: Uniformly use URI to construct URLs, ensuring URL formats 
generated by `replacePathWithEnv` and `resolvePathEnv` are consistent.
   
   ---
   
   ### Issue 2: Redundant Path Prefix Handling Logic in resolvePathEnv
   
   **Location**: 
`seatunnel-common/src/main/java/org/apache/seatunnel/common/utils/PathResolver.java:116-128`
   
   **Problem Description**:
   The code removes leading slashes from paths twice, resulting in redundant 
logic and prone to errors:
   1. First time: Remove leading slash from `path`
   2. After replacing `$SEATUNNEL_HOME`
   3. Second time: Remove leading slash from `relativePath`
   
   **Potential Risks**:
   - If the input path format does not meet expectations (e.g., 
`file:/$SEATUNNEL_HOME/lib/xxx.jar`), it may produce incorrect results
   - Poor code readability and difficult to maintain
   
   **Impact Scope**:
   - Direct impact: `PathResolver.resolvePathEnv()`
   - Indirect impact: Server-side ClassLoader creation
   - Affected area: Core framework
   
   **Severity**: MINOR
   
   **Improvement Suggestions**:
   ```java
   public static URL resolvePathEnv(URL url) {
   String path = url.getPath();
   if (!path.contains(SEATUNNEL_HOME_VAR)) {
   return url;
   }
   
   String home = Common.getSeaTunnelHome();
   if (StringUtils.isBlank(home)) {
   return url;
   }
   
   // Handle path prefix uniformly
   // Normalize path: remove leading slashes, replace variables
   String cleanPath = path.startsWith("/") ? path.substring(1) : path;
   String relativePath = cleanPath.replace(SEATUNNEL_HOME_VAR, "");
   relativePath = relativePath.replaceAll("^/+", "");  // Remove all 
leading slashes
   
   Path fullPath = Paths.get(home, relativePath);
   try {
   return fullPath.toUri().toURL();
   } catch (MalformedURLException e) {
   throw new RuntimeException("Failed to resolve logical URL for: " + 
url, e);
   }
   }
   ```
   
   **Rationale**: Simplify logic, unify path prefix handling, improve code 
readability and maintainability.
   
   ---
   
   ### Issue 3: PathResolver's Collection Modification Methods May Cause 
Concurrency Issues
   
   **Location**: 
`seatunnel-common/src/main/java/org/apache/seatunnel/common/utils/PathResolver.java:43-51`
 and `:89-97`
   
   **Problem Description**:
   `replacePathWithEnv(Collection)` and `resolvePathEnv(Collection)` 
modify the passed-in collections:
   ```java
   urls.clear();
   urls.addAll(replaced);
   ```
   
   **Potential Risks**:
   1. If the caller passes a shared collection, it may lead to unexpected 
modifications
   2. If called in a multi-threaded environment, it may lead to 
`ConcurrentModificationException`
   3. In current cod

Re: [PR] [Feature] Decouple SeaTunnel Client and Server SEATUNNEL_HOME [seatunnel]

2026-01-31 Thread via GitHub


davidzollo commented on PR #10431:
URL: https://github.com/apache/seatunnel/pull/10431#issuecomment-3830426400

   Thanks for the contribution! This feature is very valuable for flexible 
deployment scenarios where client and server paths differ.
   
   However, after a detailed review, I found a **critical bug** in the 
ClassLoader cache management that will cause memory leaks, and a robustness 
issue in the path resolver.
   
   ### 1.  Critical: Key Mismatch in ClassLoader Cache (Memory Leak)
   
   There is an inconsistency in how the cache `key` is calculated between 
getting and releasing the classloader.
   
   *   In `getClassLoader` (Line 71), the key is calculated using the primitive 
`jars` (which contain the **logical path** with `$SEATUNNEL_HOME`).
   *   In `releaseClassLoader` (Lines 112-123), you resolve the paths to 
**absolute paths** *before* calculating the key.
   
   ```java
   // DefaultClassLoaderService.java
   
   // In getClassLoader:
   String key = covertJarsToKey(jars); // Key is based on 
"file:$SEATUNNEL_HOME/..."
   
   // In releaseClassLoader:
   Collection resolvedJars = new ArrayList<>(jars);
   PathResolver.resolvePathEnv(resolvedJars); 
   // ...
   String key = covertJarsToKey(resolvedJars); // Key is based on 
"file:/opt/seatunnel/..."
   ```
   
   **Consequence:** The keys will never match. `releaseClassLoader` will fail 
to look up the entry in `classLoaderCache`. The reference count will never 
decrement, and the ClassLoader will **never be released**, leading to a 
Permanent Generation memory leak for long-running processes.
   
   **Fix Suggestion:** In `releaseClassLoader`, do **not** use `resolvedJars` 
to generate the key. Use the original `jars` collection to ensure the key 
matches the one put into the map during `getClassLoader`.
   
   ### 2. NPE Risk in `PathResolver`
   
   In `PathResolver.java`, `Common.getSeaTunnelHome()` can return `null` if the 
environment is not set up correctly.
   
   ```java
   // PathResolver.java
   String home = Common.getSeaTunnelHome();
   // ...
   String normalizedHome = new File(home).getAbsolutePath(); // This throws NPE 
if home is null
   ```
   
   **Fix Suggestion:** Please add a null check for `home`. If `SEATUNNEL_HOME` 
is not found, it should either skip the replacement logic safely or throw a 
descriptive exception rather than a raw NullPointerException.
   
   ### 3. Testing
   
   I highly recommend adding a test case (perhaps in `PathResolverTest` or an 
IT) that explicitly sets `SEATUNNEL_HOME` to `null` to verify robustness.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]