klaudworks opened a new issue, #515:
URL: https://github.com/apache/flink-agents/issues/515
## Summary
When `flink-agents-dist.jar` is deployed in `/opt/flink/lib` (which is
required), user-defined resource classes (e.g., custom `ChatModel`
implementations) cannot be loaded from user JARs uploaded via the REST API,
resulting in `ClassNotFoundException`.
## Error Message
```
java.lang.ClassNotFoundException: com.example.AzureOpenAIChatModelSetup
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown
Source)
at
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown
Source)
at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Unknown Source)
at
org.apache.flink.agents.plan.resourceprovider.JavaResourceProvider.provide(JavaResourceProvider.java:40)
```
## Root Cause
The framework code in `/opt/flink/lib` is loaded by the **System
ClassLoader**. User JARs uploaded at runtime are loaded by Flink's **User
ClassLoader** (a child of the System ClassLoader).
The existing code uses `Class.forName(className)` which defaults to the
caller's classloader (System ClassLoader). Due to Java's parent-first
delegation model, the System ClassLoader cannot see classes in its child
classloaders.
Affected locations:
- `JavaResourceProvider.java` - main resource instantiation
- `JavaSerializableResourceProvider.java` - serializable resource
deserialization
- `AgentPlan.java` - PythonResourceWrapper class checks
- `ActionJsonDeserializer.java` - parameter type and config deserialization
- `FunctionToolJsonDeserializer.java` - parameter type deserialization
- `EventLogRecordJsonDeserializer.java` - event class deserialization
## Solution
Use the **Thread Context ClassLoader (TCCL)** instead:
```java
Class.forName(className, true,
Thread.currentThread().getContextClassLoader())
```
Flink sets the TCCL to the User ClassLoader before executing user code,
making user-defined classes accessible to framework code.
## Workaround
Place user-defined resource classes in `/opt/flink/lib` alongside
`flink-agents-dist.jar`. However, this is inconvenient for deployment scenarios
where the platform cannot anticipate what users will run (e.g., would require
rebuilding Docker images for each custom resource).
## Fix
PR #514
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]