JNSimba opened a new pull request, #63898: URL: https://github.com/apache/doris/pull/63898
## What Two improvements to the BE-managed cdc_client lifecycle: ### 1. `cdc_client_java_opts` BE config New `be.conf` entry to pass JVM options to the BE-forked cdc_client process. Default in shipped `conf/be.conf`: ``` cdc_client_java_opts = -XX:+ExitOnOutOfMemoryError ``` `-XX:+ExitOnOutOfMemoryError` makes the JVM exit on OOM so BE can detect the dead child and re-fork a healthy one (previously the JVM survived OOM in an unresponsive state, and BE would keep reporting "CDC client X unresponsive" without restarting it). User can override / extend in `be.conf`, e.g. `-XX:+ExitOnOutOfMemoryError -Xmx2g`. Values are whitespace-tokenized and inserted before `-jar`. The startup now uses `execv` instead of `execlp` to support variable-length argv. ### 2. Adopt externally-managed cdc_client `start_cdc_client()` now probes `127.0.0.1:cdc_client_port/actuator/health` before forking. If a healthy cdc_client is already listening (e.g. one started manually for debug / hotfix), BE adopts it and skips the fork instead of fork-looping against a port it cannot bind. Edge cases handled: - Forked child binds the port, runs normally: unchanged (BE manages it). - BE-forked child died and user manually started a replacement on the same port: next RPC adopts the external instance. - User stops their external cdc_client: next RPC's probe fails, BE falls back to fork. - fork() returns success and health check passes but the new child has already exited (port held by an external process answering health): treated as adoption rather than masking the dead PID as "Start success". A `_adopted_external` atomic edge-triggered flag throttles the "Adopting external cdc client" log so each mode transition prints exactly once. ## Tests - Existing `cdc_client_mgr_test.cpp` cases unchanged (all new logic lives behind `#ifndef BE_TEST`). - Two new tests covering the `_adopted_external` flag default value and setter/getter round-trip. ## Test plan - [ ] Unit: `cdc_client_mgr_test` - [ ] Manual: kill BE-forked cdc_client, `nohup java -jar cdc-client.jar ...` on the same host; verify BE adopts it without fork-looping (`be.INFO` shows one-time `Adopting external cdc client on port 9096`). - [ ] Manual: trigger OOM in cdc_client; verify JVM exits and BE forks a healthy replacement. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
