gortiz opened a new pull request, #17851:
URL: https://github.com/apache/pinot/pull/17851

   ## Summary
   
   This PR reworks Pinot's continuous JFR integration to use the JVM 
DiagnosticCommand MBean (`JFR.configure`, `JFR.start`, `JFR.stop`) instead of 
the `jdk.jfr.Recording` object API, while keeping cluster-level dynamic control 
through `ContinuousJfrStarter`.
   
   ## Why this change is needed
   
   The previous implementation had correctness and operability issues:
   
   - It used `Recording.setDestination(...)`, which does not match the intended 
long-running "continuous recording + operator-managed dumps" behavior.
   - It mixed concerns by trying to handle dump-file lifecycle/cleanup in Pinot 
code.
   - It did not provide a reliable way to control JFR repository location 
through Pinot in the way operators need.
   - It made semantics around dump boundaries/rotation unclear.
   
   The new approach aligns Pinot with JFR's actual control model and removes 
behavior that was easy to misinterpret.
   
   ## Why we decided not to keep using `Recording`
   
   We moved away from `Recording` API for control flow because important 
operational knobs we need are in the global JFR control plane, not the 
per-recording API:
   
   - repository path (`repositorypath`)
   - dump path (`dumppath`)
   - unified runtime command semantics for start/stop/configure
   
   Using DiagnosticCommand MBean gives one coherent mechanism for all required 
controls and maps directly to `jcmd` behavior.
   
   ## Why we still keep `ContinuousJfrStarter` (instead of only `jcmd` or 
`JAVA_OPTS`)
   
   We intentionally keep `ContinuousJfrStarter` as Pinot's control plane 
because it gives cluster-wide operability that `JAVA_OPTS`/manual `jcmd` do not:
   
   - Dynamic reconfiguration without restart via Pinot cluster config updates.
   - Single cluster-level control surface for enable/disable and JFR parameters.
   - No need to manually run commands on many nodes during incidents.
   - Better consistency with Pinot's existing config propagation model across 
cluster roles.
   
   `JAVA_OPTS` and direct `jcmd` remain valid tools, but they are not 
sufficient as the primary Pinot control plane for large clusters.
   
   ## What changed
   
   ### `ContinuousJfrStarter` behavior
   
   - Uses DiagnosticCommand MBean commands:
     - `jfrConfigure` for runtime options (`repositorypath`, `dumppath`)
     - `jfrStart` for recording start
     - `jfrStop` for stop by recording name
   - Supports cluster config mapping:
     - `pinot.jfr.directory` -> `repositorypath` (kept for backward 
compatibility)
     - `pinot.jfr.dumpPath` -> `dumppath`
     - existing keys still supported: `enabled`, `name`, `configuration`, 
`toDisk`, `maxSize`, `maxAge`, `dumpOnExit`
   - Keeps one continuous recording model; no dump rotation/cleanup thread 
logic.
   
   ### Reliability hardening
   
   - If DiagnosticCommand MBean is unavailable, Pinot logs warnings and safely 
no-ops instead of failing startup/runtime paths.
   - Static initialization is safe even if `ObjectName` creation fails (logs 
warning; does not throw).
   
   ### Tests
   
   - Unit-style tests validate command generation and control flow.
   - Added integration test that uses the real DiagnosticCommand MBean and 
verifies recording visibility via `jfrCheck`.
   - Added coverage for graceful behavior when MBean is unavailable.
   
   ### Docs (on pinot-contrib/pinot-docs)
   
   - Updated continuous JFR operator docs to reflect:
     - MBean-based control model
     - `pinot.jfr.directory` compatibility key and `pinot.jfr.dumpPath`
     - graceful degradation when MBean is unavailable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to