github-actions[bot] commented on PR #61253: URL: https://github.com/apache/doris/pull/61253#issuecomment-4044481194
## Code Review Summary **PR: [fix](streaming) Fix NPE in StreamingInsertJob when MetricRepo is not initialized during replay** This PR fixes a `NullPointerException` during FE replay when `MetricRepo` is not yet initialized. The fix adds `MetricRepo.isInit` guards and introduces an `isReplay` boolean parameter to skip metric updates during replay paths. The approach follows the well-established pattern used elsewhere in the codebase (e.g., `RoutineLoadJob.java`). ### Critical Checkpoint Conclusions 1. **Goal and correctness**: The goal is to prevent NPE during replay. The fix correctly addresses all metric call sites on the replay path by (a) guarding with `MetricRepo.isInit` and (b) passing `isReplay=true` from replay callers. All call sites are updated consistently. 2. **Modification scope**: The change is minimal and focused — single file, single concern. +33/-18 lines, all in `StreamingInsertJob.java`. 3. **Concurrency**: No new concurrency concerns. `MetricRepo.isInit` is declared `volatile`, ensuring cross-thread visibility. The existing locking patterns are unchanged. 4. **Lifecycle management**: No lifecycle changes. N/A. 5. **Configuration items**: No new config. N/A. 6. **Incompatible changes / format changes**: None. N/A. 7. **Parallel code paths**: All 13 `MetricRepo.COUNTER_STREAMING_JOB_*` usages in the file are now guarded. The two replay paths (`replayOnCommitted` and `replayOnCloudMode`) correctly pass `isReplay=true`. The non-replay path (`afterCommitted`) correctly passes `isReplay=false`. 8. **Special conditional checks**: The `MetricRepo.isInit && !isReplay` pattern is clear and follows the convention in `RoutineLoadJob`. 9. **Test coverage**: No new tests added. This is acceptable for a defensive NPE guard — the bug manifests only during FE startup replay when `MetricRepo` is uninitialized, which is difficult to reproduce in unit tests. The fix is straightforward enough to verify by inspection. 10. **Observability**: The change skips metrics during replay, which is correct — replay should not double-count metrics that were already recorded on the master. 11. **EditLog / persistence**: No changes to EditLog or persistence logic. The `updateJobStatisticAndOffset` methods still correctly update in-memory statistics and offsets during replay; only the metric counter updates are skipped. 12. **Performance**: No performance impact. The `isInit` check is a single volatile boolean read. ### Minor Observations (Non-blocking) - `updateCloudJobStatisticAndOffset` is only ever called with `isReplay=true` (from `replayOnCloudMode`), making the `!isReplay` guard always evaluate to `true` (skip metrics). This is not a bug — just slightly redundant defensiveness. If a non-replay cloud call path is added in the future, the guard will correctly kick in. - `updateJobStatisticAndOffset(CommitOffsetRequest)` (line 666) appears to be dead code — no callers exist. Not introduced by this PR. **Verdict: No issues found. The fix is correct, focused, and follows established patterns.** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
