noob-se7en opened a new issue, #16779: URL: https://github.com/apache/pinot/issues/16779
**Problem:** 1. **Performance impact of metric calculation** Currently, realtime ingestion metrics such as delay are computed by the consumer thread during consumption. This requires fetching the latest upstream offset/record, which introduces an extra call in the critical consumption loop. These calls degrade performance and should be avoided. 2. **Irrelevant or misleading metrics** Not all metrics are applicable to every stream type. For example, ingestion_delay_offset is meaningful only for Kafka, but not for Kinesis. Similarly, ingestion_delay_ms can produce misleading values for Kafka. Meters for such metrics should not be created if they are not relevant for the stream. 3. **Race conditions due to shared stream metadata provider** The consumer thread uses the StreamMetadataProvider from RealtimeSegmentDataManager to fetch the latest upstream offset. However, the same method is also invoked by other components such as FreshnessBasedChecker, debug APIs, and resource APIs, leading to race conditions. 4. **Dependency on consumer thread and state transitions** Metric reporting is tightly coupled with the consumer thread and its state transitions. If the consumer thread is blocked, metrics are not updated. Since thread creation depends on factors like state transition timing and consumer lock acquisition, reporting becomes unreliable. Metric computation and reporting should be decoupled from the consumer thread to ensure consistency. 5. **Incomplete metric cleanup** Metrics are not removed cleanly during scenarios such as server rebalancing or segment relocation by the controller. This often results in metrics being emitted even when the server no longer hosts the corresponding stream partition. Current cleanup relies on secondary solutions (e.g., periodic async threads that check for stale metrics, manually calling the remove ingestion metrics API), which is brittle. Metric lifecycle management needs to be addressed at the source. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
