noob-se7en opened a new issue, #16779:
URL: https://github.com/apache/pinot/issues/16779

   **Problem:**
   1. **Performance impact of metric calculation**
   Currently, realtime ingestion metrics such as delay are computed by the 
consumer thread during consumption. This requires fetching the latest upstream 
offset/record, which introduces an extra call in the critical consumption loop. 
These calls degrade performance and should be avoided.
   
   2. **Irrelevant or misleading metrics**
   Not all metrics are applicable to every stream type. For example, 
ingestion_delay_offset is meaningful only for Kafka, but not for Kinesis. 
Similarly, ingestion_delay_ms can produce misleading values for Kafka. Meters 
for such metrics should not be created if they are not relevant for the stream.
   
   3. **Race conditions due to shared stream metadata provider**
   The consumer thread uses the StreamMetadataProvider from 
RealtimeSegmentDataManager to fetch the latest upstream offset. However, the 
same method is also invoked by other components such as FreshnessBasedChecker, 
debug APIs, and resource APIs, leading to race conditions.
   
   4. **Dependency on consumer thread and state transitions**
   Metric reporting is tightly coupled with the consumer thread and its state 
transitions. If the consumer thread is blocked, metrics are not updated. Since 
thread creation depends on factors like state transition timing and consumer 
lock acquisition, reporting becomes unreliable. Metric computation and 
reporting should be decoupled from the consumer thread to ensure consistency.
   
   5. **Incomplete metric cleanup**
   Metrics are not removed cleanly during scenarios such as server rebalancing 
or segment relocation by the controller. This often results in metrics being 
emitted even when the server no longer hosts the corresponding stream 
partition. Current cleanup relies on secondary solutions (e.g., periodic async 
threads that check for stale metrics, manually calling the remove ingestion 
metrics API), which is brittle. Metric lifecycle management needs to be 
addressed at the source.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to