Vanlightly commented on pull request #10534:
URL: https://github.com/apache/pulsar/pull/10534#issuecomment-899386900


   @equanz Let me know if I understand correctly. We publish aggregated stats, 
where we aggregate by partitioned producer, and optionally we also publish 
stats per partition with no aggregation. So the issue is that currently the 
stats collection code expects each partition to have the same number of 
producers on each partition, and uses that as a way of aggregating the stats 
for each partitioned producer. 
   
   So we need another way of aggregating by partitioned producer. Moreover, it 
looks to me like the current strategy is already broken seeing as the numbers 
won't always match (neither numbers nor order) such as when producers come and 
go.
   
   Using producer name as an aggregation key is not good because if the 
producer name is not set by the user, then each internal producer will have a 
different producer name (generated by the broker).
   
   So the fix is to add another producer identity (producerStatsKey) that is 
shared across all producers of the same partitioned producer. Then aggregate on 
that. 
   
   The producerStatsKey is set by the broker, to ensure it is globally unique. 
The partitioned producer creates a single producer when the partitioned 
producer is started. Then when additional internal producers for other 
partitions are required, they are created with the same  producerStatsKey as 
the first producer. That way they all share the same key.
   
   Is that summary correct?
   
   Additional question: 
   - Could we not use the producerName for aggregation and use the same 
strategy of starting a single producer, then making all others inherit their 
name from the first (when the user doesn't set the name)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to