[GitHub] flink issue #4935: [Flink-7945][Metrics]Fix per partition-lag metr...
Github user Aitozi commented on the issue: https://github.com/apache/flink/pull/4935 Hi, @tzulitai After i read kafkaConsumer code again, i found that the per partition kafka lag metric is register in method `FetchManagerMetrics#recordPartitionLag` But the when the client get the num equal to `max.poll.records ` at once poll, it will return the record it polls in advance left some partition haven't not been `sendFetches` to. So some partition will be lost. In test , if we just poll once , then register kafka metric , if i have many partition like about(100), some partition lag metric will be losed. So i think, with a configurable property, users can choose to when they have too many partition, and will do little harmless to the performance . Please let me know your idea ,thanks ---
[GitHub] flink issue #4935: [Flink-7945][Metrics]Fix per partition-lag metr...
Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/4935 Hi @Aitozi, sorry for the long delay in relaying back to this PR. I'm still not convinced that this is a sane solution. For example, what is a "good" setting for the `KEY_REGISTER_TIMES` property? Isn't 1 enough, since you mentioned that the missing metric is registered in Kafka after the first poll. Making this configurable seems unnecessary to me. I wonder if we can try the following two approaches: 1) Manually register the metrics that we know would be missing before the first poll, or 2) Poll once first outside the loop just to make sure that all Kafka metrics are existent, perform the metrics registration, and then start the regular fetch loop. What do you think? ---
[GitHub] flink issue #4935: [Flink-7945][Metrics]Fix per partition-lag metr...
Github user Aitozi commented on the issue: https://github.com/apache/flink/pull/4935 ping @tzulitai ~ ---
[GitHub] flink issue #4935: [Flink-7945][Metrics]Fix per partition-lag metr...
Github user Aitozi commented on the issue: https://github.com/apache/flink/pull/4935 Hi @tzulitai , could you take look at this again :-) ? ---
[GitHub] flink issue #4935: [Flink-7945][Metrics]Fix per partition-lag metr...
Github user Aitozi commented on the issue: https://github.com/apache/flink/pull/4935 update the code according to the comment. ping @tzulitai ---
[GitHub] flink issue #4935: [Flink-7945][Metrics]Fix per partition-lag metr...
Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/4935 One other side note: Please squash your commits into a single one, with an appropriate commit message (you can refer to the other commits in the codebase as a starter). `[FLINK-7945][Metrics]Fix per partition-lag metric lost in kafka connector` is good, but we would like that to be the commit message also. ---
[GitHub] flink issue #4935: [Flink-7945][Metrics]Fix per partition-lag metr...
Github user Aitozi commented on the issue: https://github.com/apache/flink/pull/4935 cc @zentol @tzulitai please help review the code. ---