[GitHub] flink issue #3106: [FLINK-5118] Fix inconsistent numBytesIn/Out metrics
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3106 @StephanEwen I'll add the default initialization of ```counter``` while merging. Typing the counters to a ```SimpleCounter``` would affect a lot of other classes though (especially since we can extend this to the ```OperatorIOMetricGroup```), so I would like to do that as part of another issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3106: [FLINK-5118] Fix inconsistent numBytesIn/Out metrics
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3106 +1 to this approach. Some suggestions for small performance improvements, not specific to this case, but also applicable to other cases: - It may make sense to type the counters at that level to the "SimpleCounter", if we anyways provide them. The TaskIOMetricGroup could type its counters to "SimpleCounter" as well. Since that class it not final, and future mocks/checks can be implemented on that class as well - We can make sure that the field `counter` is always initialized by initially assigning a standalone SimpleCounter. That way we could drop the null checks in the code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3106: [FLINK-5118] Fix inconsistent numBytesIn/Out metrics
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3106 The code checks for null since there is no *technical* contract that the returned value is null. They aren't strictly necessary, and are only meant to guard against programming errors in the metrics system. Using a ```NullCounter``` would indeed do the same, would however introduce effectively dead code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3106: [FLINK-5118] Fix inconsistent numBytesIn/Out metrics
Github user xhumanoid commented on the issue: https://github.com/apache/flink/pull/3106 @zentol I asked because you always check on null when you try writing to Counter or is it prevent uninitialized state? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3106: [FLINK-5118] Fix inconsistent numBytesIn/Out metrics
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3106 @xhumanoid The metrics returned by the TaskIOMetricGroup can't actually be null, so I wouldn't put too much thought into dealing with that case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3106: [FLINK-5118] Fix inconsistent numBytesIn/Out metrics
Github user xhumanoid commented on the issue: https://github.com/apache/flink/pull/3106 @zentol what do you think about remove if (numBytesOut != null) { and replace numBytesOut = metrics.getNumBytesOutCounter(); with + if (metrics.getNumBytesOutCounter() != null) { +numBytesOut = metrics.getNumBytesOutCounter(); + } else { +numBytesOut = new NullCounter(); + } where NullCounter have empty implementation for every method, prof: we do null check in one place, because sometime we may forget to do it cons: sometimes we broke devirtualization and inlining for counter.inc(..) method --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3106: [FLINK-5118] Fix inconsistent numBytesIn/Out metrics
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3106 @uce Could you take another look? I moved the bytesOut counter into the RecordWriter. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3106: [FLINK-5118] Fix inconsistent numBytesIn/Out metrics
Github user uce commented on the issue: https://github.com/apache/flink/pull/3106 Yeah, I think counting the buffers is the better way to go. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3106: [FLINK-5118] Fix inconsistent numBytesIn/Out metrics
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3106 Then we could count per buffer. I'd rather not count per record since the deserializer are kind of complicated :( --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3106: [FLINK-5118] Fix inconsistent numBytesIn/Out metrics
Github user zentol commented on the issue: https://github.com/apache/flink/pull/3106 Instead of counting bytesOut in the serializer we could do it one level up in the RecordWriter. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3106: [FLINK-5118] Fix inconsistent numBytesIn/Out metrics
Github user uce commented on the issue: https://github.com/apache/flink/pull/3106 Good catch. I was wondering whether it is better to count in a consistent way, too, e.g. count the size of each buffer or each record on both A and B (now we have per record on the output side and per buffer on the input side). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---