[jira] [Commented] (FLINK-5118) Inconsistent records sent/received metrics

2017-01-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834460#comment-15834460
 ] 

ASF GitHub Bot commented on FLINK-5118:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3106


> Inconsistent records sent/received metrics
> --
>
> Key: FLINK-5118
> URL: https://issues.apache.org/jira/browse/FLINK-5118
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Webfrontend
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
> Fix For: 1.2.0, 1.3.0
>
>
> In 1.2-SNAPSHOT running a large scale job you see that the counts for 
> send/received records are inconsistent, e.g. in a simple word count job we 
> see more received records/bytes than we see sent. This is a regression from 
> 1.1 where everything works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5118) Inconsistent records sent/received metrics

2017-01-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826057#comment-15826057
 ] 

ASF GitHub Bot commented on FLINK-5118:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3106
  
@StephanEwen I'll add the default initialization of ```counter``` while 
merging.

Typing the counters to a ```SimpleCounter``` would affect a lot of other 
classes though (especially since we can extend this to the 
```OperatorIOMetricGroup```), so I would like to do that as part of another 
issue. 


> Inconsistent records sent/received metrics
> --
>
> Key: FLINK-5118
> URL: https://issues.apache.org/jira/browse/FLINK-5118
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Webfrontend
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
> Fix For: 1.2.0, 1.3.0
>
>
> In 1.2-SNAPSHOT running a large scale job you see that the counts for 
> send/received records are inconsistent, e.g. in a simple word count job we 
> see more received records/bytes than we see sent. This is a regression from 
> 1.1 where everything works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5118) Inconsistent records sent/received metrics

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823968#comment-15823968
 ] 

ASF GitHub Bot commented on FLINK-5118:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3106
  
+1 to this approach.

Some suggestions for small performance improvements, not specific to this 
case, but also applicable to other cases:
  - It may make sense to type the counters at that level to the 
"SimpleCounter", if we anyways provide them. The TaskIOMetricGroup could type 
its counters to "SimpleCounter" as well. Since that class it not final, and 
future mocks/checks can be implemented on that class as well
  - We can make sure that the field `counter` is always initialized by 
initially assigning a standalone SimpleCounter. That way we could drop the null 
checks in the code.


> Inconsistent records sent/received metrics
> --
>
> Key: FLINK-5118
> URL: https://issues.apache.org/jira/browse/FLINK-5118
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Webfrontend
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
> Fix For: 1.2.0, 1.3.0
>
>
> In 1.2-SNAPSHOT running a large scale job you see that the counts for 
> send/received records are inconsistent, e.g. in a simple word count job we 
> see more received records/bytes than we see sent. This is a regression from 
> 1.1 where everything works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5118) Inconsistent records sent/received metrics

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823662#comment-15823662
 ] 

ASF GitHub Bot commented on FLINK-5118:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3106
  
The code checks for null since there is no *technical* contract that the 
returned value is null. They aren't strictly necessary, and are only meant to 
guard against programming errors in the metrics system.

Using a ```NullCounter``` would indeed do the same, would however introduce 
effectively dead code.


> Inconsistent records sent/received metrics
> --
>
> Key: FLINK-5118
> URL: https://issues.apache.org/jira/browse/FLINK-5118
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Webfrontend
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
> Fix For: 1.2.0, 1.3.0
>
>
> In 1.2-SNAPSHOT running a large scale job you see that the counts for 
> send/received records are inconsistent, e.g. in a simple word count job we 
> see more received records/bytes than we see sent. This is a regression from 
> 1.1 where everything works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5118) Inconsistent records sent/received metrics

2017-01-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823272#comment-15823272
 ] 

ASF GitHub Bot commented on FLINK-5118:
---

Github user xhumanoid commented on the issue:

https://github.com/apache/flink/pull/3106
  
@zentol I asked because you always check on null when you try writing to 
Counter 
or is it prevent uninitialized state?


> Inconsistent records sent/received metrics
> --
>
> Key: FLINK-5118
> URL: https://issues.apache.org/jira/browse/FLINK-5118
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Webfrontend
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
> Fix For: 1.2.0, 1.3.0
>
>
> In 1.2-SNAPSHOT running a large scale job you see that the counts for 
> send/received records are inconsistent, e.g. in a simple word count job we 
> see more received records/bytes than we see sent. This is a regression from 
> 1.1 where everything works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5118) Inconsistent records sent/received metrics

2017-01-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823265#comment-15823265
 ] 

ASF GitHub Bot commented on FLINK-5118:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3106
  
@xhumanoid The metrics returned by the TaskIOMetricGroup can't actually be 
null, so I wouldn't put too much thought into dealing with that case.


> Inconsistent records sent/received metrics
> --
>
> Key: FLINK-5118
> URL: https://issues.apache.org/jira/browse/FLINK-5118
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Webfrontend
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
> Fix For: 1.2.0, 1.3.0
>
>
> In 1.2-SNAPSHOT running a large scale job you see that the counts for 
> send/received records are inconsistent, e.g. in a simple word count job we 
> see more received records/bytes than we see sent. This is a regression from 
> 1.1 where everything works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5118) Inconsistent records sent/received metrics

2017-01-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823250#comment-15823250
 ] 

ASF GitHub Bot commented on FLINK-5118:
---

Github user xhumanoid commented on the issue:

https://github.com/apache/flink/pull/3106
  
@zentol 
what do you think about remove 
if (numBytesOut != null) {

and replace
 numBytesOut = metrics.getNumBytesOutCounter();

with

+ if (metrics.getNumBytesOutCounter() != null) {
+numBytesOut = metrics.getNumBytesOutCounter();
+ } else {
+numBytesOut = new NullCounter();
+ }

where NullCounter have empty implementation for every method,

prof:
we do null check in one place, because sometime we may forget to do it

cons:
sometimes we broke devirtualization and inlining for counter.inc(..) method


> Inconsistent records sent/received metrics
> --
>
> Key: FLINK-5118
> URL: https://issues.apache.org/jira/browse/FLINK-5118
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Webfrontend
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
> Fix For: 1.2.0, 1.3.0
>
>
> In 1.2-SNAPSHOT running a large scale job you see that the counts for 
> send/received records are inconsistent, e.g. in a simple word count job we 
> see more received records/bytes than we see sent. This is a regression from 
> 1.1 where everything works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5118) Inconsistent records sent/received metrics

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821625#comment-15821625
 ] 

ASF GitHub Bot commented on FLINK-5118:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3106
  
@uce Could you take another look? I moved the bytesOut counter into the 
RecordWriter.



> Inconsistent records sent/received metrics
> --
>
> Key: FLINK-5118
> URL: https://issues.apache.org/jira/browse/FLINK-5118
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Webfrontend
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
> Fix For: 1.2.0, 1.3.0
>
>
> In 1.2-SNAPSHOT running a large scale job you see that the counts for 
> send/received records are inconsistent, e.g. in a simple word count job we 
> see more received records/bytes than we see sent. This is a regression from 
> 1.1 where everything works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5118) Inconsistent records sent/received metrics

2017-01-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821239#comment-15821239
 ] 

ASF GitHub Bot commented on FLINK-5118:
---

Github user uce commented on the issue:

https://github.com/apache/flink/pull/3106
  
Good catch. I was wondering whether it is better to count in a consistent 
way, too, e.g. count the size of each buffer or each record on both A and B 
(now we have per record on the output side and per buffer on the input side).


> Inconsistent records sent/received metrics
> --
>
> Key: FLINK-5118
> URL: https://issues.apache.org/jira/browse/FLINK-5118
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Webfrontend
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
> Fix For: 1.2.0, 1.3.0
>
>
> In 1.2-SNAPSHOT running a large scale job you see that the counts for 
> send/received records are inconsistent, e.g. in a simple word count job we 
> see more received records/bytes than we see sent. This is a regression from 
> 1.1 where everything works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5118) Inconsistent records sent/received metrics

2017-01-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821232#comment-15821232
 ] 

ASF GitHub Bot commented on FLINK-5118:
---

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/3106#discussion_r95812448
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/api/serialization/SpanningRecordSerializer.java
 ---
@@ -99,7 +99,7 @@ public SerializationResult addRecord(T record) throws 
IOException {
this.lengthBuffer.putInt(0, len);

if (numBytesOut != null) {
-   numBytesOut.inc(len);
+   numBytesOut.inc(len + 4);
--- End diff --

I think this warrants both a comment and a test.


> Inconsistent records sent/received metrics
> --
>
> Key: FLINK-5118
> URL: https://issues.apache.org/jira/browse/FLINK-5118
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Webfrontend
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
> Fix For: 1.2.0, 1.3.0
>
>
> In 1.2-SNAPSHOT running a large scale job you see that the counts for 
> send/received records are inconsistent, e.g. in a simple word count job we 
> see more received records/bytes than we see sent. This is a regression from 
> 1.1 where everything works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5118) Inconsistent records sent/received metrics

2017-01-12 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821166#comment-15821166
 ] 

Chesnay Schepler commented on FLINK-5118:
-

PR at https://github.com/apache/flink/pull/3106

> Inconsistent records sent/received metrics
> --
>
> Key: FLINK-5118
> URL: https://issues.apache.org/jira/browse/FLINK-5118
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Webfrontend
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
> Fix For: 1.2.0, 1.3.0
>
>
> In 1.2-SNAPSHOT running a large scale job you see that the counts for 
> send/received records are inconsistent, e.g. in a simple word count job we 
> see more received records/bytes than we see sent. This is a regression from 
> 1.1 where everything works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)