[GitHub] storm issue #2399: Track network data metrics STORM-2793

2017-11-01 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/storm/pull/2399
  
And please change the title to 'STORM-2793 Track network data metrics' to 
conform to 
https://github.com/apache/storm/blob/master/DEVELOPER.md#create-a-pull-request




---


[GitHub] storm issue #2399: Track network data metrics STORM-2793

2017-11-01 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/storm/pull/2399
  
Looks like the patch addresses a bit different point rather than issue 
description in STORM-2793. 

Below is the issue description in STORM-2793:

> Existing Storm metrics track tuple counts, but don't account for the size 
of the tuple payload.  If a task always gets large tuples, it can be slower 
than a task that gets small tuples without showing any reason why.  It would be 
good to track the byte counts as well so data skew can be observed directly.

from the description I imagined that the issue would want to track the 
tuples in bytes for every tasks (maybe with sampling), but this patch only 
addresses tracking `transferred` tuples, which ignores local tasks.

I can still see the benefit with measuring transferred tuples, but just 
like to check the patch fits origin intention for the issue.

In addition, currently we allow to put map to value in metric and let 
metrics collector handles it in any way. It should be changed, but we could 
resolve it along with other metrics later.

Overall looks good to me, and it would be much better if we could see the 
performance impact and confirm it's small enough.


---