However the feature may not be available in other metrics service/library, so it is a valid concern especially for heavy topologies.
The new metrics could be very useful though. I am wondering it might be useful to have two sets of metrics. One is exposed to external observability, another one is used for internal processes. On Wed, Apr 18, 2018 at 9:17 PM, Karthik Ramasamy <[email protected]> wrote: > I like the idea of both the metrics and it might be great to include them. > > Prometheus can aggregate metrics downstream by component-id/source-task etc > It is a nice tool. > > cheers > /karthik > > On Wed, Apr 18, 2018 at 8:32 PM, Fu Maosong <[email protected]> wrote: > > > One concern is that it will significantly increase the number of metrics, > > potentially leading performance concerns. > > > > 2018-04-18 18:58 GMT-07:00 Thomas Cooper <[email protected]>: > > > > > Hi All, > > > > > > This started out a quick slack post, then a reasonably sized email and > > now > > > it has headings! > > > > > > *Introduction* > > > > > > I am working on a performance modeling system for Heron. Hopefully this > > > system will be useful for checking proposed plans will meet performance > > > targets and also for checking if currently running physical plans will > > have > > > back pressure issues with higher traffic rates. > > > > > > To do this I need to know what proportion of tuples are routed from > each > > > upstream instance to its downstream instances, which is a metric that > > Heron > > > does not provide by default. > > > > > > *Proposal* > > > > > > I have implemented a custom metric to do what I need in my test > > topologies, > > > it is a simple multi-count metric called "__receive-count" where the > key > > > now includes the "sourceTaskId" value (which you can get from the tuple > > > instance) as well as the source component name and incoming stream > name. > > > > > > This is basically the same as the default "__execute-count" metric but > > the > > > metric name format is > > > "__receive-count/<source-component>/<source-task-ID>/< > incoming-stream>" > > > instead of "__execute-count/<source-component>/<incoming-stream>" > > > > > > So I see two options: > > > > > > 1. Create a new "__receive-count" metric and leave the > > "__execute-count" > > > alone > > > 2. Alter "__execute-count" to include the source task ID. > > > > > > *Questions* > > > > > > My first question is weather the metric name is parsed anywhere further > > > down the line, such as aggregating component metrics in the metrics > > > manager? So changing the name would break things? > > > > > > My second is if we do change "__execute-count" should we also add the > > > source task ID to other bolt metrics like "__execute-latency" (it would > > be > > > nice to see how latency changes by source instance --- this is a > > particular > > > issue in two consecutive fields grouped components as instances will > > > receive very different key distributions which could lead to very > > different > > > processing latency). > > > > > > *Implementation* > > > > > > To add this to the default metrics (or change "__execute-count") seems > > like > > > it would be reasonably straight forward (famous last words). We would > > need > > > to modify the `FullBoltMetric` class to include the new metrics (if > > > required) and edit the `FullBoltMetric.executeTuple` method to accept > the > > > "sourceTaskId" (which is already available in the > > > "BoltInstance.readTuplesAndExecute" method) as a 4th argument. > > > > > > Obviously, we will need to do the same with the Python implementation. > > Will > > > this also need to be changed in the Storm compatibility layer? > > > > > > *Conclusion* > > > > > > Having the information on where tuples are flowing is really important > if > > > we want to be able to do more intelligent routing and adaptive > > auto-scaling > > > in the future and hopefully this one small change/extra metric won't > add > > > any significant processing overhead. > > > > > > I look forward to hearing what you think. > > > > > > Cheers, > > > > > > Tom Cooper > > > W: www.tomcooper.org.uk | Twitter: @tomncooper > > > <https://twitter.com/tomncooper> > > > > > > > > > > > -- > > With my best Regards > > ------------------ > > Fu Maosong > > Twitter Inc. > > Mobile: +001-415-244-7520 > > >
