However the feature may not be available in other metrics service/library,
so it is a valid concern especially for heavy topologies.

The new metrics could be very useful though. I am wondering it might be
useful to have two sets of metrics. One is exposed to external
observability, another one is used for internal processes.

On Wed, Apr 18, 2018 at 9:17 PM, Karthik Ramasamy <[email protected]>
wrote:

> I like the idea of both the metrics and it might be great to include them.
>
> Prometheus can aggregate metrics downstream by component-id/source-task etc
> It is a nice tool.
>
> cheers
> /karthik
>
> On Wed, Apr 18, 2018 at 8:32 PM, Fu Maosong <[email protected]> wrote:
>
> > One concern is that it will significantly increase the number of metrics,
> > potentially leading performance concerns.
> >
> > 2018-04-18 18:58 GMT-07:00 Thomas Cooper <[email protected]>:
> >
> > > Hi All,
> > >
> > > This started out a quick slack post, then a reasonably sized email and
> > now
> > > it has headings!
> > >
> > > *Introduction*
> > >
> > > I am working on a performance modeling system for Heron. Hopefully this
> > > system will be useful for checking proposed plans will meet performance
> > > targets and also for checking if currently running physical plans will
> > have
> > > back pressure issues with higher traffic rates.
> > >
> > > To do this I need to know what proportion of tuples are routed from
> each
> > > upstream instance to its downstream instances, which is a metric that
> > Heron
> > > does not provide by default.
> > >
> > > *Proposal*
> > >
> > > I have implemented a custom metric to do what I need in my test
> > topologies,
> > > it is a simple multi-count metric called "__receive-count" where the
> key
> > > now includes the "sourceTaskId" value (which you can get from the tuple
> > > instance) as well as the source component name and incoming stream
> name.
> > >
> > > This is basically the same as the default "__execute-count" metric but
> > the
> > > metric name format is
> > > "__receive-count/<source-component>/<source-task-ID>/<
> incoming-stream>"
> > > instead of "__execute-count/<source-component>/<incoming-stream>"
> > >
> > > So I see two options:
> > >
> > >    1. Create a new "__receive-count" metric and leave the
> > "__execute-count"
> > >    alone
> > >    2. Alter "__execute-count" to include the source task ID.
> > >
> > > *Questions*
> > >
> > > My first question is weather the metric name is parsed anywhere further
> > > down the line, such as aggregating component metrics in the metrics
> > > manager? So changing the name would break things?
> > >
> > > My second is if we do change "__execute-count" should we also add the
> > > source task ID to other bolt metrics like "__execute-latency" (it would
> > be
> > > nice to see how latency changes by source instance --- this is a
> > particular
> > > issue in two consecutive fields grouped components as instances will
> > > receive very different key distributions which could lead to very
> > different
> > > processing latency).
> > >
> > > *Implementation*
> > >
> > > To add this to the default metrics (or change "__execute-count") seems
> > like
> > > it would be reasonably straight forward (famous last words). We would
> > need
> > > to modify the `FullBoltMetric` class to include the new metrics (if
> > > required) and edit the `FullBoltMetric.executeTuple` method to accept
> the
> > > "sourceTaskId" (which is already available in the
> > > "BoltInstance.readTuplesAndExecute" method) as a 4th argument.
> > >
> > > Obviously, we will need to do the same with the Python implementation.
> > Will
> > > this also need to be changed in the Storm compatibility layer?
> > >
> > > *Conclusion*
> > >
> > > Having the information on where tuples are flowing is really important
> if
> > > we want to be able to do more intelligent routing and adaptive
> > auto-scaling
> > > in the future and hopefully this one small change/extra metric won't
> add
> > > any significant processing overhead.
> > >
> > > I look forward to hearing what you think.
> > >
> > > Cheers,
> > >
> > > Tom Cooper
> > > W: www.tomcooper.org.uk  | Twitter: @tomncooper
> > > <https://twitter.com/tomncooper>
> > >
> >
> >
> >
> > --
> > With my best Regards
> > ------------------
> > Fu Maosong
> > Twitter Inc.
> > Mobile: +001-415-244-7520
> >
>

Reply via email to