Hi devs,

I'm trying to push Storm metrics into Ambari Metrics Service.
There's StormTimelineMetricsSink available, but I found multiple issues so
I'm trying to fix that to get over it.
(I know StormTimelineMetricsSink is off by default due to AMBARI-13237
<https://issues.apache.org/jira/browse/AMBARI-13237>. I'm also trying to
improve metrics feature of Storm to get over.)

While other things are being fixed so I can push metrics to AMS collector
per task level, but stuck on aggregation.
I'm not sure current AMS can handle this, or I should take workaround.
Problem explained below,

- Storm publishes metrics per task level, and 'metric name' is not unique
for Storm. Different topologies, components (Spout / Bolt), tasks can have
same metric name.
- User normally want to show metrics per component level, not task level.
In order to achieve this, we need to aggregate metric values applying sum
or avg.
- Graphite supports wildcard in query API so place for task could be
replaced to wildcard. And other time-series DBs support tags so that task
can be placed there. I couldn't find relevant feature from AMS.

I'm trying to let Storm sink aggregate metric values via component + task +
metric name. But since it's done from sink side, there're two downsides of
workaround,

- parallelism hint of storm sink must be 1 in order to aggregate
- aggregation should be done from sink side. it means storm sink should
have complicated configurations, which pattern of metrics name should apply
sum, or avg.

So I would really like to resolve aggregation without workaround. Is there
a way to aggregate values of task level to show component level?

Thanks in advance!

Best Regards,
Jungtaek Lim (HeartSaVioR)

Reply via email to