The scheduler does not actually have access to those stats even though nimbus
has access to them. There is no API to get to them. I would be a bit nervous
about exposing them, because the metrics as they are now, are very noisy. The
complete_ms_avg only is set if acking is enabled. All of the metrics are only
for workers that are currently up and running. A feedback loop to the
scheduler for metrics is important, but the current way we collect metrics in
my opinion is just way too noisy to be useful.
- Bobby
On Friday, February 26, 2016 3:12 AM, Anirudh Jayakumar
<[email protected]> wrote:
Hi,
Are the latency and throughput metrics of a topology available in Nimbus?
I'm working on a dynamic load-balancer running within Nimbus. To make
intelligent decisions, I need these two metrics along with a few others.
>From digging into the code, I found "SpoutStats" made available through
TopologyInfo. I would like to know if my understanding about the fields in
SpoutStats is correct.
acked - This is an aggregate number of the spout-tuples that completed the
tuple-tree.
failed - This is an aggregate number of the spout-tuples that failed to
complete the tuple-tree
complete_ms_avg - The avg time from when the tuple was emitted by the spout
to the time when the tuple-tree is completed.
1. By periodically getting the acked value I can calculate the throughput
of the topology for that period.
2. complete_ms_avg gives an accurate representation of the latency.
It will be great if someone can validate my understanding.
Thanks,
Anirudh