Is there a good way to measure how contended a cluster is in terms of
inbound/outbound queues?

I'm using 1.0.2 and have noticed that at times tuples flowing through a
topology slow down considerably.

Load for each of the 5 nodes in the cluster is low and network doesn't
appear bottlenecked.  Sometimes, if I redeploy or re-balance the topology,
throughput increases dramatically for a day or so.

I'm using topology.max.spout.pending set to 30 with 8 spouts feeding 40
"writer" bolts.  The capacity metric for the busiest bolt is around .780,
which seems to indicate that they aren't the bottleneck.

topology.message.timeout.secs is set to 120 seconds, but I'm not seeing
failures.

Additionally, I'm using tic tuples to flush the accumulated data at each
bolt to the database every 5 minutes.  Between those cycles, the bolt
accumulates aggregated data and only writes if cache misses occur.  But,
the cache hit rate is almost always 100%.

-Tom

Reply via email to