I have a topology in which one of the bolts read from HBase. That bolt is
setup to have one task per executor, and it got tuples from shuffle
grouping so every executor of the bolt will have the same number of tuples.

The problem is that some executors will take longer than others (because of
hot-spotting in HBase region servers). For example, 5% of the executors
have latency of 100ms while the rest 95% have around 25ms. Now with the
guarantee of equal number of tuples per executors, my understanding is that
95% of the fast executors will have to wait. Thus it brings down the
throughput. Please correct me if I mis-understand.

Ideally, I would love to have a load-balance-biased shuffle grouping so
that the 95% of the fast executors would get more tuples.

Is this something I can leverage other existing groupings or patterns to
implement? In an earlier post entitled "long running bolts", Mike Thomsen
and Nathan Leung discussed a nice idea of taking long running tuples
elsewhere (paraphrase below):

"*create tasks for the bolt using the same class but different name in the
topology... route long running bolts (without acks) to the separate
instances and they will not affect your normal processing*"

Since my long running executors are not taking that long (just around
100ms), this may not be worth the effort to take the tuples elsewhere.

I would appreciate any comment and suggestions. Many thanks.

BH

Reply via email to