Thanks Jungtaek and Matthias.

On Sun, Jun 14, 2015 at 8:15 AM, Matthias J. Sax <
[email protected]> wrote:

> Hi,
>
> the idea from Mike and Nathan does not apply to your problem because in
> your case the different execution times do not depend on the tuples but
> on the executors. Thus, on the producer side you cannot separate "slow"
> tuples from "fast" tuples.
>
> If you can identify the "slow" executors, you can implement a
> CustomGrouping.java strategy (or use .directGrouping() instead of
> .shuffle()) to send less tuple to the slow executors.
>
> -Matthias
>
>
>
> On 06/13/2015 06:38 PM, Banias H wrote:
> > I have a topology in which one of the bolts read from HBase. That bolt
> > is setup to have one task per executor, and it got tuples from shuffle
> > grouping so every executor of the bolt will have the same number of
> tuples.
> >
> > The problem is that some executors will take longer than others (because
> > of hot-spotting in HBase region servers). For example, 5% of the
> > executors have latency of 100ms while the rest 95% have around 25ms. Now
> > with the guarantee of equal number of tuples per executors, my
> > understanding is that 95% of the fast executors will have to wait. Thus
> > it brings down the throughput. Please correct me if I mis-understand.
> >
> > Ideally, I would love to have a load-balance-biased shuffle grouping so
> > that the 95% of the fast executors would get more tuples.
> >
> > Is this something I can leverage other existing groupings or patterns to
> > implement? In an earlier post entitled "long running bolts", Mike
> > Thomsen and Nathan Leung discussed a nice idea of taking long running
> > tuples elsewhere (paraphrase below):
> >
> > "/create tasks for the bolt using the same class but different name in
> > the topology... route long running bolts (without acks) to the separate
> > instances and they will not affect your normal processing/"
> >
> > Since my long running executors are not taking that long (just around
> > 100ms), this may not be worth the effort to take the tuples elsewhere.
> >
> > I would appreciate any comment and suggestions. Many thanks.
> >
> > BH
>
>

Reply via email to