Hi John,

I think it *may* make sense, but without more details like code/sample data, it 
is hard to say.

Whenever you use a fields grouping, key distribution can come into play and 
affect scaling.

-Taylor

> On Dec 9, 2015, at 9:31 PM, John Yost <[email protected]> wrote:
> 
> Hi Everyone,
> 
> I have a large fan in within my topology where I go from 1000 Bolt A 
> executors to 50 Bolt B executors via fieldsGrouping.  When I profile via 
> jvisualvm, it shows that the Bolt A thread spends 99% of it's time in the 
> com.lmax.disruptor.BlockingWaitStrategy.waitFor method.
> 
> The topology details are as follows:
> 
> 200 workers
> 20 KafkaSpout executors
> 1000 Bolt A executors
> 50  Bolt B executors
> 
> fieldsGrouping from Bolt A -> Bolt B because I am caching in Bolt B, building 
> up large Key/Value pairs for HFile import into HBase.
> 
> I am thinking if I add an extra bolt between Bolt A and Bolt B where I do a 
> localOrShuffleGrouping to go from 1000 -> 200 locally followed by 
> fieldsGrouping to go from 200 -> 50 will lessen Network I/O wait time.
> 
> Please confirm if this makes sense or if there are any other better ideas.
> 
> Thanks
> 
> --John

Reply via email to