Hi John, I think it *may* make sense, but without more details like code/sample data, it is hard to say.
Whenever you use a fields grouping, key distribution can come into play and affect scaling. -Taylor > On Dec 9, 2015, at 9:31 PM, John Yost <[email protected]> wrote: > > Hi Everyone, > > I have a large fan in within my topology where I go from 1000 Bolt A > executors to 50 Bolt B executors via fieldsGrouping. When I profile via > jvisualvm, it shows that the Bolt A thread spends 99% of it's time in the > com.lmax.disruptor.BlockingWaitStrategy.waitFor method. > > The topology details are as follows: > > 200 workers > 20 KafkaSpout executors > 1000 Bolt A executors > 50 Bolt B executors > > fieldsGrouping from Bolt A -> Bolt B because I am caching in Bolt B, building > up large Key/Value pairs for HFile import into HBase. > > I am thinking if I add an extra bolt between Bolt A and Bolt B where I do a > localOrShuffleGrouping to go from 1000 -> 200 locally followed by > fieldsGrouping to go from 200 -> 50 will lessen Network I/O wait time. > > Please confirm if this makes sense or if there are any other better ideas. > > Thanks > > --John
