Hi Everyone, I am currently prototyping FieldsGrouping at the KafkaSpout vs Bolt level. I am curious as to whether anyone else has tried this and, if so, how well this worked.
The reason I am attempting to do FieldsGrouping in the KafkaSpout is that I moved from fieldsGrouping to localOrShuffleGrouping between Bolt 1 and Bolt 2 in my topology due to a 4 to 1 fan in from Bolt 1 to Bolt 2 (for example, 200 Bolt 1 executors and 50 Bolt 2 executors) which was dramatically slowing throughput. It is still highly preferable to do fieldsGrouping one way or another so that I am getting all values for a given key to the same Bolt 2 executor, which is the impetus for attempting to do fieldsGrouping in the KafkaSpout. If anyone has any thoughts on this approach, I'd very much like to get your thoughts. Thanks --John
