Hi,
I am worried about scaling issues regarding my aggregation bolt.
My scenario is pretty simple:
I am sending message to Storm topology. I got one bolt which executing http
request to an external webservice and result is returned with
messageGroupId .
When I get the result I want to aggregate(Counting by messageGroupId)
And every 3 mins to persist the current map(ill use Tick bolt for that
mission).
So for example my aggregation would be simple as that:
Map<String, Integer> messageMaps;
...
@Override
public void execute(Tuple tuple..)
{
String messageGroupId=tuple.getStringByField("messageGroupId)';
String messageGroupIdKey=messageMaps.get(messageGroupId);
if(messageGroupIdKey==null)
{
messageMaps.put(messageGroupId,1);
}
else
{
int counter=messageMaps.get(messageGroupIdKey);
messageMaps.put(messageGroupIdKey,counter++);
}
Now I understand that in order to bring all executions in my topology to
this instance(to use this map) I gotta use Field-Grouping.
But how i am going to scale that bolt? else I am overloading it
if I have 800,000 messages with the same messageGroupId they all will go
to a specific bolt..
Thank you all.