Hello,

I have a simple WordCount topology which only uses BaseBasicBolts and 
BaseRichSpouts.
When running with multiple workers, I experience increasing Heap usage until it 
is eventually full,
which causes the workers to restart.
Adjusting the Heap size does not permanently solve this problem, it just delays 
it.

I have read about TOPOLOGY_MAX_SPOUT_PENDING and Guaranteed Message Processing 
[1], which is basically
mandatory for running a stable multi-worker topology.

However, even though my workers are now "throttled" with this parameter, still 
run out of Heap at some point.
I am aware that "topology.max.spout.pending" is applied to each Spout instance 
individually, meaning that the
total amount of pending tuples is this value times the amount of spout 
instances.
But there is now way that a _single_ Spout instance which tolerates only 10 
tuples "in flight" at a time almost
instantly fills an 1.5GB Heap.

$ jmap -histo:live 9476

 num     #instances         #bytes  class name
----------------------------------------------
   1:       5648304      989149784  [B
   2:       5647555      135541320  backtype.storm.messaging.TaskMessage


How can 5.6 million TaskMessages be possible? Only _one single_ tuple can be in 
flight. Each tuple is split into
50 tuples by the first Bolt and each of them is reduced by a final CounterBolt.

My questions are the following:

1. How can I query the amount of tuples in my topology currently being "in 
flight".
Storm needs this number itself in one way or the other, but how can I access it 
externally?

2. How can I further troubleshoot my Out of Memory problems?
I need to which Bolt is the bottleneck. Restarting the topology and observing 
is incredibly slow and cumbersome.

3. Do I really _don't_ need to "ack()" anywhere in my code if I'm only using 
BaseBasicBolts and BaseRichSpouts?

Thanks

[1] http://storm.apache.org/releases/0.10.1/Guaranteeing-message-processing.html

Reply via email to