Thanks for all the replies so far. I am profiling the topology in local
mode with VisualVm and I do not see this problem. I am still running to
this problem when the topology is deployed on the cluster, even with
max.spout.pending = 1.

On Wed, Jan 13, 2016 at 10:38 PM, John Yost <[email protected]> wrote:

> +1 for Andrew, definitely agree profiling with jvisualvm or whatever is
> definitely something to do if you have not done already
>
> On Wed, Jan 13, 2016 at 3:30 PM, Andrew Xor <[email protected]>
> wrote:
>
>> Hey,
>>
>>  Care to give version of storm/jvm? Does this happen on cluster execution
>> only or when also running the topology in local mode? Unfortunately,
>> probably the best way to find what's really going on is to profile your
>> topology... if you can run the topology locally this will make things quite
>> a bit easier as profiling storm topologies on a live cluster can be quite
>> time consuming.
>>
>> Regards.
>>
>> On Wed, Jan 13, 2016 at 10:06 PM, Nikolaos Pavlakis <
>> [email protected]> wrote:
>>
>>> Hello,
>>>
>>> I am implementing a distributed algorithm for pagerank estimation using
>>> Storm. I have been having memory problems, so I decided to create a dummy
>>> implementation that does not explicitly save anything in memory, to
>>> determine whether the problem lies in my algorithm or my Storm structure.
>>>
>>> Indeed, while the only thing the dummy implementation does is
>>> message-passing (a lot of it), the memory of each worker process keeps
>>> rising until the pipeline is clogged. I do not understand why this might be
>>> happening.
>>>
>>> My cluster has 18 machines (some with 8g, some 16g and some 32g of
>>> memory). I have set the worker heap size to 6g (-Xmx6g).
>>>
>>> My topology is very very simple:
>>> One spout
>>> One bolt (with parallelism).
>>>
>>> The bolt receives data from the spout (fieldsGrouping) and also from
>>> other tasks of itself.
>>>
>>> My message-passing pattern is based on random walks with a certain
>>> stopping probability. More specifically:
>>> The spout generates a tuple.
>>> One specific task from the bolt receives this tuple.
>>> Based on a certain probability, this task generates another tuple and
>>> emits it again to another task of the same bolt.
>>>
>>>
>>> I am stuck at this problem for quite a while, so it would be very
>>> helpful if someone could help.
>>>
>>> Best Regards,
>>> Nick
>>>
>>
>>
>

Reply via email to