Hey,

 Care to give version of storm/jvm? Does this happen on cluster execution
only or when also running the topology in local mode? Unfortunately,
probably the best way to find what's really going on is to profile your
topology... if you can run the topology locally this will make things quite
a bit easier as profiling storm topologies on a live cluster can be quite
time consuming.

Regards.

On Wed, Jan 13, 2016 at 10:06 PM, Nikolaos Pavlakis <
[email protected]> wrote:

> Hello,
>
> I am implementing a distributed algorithm for pagerank estimation using
> Storm. I have been having memory problems, so I decided to create a dummy
> implementation that does not explicitly save anything in memory, to
> determine whether the problem lies in my algorithm or my Storm structure.
>
> Indeed, while the only thing the dummy implementation does is
> message-passing (a lot of it), the memory of each worker process keeps
> rising until the pipeline is clogged. I do not understand why this might be
> happening.
>
> My cluster has 18 machines (some with 8g, some 16g and some 32g of
> memory). I have set the worker heap size to 6g (-Xmx6g).
>
> My topology is very very simple:
> One spout
> One bolt (with parallelism).
>
> The bolt receives data from the spout (fieldsGrouping) and also from other
> tasks of itself.
>
> My message-passing pattern is based on random walks with a certain
> stopping probability. More specifically:
> The spout generates a tuple.
> One specific task from the bolt receives this tuple.
> Based on a certain probability, this task generates another tuple and
> emits it again to another task of the same bolt.
>
>
> I am stuck at this problem for quite a while, so it would be very helpful
> if someone could help.
>
> Best Regards,
> Nick
>

Reply via email to