Hi Nikolaos,

Maybe try experimenting with max.spout.pending. You may have a buildup of
tuples due to a high max.spout.pending. Would check capacity of each bolt,
find which one(s) are ~ 1, add more executors for those, and see how things
look then.

--John

On Wed, Jan 13, 2016 at 3:06 PM, Nikolaos Pavlakis <
[email protected]> wrote:

> Hello,
>
> I am implementing a distributed algorithm for pagerank estimation using
> Storm. I have been having memory problems, so I decided to create a dummy
> implementation that does not explicitly save anything in memory, to
> determine whether the problem lies in my algorithm or my Storm structure.
>
> Indeed, while the only thing the dummy implementation does is
> message-passing (a lot of it), the memory of each worker process keeps
> rising until the pipeline is clogged. I do not understand why this might be
> happening.
>
> My cluster has 18 machines (some with 8g, some 16g and some 32g of
> memory). I have set the worker heap size to 6g (-Xmx6g).
>
> My topology is very very simple:
> One spout
> One bolt (with parallelism).
>
> The bolt receives data from the spout (fieldsGrouping) and also from other
> tasks of itself.
>
> My message-passing pattern is based on random walks with a certain
> stopping probability. More specifically:
> The spout generates a tuple.
> One specific task from the bolt receives this tuple.
> Based on a certain probability, this task generates another tuple and
> emits it again to another task of the same bolt.
>
>
> I am stuck at this problem for quite a while, so it would be very helpful
> if someone could help.
>
> Best Regards,
> Nick
>

Reply via email to