Thanks for all the replies so far. I am profiling the topology in local mode with VisualVm and I do not see this problem. I am still running to this problem when the topology is deployed on the cluster, even with max.spout.pending = 1.
On Wed, Jan 13, 2016 at 10:38 PM, John Yost <[email protected]> wrote: > +1 for Andrew, definitely agree profiling with jvisualvm or whatever is > definitely something to do if you have not done already > > On Wed, Jan 13, 2016 at 3:30 PM, Andrew Xor <[email protected]> > wrote: > >> Hey, >> >> Care to give version of storm/jvm? Does this happen on cluster execution >> only or when also running the topology in local mode? Unfortunately, >> probably the best way to find what's really going on is to profile your >> topology... if you can run the topology locally this will make things quite >> a bit easier as profiling storm topologies on a live cluster can be quite >> time consuming. >> >> Regards. >> >> On Wed, Jan 13, 2016 at 10:06 PM, Nikolaos Pavlakis < >> [email protected]> wrote: >> >>> Hello, >>> >>> I am implementing a distributed algorithm for pagerank estimation using >>> Storm. I have been having memory problems, so I decided to create a dummy >>> implementation that does not explicitly save anything in memory, to >>> determine whether the problem lies in my algorithm or my Storm structure. >>> >>> Indeed, while the only thing the dummy implementation does is >>> message-passing (a lot of it), the memory of each worker process keeps >>> rising until the pipeline is clogged. I do not understand why this might be >>> happening. >>> >>> My cluster has 18 machines (some with 8g, some 16g and some 32g of >>> memory). I have set the worker heap size to 6g (-Xmx6g). >>> >>> My topology is very very simple: >>> One spout >>> One bolt (with parallelism). >>> >>> The bolt receives data from the spout (fieldsGrouping) and also from >>> other tasks of itself. >>> >>> My message-passing pattern is based on random walks with a certain >>> stopping probability. More specifically: >>> The spout generates a tuple. >>> One specific task from the bolt receives this tuple. >>> Based on a certain probability, this task generates another tuple and >>> emits it again to another task of the same bolt. >>> >>> >>> I am stuck at this problem for quite a while, so it would be very >>> helpful if someone could help. >>> >>> Best Regards, >>> Nick >>> >> >> >
