Thank you very much! Suggesting VisualVM was very useful in exploring the usage of java resources which was really the large issue that I was running into and I re-architected my code to run as many threads instead of many separate java processes. By doing that it alleviated all of my memory issues, which I suspect was really just the overhead of each separate java process, not the flume client code.
Thanks again! Matt On Wed, Mar 25, 2015 at 11:18 PM, Ashish <[email protected]> wrote: > Do all these clients have memory usage is in same range? If yes, then > taking a heap dump would reveal what is consuming memory. > > As Hari said, the batch is kept in-memory, meaning Event size would > matter. Here is what I would do to debug this > > 1. See the memory usage of all client > 2. If they are in range, would use VisualVM to get the heap dump of > any one of the process, else take heap dump of a few process (max, min > usage etc) > 3. Use Eclipse MAT or other tool to see what's consuming the memory > > Can also try tweaking the batch size to see if it makes any difference > in memory usage. > > On Thu, Mar 26, 2015 at 8:33 AM, Matt Fair <[email protected]> wrote: > > The machine that I have seen it both on my machine with 16 GB and 60 GB > of > > memory, when running about 40 clients and ~4k clients respectively using > up > > 100% of memory. If I run without the flume client I have no memory > > problems, but when I insatiate a flume RPCClient, then I run into memory > > problems. > > > > Thanks, > > Matt > > > > On Wed, Mar 25, 2015 at 6:42 PM, Hari Shreedharan > > <[email protected]> wrote: > >> > >> How much memory are you talking about? The RPC client will hold on to > the > >> batch of events you sent, plus some additional threading overhead. > Under the > >> hood, it uses a Netty client which should not really have a big memory > >> footprint. > >> > >> Thanks, > >> Hari > >> > >> > >> On Wed, Mar 25, 2015 at 3:27 PM, Matt Fair <[email protected]> wrote: > >>> > >>> I have an application that launches a bunch of processes (40+) on the > >>> same machine, each one connects to flume using the default flume > RPCClient. > >>> I however have noticed that each RPCClient takes up a decent amount of > >>> memory, and when you create as many clients like I am, it adds up to a > lot > >>> of memory. One thought I had to alleviate having to create all of the > >>> clients was to create only a single RPCClient and then have my other > >>> processes connect to it via a socket, but that seems a little redundant > >>> since that is what the RPCClient is suppose to do anyways. Have others > >>> found themselves in this same situation? Is there a way to handle > memory > >>> more efficiently or is there another RPCClient implementation that > doesn't > >>> take up as much memory? > >>> > >>> Thanks, > >>> Matt > >> > >> > > > > > > -- > thanks > ashish > > Blog: http://www.ashishpaliwal.com/blog > My Photo Galleries: http://www.pbase.com/ashishpaliwal >
