Hi there, I will package the problem-simulation soon, but in the meantime I also want to port the bots to be MINA based as well, instead of the current own-rolled blocking IO architecture.
The reason why I'm saying that is because I find it weird that my Flash clients stay connected during this disconnection process and only the Java bots die. Thanks, Gerrit > -----Original Message----- > From: "Sangjin Lee" <[EMAIL PROTECTED]> > Sent: Saturday 26 July 2008 18:04 > To: [email protected] > CC: > Subject: Re: byte[] usage - not GCd > > > Would you be able to create simple code that demonstrates the memory leak > behavior? I haven't played extensively with server-side load, but in my > experience at least with the client-side load shows no such memory > behavior, and we push it pretty hard. A code sample would be very helpful. > Thanks, > Sangjin > > On Sat, Jul 26, 2008 at 8:04 AM, Gerrit Grobbelaar <[EMAIL PROTECTED]> wrote: > > What I get from this is that MINA (out of the box) might not be the > > solution > > for me in a production environment. 2.0.0 looks to perform a bit better, > > but > > the 50 clients, 1 message per client every 2 seconds, also get > > disconnected after a while. > > > > Granted, all my clients are simulated from one Host machine on a LAN, but > > surely that can't make a difference? > > > > I've checked src out there on the net, and my IoSession/IoBuffer handling > > seems to be correct. The area where my problem is is situated outside of > > my > > domain in anyway, that is the NIO handling of the Mina library. > > > > I've been reading such good reviews about this framework, but just > > frustrating > > that I can't get the same out of it due to the buffer handling (byte[] > > allocations). > > > > I've stumbled onto this: > > =========> > > However, there's a case that allocation of heap buffers becomes > > bottleneck if > > the allocation occurs too fast. It's because any allocation of byte > > arrays requires filling up the content of the array with '0', which > > consumes memory > > bandwidth. CachedBufferAllocator is provided to address this concern, but > > you > > won't need to change the default SimpleBufferAllocator in most cases. > > <========= > > Which confirms to me that if I'm reading and writing on these buffers at > > ridiculous amounts, then MINA will not be of any benefit to me out of the > > box. > > > > > > I'll appreciate any 2 cents added. > > > > Thanks, > > Gerrit > > > > > -----Original Message----- > > > From: Gerrit Grobbelaar <[EMAIL PROTECTED]> > > > Sent: Saturday 26 July 2008 16:33 > > > To: [email protected] > > > CC: > > > Subject: Re: byte[] usage - not GCd > > > > > > > > > I have MUCH better results with MINA 2.0.0 M2, but after a short while > > > > all > > > > > clients/bots disconnect with: > > > > > > An error occurred : null > > > org.apache.mina.core.write.WriteTimeoutException > > > > > > Any ideas anyone? > > > > > > Thanks, > > > Gerrit > > > > > > > -----Original Message----- > > > > From: Gerrit Grobbelaar <[EMAIL PROTECTED]> > > > > Sent: Saturday 26 July 2008 14:14 > > > > To: [email protected] > > > > CC: > > > > Subject: Re: byte[] usage - not GCd > > > > > > > > > > > > MINA 1.1.4 performs even better than 1.1.5. > > > > > > > > I still believe I'm doing something wrong in my code though, as I > > > > read only good things about MINA in the reviews, but for me it just > > > > seems to crumble underneath heavy load, the buffers not clearing > > > > quickly enough > > > > to > > > > > > be GCd. > > > > > > > > ==============> > > > > MINA 1.1.4 - 30 bots > > > > 1. 30 bots - each sending one message every 2 seconds > > > > 1.1. after 10 minutes > > > > 1.1.1. Heap size - 7MB > > > > 1.1.2. byte[] allocation: > > > > 1.1.2.1. no. of objects - 3,791: > > > > 1.1.2.2. shallow size - 996,320 > > > > 1.2. after killing 30 bots > > > > 1.2.1. Heap size - 2,7MB > > > > 1.2.2. byte[] allocation: > > > > 1.2.2.1. no. of objects - 3,051: > > > > 1.2.2.2. shallow size - 659,344 > > > > <============== > > > > > > > > ==============> > > > > MINA 1.1.4 - 100 bots > > > > 2. 100 bots - each sending one message every 2 seconds > > > > 2.1. after 2 minutes > > > > 2.1.1. Heap size - 329MB - JVM unresponsive > > > > 2.1.2. byte[] allocation: > > > > 2.1.2.1. no. of objects - 423,512: > > > > 2.1.2.2. shallow size - 201,053,744 > > > > > > > > 2.2. after killing 100 bots > > > > 2.2.1. no effect - JVM unresponsive > > > > > > > > Interesting part: after about 1 minute this time, the server starts > > > > spewing out logs again, seems to have some action on the buffers > > > > again, processing messages still sent a while back, obviously > > > > printing out > > > > that > > > > > > the session are closed already. Very soon all bots disconnected, > > > > freeing > > > > > > up all the heap mem again. > > > > <============= > > > > > > > > > -----Original Message----- > > > > > From: Gerrit Grobbelaar <[EMAIL PROTECTED]> > > > > > Sent: Saturday 26 July 2008 12:03 > > > > > To: [email protected] > > > > > CC: > > > > > Subject: Re: byte[] usage - not GCd > > > > > > > > > > > > > > > Some more test results (I definitely have better results with MINA > > > > > 1.1.5, but still not ideal under load). Seeing that I get > > > > > different, and somewhat better, result with an older version, I'm > > > > > going to try even older versions than 1.1.5 now. > > > > > > > > > > In the meantime: howcome these different results though, and why > > > > will > > > > > > > 1.1.5 perform better with Buffer (byte[]) > > > > handling/allocations/flushing > > > > > > > than 1.1.6 and 1.1.7? Could be that I'm definitely doing something > > > > > wrong with my MINA integration, but then I should get consistent > > > > > bad performances across all versions? > > > > > > > > > > MINA 1.1.5 results as follows: > > > > > > > > > > ==============> > > > > > MINA 1.1.5 - 30 bots > > > > > 1. 30 bots - each sending one message every 2 seconds > > > > > 1.1. after 10 minutes > > > > > 1.1.1. Heap size - 20MB > > > > > 1.1.2. byte[] allocation: > > > > > 1.1.2.1. no. of objects - 12,104: > > > > > 1.1.2.2. shallow size - 4,207,440 > > > > > 1.2. after killing 30 bots > > > > > 1.2.1. Heap size - 1,5MB > > > > > 1.2.2. byte[] allocation: > > > > > 1.2.2.1. no. of objects - 3,794: > > > > > 1.2.2.2. shallow size - 722,696 > > > > > <============== > > > > > > > > > > ==============> > > > > > MINA 1.1.5 - 100 bots > > > > > 2. 100 bots - each sending one message every 2 seconds > > > > > 2.1. after 3 minutes > > > > > 2.1.1. Heap size - 374MB > > > > > 2.1.2. byte[] allocation: > > > > > 2.1.2.1. no. of objects - 455,736: > > > > > 2.1.2.2. shallow size - 216,055,400 > > > > > > > > > > 2.2. after killing 30 bots > > > > > 2.2.1. no effect - JVM unresponsive > > > > > > > > > > Interesting part: after about 2 - 3 minutes, the server starts > > > > spewing > > > > > > > out logs again, seems to have some action on the buffers again, > > > > > processing messages still sent a while back, obviously printing out > > > > > that the session are closed already. Only after about 5 minutes it > > > > is > > > > > > > starting to detect the client disconnections. And then hangs > > > > > again, after disconnecting only 14 of the 100 bots. There it > > > > > continues > > > > again, > > > > > > > still trying to send chat messages that was still in the buffers. > > > > Now > > > > > > > detected 40 disconnections after another 2 minutes. Do note, Heap > > > > mem > > > > > > > down to 40MB from 374MB. There we go: all disconnected now, > > > > > perform GC, 2MB heap used. > > > > > > > > > > What could I be doing wrong that it seems that these buffers do not > > > > > flush quickly enough, getting bogged down with the data? > > > > > <============= > > > > > > > > > > ==============> > > > > > MINA 1.1.5 - 100 bots > > > > > 2. 100 bots - each sending one message every 5 minutes, and flash > > > > > client sending message every 1 - 3 seconds > > > > > 2.1. after 10minutes > > > > > 2.1.1. Heap size - 165MB > > > > > 2.1.2. byte[] allocation: > > > > > 2.1.2.1. no. of objects - 180,352 > > > > > 2.1.2.2. shallow size - 103,458,904 > > > > > > > > > > 1.2. after killing 100 bots > > > > > 1.2.1. Heap size - 1,5MB > > > > > 1.2.2. byte[] allocation: > > > > > 1.2.2.1. no. of objects - 3,890: > > > > > 1.2.2.2. shallow size - 792,536 > > > > > <============= > > > > > > > > > > > -----Original Message----- > > > > > > From: Gerrit Grobbelaar <[EMAIL PROTECTED]> > > > > > > Sent: Saturday 26 July 2008 10:32 > > > > > > To: [email protected] > > > > > > CC: > > > > > > Subject: Re: byte[] usage - not GCd > > > > > > > > > > > > > > > > > > Yes, 1.1.7 but 1.1.6 as well. > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: Emmanuel Lecharny <[EMAIL PROTECTED]> > > > > > > > Sent: Saturday 26 July 2008 10:20 > > > > > > > To: [email protected] > > > > > > > CC: > > > > > > > Subject: Re: byte[] usage - not GCd > > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > MINA 1.1.7 ? > > > > > > > > > > > > > > Gerrit Grobbelaar wrote: > > > > > > > > YourKit shows the Class list as follows: > > > > > > > > - java.nio.HeapByteBuffer > > > > > > > > - > > > > org.apache.mina.common.SimpleByteBufferAllocator$SimpleByteBuffer > > > > > > > > > > - > > > > org.apache.mina.filter.codec.ProtocolCodecFilter$HiddenByteBuffer > > > > > > > > > > - org.apache.mina.common.IoFilter$WriteRequest > > > > > > > > - java.util.concurrent.ConcurrentLinkedQueue$Node > > > > > > > > > > > > > > > > Note the above contributes 99%+ for the byte[] allocations, > > > > > > > > and the byte[] allocations are causing the OOM. > > > > > > > > > > > > > > > >> -----Original Message----- > > > > > > > >> From: Gerrit Grobbelaar <[EMAIL PROTECTED]> > > > > > > > >> Sent: Saturday 26 July 2008 09:48 > > > > > > > >> To: [email protected] > > > > > > > >> CC: > > > > > > > >> Subject: Re: byte[] usage - not GCd > > > > > > > >> > > > > > > > >> > > > > > > > >> Another symptom: > > > > > > > >> - 100 Java client bots connected, sending NO messages. > > > > > > > >> - 1 Flash client, typing in a message every 2 seconds > > > > > > > >> - memory consumption (byte[] allocations) increases and a > > > > Mark > > > > > > > > > >> Sweep with JConsole doesn't do anything (stays on 100s of > > > > MBs). > > > > > > > > > >> - kill off all bots > > > > > > > >> - Mark sweep > > > > > > > >> - Heap mem usage drops to about 2 - 5MB > > > > > > > >> > > > > > > > >>> -----Original Message----- > > > > > > > >>> From: Gerrit Grobbelaar <[EMAIL PROTECTED]> > > > > > > > >>> Sent: Saturday 26 July 2008 09:33 > > > > > > > >>> To: [email protected] > > > > > > > >>> CC: > > > > > > > >>> Subject: byte[] usage - not GCd > > > > > > > >>> > > > > > > > >>> > > > > > > > >>> Hi, > > > > > > > >>> > > > > > > > >>> I'm having an issue with OOM - here is my story: > > > > > > > >>> > > > > > > > >>> Criteria: > > > > > > > >>> -------- > > > > > > > >>> - concurrent clients connecting to a chatserver, need to > > > > > > > >>> broadcast the messages > > > > > > > >>> - clients keep same connection (each one having its own > > > > > > > >>> IoSession during lifetime) > > > > > > > >>> > > > > > > > >>> > > > > > > > >>> Symptoms: > > > > > > > >>> -------- > > > > > > > >>> - massive amounts of byte[] allocations that just grows > > > > > > > >>> and grows, until OOM. - still OOM even if i have 100 bots > > > > > > > >>> connected, sending one message each every 5 minutes > > > > > > > >>> > > > > > > > >>> When I have 1 - 5 clients connected (all simulated with > > > > > > > >>> bots) that send one message per bot every 2 - 5 seconds, > > > > > > > >>> heap space stays under control with GC. > > > > > > > >>> > > > > > > > >>> > > > > > > > >>> Configs/Tools: > > > > > > > >>> ------------- > > > > > > > >>> - JVM: 1.6u7 > > > > > > > >>> - VM arguments: -Xms128m -Xmx384m > > > > > > > >>> - Operating System: Linux 2.6.15.7 > > > > > > > >>> - Architecture: i386 > > > > > > > >>> - Number of processors: 2 > > > > > > > >>> - Model: Intel(R) Pentium(R) 4 CPU 2.80GHz > > > > > > > >>> - Committed virtual memory: 681,564 kbytes > > > > > > > >>> - Heap buffer allocation (Direct is very insufficient) > > > > > > > >>> - SimpleByteBufferAllocator (do note I have same result > > > > > > > >>> with PooledByteBufferAllocator) > > > > > > > >>> - YourKit and JProfiler > > > > > > > >>> > > > > > > > >>> Both YourKit and JProfiler point fingers at: > > > > > > > >>> byte[] allocation due to > > > > > > > >>> org.apache.mina.common.IoSession.write() calls > > > > > > > >>> > > > > > > > >>> > > > > > > > >>> What could I possibly be doing wrong that I will have the > > > > above > > > > > > > > > >>> scenario where the byte[] allocations never GC properly? > > > > > > > >>> > > > > > > > >>> Thanks, > > > > > > > >>> Gerrit
