Moved this whole thread to the wiki: http://cwiki.apache.org/confluence/display/ESME/Performance+test+2009-11-25
D. On Thu, Nov 26, 2009 at 2:22 PM, Markus Kohler <[email protected]> wrote: > Hi Michael, > No problem :-) > > > > Regards, > Markus > > "The best way to predict the future is to invent it" -- Alan Kay > > > On Thu, Nov 26, 2009 at 2:12 PM, Bechauf, Michael > <[email protected]>wrote: > >> Thanks Markus. That certainly sounds much better. I was confused >> yesterday already because 23 GByte memory would be a little difficult to >> create when not even the operating system can handle such size. I should >> have asked right away. Blame it on jetlag. >> >> -Michael >> >> -----Original Message----- >> From: Markus Kohler [mailto:[email protected]] >> Sent: Thursday, Nov 26, 2009 1:04 AM >> To: [email protected] >> Subject: Re: Further analysis of the GC issue >> >> Hi Michael, >> Good to see you here! >> >> "Memory Analyzer"? that's me ;-) >> >> The 23 Gbyte are not "retained" at one point in time, but they are the >> sum >> of all temporary allocated objects, most of memory, (or all of it, there >> doesn't seem to be an obvious memory leak), are gone within a >> millisecond. >> I'm confident that this value can be decreased to 90Mbyte and can be >> further >> improved down to a few MByte (or even less). We already know that the >> 90Mbyte are mostly caused be an inefficient textile parser. >> >> I also used the Memory Analyzer to look at how much memory is retained, >> e.g. >> still in use/referenced after the user interaction has been finished. >> The >> report is here >> http://cwiki.apache.org/confluence/display/ESME/Performance+test+-+2009- >> 11-22 >> Also there's room for improvement, potentially caused by the same bug >> that >> turned 90Mbyte into 23Gbyte, I don't see any major issues yet with >> regards >> to memory usage. >> >> This is also related to the state less versus state full discussion, ATM >> the >> amount of state needed for one user is already very low ( a few hundred >> kByte), at least compared to what I'm used to with Enterprise >> Applications. >> It is at least an order of magnitude lower, which can only partially >> explained by ESME being less complex than the typical Enterprise app. >> So far I don't see any major road block from the design perspective that >> would stop us from scaling very well. >> >> In my experience, it's quite normal that as soon as someone with a >> little >> bit of experience in performance takes as closer look at a software, >> that a >> few dramatic improvements can be made. That makes working as a >> performance >> analysis expert so gratifying. You suggest a few improvements, which >> have an >> dramatic impact, and then you walk away before it gets too complicated >> ;-) >> No, that's not my intention here :-) >> >> >> Markus >> >> "The best way to predict the future is to invent it" -- Alan Kay >> >> >> On Thu, Nov 26, 2009 at 6:04 AM, Bechauf, Michael >> <[email protected]>wrote: >> >> > David, >> > >> > well, "dead wrong" is a strong expression; hopefully I'm still >> breathing. I >> > don't want to judge without having looked at the code myself, but I >> have no >> > idea how a massive multi-user system could possibly be designed with >> state >> > where per-user information is kept in memory for a certain time. I >> mean, 23 >> > GB allocated - that's tough for an SAP transaction server that is not >> > mutlithreaded and where the memory management is highly optimized >> based on >> > shared memory that the work processes can attach to, or rolled out to >> a file >> > if unused for a whilet. It is, however, deadly for a VM that was never >> > designed for such memory consumption and where a GC run can halt the >> server. >> > >> > Anyway, I'll study this a bit more, particularely the Scala >> architecture. I >> > heard many good things about Scala, but in the end it's all translated >> to >> > things a VM can understand, and I hope Scala does a good enough job >> managing >> > this load in a transparent way. >> > >> > -Michael >> > >> > >> > ----- Original Message ----- >> > From: David Pollak <[email protected]> >> > To: [email protected] <[email protected]> >> > Sent: Wed Nov 25 23:00:20 2009 >> > Subject: Re: Further analysis of the GC issue >> > >> > On Wed, Nov 25, 2009 at 7:16 PM, Bechauf, Michael >> > <[email protected]>wrote: >> > >> > > Wasn't this exactly the kind of stuff that the Eclipse Memory >> Analyzer - >> > > donated by SAP - was supposed to fix ? A heap of that size for a >> still >> > > moderate number of 300 users is crazy, so either there is stuff like >> > > circular references that hog memory, or the design model is >> fundamentally >> > > flawed. I don't understand why ESME needs "sessions" ? How can a >> > scaleable >> > > server be created if each user will allocate memory until some >> timeout. >> > In a >> > > world of stateless browser-based UIs that's not going to work. >> > > >> > >> > You're actually dead wrong about this. "Stateless" is not... it's >> just >> > pushing state and cache someplace else (the RDBMS, memcached, etc.). >> > "Stateless" will lead to radical performance problems. "Stateless" >> merely >> > moves the caching decisions into code you don't control. I dealt with >> this >> > issue first-hand while helping a popular micro-blogging site migrate >> from a >> > "stateless" to a Scala-based backend. I'm dealing with this issue >> > first-hand helping another popular site that's experiencing >> exponential >> > growth migrate away from "push everything back to the RDBMS and hope >> for >> > the >> > best." >> > >> > My original design for ESME is stateful. My original design for ESME >> is >> > based on lessoned learned in this very space and was oriented to have >> > things >> > intelligently cached so that the caching is not based on RDBMS >> indexes. >> > I'm >> > not sure what happened to cause the particular issues, but it seems >> like >> > folks are loading messages from the RDBMS rather than asking the >> message >> > cache for them. >> > >> > >> > > >> > > Time for me to look at that code ... >> > > >> > > -Michael >> > > >> > > >> > > ----- Original Message ----- >> > > From: Markus Kohler <[email protected]> >> > > To: [email protected] <[email protected]> >> > > Sent: Wed Nov 25 12:14:58 2009 >> > > Subject: Further analysis of the GC issue >> > > >> > > Hi all, >> > > the Garbage Collector issue I was talking about is reproducible. >> > > I've uploaded an annotated GC graph to >> > > >> > > >> > >> http://picasaweb.google.com/lh/photo/wB-RRtb0wIVfpxJkTJPNuw?authkey=Gv1s >> RgCOve7LThpfvXsQE&feat=directlink >> > > >> > > I think the "LOGON" phase where I logon all the 300 users looks ok >> (given >> > > that probably textile formatting is involved) but the phase where >> just >> > one >> > > user sends one message is certainly not looking good. >> > > >> > > I took the profiler and the result is a bit shocking. For that one >> > message, >> > > 881.000.000 objects weighting 23,2 Gbyte where allocated (and >> reclaimed >> > > afterwards). My former record was 2Gbyte ;-) >> > > >> > > Fortunately I have a theory what happens, without looking into the >> > > code,yet, >> > > so take it with a grain of salt. It seems that the public time line >> for >> > all >> > > users is re-rendered, because 99% of the allocations come >> > > from org.apache.esme.comet.PublicTimeline.render(). I guess all the >> > actors >> > > for all the users are sitting there, not knowing that the user has >> closed >> > > the browser, because the user session has not yet expired. >> > > >> > > I wonder how we get around this with a real "push" model. If the >> browser >> > > would ask for updates this rendering could be done lazily. Or can we >> > "ping" >> > > the browser and check whether it responds? >> > > On the other side. It should also not be necessary the re-render the >> > > message >> > > again and again because the result will be the same. >> > > >> > > I will send Richard some attachments. Not sure whether you will need >> > them, >> > > they look very similar to the ones we already have. >> > > >> > > BTW, we should definitely check the use >> > > of scala.xml.XML$.loadString(java.lang.String) >> > > It's creating a new Parser each time, which is a bit costly because >> it >> > > allocates a new Buffer each time and also hits the disk, when >> searching >> > for >> > > the name of the Java class. >> > > >> > > Greetings, >> > > Markus >> > > >> > > >> > > >> > > "The best way to predict the future is to invent it" -- Alan Kay >> > > >> > >> > >> > >> > -- >> > Lift, the simply functional web framework http://liftweb.net >> > Beginning Scala http://www.apress.com/book/view/1430219890 >> > Follow me: http://twitter.com/dpp >> > Surf the harmonics >> > >> >
