For some reason the wiki page about the performance test on 11-25 was lost, I'll have to create once again.....
On Fri, Nov 27, 2009 at 5:47 AM, Richard Hirsch <[email protected]> wrote: > Moved this whole thread to the wiki: > http://cwiki.apache.org/confluence/display/ESME/Performance+test+2009-11-25 > > D. > > On Thu, Nov 26, 2009 at 2:22 PM, Markus Kohler <[email protected]> > wrote: >> Hi Michael, >> No problem :-) >> >> >> >> Regards, >> Markus >> >> "The best way to predict the future is to invent it" -- Alan Kay >> >> >> On Thu, Nov 26, 2009 at 2:12 PM, Bechauf, Michael >> <[email protected]>wrote: >> >>> Thanks Markus. That certainly sounds much better. I was confused >>> yesterday already because 23 GByte memory would be a little difficult to >>> create when not even the operating system can handle such size. I should >>> have asked right away. Blame it on jetlag. >>> >>> -Michael >>> >>> -----Original Message----- >>> From: Markus Kohler [mailto:[email protected]] >>> Sent: Thursday, Nov 26, 2009 1:04 AM >>> To: [email protected] >>> Subject: Re: Further analysis of the GC issue >>> >>> Hi Michael, >>> Good to see you here! >>> >>> "Memory Analyzer"? that's me ;-) >>> >>> The 23 Gbyte are not "retained" at one point in time, but they are the >>> sum >>> of all temporary allocated objects, most of memory, (or all of it, there >>> doesn't seem to be an obvious memory leak), are gone within a >>> millisecond. >>> I'm confident that this value can be decreased to 90Mbyte and can be >>> further >>> improved down to a few MByte (or even less). We already know that the >>> 90Mbyte are mostly caused be an inefficient textile parser. >>> >>> I also used the Memory Analyzer to look at how much memory is retained, >>> e.g. >>> still in use/referenced after the user interaction has been finished. >>> The >>> report is here >>> http://cwiki.apache.org/confluence/display/ESME/Performance+test+-+2009- >>> 11-22 >>> Also there's room for improvement, potentially caused by the same bug >>> that >>> turned 90Mbyte into 23Gbyte, I don't see any major issues yet with >>> regards >>> to memory usage. >>> >>> This is also related to the state less versus state full discussion, ATM >>> the >>> amount of state needed for one user is already very low ( a few hundred >>> kByte), at least compared to what I'm used to with Enterprise >>> Applications. >>> It is at least an order of magnitude lower, which can only partially >>> explained by ESME being less complex than the typical Enterprise app. >>> So far I don't see any major road block from the design perspective that >>> would stop us from scaling very well. >>> >>> In my experience, it's quite normal that as soon as someone with a >>> little >>> bit of experience in performance takes as closer look at a software, >>> that a >>> few dramatic improvements can be made. That makes working as a >>> performance >>> analysis expert so gratifying. You suggest a few improvements, which >>> have an >>> dramatic impact, and then you walk away before it gets too complicated >>> ;-) >>> No, that's not my intention here :-) >>> >>> >>> Markus >>> >>> "The best way to predict the future is to invent it" -- Alan Kay >>> >>> >>> On Thu, Nov 26, 2009 at 6:04 AM, Bechauf, Michael >>> <[email protected]>wrote: >>> >>> > David, >>> > >>> > well, "dead wrong" is a strong expression; hopefully I'm still >>> breathing. I >>> > don't want to judge without having looked at the code myself, but I >>> have no >>> > idea how a massive multi-user system could possibly be designed with >>> state >>> > where per-user information is kept in memory for a certain time. I >>> mean, 23 >>> > GB allocated - that's tough for an SAP transaction server that is not >>> > mutlithreaded and where the memory management is highly optimized >>> based on >>> > shared memory that the work processes can attach to, or rolled out to >>> a file >>> > if unused for a whilet. It is, however, deadly for a VM that was never >>> > designed for such memory consumption and where a GC run can halt the >>> server. >>> > >>> > Anyway, I'll study this a bit more, particularely the Scala >>> architecture. I >>> > heard many good things about Scala, but in the end it's all translated >>> to >>> > things a VM can understand, and I hope Scala does a good enough job >>> managing >>> > this load in a transparent way. >>> > >>> > -Michael >>> > >>> > >>> > ----- Original Message ----- >>> > From: David Pollak <[email protected]> >>> > To: [email protected] <[email protected]> >>> > Sent: Wed Nov 25 23:00:20 2009 >>> > Subject: Re: Further analysis of the GC issue >>> > >>> > On Wed, Nov 25, 2009 at 7:16 PM, Bechauf, Michael >>> > <[email protected]>wrote: >>> > >>> > > Wasn't this exactly the kind of stuff that the Eclipse Memory >>> Analyzer - >>> > > donated by SAP - was supposed to fix ? A heap of that size for a >>> still >>> > > moderate number of 300 users is crazy, so either there is stuff like >>> > > circular references that hog memory, or the design model is >>> fundamentally >>> > > flawed. I don't understand why ESME needs "sessions" ? How can a >>> > scaleable >>> > > server be created if each user will allocate memory until some >>> timeout. >>> > In a >>> > > world of stateless browser-based UIs that's not going to work. >>> > > >>> > >>> > You're actually dead wrong about this. "Stateless" is not... it's >>> just >>> > pushing state and cache someplace else (the RDBMS, memcached, etc.). >>> > "Stateless" will lead to radical performance problems. "Stateless" >>> merely >>> > moves the caching decisions into code you don't control. I dealt with >>> this >>> > issue first-hand while helping a popular micro-blogging site migrate >>> from a >>> > "stateless" to a Scala-based backend. I'm dealing with this issue >>> > first-hand helping another popular site that's experiencing >>> exponential >>> > growth migrate away from "push everything back to the RDBMS and hope >>> for >>> > the >>> > best." >>> > >>> > My original design for ESME is stateful. My original design for ESME >>> is >>> > based on lessoned learned in this very space and was oriented to have >>> > things >>> > intelligently cached so that the caching is not based on RDBMS >>> indexes. >>> > I'm >>> > not sure what happened to cause the particular issues, but it seems >>> like >>> > folks are loading messages from the RDBMS rather than asking the >>> message >>> > cache for them. >>> > >>> > >>> > > >>> > > Time for me to look at that code ... >>> > > >>> > > -Michael >>> > > >>> > > >>> > > ----- Original Message ----- >>> > > From: Markus Kohler <[email protected]> >>> > > To: [email protected] <[email protected]> >>> > > Sent: Wed Nov 25 12:14:58 2009 >>> > > Subject: Further analysis of the GC issue >>> > > >>> > > Hi all, >>> > > the Garbage Collector issue I was talking about is reproducible. >>> > > I've uploaded an annotated GC graph to >>> > > >>> > > >>> > >>> http://picasaweb.google.com/lh/photo/wB-RRtb0wIVfpxJkTJPNuw?authkey=Gv1s >>> RgCOve7LThpfvXsQE&feat=directlink >>> > > >>> > > I think the "LOGON" phase where I logon all the 300 users looks ok >>> (given >>> > > that probably textile formatting is involved) but the phase where >>> just >>> > one >>> > > user sends one message is certainly not looking good. >>> > > >>> > > I took the profiler and the result is a bit shocking. For that one >>> > message, >>> > > 881.000.000 objects weighting 23,2 Gbyte where allocated (and >>> reclaimed >>> > > afterwards). My former record was 2Gbyte ;-) >>> > > >>> > > Fortunately I have a theory what happens, without looking into the >>> > > code,yet, >>> > > so take it with a grain of salt. It seems that the public time line >>> for >>> > all >>> > > users is re-rendered, because 99% of the allocations come >>> > > from org.apache.esme.comet.PublicTimeline.render(). I guess all the >>> > actors >>> > > for all the users are sitting there, not knowing that the user has >>> closed >>> > > the browser, because the user session has not yet expired. >>> > > >>> > > I wonder how we get around this with a real "push" model. If the >>> browser >>> > > would ask for updates this rendering could be done lazily. Or can we >>> > "ping" >>> > > the browser and check whether it responds? >>> > > On the other side. It should also not be necessary the re-render the >>> > > message >>> > > again and again because the result will be the same. >>> > > >>> > > I will send Richard some attachments. Not sure whether you will need >>> > them, >>> > > they look very similar to the ones we already have. >>> > > >>> > > BTW, we should definitely check the use >>> > > of scala.xml.XML$.loadString(java.lang.String) >>> > > It's creating a new Parser each time, which is a bit costly because >>> it >>> > > allocates a new Buffer each time and also hits the disk, when >>> searching >>> > for >>> > > the name of the Java class. >>> > > >>> > > Greetings, >>> > > Markus >>> > > >>> > > >>> > > >>> > > "The best way to predict the future is to invent it" -- Alan Kay >>> > > >>> > >>> > >>> > >>> > -- >>> > Lift, the simply functional web framework http://liftweb.net >>> > Beginning Scala http://www.apress.com/book/view/1430219890 >>> > Follow me: http://twitter.com/dpp >>> > Surf the harmonics >>> > >>> >> >
