Thanks, I will try that. Markus
"The best way to predict the future is to invent it" -- Alan Kay On Thu, Nov 26, 2009 at 10:09 AM, Richard Hirsch <[email protected]>wrote: > @Markus It would be interesting to remove the Textile parser and do > the tests again. > > This would confirm whether it is the culprit or not. If I remember > correctly, it was just a change in one line of code. > > Just found the change > ( > http://svn.apache.org/viewvc/incubator/esme/trunk/server/src/main/scala/org/apache/esme/model/Message.scala?r1=804817&r2=819509&diff_format=h > ) > You could change the code to the older version and try it again > > D. > > On Thu, Nov 26, 2009 at 10:03 AM, Markus Kohler <[email protected]> > wrote: > > Hi Michael, > > Good to see you here! > > > > "Memory Analyzer"? that's me ;-) > > > > The 23 Gbyte are not "retained" at one point in time, but they are the > sum > > of all temporary allocated objects, most of memory, (or all of it, there > > doesn't seem to be an obvious memory leak), are gone within a > millisecond. > > I'm confident that this value can be decreased to 90Mbyte and can be > further > > improved down to a few MByte (or even less). We already know that the > > 90Mbyte are mostly caused be an inefficient textile parser. > > > > I also used the Memory Analyzer to look at how much memory is retained, > e.g. > > still in use/referenced after the user interaction has been finished. The > > report is here > > > http://cwiki.apache.org/confluence/display/ESME/Performance+test+-+2009-11-22 > > Also there's room for improvement, potentially caused by the same bug > that > > turned 90Mbyte into 23Gbyte, I don't see any major issues yet with > regards > > to memory usage. > > > > This is also related to the state less versus state full discussion, ATM > the > > amount of state needed for one user is already very low ( a few hundred > > kByte), at least compared to what I'm used to with Enterprise > Applications. > > It is at least an order of magnitude lower, which can only partially > > explained by ESME being less complex than the typical Enterprise app. > > So far I don't see any major road block from the design perspective that > > would stop us from scaling very well. > > > > In my experience, it's quite normal that as soon as someone with a little > > bit of experience in performance takes as closer look at a software, that > a > > few dramatic improvements can be made. That makes working as a > performance > > analysis expert so gratifying. You suggest a few improvements, which have > an > > dramatic impact, and then you walk away before it gets too complicated > ;-) > > No, that's not my intention here :-) > > > > > > Markus > > > > "The best way to predict the future is to invent it" -- Alan Kay > > > > > > On Thu, Nov 26, 2009 at 6:04 AM, Bechauf, Michael > > <[email protected]>wrote: > > > >> David, > >> > >> well, "dead wrong" is a strong expression; hopefully I'm still > breathing. I > >> don't want to judge without having looked at the code myself, but I have > no > >> idea how a massive multi-user system could possibly be designed with > state > >> where per-user information is kept in memory for a certain time. I mean, > 23 > >> GB allocated - that's tough for an SAP transaction server that is not > >> mutlithreaded and where the memory management is highly optimized based > on > >> shared memory that the work processes can attach to, or rolled out to a > file > >> if unused for a whilet. It is, however, deadly for a VM that was never > >> designed for such memory consumption and where a GC run can halt the > server. > >> > >> Anyway, I'll study this a bit more, particularely the Scala > architecture. I > >> heard many good things about Scala, but in the end it's all translated > to > >> things a VM can understand, and I hope Scala does a good enough job > managing > >> this load in a transparent way. > >> > >> -Michael > >> > >> > >> ----- Original Message ----- > >> From: David Pollak <[email protected]> > >> To: [email protected] <[email protected]> > >> Sent: Wed Nov 25 23:00:20 2009 > >> Subject: Re: Further analysis of the GC issue > >> > >> On Wed, Nov 25, 2009 at 7:16 PM, Bechauf, Michael > >> <[email protected]>wrote: > >> > >> > Wasn't this exactly the kind of stuff that the Eclipse Memory Analyzer > - > >> > donated by SAP - was supposed to fix ? A heap of that size for a still > >> > moderate number of 300 users is crazy, so either there is stuff like > >> > circular references that hog memory, or the design model is > fundamentally > >> > flawed. I don't understand why ESME needs "sessions" ? How can a > >> scaleable > >> > server be created if each user will allocate memory until some > timeout. > >> In a > >> > world of stateless browser-based UIs that's not going to work. > >> > > >> > >> You're actually dead wrong about this. "Stateless" is not... it's just > >> pushing state and cache someplace else (the RDBMS, memcached, etc.). > >> "Stateless" will lead to radical performance problems. "Stateless" > merely > >> moves the caching decisions into code you don't control. I dealt with > this > >> issue first-hand while helping a popular micro-blogging site migrate > from a > >> "stateless" to a Scala-based backend. I'm dealing with this issue > >> first-hand helping another popular site that's experiencing exponential > >> growth migrate away from "push everything back to the RDBMS and hope for > >> the > >> best." > >> > >> My original design for ESME is stateful. My original design for ESME is > >> based on lessoned learned in this very space and was oriented to have > >> things > >> intelligently cached so that the caching is not based on RDBMS indexes. > >> I'm > >> not sure what happened to cause the particular issues, but it seems like > >> folks are loading messages from the RDBMS rather than asking the message > >> cache for them. > >> > >> > >> > > >> > Time for me to look at that code ... > >> > > >> > -Michael > >> > > >> > > >> > ----- Original Message ----- > >> > From: Markus Kohler <[email protected]> > >> > To: [email protected] <[email protected]> > >> > Sent: Wed Nov 25 12:14:58 2009 > >> > Subject: Further analysis of the GC issue > >> > > >> > Hi all, > >> > the Garbage Collector issue I was talking about is reproducible. > >> > I've uploaded an annotated GC graph to > >> > > >> > > >> > http://picasaweb.google.com/lh/photo/wB-RRtb0wIVfpxJkTJPNuw?authkey=Gv1sRgCOve7LThpfvXsQE&feat=directlink > >> > > >> > I think the "LOGON" phase where I logon all the 300 users looks ok > (given > >> > that probably textile formatting is involved) but the phase where just > >> one > >> > user sends one message is certainly not looking good. > >> > > >> > I took the profiler and the result is a bit shocking. For that one > >> message, > >> > 881.000.000 objects weighting 23,2 Gbyte where allocated (and > reclaimed > >> > afterwards). My former record was 2Gbyte ;-) > >> > > >> > Fortunately I have a theory what happens, without looking into the > >> > code,yet, > >> > so take it with a grain of salt. It seems that the public time line > for > >> all > >> > users is re-rendered, because 99% of the allocations come > >> > from org.apache.esme.comet.PublicTimeline.render(). I guess all the > >> actors > >> > for all the users are sitting there, not knowing that the user has > closed > >> > the browser, because the user session has not yet expired. > >> > > >> > I wonder how we get around this with a real "push" model. If the > browser > >> > would ask for updates this rendering could be done lazily. Or can we > >> "ping" > >> > the browser and check whether it responds? > >> > On the other side. It should also not be necessary the re-render the > >> > message > >> > again and again because the result will be the same. > >> > > >> > I will send Richard some attachments. Not sure whether you will need > >> them, > >> > they look very similar to the ones we already have. > >> > > >> > BTW, we should definitely check the use > >> > of scala.xml.XML$.loadString(java.lang.String) > >> > It's creating a new Parser each time, which is a bit costly because it > >> > allocates a new Buffer each time and also hits the disk, when > searching > >> for > >> > the name of the Java class. > >> > > >> > Greetings, > >> > Markus > >> > > >> > > >> > > >> > "The best way to predict the future is to invent it" -- Alan Kay > >> > > >> > >> > >> > >> -- > >> Lift, the simply functional web framework http://liftweb.net > >> Beginning Scala http://www.apress.com/book/view/1430219890 > >> Follow me: http://twitter.com/dpp > >> Surf the harmonics > >> > > >
