Re: Further analysis of the GC issue

Richard Hirsch Thu, 26 Nov 2009 20:47:50 -0800

Moved this whole thread to the wiki:
http://cwiki.apache.org/confluence/display/ESME/Performance+test+2009-11-25


D.

On Thu, Nov 26, 2009 at 2:22 PM, Markus Kohler <[email protected]> wrote:
> Hi Michael,
> No problem :-)
>
>
>
> Regards,
> Markus
>
> "The best way to predict the future is to invent it" -- Alan Kay
>
>
> On Thu, Nov 26, 2009 at 2:12 PM, Bechauf, Michael
> <[email protected]>wrote:
>
>> Thanks Markus. That certainly sounds much better. I was confused
>> yesterday already because 23 GByte memory would be a little difficult to
>> create when not even the operating system can handle such size. I should
>> have asked right away. Blame it on jetlag.
>>
>> -Michael
>>
>> -----Original Message-----
>> From: Markus Kohler [mailto:[email protected]]
>> Sent: Thursday, Nov 26, 2009 1:04 AM
>> To: [email protected]
>> Subject: Re: Further analysis of the GC issue
>>
>> Hi Michael,
>> Good to see you here!
>>
>> "Memory Analyzer"? that's me ;-)
>>
>> The 23 Gbyte are not "retained" at one point in time, but they are the
>> sum
>> of all temporary allocated objects, most of memory, (or all of it, there
>> doesn't seem to be an obvious memory leak), are gone within a
>> millisecond.
>> I'm confident that this value can be decreased to 90Mbyte and can be
>> further
>> improved down to a few MByte (or even less). We already know that the
>> 90Mbyte are mostly caused be an inefficient textile parser.
>>
>> I also used the Memory Analyzer to look at how much memory is retained,
>> e.g.
>> still in use/referenced after the user interaction has been finished.
>> The
>> report is here
>> http://cwiki.apache.org/confluence/display/ESME/Performance+test+-+2009-
>> 11-22
>> Also there's room for improvement, potentially caused by the same bug
>> that
>> turned 90Mbyte into 23Gbyte, I don't see any major issues yet with
>> regards
>> to memory usage.
>>
>> This is also related to the state less versus state full discussion, ATM
>> the
>> amount of state needed for one user is already very low ( a few hundred
>> kByte), at least compared to what I'm used to with Enterprise
>> Applications.
>> It is at least an order of magnitude lower, which can only partially
>> explained by ESME being less complex than the typical Enterprise app.
>> So far I don't see any major road block from the design perspective that
>> would stop us from scaling very well.
>>
>> In my experience, it's quite normal that as soon as someone with a
>> little
>> bit of experience in performance takes as closer look at a software,
>> that a
>> few dramatic improvements can be made. That makes working as a
>> performance
>> analysis expert so gratifying. You suggest a few improvements, which
>> have an
>> dramatic impact, and then you walk away before it gets too complicated
>> ;-)
>> No, that's not my intention here :-)
>>
>>
>> Markus
>>
>> "The best way to predict the future is to invent it" -- Alan Kay
>>
>>
>> On Thu, Nov 26, 2009 at 6:04 AM, Bechauf, Michael
>> <[email protected]>wrote:
>>
>> > David,
>> >
>> > well, "dead wrong" is a strong expression; hopefully I'm still
>> breathing. I
>> > don't want to judge without having looked at the code myself, but I
>> have no
>> > idea how a massive multi-user system could possibly be designed with
>> state
>> > where per-user information is kept in memory for a certain time. I
>> mean, 23
>> > GB allocated - that's tough for an SAP transaction server that is not
>> > mutlithreaded and where the memory management is highly optimized
>> based on
>> > shared memory that the work processes can attach to, or rolled out to
>> a file
>> > if unused for a whilet. It is, however, deadly for a VM that was never
>> > designed for such memory consumption and where a GC run can halt the
>> server.
>> >
>> > Anyway, I'll study this a bit more, particularely the Scala
>> architecture. I
>> > heard many good things about Scala, but in the end it's all translated
>> to
>> > things a VM can understand, and I hope Scala does a good enough job
>> managing
>> > this load in a transparent way.
>> >
>> > -Michael
>> >
>> >
>> > ----- Original Message -----
>> > From: David Pollak <[email protected]>
>> > To: [email protected] <[email protected]>
>> > Sent: Wed Nov 25 23:00:20 2009
>> > Subject: Re: Further analysis of the GC issue
>> >
>> > On Wed, Nov 25, 2009 at 7:16 PM, Bechauf, Michael
>> > <[email protected]>wrote:
>> >
>> > > Wasn't this exactly the kind of stuff that the Eclipse Memory
>> Analyzer -
>> > > donated by SAP - was supposed to fix ? A heap of that size for a
>> still
>> > > moderate number of 300 users is crazy, so either there is stuff like
>> > > circular references that hog memory, or the design model is
>> fundamentally
>> > > flawed. I don't understand why ESME needs "sessions" ? How can a
>> > scaleable
>> > > server be created if each user will allocate memory until some
>> timeout.
>> > In a
>> > > world of stateless browser-based UIs that's not going to work.
>> > >
>> >
>> > You're actually dead wrong about this.  "Stateless" is not... it's
>> just
>> > pushing state and cache someplace else (the RDBMS, memcached, etc.).
>> > "Stateless" will lead to radical performance problems.  "Stateless"
>> merely
>> > moves the caching decisions into code you don't control.  I dealt with
>> this
>> > issue first-hand while helping a popular micro-blogging site migrate
>> from a
>> > "stateless" to a Scala-based backend.  I'm dealing with this issue
>> > first-hand helping another popular site that's experiencing
>> exponential
>> > growth migrate away from "push everything back to the RDBMS and hope
>> for
>> > the
>> > best."
>> >
>> > My original design for ESME is stateful.  My original design for ESME
>> is
>> > based on lessoned learned in this very space and was oriented to have
>> > things
>> > intelligently cached so that the caching is not based on RDBMS
>> indexes.
>> >  I'm
>> > not sure what happened to cause the particular issues, but it seems
>> like
>> > folks are loading messages from the RDBMS rather than asking the
>> message
>> > cache for them.
>> >
>> >
>> > >
>> > > Time for me to look at that code ...
>> > >
>> > > -Michael
>> > >
>> > >
>> > > ----- Original Message -----
>> > > From: Markus Kohler <[email protected]>
>> > > To: [email protected] <[email protected]>
>> > > Sent: Wed Nov 25 12:14:58 2009
>> > > Subject: Further analysis of the GC issue
>> > >
>> > > Hi all,
>> > > the Garbage Collector issue I was talking about is reproducible.
>> > > I've uploaded an annotated GC graph to
>> > >
>> > >
>> >
>> http://picasaweb.google.com/lh/photo/wB-RRtb0wIVfpxJkTJPNuw?authkey=Gv1s
>> RgCOve7LThpfvXsQE&feat=directlink
>> > >
>> > > I think the "LOGON" phase where I logon all the 300 users looks ok
>> (given
>> > > that probably textile formatting is involved) but the phase where
>> just
>> > one
>> > > user sends one message is certainly not looking good.
>> > >
>> > > I took the profiler and the result is a bit shocking. For that one
>> > message,
>> > > 881.000.000 objects weighting  23,2 Gbyte where allocated (and
>> reclaimed
>> > > afterwards). My former record was 2Gbyte ;-)
>> > >
>> > > Fortunately I have a theory what happens, without looking into the
>> > > code,yet,
>> > > so take it with a grain of salt. It seems that the public time line
>> for
>> > all
>> > > users is re-rendered, because 99% of the allocations come
>> > > from org.apache.esme.comet.PublicTimeline.render(). I guess all the
>> > actors
>> > > for all the users are sitting there, not knowing that the user has
>> closed
>> > > the browser, because the user session has not yet expired.
>> > >
>> > > I wonder how we get around this with a real "push" model. If the
>> browser
>> > > would ask for updates this rendering could be done lazily. Or can we
>> > "ping"
>> > > the browser and check whether it responds?
>> > > On the other side. It should also not be necessary the re-render the
>> > > message
>> > > again and again because the result will be the same.
>> > >
>> > > I will send Richard some attachments. Not sure whether you will need
>> > them,
>> > > they look very similar to the ones we already have.
>> > >
>> > > BTW, we should definitely check the use
>> > > of scala.xml.XML$.loadString(java.lang.String)
>> > > It's creating a new Parser each time, which is a bit costly because
>> it
>> > > allocates a new Buffer each time and also hits the disk, when
>> searching
>> > for
>> > > the name of the Java class.
>> > >
>> > > Greetings,
>> > > Markus
>> > >
>> > >
>> > >
>> > > "The best way to predict the future is to invent it" -- Alan Kay
>> > >
>> >
>> >
>> >
>> > --
>> > Lift, the simply functional web framework http://liftweb.net
>> > Beginning Scala http://www.apress.com/book/view/1430219890
>> > Follow me: http://twitter.com/dpp
>> > Surf the harmonics
>> >
>>
>

Re: Further analysis of the GC issue

Reply via email to