Re: Further analysis of the GC issue

Markus Kohler Wed, 25 Nov 2009 10:11:23 -0800

BTW, you don't need 10 user , 2 or 3 would be enough.
You just have to take care that the session does not expire.
The issue disappears if you wait long enough, I verified that already.



Regards,
Markus

"The best way to predict the future is to invent it" -- Alan Kay


On Wed, Nov 25, 2009 at 6:35 PM, David Pollak <[email protected]
> wrote:

> Markus,
>
> I'll have to look into how the code is currently implemented.  The original
> design called for a message to hang out for quite a while (not be reloaded
> from RDBMS for each user) and lazily render different pieces of itself.
>  Put
> another way, for each message, no matter where it's viewed in the system,
> there should be a single, cached instance of the message that contains the
> rendered Textile markup, etc.
>
> Put another way, a message that appears in 300 timelines should be the same
> message instance and should have its complex values calculated once.  Sure,
> if you go back and look at a message that's a week old, it'll have fallen
> out of memory and need to be re-rendered, but in the normal case, a message
> that's viewed by 300 people as part of their timeline or the public
> timeline
> should only be materialized/rendered once.
>
> Can you point me to a place (maybe even a VM Ware instance) where I can
> reproduce your tests?  I'd love to cycle on making 23GB -> 100MB (1/300th
> the size).
>
> Thanks,
>
> David
>
> On Wed, Nov 25, 2009 at 9:14 AM, Markus Kohler <[email protected]
> >wrote:
>
> > Hi all,
> > the Garbage Collector issue I was talking about is reproducible.
> > I've uploaded an annotated GC graph to
> >
> >
> http://picasaweb.google.com/lh/photo/wB-RRtb0wIVfpxJkTJPNuw?authkey=Gv1sRgCOve7LThpfvXsQE&feat=directlink
> >
> > I think the "LOGON" phase where I logon all the 300 users looks ok (given
> > that probably textile formatting is involved) but the phase where just
> one
> > user sends one message is certainly not looking good.
> >
> > I took the profiler and the result is a bit shocking. For that one
> message,
> > 881.000.000 objects weighting  23,2 Gbyte where allocated (and reclaimed
> > afterwards). My former record was 2Gbyte ;-)
> >
> > Fortunately I have a theory what happens, without looking into the
> > code,yet,
> > so take it with a grain of salt. It seems that the public time line for
> all
> > users is re-rendered, because 99% of the allocations come
> > from org.apache.esme.comet.PublicTimeline.render(). I guess all the
> actors
> > for all the users are sitting there, not knowing that the user has closed
> > the browser, because the user session has not yet expired.
> >
> > I wonder how we get around this with a real "push" model. If the browser
> > would ask for updates this rendering could be done lazily. Or can we
> "ping"
> > the browser and check whether it responds?
> > On the other side. It should also not be necessary the re-render the
> > message
> > again and again because the result will be the same.
> >
> > I will send Richard some attachments. Not sure whether you will need
> them,
> > they look very similar to the ones we already have.
> >
> > BTW, we should definitely check the use
> > of scala.xml.XML$.loadString(java.lang.String)
> > It's creating a new Parser each time, which is a bit costly because it
> > allocates a new Buffer each time and also hits the disk, when searching
> for
> > the name of the Java class.
> >
> > Greetings,
> > Markus
> >
> >
> >
> > "The best way to predict the future is to invent it" -- Alan Kay
> >
>
>
>
> --
> Lift, the simply functional web framework http://liftweb.net
> Beginning Scala http://www.apress.com/book/view/1430219890
> Follow me: http://twitter.com/dpp
> Surf the harmonics
>

Re: Further analysis of the GC issue

Reply via email to