Re: Further analysis of the GC issue

David Pollak Wed, 25 Nov 2009 09:35:56 -0800

Markus,

I'll have to look into how the code is currently implemented.  The original
design called for a message to hang out for quite a while (not be reloaded
from RDBMS for each user) and lazily render different pieces of itself.  Put
another way, for each message, no matter where it's viewed in the system,
there should be a single, cached instance of the message that contains the
rendered Textile markup, etc.


Put another way, a message that appears in 300 timelines should be the same
message instance and should have its complex values calculated once.  Sure,
if you go back and look at a message that's a week old, it'll have fallen
out of memory and need to be re-rendered, but in the normal case, a message
that's viewed by 300 people as part of their timeline or the public timeline
should only be materialized/rendered once.

Can you point me to a place (maybe even a VM Ware instance) where I can
reproduce your tests?  I'd love to cycle on making 23GB -> 100MB (1/300th
the size).

Thanks,

David

On Wed, Nov 25, 2009 at 9:14 AM, Markus Kohler <[email protected]>wrote:

> Hi all,
> the Garbage Collector issue I was talking about is reproducible.
> I've uploaded an annotated GC graph to
>
> http://picasaweb.google.com/lh/photo/wB-RRtb0wIVfpxJkTJPNuw?authkey=Gv1sRgCOve7LThpfvXsQE&feat=directlink
>
> I think the "LOGON" phase where I logon all the 300 users looks ok (given
> that probably textile formatting is involved) but the phase where just one
> user sends one message is certainly not looking good.
>
> I took the profiler and the result is a bit shocking. For that one message,
> 881.000.000 objects weighting  23,2 Gbyte where allocated (and reclaimed
> afterwards). My former record was 2Gbyte ;-)
>
> Fortunately I have a theory what happens, without looking into the
> code,yet,
> so take it with a grain of salt. It seems that the public time line for all
> users is re-rendered, because 99% of the allocations come
> from org.apache.esme.comet.PublicTimeline.render(). I guess all the actors
> for all the users are sitting there, not knowing that the user has closed
> the browser, because the user session has not yet expired.
>
> I wonder how we get around this with a real "push" model. If the browser
> would ask for updates this rendering could be done lazily. Or can we "ping"
> the browser and check whether it responds?
> On the other side. It should also not be necessary the re-render the
> message
> again and again because the result will be the same.
>
> I will send Richard some attachments. Not sure whether you will need them,
> they look very similar to the ones we already have.
>
> BTW, we should definitely check the use
> of scala.xml.XML$.loadString(java.lang.String)
> It's creating a new Parser each time, which is a bit costly because it
> allocates a new Buffer each time and also hits the disk, when searching for
> the name of the Java class.
>
> Greetings,
> Markus
>
>
>
> "The best way to predict the future is to invent it" -- Alan Kay
>



-- 
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Surf the harmonics

Re: Further analysis of the GC issue

Reply via email to