Hi David, The nifty graphs are easy to produce. the old school way I used is using -Xloggc:<file> <http://www.tagtraum.com/gcviewer-vmflags.html#sun.loggc> option and loading the file into http://www.tagtraum.com/gcviewer.html . An alternative for 1.6 is https://visualvm.dev.java.net/ Can do more, even memory allocation tracing I think.
Yourkit is good (the developers are amazing), though it always did a heap dump at the end the of memory allocation phase, which is not always needed. I will try to package the Selenium tests, but will probably have not enough time today. I also don't want to check in any sources, permission from my company still pending. Markus "The best way to predict the future is to invent it" -- Alan Kay On Wed, Nov 25, 2009 at 7:51 PM, David Pollak <[email protected] > wrote: > On Wed, Nov 25, 2009 at 9:58 AM, Markus Kohler <[email protected] > >wrote: > > > Hi David, > > see my comments below... > > > > > > > > "The best way to predict the future is to invent it" -- Alan Kay > > > > > > On Wed, Nov 25, 2009 at 6:35 PM, David Pollak < > > [email protected] > > > wrote: > > > > > Markus, > > > > > > I'll have to look into how the code is currently implemented. The > > original > > > design called for a message to hang out for quite a while (not be > > reloaded > > > from RDBMS for each user) and lazily render different pieces of itself. > > > Put > > > another way, for each message, no matter where it's viewed in the > system, > > > there should be a single, cached instance of the message that contains > > the > > > rendered Textile markup, etc. > > > > > > Put another way, a message that appears in 300 timelines should be the > > same > > > message instance and should have its complex values calculated once. > > Sure, > > > if you go back and look at a message that's a week old, it'll have > fallen > > > out of memory and need to be re-rendered, but in the normal case, a > > message > > > that's viewed by 300 people as part of their timeline or the public > > > timeline > > > should only be materialized/rendered once. > > > > > > Yes, that's how it should be done. > > > > Can you point me to a place (maybe even a VM Ware instance) where I can > > > reproduce your tests? I'd love to cycle on making 23GB -> 100MB > (1/300th > > > the size). > > > > > > I see 2 options (I guess you need to debug): > > > > 1. You doit manually. > > I think it would be enough if you have say 10 users. logon a user with > the > > browser, clear the browser cache (no logout) and then logon the next user > > etc. > > Then send a message with the last user. > > > > 2. I pack the selenium test into one fat jar file and you run it locally > on > > your machine. I could also provide you my esme_db directory with the 300 > > users, not sure whether it would work. > > > > Having the selenium tests would be great. > > I can use YourKit to analyze the memory usage, but I'm not likely to get > the > nifty graphs you've been showing us. > > > > > > Which option do you prefer? > > > > Regards, > > Markus > > > > > > > Thanks, > > > > > > David > > > > > > On Wed, Nov 25, 2009 at 9:14 AM, Markus Kohler < > [email protected] > > > >wrote: > > > > > > > Hi all, > > > > the Garbage Collector issue I was talking about is reproducible. > > > > I've uploaded an annotated GC graph to > > > > > > > > > > > > > > http://picasaweb.google.com/lh/photo/wB-RRtb0wIVfpxJkTJPNuw?authkey=Gv1sRgCOve7LThpfvXsQE&feat=directlink > > > > > > > > I think the "LOGON" phase where I logon all the 300 users looks ok > > (given > > > > that probably textile formatting is involved) but the phase where > just > > > one > > > > user sends one message is certainly not looking good. > > > > > > > > I took the profiler and the result is a bit shocking. For that one > > > message, > > > > 881.000.000 objects weighting 23,2 Gbyte where allocated (and > > reclaimed > > > > afterwards). My former record was 2Gbyte ;-) > > > > > > > > Fortunately I have a theory what happens, without looking into the > > > > code,yet, > > > > so take it with a grain of salt. It seems that the public time line > for > > > all > > > > users is re-rendered, because 99% of the allocations come > > > > from org.apache.esme.comet.PublicTimeline.render(). I guess all the > > > actors > > > > for all the users are sitting there, not knowing that the user has > > closed > > > > the browser, because the user session has not yet expired. > > > > > > > > I wonder how we get around this with a real "push" model. If the > > browser > > > > would ask for updates this rendering could be done lazily. Or can we > > > "ping" > > > > the browser and check whether it responds? > > > > On the other side. It should also not be necessary the re-render the > > > > message > > > > again and again because the result will be the same. > > > > > > > > I will send Richard some attachments. Not sure whether you will need > > > them, > > > > they look very similar to the ones we already have. > > > > > > > > BTW, we should definitely check the use > > > > of scala.xml.XML$.loadString(java.lang.String) > > > > It's creating a new Parser each time, which is a bit costly because > it > > > > allocates a new Buffer each time and also hits the disk, when > searching > > > for > > > > the name of the Java class. > > > > > > > > Greetings, > > > > Markus > > > > > > > > > > > > > > > > "The best way to predict the future is to invent it" -- Alan Kay > > > > > > > > > > > > > > > > -- > > > Lift, the simply functional web framework http://liftweb.net > > > Beginning Scala http://www.apress.com/book/view/1430219890 > > > Follow me: http://twitter.com/dpp > > > Surf the harmonics > > > > > > > > > -- > Lift, the simply functional web framework http://liftweb.net > Beginning Scala http://www.apress.com/book/view/1430219890 > Follow me: http://twitter.com/dpp > Surf the harmonics >
