org.tartarus.snowball.ext.PorterStemmer is from the compass search. Maybe we can configure it, so that it is not retained after usage.
D. On Tue, Dec 8, 2009 at 1:03 AM, Markus Kohler <[email protected]> wrote: > Hi all, > I've been busy otherwise, and therefore didn't find much time for ESME last > week. > I tried a few things with regards to performance. > As you all noticed the performance on the performance instance is currently > excellent. > I tried various approaches to measure it, but most failed due to the coment > requests, which the tools I usually use don't like. > The best I could get are some numbers from the Firebug Firefox plugin. It > seems that the response time for entering a message until it appears in the > users timeline is around 350ms, which is really excellent. It will be even > harder to measure (using the browser) how long it takes for a message from > one user to the user. I'm not sure how to do that yet. I tested manually > sending messages from chrome to firefox and it' s really fast. > > I also let one of the 300+x Users send 1000 messages and did some heap > dumps. > I'm not yet fully through it but it's already clear that messages take up > too much space. > Around 1400 messages would need 9,3 Million bytes which means that in > average one messages needs 6Kbyte! > Ok there were probably also a lot of relatively long update status messages, > but still I think this is too much. > > The reason seems to be that The messages still retain an instance to the > Stemmer (org.tartarus.snowball.ext.PorterStemmer) which alone takes 2 Kbyte. > Do we really need this Stemmer after we ran it? > > Another reason is that scala.xml.Elem is referenced in the toXML field. I > guess this is the result of parsing XML. Not sure whether this is still > needed after it's done, but storing DOM like structures is for sure not > memory efficient. originialXML looks similiar. > > It would be important to get these numbers down, otherwise we will be killed > by memory usage as soon as we get a lot of messages send. > > I also asked on the Scala list about the loadXML function accessing the > filesystem, but someone claimed this would not be the case in trunk and > asked for the version. So maybe they can backport a fix for this. > I seem to remember during some profiling that this function is still used. > > Haven't had any time to draft a blog, but I hope I can start with that on > Wednesday or Thursday. > > Regards, > Markus > > > "The best way to predict the future is to invent it" -- Alan Kay >
