Re: We are down to 4.6 Mbyte

David Pollak Sun, 29 Nov 2009 06:36:07 -0800

On Sun, Nov 29, 2009 at 5:03 AM, Vassil Dichev <[email protected]> wrote:


> > I haven't looked in detail into last nights profiling  results, but it
> seems
> > we are down to 4.6 Mbyte! That's an 5000x improvement!
> > I plan to document the details on Monday night. If there's time left I
> will
> > also start with drafting a longer blog post. It would be great if Vassil
> > would provide a short description/explanation of his changes.
>
> OK, this is going to be embarassing for me, but this is not actually
> an improvement, but a return to the performance capabilities of ESME
> from several months ago.
>

Vassil,

I am sorry you feel embarrassed.  While bugs happen, your effective code to
bug ratio is quite excellent... and this is the kind of bug we like to have:
easily identifiable (thanks Markus for your excellent tests) and easily
fixable.

So, in the future, we definitely need more tests (both as part of the
development process and integration/performance tests).  We also need to
work together to address the results of the tests.  Results of a single test
should not be viewed as a repudiation of a design.  Tests should be
invitations to either fix a bug (as in this case) or in the event that a bug
cannot be fixed without significant refactoring, a reasoned discussion of
the merits and likely performance implications of another design.

I am happy that you found the root cause of the problem and that you fixed
it quickly.  That's what counts.

Thanks,

David


>
> I'm not surprised that it was 5000x worse, because every time the
> public/friends' timeline was displayed for any user, every message was
> fetched from the database, converted to XML, transformed into XHTML
> and JSON... Not only that, but every time a new message has been
> received, this would force the timelines of all users, who receive the
> message, to be rerendered again, which means again reloading from DB
> and the same XML acrobatics for all 20 messages of the 2 timelines,
> which causes 40 messages to be processed for each user.
>
> To top it off, when the Textile parser was activated, its overhead was
> multiplied 40 times per user, which for 300 users means 12000 messages
> rerendered, just because one user decided to send a message! Yes, this
> sounds horrible. David was indeed correct that the Textile parser
> itself was not the main culprit, but just magnifying the effects of a
> more serious bug.
>
> The problem was the Message.findMessages method. It is supposed to
> cache messages based on a LRU strategy. When I introduced access
> pools, messages had to be controlled not only when loading them from
> the DB, but from the cache. So I discarded the messages from the
> temporary structure which had to be returned to the user. The messages
> which were discarded would go to the finder method, where the query
> constructed would make sure only messages from valid pools would be
> returned (inefficiency one). Furthermore, I allowed a bug where
> messages from the public pool would also always be discarded from the
> cache and fetched from the DB (inefficiency two). So stuff would work,
> but the cache wasn't used in practice.
>
> In conclusion, this is just one more argument for keeping messages in
> memory, instead of fetching them from the DB.
>
> Another important conclusion is that performance tests are just as
> important as unit and integration tests and can uncover functional
> problems too, especially with caches.
>



-- 
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Surf the harmonics

Re: We are down to 4.6 Mbyte

Reply via email to