Re: We are down to 4.6 Mbyte

Anne Kathrine Petterøe Sun, 29 Nov 2009 09:12:36 -0800

I have never worked on a project before which can show a 5000ximprovement in a matter of days :)


Dick:
I'm really curious to see how that impacts performance - maybe we
should try another Stax test at some point...
<< as in next week? Or after Boston?


/Anne


On 29. nov. 2009, at 15.35, David Pollak wrote:

On Sun, Nov 29, 2009 at 5:03 AM, Vassil Dichev <[email protected]>wrote:
I haven't looked in detail into last nights profiling results,but it
seems
we are down to 4.6 Mbyte! That's an 5000x improvement!
I plan to document the details on Monday night. If there's timeleft I
will
also start with drafting a longer blog post. It would be great ifVassil
would provide a short description/explanation of his changes.
OK, this is going to be embarassing for me, but this is not actually
an improvement, but a return to the performance capabilities of ESME
from several months ago.
Vassil,
I am sorry you feel embarrassed. While bugs happen, your effectivecode tobug ratio is quite excellent... and this is the kind of bug we liketo have:easily identifiable (thanks Markus for your excellent tests) andeasily
fixable.

So, in the future, we definitely need more tests (both as part of the
development process and integration/performance tests). We alsoneed towork together to address the results of the tests. Results of asingle test
should not be viewed as a repudiation of a design.  Tests should be
invitations to either fix a bug (as in this case) or in the eventthat a bugcannot be fixed without significant refactoring, a reasoneddiscussion of
the merits and likely performance implications of another design.
I am happy that you found the root cause of the problem and that youfixed
it quickly.  That's what counts.

Thanks,

David
I'm not surprised that it was 5000x worse, because every time the
public/friends' timeline was displayed for any user, every messagewas
fetched from the database, converted to XML, transformed into XHTML
and JSON... Not only that, but every time a new message has been
received, this would force the timelines of all users, who receivethe
message, to be rerendered again, which means again reloading from DB
and the same XML acrobatics for all 20 messages of the 2 timelines,
which causes 40 messages to be processed for each user.
To top it off, when the Textile parser was activated, its overheadwasmultiplied 40 times per user, which for 300 users means 12000messagesrerendered, just because one user decided to send a message! Yes,this
sounds horrible. David was indeed correct that the Textile parser
itself was not the main culprit, but just magnifying the effects of a
more serious bug.

The problem was the Message.findMessages method. It is supposed to
cache messages based on a LRU strategy. When I introduced access
pools, messages had to be controlled not only when loading them from
the DB, but from the cache. So I discarded the messages from the
temporary structure which had to be returned to the user. Themessages
which were discarded would go to the finder method, where the query
constructed would make sure only messages from valid pools would be
returned (inefficiency one). Furthermore, I allowed a bug where
messages from the public pool would also always be discarded from the
cache and fetched from the DB (inefficiency two). So stuff wouldwork,
but the cache wasn't used in practice.

In conclusion, this is just one more argument for keeping messages in
memory, instead of fetching them from the DB.

Another important conclusion is that performance tests are just as
important as unit and integration tests and can uncover functional
problems too, especially with caches.
--
Lift, the simply functional web framework http://liftweb.net
Beginning Scala http://www.apress.com/book/view/1430219890
Follow me: http://twitter.com/dpp
Surf the harmonics

Re: We are down to 4.6 Mbyte

Reply via email to