Regarding the public timeline, it is on a different page on the new UI, so this might solve some of our problems.
D. On Thu, Nov 26, 2009 at 3:41 PM, Vassil Dichev <[email protected]> wrote: >> We're not looking at the root cause of the problem. The Textile stuff is a >> hit if we run it on each message for each user. This is no different than >> having an SQL query in the code that's a Cartesian product and throwing out >> SQL because of it. >> >> Let's find out where and why we keep loading the same message from the RDBMS >> rather than going to the message cache. >> >> Let's find out why we're hitting the RDBMS in general... there are >> abstractions in the system (or at least were) that make RDBMS access a local >> thing rather than a global thing. >> >> I'll have time on Monday to look at this, but running around chopping off >> pieces of code and changing functionality isn't going to get us any closer >> to solving the problem... it's just going to cause the problem to be >> manifest elsewhere. > > I did not remove the Textile parser only because it potentially causes > problems. I think it doesn't fit very well and it's a bit of an > overkill. First of all, for messages headings, tables and paragraphs > are not such a good fit conceptually. > > Second, some elements from MsgParser clash with the Textile parser > ones. For instance, links to images cannot be parsed because MsgParser > takes turn first and converts it to an URL element first. > > Third, the way parsing with Textile is done is inefficient currently > anyway. I parse every separate text element. Since text can be > separated by urls, tags and usernames, that means I could invoke the > Textile parser several times per message. For instance, this message > has 4 text elements => 4 Textile invocations: > > message with #tag and @username and http://blog.esme.us url in text > > Yes, if the performance analysis is correct, the Textile parser is not > the cause of the problem. It might be easier to solve the problem > without it. We even intended to include pluggable parser > implementations some day. > > AFAICT, the problem was not that the RDBMS is queried every time > (although that's how the PublicTimeline has worked from day 1 if I > remember correctly). The problem, as explained by Markus, was that the > message was formatted from the raw string every time it's accessed for > rendering a timeline. The RDBMS was mentioned tangentially by Michael > Bechauf(or someone else?). Markus, did I get this correctly? > > I still don't see how the message could be parsed several times, since > digestedXHTML is lazy and so will be cached (this alone should make it > *way* easier for Scala to write efficient implementations over Java). > > I want to profile the stacktrace where most strings are allocated. > This should answer some questions. > > I also plan to remove rendering the public timeline on each user's > timeline page. First of all because it's not cached, and second > because it's not updated in real-time like the friends' timeline, but > only after an explicit refresh of the browser. So the public timeline > is not only slow, but might be confusing for the user, as they will > expect it to work similarly to the personal timeline (as the layout is > the same). > > Vassil >
