Re: [Flexwiki-users] Performance analysis

Craig Andera Wed, 29 Aug 2007 12:29:56 -0700

> I'm still trying to get the tests against my giant (1000 namespace)
> corpus running with the perf tool.  I suspect we'll find more of those
> more than twice as slow pages in a small number of cases.
>
> The real question to me is how do we feel about those cases.  If 98% of
> the pages are "generally as fast as 1.8" but 2% are so slow as to be
> "unusable" (because they are >10x slower) what do we think about that?
> My view of course is that those are a pretty serious problem because if
> people have them then they're probably part of the all-up solution for
> them and if some of it stops working then some of it stops working.
> It's a little bit like saying if 95% of the dialog boxes in Visual
> Studio were usable but 5% were super slow/unusable what would we think.


I have to say I generally agree with this assessment, although I'm not sure
Visual Studio should be our quality metric...and yes, I realize there are
two meanings in that statement: I meant it in both senses. :) 

Here's my general thinking as to why we're where we're at: 

When I added security, that was bound to slow things down a bit because it
does extra checking on just about every core operation. There are generally
several core operations per page. So that probably accounts for a bit of the
slowness. But I think the bigger deal is what I did to the WikiTalk engine. 

When I went to add security, the biggest motivator for the rearchitecture
was the caching that was present. There was a fair amount of it, and it was
really intertwined with other aspects of the code. Obviously, caching and
security are related: I don't want to serve David a copy of a page that was
rendered for Craig if David is not supposed to see it. As a result, the
pipelined architecture does this (more or less): 

NamespaceManager => Security => Caching => Property Parsing => Built-in
topics => Filesystem Store

So we check permissions against what comes back from cache, not the other
way around. And that works well. 

The issue is that that whole content pipeline is a full level below the
WikiTalk stuff, which interacts with NamespaceManager and Federation, not
the content pipeline. So incorporating *content* caching into the WikiTalk
engine would result in code that cuts across layers, which is the mess I was
trying to avoid in the first place. 

One option is to add another layer of caching up at the web layer.
Specifically, *output* caching. So we'd get a rendered page, and then if
nothing has changed, we could just spew it again on the next request. This
would make just about every page way, way, way, way faster, once it had been
rendered the first time. The problem is, it's slightly hard to get right.
The caching itself is pretty easy, but the cache expiration stuff is not.
For example, a WikiTalk script that scans the whole namespace looking for
"Summary" properties and displaying them in a table can only be cached until
anything at all in the namespace changes. And any change whatsoever to
_ContentBaseDefinition invalidates everything else...in case there was a
security change at the namespace level. 

I can see how to do that, but it's not a small job. Also, I'm not sure it'll
really solve the problem of "unusably slow pages", because for infrequently
accessed pages, or even just pages that fall out of cache, they're still
going to take just as long to render as they do in the current code. On top
of that, we've got a *scripting engine* in the mix. Cache expiration based
on changes to the content is all well and good when the content
deterministically produces the output, but you can't cache anything else. So
if WikiTalk had the equivalent of DateTime.Now, you obviously can't reliably
cache that. 

Now, maybe it's the case that WikiTalk can only deterministically produce
output based on content. I don't know. Does anyone? If so, it makes life
easier. But even if not, then the other option we can look at is making the
WikiTalk execution engine smarter. I believe this is what was in 1.8 - there
were certainly a bunch of annotations on the WikiTalk code about what could
and could not be cached. Presumably, it let us cache the results of
particular operations, like an Array.Sort of the topics in a namespace. I
never really understood that stuff, which didn't really matter, since it had
to go to accommodate the new architecture. But maybe it could be added back
in. David - perhaps you could fill me in a bit on how that stuff worked. 

In the meantime, I'll take a look at the old code to see if I can figure out
how it worked. I don't really have any other ideas at this point: my
attempts at profiling the existing bottlenecks with the free tools I have
available have failed. 



-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Flexwiki-users mailing list
Flexwiki-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flexwiki-users

Re: [Flexwiki-users] Performance analysis

Reply via email to