> I'm still trying to get the tests against my giant (1000 namespace) > corpus running with the perf tool. I suspect we'll find more of those > more than twice as slow pages in a small number of cases. > > The real question to me is how do we feel about those cases. If 98% of > the pages are "generally as fast as 1.8" but 2% are so slow as to be > "unusable" (because they are >10x slower) what do we think about that? > My view of course is that those are a pretty serious problem because if > people have them then they're probably part of the all-up solution for > them and if some of it stops working then some of it stops working. > It's a little bit like saying if 95% of the dialog boxes in Visual > Studio were usable but 5% were super slow/unusable what would we think.
I have to say I generally agree with this assessment, although I'm not sure Visual Studio should be our quality metric...and yes, I realize there are two meanings in that statement: I meant it in both senses. :) Here's my general thinking as to why we're where we're at: When I added security, that was bound to slow things down a bit because it does extra checking on just about every core operation. There are generally several core operations per page. So that probably accounts for a bit of the slowness. But I think the bigger deal is what I did to the WikiTalk engine. When I went to add security, the biggest motivator for the rearchitecture was the caching that was present. There was a fair amount of it, and it was really intertwined with other aspects of the code. Obviously, caching and security are related: I don't want to serve David a copy of a page that was rendered for Craig if David is not supposed to see it. As a result, the pipelined architecture does this (more or less): NamespaceManager => Security => Caching => Property Parsing => Built-in topics => Filesystem Store So we check permissions against what comes back from cache, not the other way around. And that works well. The issue is that that whole content pipeline is a full level below the WikiTalk stuff, which interacts with NamespaceManager and Federation, not the content pipeline. So incorporating *content* caching into the WikiTalk engine would result in code that cuts across layers, which is the mess I was trying to avoid in the first place. One option is to add another layer of caching up at the web layer. Specifically, *output* caching. So we'd get a rendered page, and then if nothing has changed, we could just spew it again on the next request. This would make just about every page way, way, way, way faster, once it had been rendered the first time. The problem is, it's slightly hard to get right. The caching itself is pretty easy, but the cache expiration stuff is not. For example, a WikiTalk script that scans the whole namespace looking for "Summary" properties and displaying them in a table can only be cached until anything at all in the namespace changes. And any change whatsoever to _ContentBaseDefinition invalidates everything else...in case there was a security change at the namespace level. I can see how to do that, but it's not a small job. Also, I'm not sure it'll really solve the problem of "unusably slow pages", because for infrequently accessed pages, or even just pages that fall out of cache, they're still going to take just as long to render as they do in the current code. On top of that, we've got a *scripting engine* in the mix. Cache expiration based on changes to the content is all well and good when the content deterministically produces the output, but you can't cache anything else. So if WikiTalk had the equivalent of DateTime.Now, you obviously can't reliably cache that. Now, maybe it's the case that WikiTalk can only deterministically produce output based on content. I don't know. Does anyone? If so, it makes life easier. But even if not, then the other option we can look at is making the WikiTalk execution engine smarter. I believe this is what was in 1.8 - there were certainly a bunch of annotations on the WikiTalk code about what could and could not be cached. Presumably, it let us cache the results of particular operations, like an Array.Sort of the topics in a namespace. I never really understood that stuff, which didn't really matter, since it had to go to accommodate the new architecture. But maybe it could be added back in. David - perhaps you could fill me in a bit on how that stuff worked. In the meantime, I'll take a look at the old code to see if I can figure out how it worked. I don't really have any other ideas at this point: my attempts at profiling the existing bottlenecks with the free tools I have available have failed. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Flexwiki-users mailing list Flexwiki-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flexwiki-users