To continue the story. So I spent this week trying to do a more surgical analysis of performance in the problematic cases we've uncovered. First, I had to switch to something less broken than CLRProfiler for profiling. I like NProf, but it suffered from a major problem: there's no way to say "start profiling now". That means there was no way to ignore the bit where the cache is being populated. Although interesting in its own right, that data obscures the part I'm actually trying to optimize for.
To make a long story medium, I wound up doing two things: 1) Writing a driver console program that loads the FlexWiki engine and renders a single topic. It does it twice, once to warm up the cache, and once to measure perf against the populated cache. 2) Modifying NProf to discard the data from the warmup run. So I now have a way to profile single topics, which is nice, because it gives me data in about a minute rather than three hours, not to mention the fact that I can drill down and see exactly methods are contributing most to execution time. Based on this new and excellent data, I was able to add another pretty major performance optimization. It turns out I wasn't caching the existence of a namespace, which meant that every time you asked the Federation for a NamespaceManager, it was going all the way down to the FileSystemStore. Obviously, any file I/O (let alone dozens of times per request) is a real performance killer. The end result of all this is that I can get DarrenSQLIS to render quite a bit faster. Here are some stats to give you an idea: 2.0.0.138-uncached: 61s 2.0.0.138-cached: 38s Latest-uncached: 52s Latest-cached: 28s Where "latest" is the engine with the changes I've made but not yet checked in. So you can see it's a pretty big improvement over the current 2.0 code. There's a fair amount of noise in that, too, so from run to run it varies a bit, but it's definitely better. Of course, it's still pretty darn slow, especially compared to the 11 seconds that page renders the *first* time in under 1.8, to say nothing of the 338 milliseconds that it manages in the best case. The issue now is that the performance is becoming much harder to optimize without architectural changes. It used to be the case that I could look at the perf numbers and see that one function was responsible for 75% of the time. As I've hacked it faster and faster, the bottlenecks are disappearing one by one, and the slowness is sort of...diffusing. Now there's lots of places that are contributing a little bit. I still have a few ideas about how to get some pretty big wins. One place that's still contributing a pretty good portion of the time to process a request is the AuthorizationProvider. Given that it has to examine properties on the namespace, wiki, and topic level, that's not too surprising. The big question is whether there's any reasonable way to cache some of the calls to HasPermission, maybe just within a single request. Another thing I want to look at is the implementation of Sort in WikiTalk. Our performance killer right now is Sort against large datasets. I believe this to be because quicksort (the algorithm apparently in use) is O(log(n)) in time, bound by the comparison operation. The comparison appears to be a dynamic evaluation of the WikiTalk objects in the collection being sorted. That's a lot of evaluations, and they're each individually pretty slow (uses Reflection, lots of lookups, etc.). My hope is that I can do something clever like evaluate the collection objects once ahead of time, and then sort that, rather than dynamically evaluate every time there's a comparison. Yes, that's a bit like the caching that David talked about, but the difference is that we can keep it localized to the WikiTalk code, and specific to one request. We'll see: I haven't looked at the code yet to see how easy that's going to be, or even if it's possible. Also, I should probably try to optimize the AuthorizationProvider since that'll benefit every request anyway. The big question on that one is what assumptions I can make about Thread.CurrentPrincipal object, as it will effectively be the cache key. So, to summarize: work continues. :) ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Flexwiki-users mailing list Flexwiki-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/flexwiki-users