Re: [Flexwiki-users] Performance analysis of 25000 pages

David Ornstein Sun, 30 Sep 2007 22:05:22 -0700

A few more things (I decided to investigate a bit more tonight)...

I am now much more concerned about the underlying data.


(1) One example that I just look at is a page we have that uses WikiTalk to 
list all namespaces on the server (about 1000).  The log data I have for this 
page shows that under 2.x it took over 20 seconds for each of the three 
iterations.  But when I hit that page manually it seems much, much faster (2-3 
second) including successive hits.  And the data I have for 1.8 actually shows 
zero seconds for all three iterations.  Apparently there was an exception on 
this page for all three iterations under 1.8.

(2) I have found a sample page that claims to have taken consistently about 3.5 
second for all three iterations under 2.0.  However, when I access this page 
directly via the browser (on the same box the tests ran on), it comes up in the 
blink of an eye (as do successive refreshes).

(3) Looking closer at the question of exceptions, I now see 11000 under 1.x and 
7000 under 2.x.  Divide by 3 (iterations) and this suggests almost 4000 topics 
(about 10%) got exceptions under 1.x and almost 3000 under 2.x.   Clearly these 
need to be addresses as they are apparently messing with the data.  Several 
observations so far:

        a) Some of these are pages that don't really exist at all and it's my 
mistake that they are in the URLs list to run the testing against.  I'll try to 
weed those out.
        b) At least some of the pages that gave exceptions seem to work just 
fine when run manually.  I assume the failures during the automated run are 
somehow due to the state of the web application (e.g., memory issues, etc.) but 
I'm not sure.

(4) How do you handle redirects in your test harness?  There are a fair number 
of pages on our internal site that redirect to an Internet site.  I assume 
you're just processing the data coming back from the web app, so it's just the 
time required to get back the redirect response -- but you don't follow the 
actual redirect response to the external web site and count time to retrieve 
that.  Is that right?


> -----Original Message-----
> From: David Ornstein
> Sent: Sunday, September 30, 2007 8:10 PM
> To: 'FlexWiki Users Mailing List'
> Subject: RE: [Flexwiki-users] Performance analysis of 25000 pages
>
> > -----Original Message-----
> > From: [EMAIL PROTECTED] [mailto:flexwiki-
> > [EMAIL PROTECTED] On Behalf Of Craig Andera
> > Sent: Sunday, September 30, 2007 4:39 AM
> > To: 'FlexWiki Users Mailing List'
> > Subject: Re: [Flexwiki-users] Performance analysis of 25000 pages
> >
> > > OK.  I finally got the spreadsheet working and have results.  Below
> > is
> > > the data based on a run from the build before the most recent.
> I'll
> > > have to go get the latest build this week and rerun overnight to
> > > generate data but the analysis will be very very fast since I can
> now
> > > just dump log file records into the spreadsheet and get the data
> > below.
> >
> > Very cool! I really like the analysis - I think you just set the gold
> > standard for comparison tests.
>
> When I actually get time to devote to things I'm usually happy with the
> results. :-)  I'm glad you liked it.
>
> The one thing that concerns me about the underlying data is that the
> iterations run through all 25000 pages and then come back and run
> again.  After thinking about it last night, I think this might be a
> very bad analog for the real world.  Specifically, this means that if I
> hit page1 and then 24999 other pages, by the time I hit page1 again in
> the second iteration the odds of anything being cached for page1 are
> very, very low :-)  This would explain why the numbers don't seem to
> show much improvement from iteration 1 to iteration 2 under 1.8.  How
> easily do you think you could change the harness so that iterations run
> depth first rather than breadth first?  If you had that I'd be able to
> include that when I do my updated run (with the latest bits since I'm
> one build behind).
>
> > > I am still chewing on what the information below means... ;-)
> >
> > Have you had a chance to look at the contents of any of the really
> slow
> > pages?
>
> I haven't.  I will try to pick out a few of them this week and see if I
> can repro the performance in a one-off case.  If I can, I'll put these
> up on flexwiki.com and I'll take a look at what they're doing to see if
> I can suggest any areas for exploration.
>
>
> > Bad Case 2 (was slow but it was acceptable, now it's not) is
> > particularly interesting because of the small number of pages in this
> > category. Here's my profiling technique [1] if you want to get out
> the
> > microscope. If you can provide the data, I can do the analysis, too.
>
> Realistically, I'm unlikely to be able to get a dev environment set up
> to do any of the profiling for the next couple of weeks.  Let's see
> what happens when I look at the offending pages (and a new run of
> data).
>
> > I'm interested to hear what you have to say. I have been thinking
> very
> > hard
> > about how to add output caching to FlexWiki in the new design. It's
> > decidedly nontrivial, but I'm increasingly convinced that there's a
> > reasonable way to do it, where "reasonable" is defined as "does not
> > produce
> > spaghetti code". However, if we can ship 2.0 without it, that would
> be
> > good
> > - output caching could be a 2.2 feature. *If* I can pull it off, I
> > think
> > we'll find that it makes an *enormous* difference in performance in
> > cases
> > where WikiTalk can be cached (e.g. doesn't call DateTime.Now).
> >
> > If we don't think we can ship 2.0 without output caching, then I
> think
> > maybe
> > we should ship the current bits as Beta 3, and that www.flexwiki.com
> > should
> > be upgraded as well. I just really think we need to keep putting out
> > "official" releases every three months or less - it's apparent from
> the
> > questions that we get here that many people never visit
> > http://builds.flexwiki.com, and a SourceForge "last updated" date way
> > in the
> > past is the kiss of death.
>
> I agree that if we don't think we can ship 2.0 without output caching
> that shipping as beta 3 is a very good idea -- as is upgrading
> flexwiki.com.
> >
> > [1]
> > - Download NProf
> > - Build RenderDriver from the FlexWiki tools directory
> > - In NProf, set RenderDriver.exe to be the executable
> > - In NProf set the arguments to be "\path\to\flexwiki.config
> > Namespace.TopicName n" where n is a number. I suggest 2 as a good
> > start.
> > - Run by pressing F5
> > - After run completes, drill down into Main->Run to see where the
> time
> > went.
> > Or Main->Warmup if you want to see what happened before the cache was
> > populated.
> >
> >
> >
> > ---------------------------------------------------------------------
> --
> > --
> > This SF.net email is sponsored by: Microsoft
> > Defy all challenges. Microsoft(R) Visual Studio 2005.
> > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> > _______________________________________________
> > Flexwiki-users mailing list
> > Flexwiki-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/flexwiki-users


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Flexwiki-users mailing list
Flexwiki-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flexwiki-users

Re: [Flexwiki-users] Performance analysis of 25000 pages

Reply via email to