On Wed, May 16, 2007 at 12:47:03PM -0500, Patrick R. Michaud wrote: > On Wed, May 16, 2007 at 07:32:14PM +0200, Christophe David wrote: > > I made some more tests with the following pagelist : > > > > (:pagelist group=RCB name=RCB?* list=normal order=$:_city fmt=#RCB_city:) > ... > This will display various timings at the bottom of each page, similar > to what is done for pages in the Test group on pmwiki.org. > ... > I'll post an example in a followup message to this one.
Here's the followup example. For this example I'm using the page at http://www.pmwiki.org/wiki/Test/PageListCache , which contains the markup (:pagelist group=PITS name=[0-9]* list=normal order=$:Summary fmt=#default :) This is very similar to the pagelist markup that Christophe is using on his site. The #default template is given as: [[#default]] (:if ! equal {=$Group} {<$Group}:) :[[{=$Group}/]] /: (:if:) : :[[{=$Group}/{=$Name}]] [-{=$:Summary}-] [[#defaultend]] The "stopwatch" results appear at the bottom of the Test.PageListCache page output, and the rest of this message will relate that output to what's happening internally. If you aren't interested in pagelist caching or the gory PmWiki internals, it's safe to skip this message. :-) In the stopwatch output, each line represents an "event" of some sort in PmWiki's overall processing. For example, the first three lines of the stopwatch output read: 00.00 00.00 config start 00.00 00.00 config end 00.09 00.08 MarkupToHTML begin The first number of each line is the cumulative "wall clock time" since the first event, and the second number is the cumulative CPU seconds used. I'll just use the wall clock timings in this example. The "config start" and "config end" lines above are specific to pmwiki.org, and indicate the amount of time needed to process local customization files. As you can see, it doesn't take much (00.00 seconds). The "MarkupToHTML begin" line marks the point where PmWiki begins processing the markup for Test.PageListCache, 0.09 seconds after we logged "config start". 00.10 00.09 FPLTemplate begin 00.11 00.10 MakePageList pre 00.11 00.10 PageListSources begin 00.11 00.10 PageStore::ls begin wiki.d/{$FullName} 00.16 00.13 PageStore::ls merge wiki.d/{$FullName} 00.27 00.23 PageStore::ls end wiki.d/{$FullName} 00.27 00.23 PageStore::ls begin $FarmD/wikilib.d/{$FullName} 00.27 00.23 PageStore::ls merge $FarmD/wikilib.d/{$FullName} 00.28 00.23 PageStore::ls end $FarmD/wikilib.d/{$FullName} 00.28 00.24 PageListSources end count=922 The "FPLTemplate begin" indicates the point where the (:pagelist:) directive is being processed via a pagelist template. So, we can base the overall time spent on pagelist processing from here. The first thing that happens is that PmWiki scans the wiki.d/ and wikilib.d/ directories (over 5,600 pages) for pagenames matching the group=PITS and name=[0-9]* patterns. When I ran the above it took pmwiki.org 0.18 seconds to perform this scan, and we end up with 922 matching pages (count=922). 00.28 00.24 PageListSort pre ret=4 order=$:Summary 00.28 00.24 MakePageList items count=922, filters= 00.30 00.25 MakePageList post count=922, readc=0 00.30 00.25 PageListCache begin save key=b1924c551f2a4a9203fbdd780ccb9339 00.30 00.26 PageListCache end save Next, we loop through the list of 922 pages, building the internal pagelist structure as we go. If we were doing any search term filters, or if $EnablePageListProtect is enabled, then we'd be checking search terms and page permissions here. Here it doesn't appear that any additional checking was required that would involve reading pages (readc=0), so we complete this step in 0.02 seconds. This also shows where the resulting pagelist is saved to the pagelist cache, so that it can be recalled later without having to scan the directory and repeat any searches. The pagelist is stored in the cache under the key "b1924c551f2a4a9203fbdd780ccb9339". 00.30 00.26 PageListSort begin 02.53 02.33 PageListSort end 02.53 02.33 MakePageList end This is where we start to see some time usage. Since we specified order=$:Summary (a page text variable), PmWiki now has to read the text of all 922 pages to extract the corresponding $:Summary values. It takes 2.23 seconds to read those pages and perform the corresponding sort. We've now computed our final pagelist, and we can return this to the pagelist template formatter for converting the list to HTML. The total time to required to do the search/sort was 2.42 seconds (the time elapsed between "MakePageList begin" and "MakePageList end"). 02.89 02.69 MarkupToHTML begin 04.02 03.80 ReadApprovedUrls Site.ApprovedUrls begin 04.03 03.80 ReadApprovedUrls Site.ApprovedUrls end 05.78 05.54 MarkupToHTML end 05.78 05.54 FPLTemplate end The FPLTemplate routine next applies the #default template to all 922 pages, concatenating the results of each (0.36 seconds). Then that markup is passed to the MarkupToHTML function to be converted into HTML, requring another 2.89 seconds (time elapsed from "MarkupToHTML begin" and "MarkupToHTML end"). We're then done with the (:pagelist:) command and can return to processing the rest of the page. In other words, processing this pagelist required a total of 5.67 seconds, 2.42 was spent determining the pages that belong in the list, and 3.25 was spent formatting that list into HTML. It's worth highlighting at this point that any pagelist caching can only improve the first part of this total -- it doesn't help us at all with the rendering. The remainder of the stopwatch entries deal with the rest of the page output... 05.80 05.55 MarkupToHTML end This finishes the rendering of the main page. 05.82 05.57 MarkupToHTML begin 05.86 05.61 MarkupToHTML end 05.86 05.61 MarkupToHTML begin 05.88 05.63 MarkupToHTML end 06.17 05.64 now The first pair of MarkupToHTML entries are for the SideBar (0.04 seconds) and the second pair are for Site.PageActions (0.02 seconds). Finally we send everything to the browser, and report the total time elapsed up to the point of generating the stopwatch output. Okay, let's see the timings when we manage to "hit" the pagelist cache: 00.00 00.00 config start 00.00 00.00 config end 00.08 00.08 MarkupToHTML begin 00.09 00.08 FPLTemplate begin 00.09 00.09 MakePageList pre 00.09 00.09 PageListCache begin load key=b1924c551f2a4a9203fbdd780ccb9339 00.09 00.09 PageListCache end load 00.09 00.09 PageListSources begin 00.10 00.09 PageListSources end count=922 00.10 00.09 PageListSort pre ret=4 order=$:Summary 00.10 00.09 MakePageList items count=922, filters= 00.11 00.10 MakePageList post count=922, readc=0 Because PmWiki determined it was safe to load pages from the cache, we have our set 922 pages very quickly -- only 0.10 second from the time we started PmWiki until we got the set of 922. Thus we got to avoid the 0.18 seconds required to scan the directories, find matching pages, and check for search terms. But from here on out, the steps are as slow as before. 00.11 00.11 PageListSort begin 02.42 02.22 PageListSort end 02.42 02.22 MakePageList end Because our sort criteria is $:Summary, we have to read all of the 922 pages to get their $:Summary value and perform the sort. As before, this takes a little over 2 seconds to achieve. At this point I hear everyone yelling out... "WAIT! We shouldn't have to do the sort over again... the caching algorithm should be saving the pagelist AFTER the sort, not before!!" Actually, in this case (and in many/most cases) caching the sorted output turns out to not save us much time overall. Yes, we could cache the sort and avoid reading the 922 pages at this step, but our pagelist template is going to be looking for {$:Summary} for each of the 922 pages anyway, so we end up reading them at some point anyway. We'd just be deferring the reads until later in the processing, and not really speeding up the overall pagelist. We also have to be concerned about things like order=random and order=$SomePageVariable. Some sort criteria aren't constant for a given set of pages, so we can't always rely on caching a sorted version of the list. (There are some special cases where there _could_ be an advantage to caching a sorted version of the list... but I think they're pretty rare overall, and possibly not worth the effort needed to detect and code for them. PmWikiPhilosophy #3 applies.) So, even with pagelist caching enabled, our total time to get the set of pages is 2.33 seconds (versus 2.42 seconds when we weren't doing the cache. The remainder of the stopwatch for the cached case is quite similar to the earlier one: 02.78 02.58 MarkupToHTML begin 03.92 03.68 ReadApprovedUrls Site.ApprovedUrls begin 03.92 03.69 ReadApprovedUrls Site.ApprovedUrls end 05.70 05.44 MarkupToHTML end 05.70 05.44 FPLTemplate end 05.72 05.45 MarkupToHTML end 05.74 05.47 MarkupToHTML begin 05.78 05.51 MarkupToHTML end 05.78 05.51 MarkupToHTML begin 05.80 05.53 MarkupToHTML end 06.07 05.54 now We still end up using 3.28 seconds to apply the pagelist template to the list of 922 pages, for a total pagelist processing time of 5.61 seconds. For this particular query, the cache provided a 0.06 second increase. Okay, at this point I can hear people saying "Well, what good is page caching?" And you'd be somewhat correct -- for *this* particular query, pagelist caching isn't providing much benefit at all. But on sites where $EnablePageListProtect is enabled, or where pagelist is filtering pages based on search terms, page variables, page text variables, link=, if=, or other criteria, the pagelist cache can provide a _huge_ speedup by reducing the number of filter checks needed. All of this also goes back to what I've been indicating all along, which is that although pagelist templates are extremely powerful, they can also be very slow, especially on large pagelists or complex templates. This is also why I've recently been focusing my attention on improvements to the efficiency of pagelist templates as opposed to pagelist generation itself -- that's where the bigger payoffs seem to be. For those who managed to make it to the bottom of this message, I hope it was worth it. :-) Pm _______________________________________________ pmwiki-users mailing list [email protected] http://www.pmichaud.com/mailman/listinfo/pmwiki-users
