Got curious about performance on live.gnome.org; the observed macro pattern for system performance was:
- Load is very spiky, sometimes low, sometimes quite high - When it's high, there are httpd processes running at high CPU utilization or spending most of their time in syscalls. - Bottleneck seems to be CPU rather than disk - disk utilization is quite low and is principally writes from httpd logging. Stracing the high-cpu and high-disk-wait httpd processes indicated that they were doing "strange things" - e.g., stat'ing through the page heirarchy looking for attachments for every page in the Wiki, so I wanted to know what requests they were processing. To try and figure this out, I temporarily modified the httpd configuration to include the processing time for each page in the log files and grep'ed out long-running page requests for half an hour of usage. There were 73 requests that took more than 10 seconds to process. 16 requests for /TitleIndex, min=20s, max=190s 18 requests for /WordIndex, min=35s, max=234s 25 requests for attachments, min=11s, max=38s Most, though not all of these of these were for large images, 500k or more 6 misc. POSTs (newaccount, login, edit, AttachFile), min=13s, max=66s 4 requests for wiki pages: min=16s, max=120s 2 requests for Category pages: min=12s, max=40s 1 request for AdvancedSearch: 11s I think it's fair to assume that the long-times for attachments and in some cases for random pages are due to network issues - clients getting data slowly and tying up an httpd issue. So, the thing that really stands out here are the /TitleIndex and /WordIndex requests - why are we getting all these requests for these expensive pages that aren't obviously linked to. So, let's look at the first three requests for /WordIndex: IP: 195.27.20.2 Time: 10/Dec/2009:16:01:25 +0000 Request: GET /WordIndex?action=print HTTP/1.0" Bytes: 1865069 Referrer: "-" User agent: "Mozilla/4.0 (compatible;)" Time: 168.302989 IP: 195.27.20.2 Time: 10/Dec/2009:16:01:25 +0000 Request: GET /WordIndex HTTP/1.0 Bytes: 1867640 Referrer: "http://live.gnome.org/Tomboy/PluginList" User agent: "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 1.1.4322; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; MS-RTC LM 8; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)" Time: 179.532250 IP: 93.174.145.75 Time: 10/Dec/2009:16:03:39 +0000 Request: GET /WordIndex HTTP/1.1" Bytes: 1867640 User agent: "Mozilla/4.0 (compatible;)" Time: 52.201152 IP: 192.196.142.21 Time: 10/Dec/2009:16:03:36 +0000 Request: GET /WordIndex HTTP/1.1 Bytes: 1867640 User agent: "Mozilla/4.0 (compatible;)" Time: 62.462147 So, the thing that stands out here is the consistent User Agent for three out of the four, and the fact that the fourth request, while with a different agent comes from the same IP at the same time as the first. If you do a web search, you'll find that this user agent is attributed to being used by "Blue Coat" proxy server products which apparently do speculative prefetching based on page contents. What page contents are they prefetching on? - if you look at the source of one of the wiki pages - we see (e.g., for /GnomeShell) <link rel="Start" href="/Home"> <link rel="Alternate" title="Wiki Markup" href="/GnomeShell?action=raw"> <link rel="Alternate" media="print" title="Print View" href="/GnomeShell?action=print"> <link rel="Search" href="/FindPage"> <link rel="Index" href="/TitleIndex"> <link rel="Glossary" href="/WordIndex"> <link rel="Help" href="/HelpOnFormatting"> There are in fact *no* obvious links to /TitleIndex and /WordIndex or to the printable versions of pages anywhere in the page, and I'm not aware of any current browsers that present these links content in the user interface. So to summarize: Our performance on live.gnome.org is being killed by speculative prefetching on URLs that are added because they seemed like a good idea but have no actual purpose on the page. Possible fixes: - Block /TitleIndex and /WordIndex entirely - they aren't useful pages - Block the Blue Coat fetches by User Agent (this, however, apparently doesn't get all the prefetches, sometimes it uses the user agent of the requesting client.) - Use apache's mod_cache facilities to cache /TitleIndex, /WordIndex - Patch Moin to omit this section of the pages Don't have a lot of opinion which one of these or combination of these is best - the last one makes some sense to me. - Owen _______________________________________________ gnome-infrastructure mailing list [email protected] http://mail.gnome.org/mailman/listinfo/gnome-infrastructure
