On Thu, Dec 10, 2009 at 10:47 AM, Owen Taylor <[email protected]> wrote: > On Thu, 2009-12-10 at 10:30 -0800, Jeff Schroeder wrote: > >> > Possible fixes: >> > >> > - Block /TitleIndex and /WordIndex entirely - they aren't useful pages >> > - Block the Blue Coat fetches by User Agent (this, however, apparently >> > doesn't get all the prefetches, sometimes it uses the user agent >> > of the requesting client.) >> > - Use apache's mod_cache facilities to cache /TitleIndex, /WordIndex >> > - Patch Moin to omit this section of the pages >> > >> > Don't have a lot of opinion which one of these or combination of these >> > is best - the last one makes some sense to me. >> > >> > - Owen >> >> Sorry Owen I forgot to reply all the first time. >> >> The last one makes a lot of sense however it will require updating the >> patch as we upgrade moinmoin. What are the downsides of just blocking >> both of those URLS with a shiney gnome 403 page? Besides it being >> nifty to see those pages, is there any value add in keeping them? > > Downsides I could see: > > - These pages are linked to from http://live.gnome.org/HelpForBeginners > and might have some small utility > > - Just blocking the /TitleIndex and /WordIndex won't keep Blue Coat > from predictively scraping other URLs in that section. > > From a rough grep, 10% of the page hits on live.gnome.org are > for action=raw or action=print. > > (Since there is no UI for getting to action=raw or action=print > I can find, we could also possibly just block those as well.)
Would it be possible to somehow programmatically generate these specific pages from a cronjob and have moin serve a static page? If BlueCoat is lying about it's user agent there isn't much of a way to stop it and not kill legitimate users every so often. The problem is that those 2 pages, helpful or not, are killing the user experience for everyone else. -- Jeff Schroeder Don't drink and derive, alcohol and analysis don't mix. http://www.digitalprognosis.com _______________________________________________ gnome-infrastructure mailing list [email protected] http://mail.gnome.org/mailman/listinfo/gnome-infrastructure
