On 3/1/06, Jack Diederich <[EMAIL PROTECTED]> wrote: > On Wed, Mar 01, 2006 Brett Cannon wrote: > > On 2/28/06, Fredrik Lundh <[EMAIL PROTECTED]> wrote: > > > Thomas Wouters wrote: > > > > > > > I added webstats for all subsites of python.org: > > > > > > > > http://www.python.org/webstats/ > > > > > > what's that "Java/1.4.2_03" user agent doing? (it's responsible for > > > 10% of all hits in january/february, and 20% of the hits today...) > > > > Most likely a crawler. > > > Youch, if I'm reading it right it consumed fully half of the bandwidth > for today on python.org. And what 1.6 million pages did it spider on > the site last month? Something smells broken.
Well, here's a hint. The file almost all of them are retrieving is /topics/xml/dtds/xbel-1.0.dtd. They're all being redirected to pyxml.sf.net, though. It's a lot of hits, but www.python.org doesn't serve any actual pages, so the actual load is not that big (at least, not for us :-) It skewes the statistics somewhat, maybe I should ignore the whole /topics/xml tree in the stats. -- Thomas Wouters <[EMAIL PROTECTED]> Hi! I'm a .signature virus! copy me into your .signature file to help me spread! _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com