On 3/1/06, Jack Diederich <[EMAIL PROTECTED]> wrote:
> On Wed, Mar 01, 2006 Brett Cannon wrote:
> > On 2/28/06, Fredrik Lundh <[EMAIL PROTECTED]> wrote:
> > > Thomas Wouters wrote:
> > >
> > > > I added webstats for all subsites of python.org:
> > > >
> > > > http://www.python.org/webstats/
> > >
> > > what's that "Java/1.4.2_03" user agent doing?  (it's responsible for
> > > 10% of all hits in january/february, and 20% of the hits today...)
> >
> > Most likely a crawler.
> >
> Youch, if I'm reading it right it consumed fully half of the bandwidth
> for today on python.org.  And what 1.6 million pages did it spider on
> the site last month?  Something smells broken.

Well, here's a hint. The file almost all of them are retrieving is
/topics/xml/dtds/xbel-1.0.dtd. They're all being redirected to
pyxml.sf.net, though. It's a lot of hits, but www.python.org doesn't
serve any actual pages, so the actual load is not that big (at least,
not for us :-) It skewes the statistics somewhat, maybe I should
ignore the whole /topics/xml tree in the stats.

--
Thomas Wouters <[EMAIL PROTECTED]>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to