Gerard Beekmans wrote: > Hi guys, > > After reviewing logs I ended up having to block the wget user agent in > Apache for the time being. Pages such as > http://www.linuxfromscratch.org/lfs/downloads/stable/ are causing issues > with wget. > > The name, last modified, size and description headers are clickable > links to change the sorting of the page. This tricks wget's recursive > mode into thinking they are different pages. Each page ends up with 8 > copies and each file is downloading at least 8 times. Looking back over > previous months this is causing a lot of wasted bandwidth toward our > monthly cap. The above page is just one example of a few areas where > this occurs. The automated dowloads also increase the per-unit of time > hit so by not letting that go on for now means the bandwidth load is > spread over longer periods of time. > > This is just another stop-gap while the migration is under way. The new > provider offers 1.6 TB/month with a soft cap so a few GB over here and > there won't cause problems. We'll have more breathing room. > > Until then, this will cause some issues with the wget-list file provided > for easy patch downloads. Hopefully not too many people will find > themselves stranded. >
Would an appropriate /robots.txt help things out? -- Bruce -- http://linuxfromscratch.org/mailman/listinfo/lfs-dev FAQ: http://www.linuxfromscratch.org/faq/ Unsubscribe: See the above information page