Gerard Beekmans wrote:
> Hi guys,
>
> After reviewing logs I ended up having to block the wget user agent in
> Apache for the time being. Pages such as
> http://www.linuxfromscratch.org/lfs/downloads/stable/ are causing issues
> with wget.
>
> The name, last modified, size and description headers are clickable
> links to change the sorting of the page. This tricks wget's recursive
> mode into thinking they are different pages. Each page ends up with 8
> copies and each file is downloading at least 8 times. Looking back over
> previous months this is causing a lot of wasted bandwidth toward our
> monthly cap. The above page is just one example of a few areas where
> this occurs. The automated dowloads also increase the per-unit of time
> hit so by not letting that go on for now means the bandwidth load is
> spread over longer periods of time.
>
> This is just another stop-gap while the migration is under way. The new
> provider offers 1.6 TB/month with a soft cap so a few GB over here and
> there won't cause problems. We'll have more breathing room.
>
> Until then, this will cause some issues with the wget-list file provided
> for easy patch downloads. Hopefully not too many people will find
> themselves stranded.
>

Would an appropriate /robots.txt help things out?

   -- Bruce

-- 
http://linuxfromscratch.org/mailman/listinfo/lfs-dev
FAQ: http://www.linuxfromscratch.org/faq/
Unsubscribe: See the above information page

Reply via email to