I would like to crawl several websites and limit the total number of
bytes per downloaded file to 5 MB, just in case I run into some files
that are really large.
From what I understand after reading through the wget manual, the
--quota option could be used to limit the total number of bytes
downloaded, but there is no option that limits the total number of bytes
per file, correct? Would such an option be useful to anyone else who
uses wget? Someone else asked about this back in Oct 2005:
http://www.mail-archive.com/[email protected]/msg08355.html
I noticed the Heritrix crawler (http://crawler.archive.org/) supports
the file size limit option.
Thanks,
Frank