I would like to crawl several websites and limit the total number of bytes per downloaded file to 5 MB, just in case I run into some files that are really large.

From what I understand after reading through the wget manual, the --quota option could be used to limit the total number of bytes downloaded, but there is no option that limits the total number of bytes per file, correct? Would such an option be useful to anyone else who uses wget? Someone else asked about this back in Oct 2005:

http://www.mail-archive.com/[email protected]/msg08355.html

I noticed the Heritrix crawler (http://crawler.archive.org/) supports the file size limit option.

Thanks,
Frank

Reply via email to