What you did is one way ;-) On Mon, Mar 19, 2018 at 03:27:25PM +0100, Laura Arjona Reina wrote: > Hello > I've been doing some tests and this is what I have, for now: > > * the current script is in: > https://anonscm.debian.org/cgit/debwww/cron.git/tree/parts/1ftpfiles > (and it works, not sure when/if it will stop working...). ... > This seems to work (I've run the script in local and later checked that the > files were downloaded in /srv/www.debian.org/cron/ftpfiles), but it needs > improvements, because: > > * I've workarounded the robots.txt with "-e robots=off" but I guess that this > is > not the correct/elegant/respectful way? > > * wget downloads all the files and then removes the ones that don't match the > pattern specified with -A. Maybe there is a more efficient way to do this?
Most wgetfiles are downloading the latest unstable binary packages. Why re-invent the binary package downloader when we have "apt-get"? apt-get has -C option to use non-standard /etc/apt/sources.list listing unstable distribution: deb http://deb.debian.org/debian/ sid main contrib non-free We need to use non-standard directories and not to contaminate the system ones: /var/cache/apt/archives/ /var/lib/apt/lists/ This can be done by setting through Item: Dir::Cache::Archives Item: Dir::State::Lists specified via -o option. By setting all these and few more options as needed, we should be able to download the latest binary package from the archive using the proven tool. Of course, obsolete dpkg-doc from snapshot should use original wgetfiles Installation guide: I need to check how it should be handled. What do you think? Regards, Osamu