Hi, My original purpose was to keep a local copy of all the remote server files using ftp mirroring.
While googling for a patch, I came across this discussion, http://www.mail-archive.com/[email protected]/msg06759.html where the demand for such an option is raised from 2004 itself. -- aneeskA On Fri, Mar 5, 2010 at 6:22 PM, Keisial <[email protected]> wrote: > Micah Cowan wrote: > >> The page you linked to said (in Japanese) that it's foolish to have to >> rm -r the directory before each mirror attempt with wget. I don't really >> agree: since wget's going to have to download each and every file >> _anyway_, just in order to ensure that it finds all the available links, >> it hardly seems useful to leave the previous files around, just so they >> get overwritten. >> > I have to disagree with you, Micah. > Consider the case where you are mirroring a site exposed via a web > server generated directory index. > wget won't download all files, only all html pages (which is good, just > clarifying your answer). > The user knows (believes) that all existing files are reachable there, > so the deleting could make sense. > Moreover, the "download all pages to follow its links" fails here, since > wget will download the same index many times, sorted by every field, even > when > the user knows they are redundant and --reject-ed them. > > > > If wget were made to parse the local files when it >> realizes it doesn't have to re-download them, then that would help a >> lot, but it doesn't currently, and trying to make it do so has some >> potential problems (though it still might be worth it). >> >> > Could be a nice addition, not sure what problems are you thinking about. > Pages with odd timestamps? > > > Such a feature might be better for FTP, which does impart sufficient >> knowledge to Wget, but the patch you linked doesn't provide that. >> >> > That patch only deletes files which wget try to download (reachable/on > a url list). It can be useful on some cases, but it's a functionaliy too > specific. I'd go for find-like options --exec <command> ; / > --exec-if-status <n> <command> ; so that the user could delete the missing > files, but also move them to an archive, AV-check files on download, etc. > > > > Anyway, Wget currently lacks a maintainer (as of January), so I'm afraid >> no one's going to add new features until that changes. I'm the former >> maintainer, and am occasionally willing to apply easy bugfix patches, >> but nothing beyond that. >> >> > I think it will take some time to get a new maintainer. > >
