I recently found that during a (wget) "mirror", not all the files are downloaded. (wget v1.8.2 / debian) For example:
wget --mirror http://www.jeannette.hu downloads some files, but for example ./saj_elemei will only contain filelist.xml (with the following content). <xml xmlns:o="urn:schemas-microsoft-com:office:office"> <o:MainFile HRef="../saj.htm"/> <o:File HRef="image001.jpg"/> <o:File HRef="image002.gif"/> <o:File HRef="image003.gif"/> <o:File HRef="image004.gif"/> <o:File HRef="filelist.xml"/> </xml> However, if I issue a wget [--no-parent] --mirror http://www.jeannette.hu/saj_elemei then the following will also gets downloaded. -rw-r--r-- 1 root root 257 Oct 29 2001 filelist.xml -rw-r--r-- 1 root root 2506 Oct 29 2001 image001.jpg -rw-r--r-- 1 root root 23343 Oct 29 2001 image001.png -rw-r--r-- 1 root root 4959 Oct 29 2001 image002.gif -rw-r--r-- 1 root root 1053 Oct 29 2001 image003.gif -rw-r--r-- 1 root root 4246 Oct 29 2001 image004.gif -rw-r--r-- 1 root root 27068 Oct 29 2001 image004.wmz -rw-r--r-- 1 root root 17627 Oct 29 2001 image006.gif -rw-r--r-- 1 root root 1447 Aug 15 16:33 index.html -rw-r--r-- 1 root root 1447 Aug 15 16:33 index.html?D=A -rw-r--r-- 1 root root 1447 Aug 15 16:33 index.html?D=D -rw-r--r-- 1 root root 1447 Aug 15 16:33 index.html?M=A -rw-r--r-- 1 root root 1447 Aug 15 16:33 index.html?M=D -rw-r--r-- 1 root root 1447 Aug 15 16:33 index.html?N=A -rw-r--r-- 1 root root 1447 Aug 15 16:33 index.html?N=D -rw-r--r-- 1 root root 1447 Aug 15 16:33 index.html?S=A -rw-r--r-- 1 root root 1447 Aug 15 16:33 index.html?S=D My goal is to have the most files (eg: full retreive) of a site (with possibly using one command only...). I tried several other ftpmirroring program but they're racing for the "crappiest program on earth" title against each other. Is it wget's fault, or am I the dumb one and missed something somewhere? Thanks, Gabor
