I recently found that during a (wget) "mirror", not all the files are
downloaded. (wget v1.8.2 / debian) For example:

wget --mirror http://www.jeannette.hu
downloads some files, but for example ./saj_elemei will
only contain filelist.xml (with the following content).

<xml xmlns:o="urn:schemas-microsoft-com:office:office">
 <o:MainFile HRef="../saj.htm"/>
 <o:File HRef="image001.jpg"/>
 <o:File HRef="image002.gif"/>
 <o:File HRef="image003.gif"/>
 <o:File HRef="image004.gif"/>
 <o:File HRef="filelist.xml"/>
</xml>

However, if I issue a 
wget [--no-parent] --mirror http://www.jeannette.hu/saj_elemei
then the following will also gets downloaded.

-rw-r--r--    1 root     root          257 Oct 29  2001 filelist.xml
-rw-r--r--    1 root     root         2506 Oct 29  2001 image001.jpg
-rw-r--r--    1 root     root        23343 Oct 29  2001 image001.png
-rw-r--r--    1 root     root         4959 Oct 29  2001 image002.gif
-rw-r--r--    1 root     root         1053 Oct 29  2001 image003.gif
-rw-r--r--    1 root     root         4246 Oct 29  2001 image004.gif
-rw-r--r--    1 root     root        27068 Oct 29  2001 image004.wmz
-rw-r--r--    1 root     root        17627 Oct 29  2001 image006.gif
-rw-r--r--    1 root     root         1447 Aug 15 16:33 index.html
-rw-r--r--    1 root     root         1447 Aug 15 16:33 index.html?D=A
-rw-r--r--    1 root     root         1447 Aug 15 16:33 index.html?D=D
-rw-r--r--    1 root     root         1447 Aug 15 16:33 index.html?M=A
-rw-r--r--    1 root     root         1447 Aug 15 16:33 index.html?M=D
-rw-r--r--    1 root     root         1447 Aug 15 16:33 index.html?N=A
-rw-r--r--    1 root     root         1447 Aug 15 16:33 index.html?N=D
-rw-r--r--    1 root     root         1447 Aug 15 16:33 index.html?S=A
-rw-r--r--    1 root     root         1447 Aug 15 16:33 index.html?S=D

My goal is to have the most files (eg: full retreive) of a site (with possibly
using one command only...). I tried several other ftpmirroring program but
they're racing for the "crappiest program on earth" title against each other.
Is it wget's fault, or am I the dumb one and missed something somewhere?

Thanks, Gabor

Reply via email to