Follow-up Comment #1, bug #20522 (group wget):

I had a look at this and it seems it is due to a limitation of FTP, and
perhaps quite impractical to implement.

Files are discovered by listing contents of a directory and parsing the
output. Therefore, we can only get the path of a symlink's target by parsing
`ls' output.

Some FTP servers server provide a MDTM command which returns the timestamp of
a file given its path:
https://www.solarwinds.com/serv-u/tutorials/mdtm-mkd-mode-noop-sscn-xcrc-ftp-command

This is however not always available.

To provide support for resolving the timestamp of the target of a symlink, I
see 2 ways off the top of my head:

* maintain a data structure to
        * populate given the paths of all files to download
        * traverse to link symlinks to their file nodes
        * traverse again to download

* or, for each symlink encountered
        * cd into the directory of the linked to file
        * list contents again
        * parse the contents again and compare file by file to find the target 
file

While the latter might be okay for one file, - although I would argue, not a
very clean solution - it has the potential to explode in computational
complexity once we do a recursive download with many symlinks.

I don't know what is being done about this in V2, but perhaps the best way to
circumvent this is to use --retr-symlinks=no. This will download symlinks
normal files.

As long as the source is trusted, the security concerns of having dangling
symlinks that resolve to outside the local mirror might not be a practical
concern.


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?20522>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature

Reply via email to