Tim Ruehsen <tim.rueh...@gmx.de> writes: >> Perhaps we do not want to have --no-parent suppressed by >> --page-requisites. It seems that --no-parent is intended as a security >> measure, and the existing code (as well as this proposal) violate its >> fundamental premise. > > --no-parent seems to be intended as a bandwidth limiter together with -r. > When > talking about security, what realistic scenario do you have in mind ? > > Anyways, we definitely don't want to change the default behavior.
What I see in the manual page (admittedly, an old one, 1.16.1) is: -np --no-parent Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that only the files below a certain hierarchy will be downloaded. In the Info page, I see more: In 2.11, "Recursive Accept/Reject Options": '-np' '--no-parent' Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that only the files _below_ a certain hierarchy will be downloaded. *Note Directory-Based Limits::, for more details. In 4.3, "Directory-Based Limits": '-np' '--no-parent' 'no_parent = on' The simplest, and often very useful way of limiting directories is disallowing retrieval of the links that refer to the hierarchy "above" than the beginning directory, i.e. disallowing ascent to the parent directory/directories. The '--no-parent' option (short '-np') is useful in this case. Using it guarantees that you will never leave the existing hierarchy. Supposing you issue Wget with: wget -r --no-parent http://somehost/~luzer/my-archive/ You may rest assured that none of the references to '/~his-girls-homepage/' or '/~luzer/all-my-mpegs/' will be followed. Only the archive you are interested in will be downloaded. Essentially, '--no-parent' is similar to '-I/~luzer/my-archive', only it handles redirections in a more intelligent fashion. *Note* that, for HTTP (and HTTPS), the trailing slash is very important to '--no-parent'. HTTP has no concept of a "directory"--Wget relies on you to indicate what's a directory and what isn't. In 'http://foo/bar/', Wget will consider 'bar' to be a directory, while in 'http://foo/bar' (no trailing slash), 'bar' will be considered a filename (so '--no-parent' would be meaningless, as its parent is '/'). The text "You may rest assured that none of the references to '/~his-girls-homepage/' or '/~luzer/all-my-mpegs/' will be followed." suggests that --no-parent can be relied upon as a type of security feature. I am not personally deeply concerned about this. But I want to see the issue discussed on the mailing list, as the current default behavior differs from the documentation in a way that might be important. Dale