"Martin Koniczek" <[EMAIL PROTECTED]> writes: > my wget (GNU Wget 1.10) on a crux-based system simply truncates the > # and everything after [...]
The part after the "#" in HTTP URLs is what some call a "fragment identifier". The browsers use it to position the page at the <a> element whose NAME attribute matches the fragment name. In other words, when you type http://www.server.com/file.html#bla, a browser will request "/file.html" from www.server.com and position the page at "bla" anchor, if such exists. It will *not* ask for "/file.html#bla". Since Wget doesn't display the page, it is trying to be compatible with the browsers by also not using the stuff after the #. > in contrast to the faq (http://www.gnu.org/software/wget/faq.html): > > 3.3 How do I download a URL with funny characters in it? [...] The FAQ is very imprecise here with its use of the term "funny characters". There are characters that are specially processed by the shell, and then there are characters with special meanings in URLs. The former can be protected by shell quoting and the latter by URL quoting.
