henry luo wrote:
> i find a problem at GNU Wget 1.9.1, but i dont know it is a new
> function or a bug;
> the old version(1.8.2) download a link ,for example:
>
> wget
>
'http://www.expekt.com/odds/eventsodds.jsp?range=100&sortby=date&active=
betting&betcategoryId=SOC%25'
>
>
> save file name is
>
"eventsodds.jsp?range=100&sortby=date&active=betting&betcategoryId=SOC%2
5"
>
>but the new version(1.9.1) save name is
>
"eventsodds.jsp?range=100&sortby=date&active=betting&betcategoryId=SOC%"
It is a feature. The latest version of wget converts %nn to the appropriate
character *if* that character is valid in a filename on the target system.
In this case, %25 converts to "%", which can appear in a filename.
The --restrict-file-names option gives you some control over this which
characters are escaped, but it does not appear to provide the functionality
you're looking for:
--restrict-file-names=MODE'
Change which characters found in remote URLs may show up in local
file names generated from those URLs. Characters that are
"restricted" by this option are escaped, i.e. replaced with `%HH',
where `HH' is the hexadecimal number that corresponds to the
restricted character.
By default, Wget escapes the characters that are not valid as part
of file names on your operating system, as well as control
characters that are typically unprintable. This option is useful
for changing these defaults, either because you are downloading to
a non-native partition, or because you want to disable escaping of
the control characters.
When mode is set to "unix", Wget escapes the character `/' and the
control characters in the ranges 0-31 and 128-159. This is the
default on Unix-like OS'es.
When mode is seto to "windows", Wget escapes the characters `\',
`|', `/', `:', `?', `"', `*', `<', `>', and the control characters
in the ranges 0-31 and 128-159. In addition to this, Wget in
Windows mode uses `+' instead of `:' to separate host and port in
local file names, and uses `@' instead of `?' to separate the
query portion of the file name from the rest. Therefore, a URL
that would be saved as `www.xemacs.org:4300/search.pl?input=blah'
in Unix mode would be saved as
`www.xemacs.org+4300/[EMAIL PROTECTED]' in Windows mode. This
mode is the default on Windows.
If you append `,nocontrol' to the mode, as in `unix,nocontrol',
escaping of the control characters is also switched off. You can
use `--restrict-file-names=nocontrol' to turn off escaping of
control characters without affecting the choice of the OS to use
as file name restriction mode.