Re: Filename trouble

Matthias Vill Fri, 11 Jan 2008 12:00:44 -0800

Micah Cowan schrieb:
> Nichlas wrote:
>> Hi, i'm new to the list.
> 
>> I'm currently trying to download about 600 pdf's linked to from
>> individual HTML pages on a site.
> 
>> Problem is, that when the PDFs get downloaded, they get names like
> 
>> "B111009.pdf?rguid=C26A0D99-F4AB-4C2B-B918-F94B51EE7C3C&rnr=20053"
> 
>> Is there any way to get wget to save them just as "B111009.pdf?"
> 
> Not really. You can, of course, rename them after the fact. A
> Bourne-style shell script to do this might be:
> 
>   for f in *.pdf\?*
>   do
>     newfname="$(echo $f | sed 's/\(\.pdf\).*$/\1/')"
>     (
>       set -x
>       mv "$f"
>     )
>   done
> 
> It's possible that the server sends Content-Disposition headers for
> these files, suggesting a cleaner name for them. In that case, you might
> consider downloading the latest development version of Wget
> (http://wget.addictivecode.org/FrequentlyAskedQuestions#download), and
> using the --content-disposition option.
> 
Maybe also
---
        -O file
        --output-document=file
           The documents will not be written to the appropriate files,
but all will be concatenated together and written to file.  If - is used
as file, documents will be printed to standard output, disabling link
conversion.  (Use ./- to print to a file literally named -.)


        Note that a combination with -k is only well-defined for downloading a
single document.
---
is of some interest to you as it seems you are batch-fetching and maybe
know a good name in advance

Greetings

Matthias

Re: Filename trouble

Reply via email to