Re: [Bug-wget] Mirror a website but no sites with special chars like "?"

Paul Wratt Mon, 19 Mar 2012 00:45:30 -0700

because you are doing a recursive download you can manipulate a robots.txt

the easiest way to get it going is:
wget -r url/image.png  <= or gif/etc


this will build the folder structure and should get the servers robots.txt file
edit the file to exclude the other unwanted urls
then do your actual recursive mirror
just google example robots.txt to hack what you need

Paul

On Mon, Mar 19, 2012 at 3:25 AM, Tobias Krais <[email protected]> wrote:
> Hi together,
>
> I want to mirror a wiki. For this I use the command
> wget -e robots=off -r -k -p -E -N -l inf intranet/mywiki/
>
> The request takes a long time, because for each site of the wiki exists
> a edit, upload, ... function. All these "unwanted" sites have one thing
> in common: the URL contains a "?".
>
> My question: Is it possible exclude sites from the download? If yes, how
> can I do it?
>
> You help is highly appreciated!
>
> Greetings,
>
> Tobias
>

Re: [Bug-wget] Mirror a website but no sites with special chars like "?"

Reply via email to