Re: [Bug-wget] sudo wget --accept sandwich

Tim Ruehsen Wed, 10 Feb 2016 06:57:43 -0800

Hi Marcel,

wget downloads 
"http://www.mfd.mw.tu-dresden.de/mfd/index.php/lehre/wintersemester/videos-winter";,
 parses it for 
URLs and in the end this document becomes removed because it doesn't fit your 
settings.


Your '-A asx' means: Just download/keep *.asx files
Your '--accept-regex videos' means: Just download/keep *video* files

That means, you only want to download *video*.asx files.
Is that what you want ?
Did you try -A '*.pdf' -R '*.asx' ? (Quotes to avoid shell wildcard 
expansion.)

Sorry, but I can only reach the login page... else I could have a look deeper 
inside. Right now I just have to guess.
Maybe you could give us an example HTML page.

Tim

On Wednesday 10 February 2016 09:01:20 Marcel Partap wrote:
> Dear wget devs,
> failed to resolve the following issue on my own, thus asking for
> assistance. I want to download all .asx-files :-( linked from an uni
> internal page. The page doesn't have a suffix.
> 
> > wget --load-cookies cookies.txt -A asx -mk -np -l 1 --accept-regex videos
> > -A asx
> > http://www.mfd.mw.tu-dresden.de/mfd/index.php/lehre/wintersemester/videos
> > -winter
> This will throw
> 
> > Removing
> > www.mfd.mw.tu-dresden.de/mfd/index.php/lehre/wintersemester/videos-winter
> > since it should be rejected.
> and exit there.
> --accept-regex videos-winter on the other side can not be combined with
> --accept asx it seems.
> So how do I get around this. This should be a common scenario, i.e. get
> all linked PDFs etc. from a web page, skipping all the links to
> dynamically generated CMS pages in the menu often placed before the page
> content.
> & IMHO, wget should simply never filter out URLs explicitly given to it
> on the command line.
> #Best Regards/MPartap

signature.asc
Description: This is a digitally signed message part.

Re: [Bug-wget] sudo wget --accept sandwich

Reply via email to