Re: trouble with -p

2008-07-20 Thread Matthias Vill
Hi Brian, maybe this helps: --html-extension If a file of type application/xhtml+xml or text/html is downloaded and the URL does not end with the regexp \.[Hh][Tt][Mm][Ll]?, this option will cause the suffix .html to be appended to the local filename. This is useful, for instance, when

Re: Wget 1.11.3 - case sensetivity and URLs

2008-06-12 Thread Matthias Vill
Hi list! saddly I couldn't find the E-Mail of Allan (maybe because I'm atached by the news gateway) so this is a list-only-post. Micah Cowan wrote: Hi Allan, You'll generally get better results if you post to the mailing list (wget@sunsite.dk). I've added it to the recipients list. Coombe,

Re: wget running in windows Vista

2008-01-31 Thread Matthias Vill
Liz Labbe wrote: I just downloaded WGET and am trying to get it to work under Window's Vista operating system. I cannot get it to connect: for example, I tried wget http://www.yahoo.com/ wget force_html=on http://www.yahoo.com/ etc. I consistently get the message connecting to

Re: Not all files downloaded for a web site

2008-01-27 Thread Matthias Vill
Alexandru Tudor Constantinescu wrote: I have the feeling wget is not really able to figure out which files to download from some web sites, when css files are used. That's right. Up until wget 1.11 (released yesterday) there is no support for css-files in the matter of parsing links out of it.

Re: Skip certain includes

2008-01-24 Thread Matthias Vill
Wayne Connolly wrote: Micah, Thanks mate- i know we chatted on IRC but just thought someone else may be able to provide some insight. Cheers and thanks, Wayne Well It's just impossible over wget to do this, because wget only get's the output a client will get on visiting you page with no

Re: Filename trouble

2008-01-11 Thread Matthias Vill
Micah Cowan schrieb: Nichlas wrote: Hi, i'm new to the list. I'm currently trying to download about 600 pdf's linked to from individual HTML pages on a site. Problem is, that when the PDFs get downloaded, they get names like

Re: Content disposition question

2007-12-03 Thread Matthias Vill
Hi, we know this. This was just recently discussed on the mailinglist and I agree with you. But there are two arguments why this is not default: a) It's a quite new feature for wget and therefore would brake compatibility with prior versions and any old script would need to be rewritten. b) It's

Re: wget default behavior [was Re: working on patch to limit to percent of bandwidth]

2007-10-17 Thread Matthias Vill
Tony Godshall wrote: If it was me, I'd have it default to backing off to 95% by default and have options for more aggressive behavior, like the multiple connections, etc. I don't like a default back-off rule. I often encounter downloads with often changing download speeds. The idea that the

Re: --limit-percent N versus --limit-rate N% ?

2007-10-15 Thread Matthias Vill
Micah Cowan wrote: I'm kinda leaning toward the idea that we change the parser for --limit-rate to something that takes a percentage, instead of adding a new option. While it probably means a little extra coding, it handily deals with broken cases like people specifying both --limit-rate and

Re: --limit-percent N versus --limit-rate N% ?

2007-10-15 Thread Matthias Vill
Micah Cowan schrieb: Matthias Vill wrote: I would appreciate having a --limit-rate N% option. So now about those broken cases. You could do some least of both policy (which would of course still need the time to do measuring and can cut only afterwards). Or otherwise you could use a non

Re: subscribing from this list

2007-10-15 Thread Matthias Vill
Josh Williams schrieb: On 10/15/07, Micah Cowan [EMAIL PROTECTED] wrote: Note that this doesn't help him much if he's lost his registration e-mail. Patrick, you'll probably have to go bug the staff at www.dotsrc.org, who hosts this list; send an email to [EMAIL PROTECTED] E-mail *address*

Re: -R and HTML files

2007-08-23 Thread Matthias Vill
Micah Cowan schrieb: Josh Williams wrote: On 8/22/07, Micah Cowan [EMAIL PROTECTED] wrote: What would be the appropriate behavior of -R then? I think the default option should be to download the html files to parse the links, but it should discard them afterwards if they do not match the

Re: -R and HTML files

2007-08-23 Thread Matthias Vill
Matthias Vill schrieb: Micah Cowan schrieb: Josh Williams wrote: On 8/22/07, Micah Cowan [EMAIL PROTECTED] wrote: What would be the appropriate behavior of -R then? I think the default option should be to download the html files to parse the links, but it should discard them afterwards

Re: -R and HTML files

2007-08-23 Thread Matthias Vill
Barnett, Rodney schrieb: -Original Message- From: Matthias Vill [mailto:[EMAIL PROTECTED] Sent: Thursday, August 23, 2007 1:54 AM To: wget@sunsite.dk Subject: Re: -R and HTML files Micah Cowan schrieb: Josh Williams wrote: On 8/22/07, Micah Cowan [EMAIL PROTECTED] wrote

Re: --spider requires --recursive

2007-08-17 Thread Matthias Vill
Hi Josh, the manual reads: --spider When invoked with this option, Wget will behave as a Web spider, which means that it will not down-load the pages, just check that they are there. For example, you can use Wget to check your book-marks: wget --spider --force-html -i

Re: text/html assumptions, and slurping huge files

2007-07-31 Thread Matthias Vill
exceed memory. I also guess that loading the whole 15M of a max-sized-html at once looks ugly in memory an can lead to problems when you have multiple wget-processes at once. Maybe wget should be optimized for HTML-files with a max-size of 4M and afterwards parse chunks. Greetings Matthias Vill

Re: How to get File names

2007-07-25 Thread Matthias Vill
Hi Karthik, Is there any way to find the file name of the downloaded file? Of course there is. Normally wget will transform the last segement of the url into the filename by changing characters not supported by you filesystem to similar charakters like ? ist stored as %3F wich is the

Re: Maximum 20 Redirections HELP!!!

2007-07-16 Thread Matthias Vill
Hi Jaymz, Jaymz Goktug YUKSEL wrote: process.php?no=0 and when this executes itself and finishes, it redirects itself to process.php?no=1 and this is supposed to go until it reaches 650. However when it comes to 19 it stops and it says Maximum redirection (20) reached. So it connot

Re: --base does not consider references to root directory

2007-07-14 Thread Matthias Vill
So you would suggest handling in the way that when I use wget --base=/some/serverdir http://server/serverdir/ /.* will be interpreted as /some/.* so if you have a link like /serverdir/ it would go back to /some/serverdir, right? I guess this would be ok. Just one question if there is a Link back

Re: --base does not consider references to root directory

2007-07-14 Thread Matthias Vill
I think I got your point: All in all this is still a matter of comparing the first against the current url and counting the common dirs from the left side. Then you compare that number(a) to the depth of the first url(b) and add b-a ../ so you get to the right position inside your base. By that

Re: wget bug?

2007-07-09 Thread Matthias Vill
Mauro Tortonesi schrieb: On Mon, 9 Jul 2007 15:06:52 +1200 [EMAIL PROTECTED] wrote: wget under win2000/win XP I get No such file or directory error messages when using the follwing command line. wget -s --save-headers http://www.nndc.bnl.gov/ensdf/browseds.jsp?nuc=%1class=Arc; %1 = 212BI

Re: Mirroring

2007-07-09 Thread Matthias Vill
Hi Miroslaw, AFIAK there is no option to get wget deleting files it did not download in that turn. So if you want to use wget for mirroring you would need to redownload the whole server to a local dir and than do something like --- CUT here --- cd new_dir find . ../new_files cd .. cp -af

Re: bug storing cookies with wget

2007-06-03 Thread Matthias Vill
Mario Ander schrieb: Hi everybody, I think there is a bug storing cookies with wget. See this command line: C:\Programme\wget\wget --user-agent=Opera/8.5 (X11; U; en) --no-check-certificate --keep-session-cookies --save-cookies=cookie.txt --output-document=- --debug

Re: bug storing cookies with wget

2007-06-03 Thread Matthias Vill
Matthias Vill schrieb: Mario Ander schrieb: Hi everybody, I think there is a bug storing cookies with wget. See this command line: C:\Programme\wget\wget --user-agent=Opera/8.5 (X11; U; en) --no-check-certificate --keep-session-cookies --save-cookies=cookie.txt --output-document

Re: rapidshare

2007-05-23 Thread Matthias Vill
I don't know rapidshare, but i would guess it's either that you need to supply wget with a rapishare-login-cookie have a look at you're browsers stored cookies for the rapidshare website) Or you can try adding user:password@ to your url like http://user:[EMAIL PROTECTED]/file.ext Hope this helps

Re: Feature request: Disable following redirections

2007-04-13 Thread Matthias Vill
Hi there, this is an Post of November 2004 and yesterday I stumbled myself over a situation where I don't want to get redirected. I tried to script-get worksheets with incremental-numbering. Easy you would say: 404 or success so check $?. The server does however try to be nice to me and always