Re: wget and asp

2002-09-11 Thread Thomas Lussnig
To invoke html examples they use calls like (just the first example): http://www.w3schools.com/html/tryit.asp?filename=tryhtml_basic What filename did you expect for this ? - tryit.asp - tryit.asp?filename=tryhtml_basic - tryhtml_basic Wget saves a file and a directory with this very

Re: wget and asp

2002-09-11 Thread Dominique
What filename did you expect for this ? - tryit.asp - tryit.asp?filename=tryhtml_basic - tryhtml_basic Once again: the loaction is: http://www.w3schools.com/html/tryit.asp?filename=tryhtml_basic It is a frame set which requires frames. One of them is a problem, because it has special

wget hangs on ftp

2002-09-11 Thread Dominique
Hi wget wizards I noticed that wget hangs when I try to download something through ftp: wget ftp://ftp.reed.edu/pub/src/html-helper-mode.tar.gz --12:12:58-- ftp://ftp.reed.edu/pub/src/html-helper-mode.tar.gz = `html-helper-mode.tar.gz' Resolving ftp.reed.edu... done. Connecting to

Re: wget hangs on ftp

2002-09-11 Thread Noel Koethe
On Mit, 11 Sep 2002, Dominique wrote: Hello, wget ftp://ftp.reed.edu/pub/src/html-helper-mode.tar.gz == PORT ... done.== RETR html-helper-mode.tar.gz ... maybe you have to use --passive-ftp? -- Noèl Köthe

Re: wget and asp

2002-09-11 Thread Max Bowsher
Dominique wrote: tryit_edit.asp?filename=tryhtml_basicreferer=http://www.w3schools.com/html/html _examples.asp and just this one is truncated. I think some regexp or pattern or explicit list of where_not_to_break_a_string characters would solve the problem. Or maybe it is already possible,

Re: wget and asp

2002-09-11 Thread Dominique
Is it something I can do myself or the code has to be changed? Domi I think that some URL encoding has not happened somewhere. Whether wget or the web server is at fault, I don't know, but the solution would be to URL encode the slashes. Max.

directory prefix and domains

2002-09-11 Thread Scott D. Anderson
Hi Hrvoje (or whoever's reading this), Thanks for all your work on wget! It looks like just what I'll need. (I'm helping a colleague in the Political Science department and he wants to archive the web sites of political candidates.) However, I did some preliminary experiments and ran into some

RE: directory prefix and domains

2002-09-11 Thread Scott D. Anderson
Okay, I re-read the info files and found: `-k' `--convert-links' Convert the non-relative links to relative ones locally. Only the references to the documents actually downloaded will be converted; the rest will be left unchanged. Note that only at the end of the download

Re: wget and asp

2002-09-11 Thread Thomas Lussnig
tryit_edit.asp?filename=tryhtml_basicreferer=http://www.w3schools.com/html/html _examples.asp and just this one is truncated. I think some regexp or pattern or explicit list of where_not_to_break_a_string characters would solve the problem. Or maybe it is already possible, but I dont know

WGET and the robots.txt file...

2002-09-11 Thread Jon W. Backstrom
Dear Gnu Developers, We just ran into a situation where we had to spider a site of our own on a outsourced service because the company was going out of business. Because wget respects the robots.txt file, however, we could not get an archive made until we had the outsourced company delete their

Re: WGET and the robots.txt file...

2002-09-11 Thread Max Bowsher
-e robots=off Jon W. Backstrom wrote: Dear Gnu Developers, We just ran into a situation where we had to spider a site of our own on a outsourced service because the company was going out of business. Because wget respects the robots.txt file, however, we could not get an archive made

bug?

2002-09-11 Thread Mats Andrén
I found this problem when fetching files recursively: What if the filenames of linked files from a www-page contains the []-characters? They are treated as some kind of patterns, and not just the way they are. Clearly not desirable! Since wget just fetches the filenames from the www-page,

Re: wget and asp

2002-09-11 Thread Max Bowsher
Thomas Lussnig wrote: Why should be there an url encoding ? / are an legal character in url and in the GET string. Ist used for example for Path2Query translation. The main problem is that wget need to translate an URL to and Filesystem name. Yes, you are right, I wasn't think clearly.

Re: bug?

2002-09-11 Thread Thomas Lussnig
Mats Andrén wrote: I found this problem when fetching files recursively: What if the filenames of linked files from a www-page contains the []-characters? They are treated as some kind of patterns, and not just the way they are. Clearly not desirable! Since wget just fetches the