Re: Maximum recursion depth handling bug

2001-06-23 Thread Ian Abbott
, but is all in one file (recur.c). It's mostly a complete reorganization of the recursive_retrieve function (which is no longer called recursively). Regards, Ian Abbott. Index: src/recur.c === RCS file: /pack/anoncvs/wget/src/recur.c,v

Re: Missing escape character

2001-06-26 Thread Ian Abbott
On Tue, 26 Jun 2001 10:45:42 -0400, Bill Bumgarner [EMAIL PROTECTED] wrote: On Tuesday, June 26, 2001, at 04:27 AM, Hrvoje Niksic wrote: Bill Bumgarner [EMAIL PROTECTED] writes: Certainly: [localhost:/tmp] bbum% cc -v Reading specs from /usr/libexec/gcc/darwin/ppc/2.95.2/specs Apple

Re: Missing escape character

2001-06-26 Thread Ian Abbott
On 26 Jun 2001 21:02:21 +0200, Hrvoje Niksic [EMAIL PROTECTED] wrote: Ian Abbott [EMAIL PROTECTED] writes: ch == '\'' || ch == ''== ch == '\\'' || ch == '\' ch == '\'' || ch == '\' == ch == '\\'' || ch == '\\\' Also, both versions of the character constant '' and '\' are valid

Re: Restarting wget after crash ??

2001-06-27 Thread Ian Abbott
On 27 Jun 2001, at 9:43, Bazuka [EMAIL PROTECTED] wrote: I have modified Wget slightly so it writes the URL and file info to a database and then deletes the actual file (with --delete-after). Can I use -nc option in that case ? Yes, but it download the deleted files again.

Re: Q: (problem) wget on dos/win: question marks in url

2001-06-29 Thread Ian Abbott
On 29 Jun 2001, at 8:13, Rick Palazola [EMAIL PROTECTED] wrote: On Fri, 29 Jun 2001, Reto Kohli wrote: consider the following wget call (and think dos:) wget http://mydomain.org/index.html?foo=bar well? -- dos (and your average windows, too) does of course not allow you to write

Re: xhtml? Re: Standalone html parser

2001-07-02 Thread Ian Abbott
On 29 Jun 2001, at 17:40, Anees Shaikh [EMAIL PROTECTED] wrote: Henrik says that xhtml probably doesn't require a space before / to close the tag. But just for anecdotal evidence, all of the sites I've had this problem with do in fact put a space before the / . I guess the requirement of

Re: wget 1.7 fails to follow link

2001-07-05 Thread Ian Abbott
On 4 Jul 2001, at 23:20, Jacob Burckhardt wrote: I run wget on this file: ! -- A HREF=a.htmla/a ! -- A HREF=b.htmlb/a It downloads b.html, but it does not download a.html. This is

Re: wget 1.7 fails to follow link

2001-07-06 Thread Ian Abbott
On 5 Jul 2001, at 22:20, Jacob Burckhardt wrote: Ian Abbott writes: On 4 Jul 2001, at 23:20, Jacob Burckhardt wrote: I run wget on this file: ! -- A HREF=a.htmla

Re: More Domain/Directory Acceptance

2001-07-06 Thread Ian Abbott
On 6 Jul 2001, at 19:58, Jens Roesner wrote: Hi again! I am trying to start from http://members.tripod.de/jroes/test.html (have a look) The first link goes to a site I do not want. The second link goes to a site that should be retrieved. wget -r -l0 -nh -d -o test.log -H -I/bmaj*/

Re: More Domain/Directory Acceptance

2001-07-06 Thread Ian Abbott
On 6 Jul 2001, at 22:24, Jens Roesner wrote: Hi Ian, hi wgetters! Thanks for your help! It didn't work for me either, but the following variation did: wget -r -l0 -nh -d -o test.log -H -I'bmaj*' http://members.tripod.de/jroes/test.html Hm, did not for me :( neither in 1.4.5 nor in

Re: Problem with a site.

2001-07-06 Thread Ian Abbott
On 5 Jul 2001, at 10:45, Philippe Grimaldi wrote: Hello to all: This adress: (#1) http://195.64.57.234/asp/gatepage/bluewin.asp?CompetitionID=80 seems have problems with wget. I want obtain the 2 links (for example) on this page : (#2)

Re: More Domain/Directory Acceptance

2001-07-09 Thread Ian Abbott
On 7 Jul 2001, at 12:29, Jens Roesner wrote: Well if you're running it from a DOS-style shell, get rid of the single quotes I put in there, i.e. try -Ibmaj* Oh, I guess that was rather stupid of me. Not really if you've never used a UNIX-style shell. However, the windows version will

Re: [Question] What's the problem?

2001-07-10 Thread Ian Abbott
On 10 Jul 2001, at 10:16, ÀÌÀç·É wrote: When we execute wget, we can get this message... Connecting to www.chosun.com:80... connected! HTTP request sent, awaiting response... 206 Partial Content What's the problem??? What command line parameters did you use? And What is the 206

VIRUS ALERT! hmp123

2001-08-01 Thread Ian Abbott
The message with subject hmp123 sent from Liaoyq's account [EMAIL PROTECTED] is infected by the TROJ_SIRCAM.A email virus, as are messages with the subjects Ó¦Ó÷þÎñÆ÷µÄ°²×° ¼°ÅäÖà (sorry that probably doesn't render correctly!) and hmp123.

VIRUS ALERT! RecommendationMot

2001-08-01 Thread Ian Abbott
The message with subject RecommendationMot sent from Liaoyq's account [EMAIL PROTECTED] is infected by the TROJ_SIRCAM.A email virus, as are messages with the subjects hmp123 and Ó¦Ó÷þÎñÆ÷µÄ°²×° ¼°ÅäÖà (sorry that probably doesn't render correctly!).

Re: wget timestamping (-N) bug/feature?

2001-08-04 Thread Ian Abbott
On 4 Aug 2001, at 3:25, Bao, Jiangcheng wrote: Suppose I have page a.html, which has a link to b.html. If a is not changed, and b is changed. When I process a, I have no way to check a so that I can process b too, without downloading a. -N will cause a not to be downloaded, but not processed

Re: refused?

2001-08-08 Thread Ian Abbott
On 7 Aug 2001, at 14:25, Huseyin Ozdemir wrote: Do you have any idea why it is refused? Is it a kind of security? Is it possible to protect a site from a wget? It is possible for a web-site to check the information supplied to it in the headers of the request and refuse the connection. This

Re: wget 1.7 age requisites bug?

2001-08-10 Thread Ian Abbott
On 10 Aug 2001, at 2:13, Joakim Verona wrote: I have found 2 problems with wget. 1) when i use the page requisites flag wit sites that use valid xhtml syntax for image links, wget doesn understand it and doesnt download the page requisite. example: img src=xxx.gif / as opposed to img

Re: Virtual sites problem

2001-08-13 Thread Ian Abbott
On 13 Aug 2001, at 16:28, Pawel Tobis wrote: Wget does not retrieve pages located on virtual sites. For example, when I want to download page from address http://www.fwbuilder.com/ which is a virtual site on sourceforge.net, wget tries to download file http://www.fwbuilder.com/index.html

Re: Incorrect calculating

2001-08-16 Thread Ian Abbott
On 15 Aug 2001, at 19:22, Andreas Heck wrote: And here is the bug: after continuing the download from another ftp Server wget says that it has downloaded 337% of the file instead of 77%. It counts up to 430% when the file is finished. I don't think the fact that the second half of the

Re: Size bug in wget-1.7

2001-08-21 Thread Ian Abbott
On 17 Aug 2001, at 11:41, Dave Turner wrote: On Fri, 17 Aug 2001, Dave Turner wrote: By way of a hack I have used the SIZE command, not supported by RFC959 but still accepted by many of the servers I use, to get the size of the file. If that fails then it falls back on the old method.

Re: wget -k crashes when converting a specific url

2001-08-23 Thread Ian Abbott
On 23 Aug 2001, at 3:01, Nathan J. Yoder wrote: Please fix this soon, ***COMMAND*** wget -k http://reality.sgi.com/fxgovers_houst/yama/panels/panelsIntro.html [snip] 02:30:05 (23.54 KB/s) - `panelsIntro.html' saved [3061/3061] Converting panelsIntro.html... zsh: segmentation fault (core

Re: wget -k crashes when converting a specific url

2001-08-24 Thread Ian Abbott
... zsh: segmentation fault (core dumped) Ian Abbott replied: I cannot reproduce this failure on my RedHat 7.1 box. I was able to reproduce this pretty easily on both Irix 6.5.2 and Digital Unix 4.0d, using gcc 2.95.2. (I bet Linux's glibc has code to protect against fwrite() calls

Re: wget1.7: Compilation Error (please Cc'ed to me :-)

2001-09-01 Thread Ian Abbott
On 31 Aug 2001, at 12:48, Edward J. Sabol wrote: Zefiro encountered the following compilation error: utils.c: In function `read_file': utils.c:980: `MAP_FAILED' undeclared (first use this function) Ian Abott suggested: Try this patch: Which is exactly what's in sysdep.h, which is

Re: segmentation fault on powerpc

2001-09-10 Thread Ian Abbott
On 8 Sep 2001, at 3:49, Nick Ryan wrote: I am getting a seg fault when running wget as: wget -r -l2 -k http://www.caucho.com I have tried this with a pre-compiled binary for the Mandrake 8.0 PPC distribution, a version of the current stable wget 1.7 source and a version of the current

Re: 20010909 bug?

2001-09-12 Thread Ian Abbott
On 12 Sep 2001, at 8:07, NAKAJI Hiroyuki wrote: I found that the timestamp of the files which are downloaded with wget-1.7 are all set to 'Jan 1 1970'. Is this a '20010909 bug'? No. $ wget -N http://www.sophos.com/downloads/ide/metys-l.ide $ ls -l metys-l.ide -rw-rw-r-- 1 nakaji

Re: sigsegv on debian ppc

2001-09-24 Thread Ian Abbott
On 24 Sep 2001, at 12:55, Daniel Saakes wrote: i'm using wget 1.7 on a debian ppc system. When i try to get a long filename (77 characters or more) i get a sigsegv. I don't get this error on a i386 system nor if i route the log to a file. Have a look at this message from the archives

Re: Wget on DG/UX

2001-10-03 Thread Ian Abbott
On 3 Oct 2001, at 10:00, Sebastien Mougey wrote: I've compiled WGET on Data General DG/UX (R4.11), and it works fine, but DG/UX doesn't know the define MAP_FAILED, used in utils.c, defined in sys/mman.h. So it should be replaced by (-1) on this unix... Yes, the current CVS version (a.k.a.

Re: Mulitple-site question

2001-10-04 Thread Ian Abbott
On 3 Oct 2001, at 16:01, CJ Kucera wrote: The closest I've come is (and there's lots of extraneous stuff in there): wget -r -l inf -k -p --wait=1 -H --domains=theonion.com,graphics.theonion.com,www.theonion.com,theonionavclub.com,www.theonionavclub.com http://www.theonion.com The domains

Re: emulate a browser

2001-10-04 Thread Ian Abbott
On 4 Oct 2001, at 11:35, Brian Harvell wrote: I want to emulate a brower by downloading all of the objects on a single page. The -p option is close however it doesn't work completely when there are frames on the page. It downloads the frame but not the images etc on the frame. Anyone

Re: Can't make wget-1.7. Sed wants me to input something.

2001-10-26 Thread Ian Abbott
On 26 Oct 2001, at 19:19, Alexey Aphanasyev wrote: cd doc make CC='gcc' CPPFLAGS='' DEFS='-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\~/etc/wgetrc\ -DLOCALEDIR=\~/share/locale\' CFLAGS='-g -O2' LDFLAGS='' LIBS='' prefix='~' exec_prefix='~' bindir='~/bin' infodir='~/info' mandir='~/man' manext='1'

Re: Compile problem (and possible fix)

2001-11-08 Thread Ian Abbott
On 7 Nov 2001, at 23:07, Hack Kampbjørn wrote: Ed Powell wrote: I had to change: assert (ch == '\'' || ch == ''); to: assert (ch == '\'' || ch == '\'); Otherwise, it would not compile... it was, I think, interpreting the , rather than using it literally.

Re: wget chokes on http redirects to # anchors

2001-11-16 Thread Ian Abbott
On 15 Nov 2001, at 14:39, Jamie Zawinski wrote: The log says it all; it's treating the # as part of the URL instead of stripping it. It's not just redirects that fail to strip the # part of the URL either. E.g.: $ wget http://www.dnalounge.com/backstage/log/2001/11.html#8-nov-2001

Re: 134% ready

2001-11-20 Thread Ian Abbott
On 17 Nov 2001, at 14:24, Hrvoje Niksic wrote: Ferenc VERES [EMAIL PROTECTED] writes: My download stopped, then next day I continued with -c, from 34%. At the end I saw: 134% downloaded, in end of the lines ;-) (it was FTP:// transfer, a 680MB iso image) GNU Wget 1.5.3 Try it

RE: bug?

2001-11-22 Thread Ian Abbott
On 22 Nov 2001, at 14:49, Tomas Hjelmberg wrote: Thanks! I see, but then, how to exclude from being downloaded per file-basis? Put the following in the /robots.txt on your website User-agent: * Disallow: /tomas.html See http://www.robotstxt.org/wc/exclusion-admin.html for more info.

Re: building wget 1.7 on Darwin 1.4 (Mac OS X 10.1)

2001-09-28 Thread Ian Abbott
On 27 Sep 2001 at 14:45, Kareila wrote: Since wget is no longer included in the standard Mac OS X distribution, I tried to compile my own from the 1.7 sources and got stuck right about here: cc -I. -I.-DHAVE_CONFIG_H - DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -

Re: wget-1.7 bug with -p -k and -nd

2001-09-28 Thread Ian Abbott
On 27 Sep 2001 at 22:23, Mark D. Roth wrote: Now, if you run wget -p -k http://www.foo.com/basepath/foo.html;, the IMG tags are left as-is. However, if you add -nd, the URL in IMG tag gets set to http://www.foo.com/basepath/foo1.gif;. For some reason, it's being set to an absolute URL, and

Re: newbie: how to allow wget to overwrite files

2001-09-28 Thread Ian Abbott
On 28 Sep 2001 at 9:48, Hubert Ming wrote: dear wget gurus i'm a little new in this business so excuse my beginner question on my suse-linux (wget 1.5.x) I download every night virus-signature-files. these files always have the same file-names even if they are updated by the

Re: (Fwd) Re: Maximum recursion depth handling bug

2001-11-24 Thread Ian Abbott
On 23 Nov 2001, at 20:05, Hrvoje Niksic wrote: You seem to be maintaining a stack of URLs to download. Maybe it's just a terminology mismatch, but wouldn't it make more sense to keep them in a queue? The links resulting from parsing an HTML file would be transformed into URLs and appended

Re: HAVE_RANDOM ?

2001-11-27 Thread Ian Abbott
On 27 Nov 2001, at 15:16, Hrvoje Niksic wrote: So, does anyone know about the portability of rand()? It's in the ANSI/ISO C spec (ISO 9899). It's always been in UNIX (or at least it's been in there since UNIX 7th Edition), and I should think it's always been in the MS-DOS compilers, but I

wget-1.8-dev Segmentation fault when retrieving from file

2001-11-27 Thread Ian Abbott
I got a segmentation fault when retrieving URLs from a file. 2001-11-27 Ian Abbott [EMAIL PROTECTED] * retr.c (retrieve_from_file): Initialize `new_file' to NULL to prevent seg fault. Index: src/retr.c === RCS

Re: Does the -Q quota command line argument work?

2001-11-27 Thread Ian Abbott
On 27 Nov 2001 at 13:07, John Masinter wrote: It seems that wget will download an entire large file regardless of what I specify for the quota. For example I am trying to download only the first 100K of a 800K file. I specify this: wget -Q 100K http://url-goes-here It then proceeds to

Re: wget1.7.1: Compilation Error (please Cc'ed to me :-)

2001-11-28 Thread Ian Abbott
On 28 Nov 2001 at 18:08, Hrvoje Niksic wrote: Daniel Stenberg [EMAIL PROTECTED] writes: On Wed, 28 Nov 2001, zefiro wrote: ld: Undefined symbol _memmove Do you have any suggestion ? SunOS 4 is known to not have memmove. May I suggest adding the following (or

Re: windows patch and problem

2001-11-29 Thread Ian Abbott
On 29 Nov 2001 at 12:48, Herold Heiko wrote: --12:27:26-- http://www.cnn.com/ (try: 3) = `www.cnn.com/index.html' Found www.cnn.com in host_name_addresses_map (008D01B0) Releasing 008D01B0 (new refcount 1). Retrying. (ecc.) Same with other hosts Could somebody please confirm if

Re: wget1.7.1: Compilation Error (please Cc'ed to me :-)

2001-11-29 Thread Ian Abbott
On 29 Nov 2001 at 13:14, Daniel Stenberg wrote: On Thu, 29 Nov 2001, Maciej W. Rozycki wrote: On Wed, 28 Nov 2001, Ian Abbott wrote: However, the Linux man page for bcopy(3) do not say the strings can overlap Presumably the man page is incorrect Yes, I think so. Well, can we

Re: wget1.7.1: Compilation Error (please Cc'ed to me :-)

2001-11-29 Thread Ian Abbott
On 29 Nov 2001 at 14:40, Hrvoje Niksic wrote: Ian, can you clarify what you meant by BSD man pages? Which BSD? NetBSD: http://www.tac.eu.org/cgi-bin/man-cgi?bcopy+3 OpenBSD: http://www.openbsd.org/cgi-bin/man.cgi?query=bcopysektion=3 FreeBSD:

Re: Recursive retrieval of page-requisites

2001-10-09 Thread Ian Abbott
On 9 Oct 2001, at 10:54, Mikko Kurki-Suonio wrote: I want to download a subtree of HTML documents from a foreign site complete with page-requisites for offline viewing. I.e. all HTML pages from this point downwards, PLUS all the images(etc.) they refer to -- no matter where they are in

Re: Recursive retrieval of page-requisites

2001-10-09 Thread Ian Abbott
On 9 Oct 2001, at 17:01, [EMAIL PROTECTED] wrote: On 09/10/2001 14:25:57 Andre Pang wrote: Try this patch. It should make -p _always_ get pre-requisites, even if you have -np on (which was the reason why i wrote the patch). [snip] Actually, case can be made for both ways. Sometimes you

Re: xhtml page requisites patch?

2001-10-10 Thread Ian Abbott
On 10 Oct 2001, at 14:44, Joakim Verona wrote: i found a patch to fix the problem that wget doesnt download images specified with xhtml syntax as page requisites, on this list. i tried to apply the patch to wget 1.7.1 pre, but it didnt apply. is this patch applied to the current cvs?

Re: Some html wierdness

2001-10-15 Thread Ian Abbott
On 14 Oct 2001, at 20:06, [EMAIL PROTECTED] wrote: Anyway, in her index file 3 urls have a CR in them for the file name, which causes the website to fail to send back the file to wget because wget is sending the string unfiltered. For example: a href=HW 02_Sol.htmlSolution to HW02

Re: A request to admin

2001-10-15 Thread Ian Abbott
On 15 Oct 2001, at 14:53, Mikko Kurki-Suonio wrote: Could we please block postings by non-subscribers? I'd rather not have the spam, thankyouverymuch. I'm not sure if the human list administrators at sunsite.dk actually read the lists they host - probably not. However they should be

Re: Make -p work with framed pages.

2001-12-03 Thread Ian Abbott
On 1 Dec 2001 at 4:04, Hrvoje Niksic wrote: As a TODO entry summed up: * -p should probably go _two_ more hops on FRAMESET pages. More generally, I think it probably needs to be made to work for nested framesets too.

Re: log errors

2001-12-11 Thread Ian Abbott
On 11 Dec 2001 at 16:09, Hrvoje Niksic wrote: Summer Breeze [EMAIL PROTECTED] writes: Here is a sample entry: 66.28.29.44 - - [08/Dec/2001:18:21:20 -0500] GET /index4.html%0A HTTP/1.0 403 280 - Wget/1.6 /index4.html%0A looks like a page is trying to link to /index4.html, but the

Re: Is wget --timestamping URL working on Windows 2000?

2001-12-12 Thread Ian Abbott
On 11 Dec 2001 at 18:40, [EMAIL PROTECTED] wrote: It seems to me that if an output_document is specified, it is being clobbered at the very beginning (unless always_rest is true). Later in http_loop stat() comes up with zero length. Hence there's always a size mismatch when --output-document

Re: A small bug

2001-12-14 Thread Ian Abbott
On 14 Dec 2001 at 14:49, Peng GUAN wrote: Maybe a bug in file fnmatch.c, line 54: ( n==string || (flags FNM_PATHNAME) n[-1] == '/')) the n[-1] should be change to *(n-1). I like the easy ones. Those are equivalent in C. As to which of the too looks the nicest is a matter of aesthetics

Wget 1.8+CVS not passing referer for recursive retrieval

2001-12-18 Thread Ian Abbott
have the Referer set to that set by the --referer option or nothing at all, and not necessarily the URL of the referring page. src/ChangeLog entry: 2001-12-18 Ian Abbott [EMAIL PROTECTED] * recur.c (retrieve_tree): Pass on referring URL when retrieving recursed URL. Index: src

Wget 1.8.1-pre2 Problem with -i, -r and -l

2001-12-18 Thread Ian Abbott
I don't have time to look at this problem today, but I thought I'd mention it now to defer the 1.8.1 release. If I have a website http://somesite/ with three files on it: index.html, a.html and b.html, such that index.html links only to a.html and a.html links only to b.html then the following

Re: Wget 1.8.1-pre2 Problem with -i, -r and -l

2001-12-19 Thread Ian Abbott
On 18 Dec 2001 at 23:13, Hrvoje Niksic wrote: Ian Abbott [EMAIL PROTECTED] writes: If I have a website http://somesite/ with three files on it: index.html, a.html and b.html, such that index.html links only to a.html and a.html links only to b.html then the following command

Re: Error while compiling Wget 1.8.1-pre2+cvs.

2001-12-19 Thread Ian Abbott
On 19 Dec 2001 at 17:40, Alexey Aphanasyev wrote: Hrvoje Niksic wrote: The `gnu-md5.o' object is missing. Can you show us the output from `configure'? Yes, sure. Please find it attached bellow. Have you tried running make distclean before ./configure? It is possible that some of your

Re: [no subject]

2002-01-04 Thread Ian Abbott
On 3 Jan 2002 at 13:58, Henric Blomgren wrote: Wget-bug: GNU Wget 1.8 [...] [root@MAGI .temporary]# wget: progress.c:673: create_image: Assertion `p - bp-buffer = bp-width' failed. Please use Wget 1.8.1. That bug has already been fixed!

Re: Asseertion failed in wget

2002-01-07 Thread Ian Abbott
On 7 Jan 2002 at 11:52, Jan Starzynski wrote: for GNU Wget 1.8 I get the following assertion failed message: get: progress.c:673: create_image: Zusicherung »p - bp-buffer = bp-width« nicht erfüllt. (snip) In the changelogs of 1.8.1 I could not find a hint that this has been fixed until

Re: wget does not treat urls starting with // correctly

2002-01-07 Thread Ian Abbott
/ChangeLog entry: 2002-01-07 Ian Abbott [EMAIL PROTECTED] * url.c (uri_merge_1): Deal with net path relative URL (one that starts with //). And the actual patch: Index: src/url.c === RCS file: /pack/anoncvs/wget/src/url.c

Re: Simplest logfile ?

2002-01-08 Thread Ian Abbott
On 8 Jan 2002 at 20:31, Mike wrote: What I'm looking for is something like the way FTP_Lite operates, Can I nominate a single log file in the wgetrc for use by all the wget processes that spawn off from my bash ? There is the -a FILE (--append-output=FILE) option to append to a logfile. A

Re: 2 Gb limitation

2002-01-11 Thread Ian Abbott
On 10 Jan 2002 at 17:09, Matt Butt wrote: I've just tried to download a 3Gb+ file (over a network using HTTP) with WGet and it died at exactly 2Gb. Can this limitation be removed? In principle, changes could be made to allow wget to be configured for large file support, by using the

Re: Using -pk, getting wrong behavior for frameset pages...Suggestions?

2002-01-11 Thread Ian Abbott
On 11 Jan 2002 at 10:51, Picot Chappell wrote: Thanks for your response. I tried the same command, using your URL, and it worked fine. So I took a look at the site I was retrieving for the failed test. It's a ssl site (didn't think about it before) and I noticed 2 things. The Frame

Re: Passwords and cookies

2002-01-15 Thread Ian Abbott
On 15 Jan 2002 at 0:27, Hrvoje Niksic wrote: Brent Morgan [EMAIL PROTECTED] writes: The -d debug option crashes wget just after it reads the input file. Huh? Ouch! Wget on Windows is much less stable than I imagined. Can you run it under a debugger and see what causes the crash? I

Mapping URLs to filenames

2002-01-15 Thread Ian Abbott
This is an initial proposal for naming the files and directories that Wget creates, based on the URLs of the retrieved documents. At the moment there are many complaints about Wget failing to save documents which have '?' in their URLs when running under Windows, for example. In general, the set

RE: Mapping URLs to filenames

2002-01-16 Thread Ian Abbott
On 16 Jan 2002 at 8:02, David Robinson (AU) wrote: In the meantime, however, '?' is problematic for Win32 users. It stops WGET from working properly whenever it is found within a URL. Can we fix it please. My proposal for using escape sequences in filenames for problem characters is up for

Re: Passwords and cookies

2002-01-16 Thread Ian Abbott
On 15 Jan 2002 at 14:48, Brent Morgan wrote: Thanks to everyone for looking at this problem. I am not a developer and at my wits end with this problem. I did determine with a different cookie required site that it is still not working. Could you change line 1017 of cmpt.c to read as

A strange bit of HTML

2002-01-16 Thread Ian Abbott
I came across this extract from a table on a website: td ALIGN=CENTER VALIGN=CENTER WIDTH=120 HEIGHT=120a href=66B27885.htm msover1('Pic1','thumbnails/MO66B27885.jpg'); onMouseOut=msout1('Pic1','thumbnails/66B27885.jpg');img SRC=thumbnails/66B27885.jpg NAME=Pic1 BORDER=0 /a/td Note the string

Re: Passwords and cookies

2002-01-17 Thread Ian Abbott
On 16 Jan 2002 at 17:50, Hrvoje Niksic wrote: Wget's strptime implementation comes from an older version of glibc. Perhaps we should simply sync it with the latest one from glibc, which is obviously capable of handling it? That sounds like a good plan.

Re: Passwords and cookies

2002-01-17 Thread Ian Abbott
On 16 Jan 2002 at 17:45, Hrvoje Niksic wrote: Aside from google, ~0UL is Wget's default value for the expiry time, meaning the cookie is non-permanent and valid throughout the session. Since Wget sets the value, Wget should be able to print it in DEBUG mode. Do you think this patch would

Re: Passwords and cookies

2002-01-18 Thread Ian Abbott
On 17 Jan 2002 at 18:17, Hrvoje Niksic wrote: Ian Abbott [EMAIL PROTECTED] writes: I'm also a little worried about the (time_t *)cookie-expiry_time cast, as cookie-expiry time is of type unsigned long. Is a time_t guaranteed to be the same size as an unsigned long? It's not, but I have

Re: Bug report: 1) Small error 2) Improvement to Manual

2002-01-21 Thread Ian Abbott
On 17 Jan 2002 at 2:15, Hrvoje Niksic wrote: Michael Jennings [EMAIL PROTECTED] writes: WGet returns an error message when the .wgetrc file is terminated with an MS-DOS end-of-file mark (Control-Z). MS-DOS is the command-line language for all versions of Windows, so ignoring the

Re: Bug report: 1) Small error 2) Improvement to Manual

2002-01-21 Thread Ian Abbott
On 21 Jan 2002 at 14:56, Thomas Lussnig wrote: Why not just open the wgetrc file in text mode using fopen(name, r) instead of rb? Does that introduce other problems? I think it has to do with comments because the defeinition is that starting with '#' the rest of the line is ignored. And

Re: Downloading all files by http:

2002-01-31 Thread Ian Abbott
On 31 Jan 2002 at 9:25, Fred Holmes wrote: wget -N http://www.karenware.com/progs/*.* fails with a not found whether the filespec is * or *.* The * syntax works just fine with ftp Is there a syntax that will get all files with http? You could try wget -m -l 1 -n

Re: timestamping content-length --ignore-length

2002-01-31 Thread Ian Abbott
On 31 Jan 2002 at 8:41, Bruce BrackBill wrote: The problem is, that my web pages are served up by php and the content lengh is not defined. So as the manual states I use --ignore-length. But when wget retrieves an image it slows right down, possibly because it is ignoring the

Re: timestamping content-length --ignore-length

2002-01-31 Thread Ian Abbott
On 31 Jan 2002 at 9:48, Bruce BrackBill wrote: Thanks for your responce Ian. When I use it without --ignore-length option it appears that wget SOMETIMES ignores the last_modified_date OR wget says to itself ( hey, I see the file is older than the local copy, but hey, since the server isn't

HTTP/1.1 (was Re: timestamping content-length --ignore-length)

2002-02-01 Thread Ian Abbott
On 1 Feb 2002 at 8:17, Daniel Stenberg wrote: You may count this mail as advocating for HTTP 1.1 support, yes! ;-) I did write down some minimal requirements for HTTP/1.1 support on a scrap of paper recently. It's probably still buried under the more recent strata of crap on my desk somewhere!

Re: @ sign in username

2002-02-04 Thread Ian Abbott
On 4 Feb 2002 at 15:21, Christian Busch wrote: Hello, i have a question. On a ftp-site that we need to mirror, our login is wget -cm ftp://christian.busch%40brainjunction.de:**xx**@esd.intraware.com/ as you see I tried to encode the @ as %40 as described in the manual. This does

Re: KB or kB

2002-02-08 Thread Ian Abbott
On 8 Feb 2002 at 4:26, Fred Holmes wrote: At 02:54 AM 2/8/2002, Hrvoje Niksic wrote: Wget currently uses KB as abbreviation for kilobyte. In a Debian bug report someone suggested that kB should be used because it is more correct. The reporter however failed to cite the reference for this,

Re: wget 1.8.x proxies

2002-02-12 Thread Ian Abbott
On 12 Feb 2002 at 12:30, Holger Pfaff wrote: I'm having trouble using wget 1.8.[01] over a (squid24-) proxy to mirror a ftp-directory: # setenv ftp_proxy http://139.21.68.25: # wget181 -r -np -l0 ftp://ftp.funet.fi/pub/Linux/mirrors/redhat/redhat/linux/updates --12:06:58--

Re: wget 1.8.x proxies

2002-02-12 Thread Ian Abbott
On 12 Feb 2002 at 7:54, Winston Smith wrote: # wget181 -r -np -l0 ftp://ftp.funet.fi/pub/Linux/mirrors/redhat/redhat/linux/updates ummm... looks like the -l0 might be limiting your recursion level to 0 levels No. '-l0' is the same as '-l inf'.

Re: wget crash

2002-02-14 Thread Ian Abbott
On 14 Feb 2002 at 10:41, Steven Enderle wrote: assertion percentage = 100 failed: file progress.c, line 552 zsh: abort (core dumped) wget -m -c --tries=0 ftp://ftp.scene.org/pub/music/artists/nutcase/mp3/timeofourlives.mp3 hope this helps in any way. Thanks for the report. That's a

Re: wget crash

2002-02-15 Thread Ian Abbott
On 14 Feb 2002 at 16:02, Steven Enderle wrote: Sorry for not including any version information. This is version 1.8.1, which I am using. Sorry for not reading your bug report properly. I should have realised that this was a different bug to the hundreds (it seems!) of other reports about

Re: wget bug?!

2002-02-18 Thread Ian Abbott
[The message I'm replying to was sent to [EMAIL PROTECTED]. I'm continuing the thread on [EMAIL PROTECTED] as there is no bug and I'm turning it into a discussion about features.] On 18 Feb 2002 at 15:14, TD - Sales International Holland B.V. wrote: I've tried -w 30 --waitretry=30 --wait=30

Re: wget info page

2002-02-20 Thread Ian Abbott
On 20 Feb 2002 at 12:54, Noel Koethe wrote: wget 1.8.1 is shipped with the files in doc/ wget.info wget.info-1 wget.info-2 wget.info-3 wget.info-4 They are build out of wget.texi if I remove them and makeinfo is installed. The files are removed when runing make realclean. I think

No clobber and .shtml files

2002-02-20 Thread Ian Abbott
Here is a patch for a potential feature change. I'm not sending it to the wget-patches list yet, as I'm not sure if it should be applied as is, or at all. The feature change is a minor amendment to the (bogus) test for whether an existing local copy of a file is text/html when the or not when

Re: retr.c:253: calc_rate: Assertion `msecs = 0' failed.

2002-03-06 Thread Ian Abbott
On 6 Mar 2002 at 12:43, Mats Palmgren wrote: I have a cron job that downloads Mozilla every night using wget. Last night I got: wget: retr.c:253: calc_rate: Assertion `msecs = 0' failed. I think this can happen if the system time is reset backwards while wget is downloading stuff.

Re: reading HTML input-files (WITH ATTACHMNT!)

2002-03-07 Thread Ian Abbott
On 7 Mar 2002 at 17:50, Mathias Kratzer wrote: While calling Wget 1.5.2 by wget -F -O 69_4_522_Ref.res -i 69_4_522_Ref.mrq on the attached file 69_4_522_Ref.mrq has worked very well I am left with the error message No URLs found in 69_4_522_Ref.mrq whenever I try the same

(Fwd) Proposed new --unfollowed-links option for wget

2002-03-08 Thread Ian Abbott
This seems more appropriate for the main Wget list. The wget-patches list is for patches! --- Forwarded message follows --- From: Tony Lewis [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject:Proposed new --unfollowed-links option for

(Fwd) Processing of JavaScript

2002-03-08 Thread Ian Abbott
--- Forwarded message follows --- From: Tony Lewis [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject:Processing of JavaScript Date sent: Fri, 8 Mar 2002 00:04:43 -0800 Some web sites include URL references within

(Fwd) Automatic posting to forms

2002-03-08 Thread Ian Abbott
--- Forwarded message follows --- From: Tony Lewis [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject:Automatic posting to forms Date sent: Thu, 7 Mar 2002 23:43:28 -0800 As promised in my earlier note, there is a second

Re: reading HTML input-files (WITH ATTACHMNT!)

2002-03-08 Thread Ian Abbott
On 8 Mar 2002 at 10:50, Mathias Kratzer wrote: I admit that the lines in my original file contain a really stupid syntax error. As an absolute beginner with the Markup Languages I have just tried to learn from some hyperlink examples but obviously misunderstood their formal

Re: wget1.8.1's patches for using the free Borland C++Builder compile r

2002-03-18 Thread Ian Abbott
On 12 Mar 2002 at 3:18, sr111 wrote: I have to modify some files in order to build win32 port of wget using the free Borland C++Builder compiler. Please refer to the attachment file for the details. I've modified Chin-yuan Kuo's patch for the current CVS. It builds fine with the

Re: Wget and Symantec Web Security

2002-03-21 Thread Ian Abbott
On 19 Mar 2002 at 22:53, Löfstrand Thomas wrote: I use wget to get files from a FTP server. The proxy server is Symantecs web security 2.0 product for solaris which has a antivirus function. I have used wget with -d option to see what is going on, and it seems like the proxyserver returns

Re: OK, time to moderate this list

2002-03-22 Thread Ian Abbott
On 22 Mar 2002 at 4:08, Hrvoje Niksic wrote: The suggestion of having more than one admin is good, as long as there are people who volunteer to do it besides me. I'd volunteer too, but don't want to be the only person moderating the lists for the same reasons as yourself. (I'm also completely

Re: wget parsing JavaScript

2002-03-26 Thread Ian Abbott
On 26 Mar 2002 at 7:05, Tony Lewis wrote: Csaba Ráduly wrote: I see that wget handles SCRIPT with tag_find_urls, i.e. it tries to parse whatever it's inside. Why was this implemented ? JavaScript is most used to construct links programmatically. wget is likely to find bogus URLs

Re: spanning hosts: 2 Problems

2002-03-26 Thread Ian Abbott
On 26 Mar 2002 at 19:01, Jens Rösner wrote: I am using wget to parse a local html file which has numerous links into the www. Now, I only want hosts that include certain strings like -H -Daudi,vw,online.de It's probably worth noting that the comparisons between the -D strings and the

Re: wget parsing JavaScript

2002-03-27 Thread Ian Abbott
On 26 Mar 2002 at 19:33, Tony Lewis wrote: I wrote: wget is parsing the attributes within the script tag, i.e., script src=url. It does not examine the content between script and /script. and Ian Abbott responded: I think it does, actually, but that is mostly harmless. You're

  1   2   >