Bug-report: wget with multiple cnames in ssl certificate
Hi If i connect with wget 1.10.2 (Debian Etch Ubuntu Feisty Fawn) to a secure host, that uses multiple cnames in the certificate i get the following error: [EMAIL PROTECTED]:~$ wget https://host.domain.tld --10:18:55-- https://host.domain.tld/ = `index.html' Resolving host.domain.tld... xxx.xxx.xxx.xxx Connecting to host.domain.tld|xxx.xxx.xxx.xxx|:443... connected. ERROR: certificate common name `host0.domain.tld' doesn't match requested host name `host.domain.tld'. To connect to host.domain.tld insecurely, use `--no-check-certificate'. Unable to establish SSL connection. If I do the same with wget 1.9.1 (Debian Sarge) I do not get that Error. Kind regards, Alex Antener -- Alex Antener Dipl. Medienkuenstler FH [EMAIL PROTECTED] // http://lix.cc // +41 (0)44 586 97 63 GPG Key: 1024D/14D3C7A1 https://lix.cc/gpg_key.php Fingerprint: BAB6 E61B 17D7 A9C9 6313 5141 3A3C DAA3 14D3 C7A1
Bug report: backup files missing when using wget -K
Hi, when calling wget -k -K , the backup files (.orig) are missing. In one case (LOG.Linux.short) one backup file is missing (two files were converted). In another case (LOG.IRIX64.short) all backup files are missing. This is also true when using recursive retrieval (LOG.IRIX64.recursive.short). See attached files for details. The script calling wget is WGET. There was no .wgetrc file. You probably know the bug described at: http://www.mail-archive.com/wget@sunsite.dk/msg07686.html Remove the two ./CLEAN commands in the script to test recursive re-download with it. I cannot reproduce the bug since the backup files are missing. Kind regards, Ken LOG.Linux.short: DEBUG output created by Wget 1.8.2 on linux. Linux linux 2.4.21-99-default #1 Wed Sep 24 13:30:51 UTC 2003 i686 i686 i386 GNU/Linux 2 Dateien in 0.07 Sekunden konvertiert. Backup files: 1 LOG.IRIX64.short: DEBUG output created by Wget 1.10.1 on irix6.5. IRIX64 Komma 6.5 07010238 IP27 Converted 2 files in 0.204 seconds. Backup files: 0 LOG.IRIX64.recursive.short: DEBUG output created by Wget 1.10.1 on irix6.5. IRIX64 Komma 6.5 07010238 IP27 Converted 55 files in 7.616 seconds. Backup files: 0 wget-bug.tar.bz2 Description: Binary data
Bug report
Hi, I don't really know if this is a Wget bug, or some problem with my website, but, either way, maybe you can help. I have a web site ( www.BuildItSolar.com ) with perhaps a few hundred pages (260MB of storage total). Someone did a Wget on my site, and managed to log 111,000 hits and 58,000 page views (using more than a GB of bandwidth). I am wondering how this can happen, since the number of page views is about 200 times the number of pages on my site?? Is there something I can do to prevent this? Is there something about the organization of my website that is causing Wget to get stuck in a loop? I've never used Wget, but I am guessing that this guy really did not want 50,000+ pages -- do you provide some way for the user to shut itself down when it reaches some reasonable limit? My website is non-commercial, and provides a lot of information that people find useful in building renewable energy projects. It generates zero income, and I can't really afford to have a lot of people come in and burn up GBs of bandwidth to no useful end. Help! Gary Reysa Bozeman, MT [EMAIL PROTECTED]
Re: Bug report
Gary Reysa wrote: Hi, I don't really know if this is a Wget bug, or some problem with my website, but, either way, maybe you can help. I have a web site ( www.BuildItSolar.com ) with perhaps a few hundred pages (260MB of storage total). Someone did a Wget on my site, and managed to log 111,000 hits and 58,000 page views (using more than a GB of bandwidth). I am wondering how this can happen, since the number of page views is about 200 times the number of pages on my site?? Is there something I can do to prevent this? Is there something about the organization of my website that is causing Wget to get stuck in a loop? I've never used Wget, but I am guessing that this guy really did not want 50,000+ pages -- do you provide some way for the user to shut itself down when it reaches some reasonable limit? My website is non-commercial, and provides a lot of information that people find useful in building renewable energy projects. It generates zero income, and I can't really afford to have a lot of people come in and burn up GBs of bandwidth to no useful end. Help! Gary Reysa Bozeman, MT [EMAIL PROTECTED] Hello Gary, From a quick look at your site, it appears to be mainly static html that would not generate a lot of extra crawls. If you have some dynamic portion of your site, like a calendar, that could make wget go into an infinite loop. It would be much easier to tell if you could look at the server logs that show what pages were requested. They would easily tell you want wget was getting hung on. One problem I did notice is that your site is generating soft 404s. In other words, it is sending back a http 200 response when it should be sending back a 404 response. So if wget tries to access http://www.builditsolar.com/blah your web server is telling wget that the page actually exists. This *could* cause more crawls than necessary, but not likely. This problem should be fixed though. It's possible the wget user did not know what they were doing and ran the crawler several times. You could try to block traffic from that particular IP address or create a robots.txt file that tells crawlers to stay away from your site or just certain pages. Wget respects robots.txt. For more info: http://www.robotstxt.org/wc/robots.html Regards, Frank
wget bug report
Sorry for the crosspost, but the wget Web site is a little confusing on the point of where to send bug reports/patches. Just installed wget 1.10 on Friday. Over the weekend, my scripts failed with the following error (once for each wget run): Assertion failed: wget_cookie_jar != NULL, file http.c, line 1723 Abort - core dumped All of my command lines are similar to this: /home/programs/bin/wget -q --no-cache --no-cookies -O /home/programs/etc/alte_se iten/xsr.html 'http://www.enterasys.com/download/download.cgi?lib=XSR' After taking a look at it, i implemented the following change to http.c and tried again. It works for me, but i don't know what other implications my change might have. --- http.c.orig Mon Jun 13 08:04:23 2005 +++ http.c Mon Jun 13 08:06:59 2005 @@ -1715,6 +1715,7 @@ hs-remote_time = resp_header_strdup (resp, Last-Modified); /* Handle (possibly multiple instances of) the Set-Cookie header. */ + if (opt.cookies) { char *pth = NULL; int scpos; Mit freundlichen Grüßen MVV Energie AG Abteilung AI.C Andrew Jones Telefon: +49 621 290-3645 Fax: +49 621 290-2677 E-Mail: [EMAIL PROTECTED] Internet: www.mvv.de MVV Energie · Luisenring 49 · 68159 Mannheim Handelsregister-Nr. HRB 1780 Vorsitzender des Aufsichtsrates: Oberbürgermeister Gerhard Widder Vorstand: Dr. Rudolf Schulten (Vorsitzender) · Dr. Werner Dub · Hans-Jürgen Farrenkopf · Karl-Heinz Trautmann
Bug report: two spaces between filesize and Month
Hello! I just found a feature in embedded system (no source) with ftp server. In listing, there are two spaces between fileize and month. As a consequence, wget allways thinks size is 0. In procedure ftp_parse_unix_ls it just steps back one blank before cur.size is calculated. My quick hack is just to add one more pointer and atoi, but maybe a nicer sollution can be done. case from .listing: -rw-rw-rw- 0 0 0 68065 Apr 16 08:00 A20040416.0745 -rw-rw-rw- 0 0 0781 Apr 20 07:45 A20040420.0730 -rw-rw-rw- 0 0 0 59606 Apr 16 08:15 A20040416.0800 -rw-rw-rw- 0 0 0781 Apr 23 12:15 A20040423.1200 -rw-rw-rw- 0 0 0 2130 Feb 3 12:00 A20040203.1145 -rw-rw-rw- 0 0 0 33440 Apr 14 12:15 A20040414.1200 BR Iztok
wget bug report
I sent this message to [EMAIL PROTECTED] as directed in the wget man page, but it bounced and said to try this email address. This bug report is for GNU Wget 1.8.2 tested on both RedHat Linux 7.3 and 9 rpm -q wget wget-1.8.2-9 When I use a wget with the -S to show the http headers, and I use the spider switch as well, it gives me a 501 error on some servers. The main example I have found was doing it against a server running ntop. http://www.ntop.org/ You can find an RPM for it at: http://rpm.pbone.net/index.php3/stat/4/idpl/586625/com/ntop-2.2-0.dag.rh90.i386.rpm.html You cean search with other parameters at rpm.pbone.net to get ntop for other version of linux So here is the command and output: wget -S --spider http://SERVER_WITH_NTOP:3000 HTTP request sent, awaiting response... 1 HTTP/1.0 501 Not Implemented 2 Date: Sat, 27 Mar 2004 07:08:24 GMT 3 Cache-Control: no-cache 4 Expires: 0 5 Connection: close 6 Server: ntop/2.2 (Dag Apt RPM Repository) (i686-pc-linux-gnu) 7 Content-Type: text/html 21:11:56 ERROR 501: Not Implemented. I get a 501 error. echoing the $? shows an exit status of 1 When I don't use the spider, I get the following: wget -S http://SERVER_WITH_NTOP:3000 HTTP request sent, awaiting response... 1 HTTP/1.0 200 OK 2 Date: Sat, 27 Mar 2004 07:09:31 GMT 3 Cache-Control: max-age=3600, must-revalidate, public 4 Connection: close 5 Server: ntop/2.2 (Dag Apt RPM Repository) (i686-pc-linux-gnu) 6 Content-Type: text/html 7 Last-Modified: Mon, 17 Mar 2003 20:27:49 GMT 8 Accept-Ranges: bytes 9 Content-Length: 1214 100%[==] 1,214 1.16M/sETA 00:00 21:13:04 (1.16 MB/s) - `index.html' saved [1214/1214] The exit status was 0 and the index.html file was downloaded. If this is a bug please fix it in your next release of wget. If it is not a bug, I would appriciate a brief explination as to why. Thank You Corey Henderson Chief Programmer GlobalHost.com
Bug report
Hello. This is report on some wget bugs. My wgetdir command looks the following (wget 1.9.1): wget -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np -U Mozilla $@ Bugs: Command: wgetdir http://www.directfb.org;. Problem: In file www.directfb.org/index.html the hrefs of type /screenshots/index.xml was not converted to relative with -k option. Command: wgetdir http://threedom.sourceforge.net;. Problem: In file threedom.sourceforge.net/index.html the hrefs were not converted to relative with -k option. Command: wgetdir http://liarliar.sourceforge.net;. Problem: Files are named as content.php?content.2 content.php?content.3 content.php?content.4 which are interpreted, e.g., by Nautilus as manual pages and are displayed as plain texts. Could the files and the links to them renamed as the following? content.php?content.2.html content.php?content.3.html content.php?content.4.html After all, are those pages still php files or generated html files? If they are html files produced by the php files, then it could be a good idea to add a new extension to the files. Command: wgetdir http://www.newtek.com/products/lightwave/developer/lscript2.6/index.html; Problem: Images are not downloaded. Perhaps because the image links are the following: image src=v26_2.jpg Regards, Juhana
Re: Bug report
Juhana Sadeharju [EMAIL PROTECTED] writes: Command: wgetdir http://liarliar.sourceforge.net;. Problem: Files are named as content.php?content.2 content.php?content.3 content.php?content.4 which are interpreted, e.g., by Nautilus as manual pages and are displayed as plain texts. Could the files and the links to them renamed as the following? content.php?content.2.html content.php?content.3.html content.php?content.4.html Use the option `--html-extension' (-E). After all, are those pages still php files or generated html files? If they are html files produced by the php files, then it could be a good idea to add a new extension to the files. They're the latter -- HTML files produced by the server-side PHP code. Command: wgetdir http://www.newtek.com/products/lightwave/developer/lscript2.6/index.html; Problem: Images are not downloaded. Perhaps because the image links are the following: image src=v26_2.jpg I've never seen this tag, but it seems to be the same as IMG. Mozilla seems to grok it and its DOM inspector thinks it has seen IMG. Is this tag documented anywhere? Does IE understand it too?
bug report
Hi again, I found something what can be called a bug. The command line and the output (shortened): $ wget -k www.seznam.cz --14:14:28-- http://www.seznam.cz/ = `index.html' Resolving www.seznam.cz... done. Connecting to www.seznam.cz[212.80.76.18]:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] [ = ] 19,975 3.17M/s 14:14:28 (3.17 MB/s) - `index.html' saved [19975] Converting index.html... 5-123 Converted 1 files in 0.01 seconds. --- That is, newly created file is really link-converted. Now I run: $ wget -k -O myfile www.seznam.cz --14:16:07-- http://www.seznam.cz/ = `myfile' Resolving www.seznam.cz... done. Connecting to www.seznam.cz[212.80.76.3]:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] [ = ] 19,980 3.18M/s 14:16:07 (3.18 MB/s) - `myfile' saved [19980] index.html.1: No such file or directory Converting index.html.1... nothing to do. Converted 1 files in 0.00 seconds. --- Now myfile is created and then wget tries to convert index.html.1, i.e. the file it normally *would* create if there was no -O option... When I wish the content to be sent to stdout (-O -), this postponed converting function is run again on index.html.1. Which is totally wrong, all content has been sent out to stdout already. Not only my content is not link-converted. Is not here a possibility, that wget can inadvertently garble files on disk it has nothing to do with? Vlada Macek
bug report : 302 server response forces host spanning even without-H
If wget recieves a 302 TEmporarily Moved redirection to *another site*, this site is browsed ! wget -r http://original/index.html Server reply 302 http://redirect/index.html WGET goes and downloads from redirect I also tried adding -D flag but it doesnt help wget -r -Doriginal -nh http://original/ WGET still browses the redirect site And by the way - multiple dependcy files are downloaded from the redirect site - so this is a mojor bug i think
bug report
1/ (serious) #include config.h needs to be replaced by #include config.h in several source files. The same applies to strings.h. 2/ #ifdef WINDOWS should be replaced by #ifdef _WIN32. With these two changes it is even possible to compile wget with MSVC[++] and Intel C[++]. :-) Jirka
bug report about running wget in BSDI 3.1
Hello, I'v downloaded wget-1.5.3 from http://ftp.gnu.org/gnu/wget into our BSDI version 3.1 OS and used following commands: % gunzip wget-1.5.3.tar.gz % tar -xvf wget-1.5.3.tar % cd wget-1.5.3 % ./configure % ./make -f Makefile % ./make install But the following error message was displayed: --12:53:33-- http://www.osdpd.noaa.gov:80/COB/poltbus.asc = `poltbus.asc' Connecting to www.osdpd.noaa.gov:80... www.osdpd.noaa.gov: Host not found. when I ran % ./src/wget http://www.osdpd.noaa.gov/COB/poltbus.asc Couls you please give me your advice about the error message? Thank you very much. I.P.S. Julian __ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
Bug report / feature request
Hi! Wget 1.5.3 uses /robots.txt to skip some parts of web-site. But it doesn't use META NAME=ROBOTS CONTENT=NOFOLLOW tag, which serves to the same purpose. I believe that Wget must also parse and use META NAME='ROBOTS' ... tags WBR Stas mailto:[EMAIL PROTECTED]
Re: bug report and patch, HTTPS recursive get
In message Re: bug report and patch, HTTPS recursive get, Ian Abbott wrote... Thanks again for the bug report and the proposed patch. I thought some of the scheme tests in recur.c were getting messy, so propose the following patch that uses a function to check for similar schemes. Thanks for your rewriting. By your patch, the problem was solved. Thankyou --- Doumae Kiyotaka Internet Initiative Japan Inc. Technical Planning Division
Re: bug report and patch, HTTPS recursive get
On Wed, 15 May 2002 18:44:19 +0900, Kiyotaka Doumae [EMAIL PROTECTED] wrote: I found a bug of wget with HTTPS resursive get, and proposal a patch. Thanks for the bug report and the proposed patch. The current scheme comparison checks are getting messy, so I'll write a function to check schemes for similarity (when I can spare the time later today).
Re: Bug report
On Fri, 3 May 2002 18:37:22 +0200, Emmanuel Jeandel [EMAIL PROTECTED] wrote: ejeandel@yoknapatawpha:~$ wget -r a:b Segmentation fault Patient: Doctor, it hurts when I do this Doctor: Well don't do that then! Seriously, this is already fixed in CVS.
Bug report
ejeandel@yoknapatawpha:~$ wget -r a:b Segmentation fault ejeandel@yoknapatawpha:~$ I encounter this bug while i wanted to do wget ftp://a:b@c/, forgetting the ftp:// The bug is not present when -r is not there (a:b: Unsupported scheme) Emmanuel
GNU wget 1.8.1 - Bug report memory occupied
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hallo specialists, I used wget 1.8.1 on my system to mirror the site www.europa.eu.int. Transfer was throug a proxy and DSL over night. After about 12-13 hours I found following situation: Totally download about 1.8GB data. wget process was increased to approx 75MB RAM occupying! This increasing was fatal for the system, because there where only 32MBRAM in the intel486-maschine. Downlaod rate was dramatically reduced, but system was still running. Did kill the process with Ctrl-C. Everything seems to be o.k. After research the data I found, that redirecting was not good in all ways. Files where set in the right directories, but relinking in the page was often wrong. should be: http://myroot/europa.eu.int/indidivdual_dir but was: http://europa.eu.int/individual_dir problem seems to be, that wget misses the part of directory, that is leading to the downloaded one. My calling conditions for wget where: wget -m http://europa.eu.int/ All parameters have been in downloaded status and where unchanged excep the adres for the proxy, I had to use. compiling was with standard features under linux 2.4.13 kernel. Quesction: Did I make a configuration mistake? If not can cou correct the relinking? Howa to make, that wget will not use so many RAM? Do I have the chance to correct the wrong 'links'. (Not by hand, there are thousands) Mit freundlichem Gruß Dipl. Ing. Hermann Rugen Rugen Consulting Max-Planck-Straße 7 49767 Twist Tel.: 05931 4099 151 Fax: 05931 4099 152 eMail: [EMAIL PROTECTED] Internet: www.rugen-consulting.com -BEGIN PGP SIGNATURE- Version: PGPfreeware 7.0.3 for non-commercial use http://www.pgp.com iQA/AwUBPKBkl0Y5W7VNHjVzEQIPHQCg0xNHFV2Qrf5as2+xwvlK4Uf5Gr0AoMtY RENbT04glmugzL3kiWOh/wG3 =i623 -END PGP SIGNATURE- PGPexch.rtf.asc Description: Binary data
bug report
I found a serious bug in wget, all versions affected. Description: It is highly addictive Solution:You should include a warning about this somewhere in the product :) a windows user
Re: Bug report: 1) Small error 2) Improvement to Manual
On 17 Jan 2002 at 2:15, Hrvoje Niksic wrote: Michael Jennings [EMAIL PROTECTED] writes: WGet returns an error message when the .wgetrc file is terminated with an MS-DOS end-of-file mark (Control-Z). MS-DOS is the command-line language for all versions of Windows, so ignoring the end-of-file mark would make sense. Ouch, I never thought of that. Wget opens files in binary mode and handles the line termination manually -- but I never thought to handle ^Z. Why not just open the wgetrc file in text mode using fopen(name, r) instead of rb? Does that introduce other problems? In the Windows C compilers I've tried (Microsoft and Borland ones), r causes the file to be opened in text mode by default (there are ways to override that at compile time and/or run time), and this causes the ^Z to be treated as an EOF (there might be ways to override that too).
Re: Bug report: 1) Small error 2) Improvement to Manual
WGet returns an error message when the .wgetrc file is terminated with an MS-DOS end-of-file mark (Control-Z). MS-DOS is the command-line language for all versions of Windows, so ignoring the end-of-file mark would make sense. Ouch, I never thought of that. Wget opens files in binary mode and handles the line termination manually -- but I never thought to handle ^Z. Why not just open the wgetrc file in text mode using fopen(name, r) instead of rb? Does that introduce other problems? In the Windows C compilers I've tried (Microsoft and Borland ones), r causes the file to be opened in text mode by default (there are ways to override that at compile time and/or run time), and this causes the ^Z to be treated as an EOF (there might be ways to override that too). I think it has to do with comments because the defeinition is that starting with '#' the rest of the line is ignored. And an line ends with '\n' or the end of the file and not with and spezial charakter '\0' that mean for me that to abort the reading of an textfile when zero isfound mean's incorrect parsing. Cu Thomas Lußnig smime.p7s Description: S/MIME Cryptographic Signature
Re: Bug report: 1) Small error 2) Improvement to Manual
On 21 Jan 2002 at 14:56, Thomas Lussnig wrote: Why not just open the wgetrc file in text mode using fopen(name, r) instead of rb? Does that introduce other problems? I think it has to do with comments because the defeinition is that starting with '#' the rest of the line is ignored. And an line ends with '\n' or the end of the file and not with and spezial charakter '\0' that mean for me that to abort the reading of an textfile when zero isfound mean's incorrect parsing. (N.B. the control-Z character would be '\032', not '\0'.) So maybe just mention in the documentation that the wgetrc file is considered to be a plain text file, whatever that means for the system Wget is running on. Maybe mention peculiaries of DOS/Windows, etc. In general, it is more portable to read or write native text files in text mode as it performs whatever local conversions are necessary to make reads and writes of text files appear like UNIX i.e. each line of text terminated by a newline '\n'). In binary mode, what you get depends on the system (Mac text files have lines terminated by carriage return ('\r') for example, and some systems (VMS?) don't even have line termination characters as such.) In the case of Wget, log files are already written in text mode. I think wgetrc needs to be read in text mode and that's an easy change. In the case of the --input-file option, ideally the input file should be read in text mode unless the --force-html option is used, in which case it should be read in the same mode as when parsing other locally-stored HTML files. Wget stores retrieved files in binary mode but the mode used when reading those locally-stored files is less precise (not that it makes much difference for UNIX). It uses open() (not fopen()) and read() to read those files into memory (or uses mmap() to map them into memory space if supported). The DOS/Windows version of open() allows you to specify text or binary mode, defaulting to text mode, so it looks like the Windows version of Wget saves html files in binary mode and reads them back in in text mode! Well whatever - the HTML parser still seems to work okay on Windows, probably because HTML isn't that fussy about line-endings anyway! So to support --input-file portably (not the --force-html version), the get_urls_file() function in url.c should probably call a new function read_file_text() (or read_text_file() instead of read_file() as it does at the moment. For UNIX-type systems, that could just fall back to calling read_file(). The local HTML file parsing stuff should probably be left well alone but possibly add some #ifdef code for Windows to open the file in binary mode, though there may be differences between compilers for that.
RE: Bug report: 1) Small error 2) Improvement to Manual
On 17/01/2002 07:34:05 Herold Heiko wrote: [proper order restored] -Original Message- From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 17, 2002 2:15 AM To: Michael Jennings Cc: [EMAIL PROTECTED] Subject: Re: Bug report: 1) Small error 2) Improvement to Manual Michael Jennings [EMAIL PROTECTED] writes: 1) There is a very small bug in WGet version 1.8.1. The bug occurs when a .wgetrc file is edited using an MS-DOS text editor: WGet returns an error message when the .wgetrc file is terminated with an MS-DOS end-of-file mark (Control-Z). MS-DOS is the command-line language for all versions of Windows, so ignoring the end-of-file mark would make sense. Ouch, I never thought of that. Wget opens files in binary mode and handles the line termination manually -- but I never thought to handle ^Z. As much as I'd like to be helpful, I must admit I'm loath to encumber the code with support for this particular thing. I have never seen it before; is it only an artifact of DOS editors, or is it used on Windows too? [snip copy con file.txt] However in this case (at least when I just tried) the file won't contain the ^Z. OTOH some DOS programs still will work on NT4, NT2k and XP, and could be used, and would create files ending with ^Z. But do they really belong here and should wget be bothered ? What we really need to know is: Is ^Z still a valid, recognized character indicating end-of-file (for textmode files) for command shell programs on windows NT 4/2k/Xp ? Somebody with access to the *windows standards* could shed more light on this question ? My personal idea is: As a matter of fact no *windows* text editor I know of, even the supplied windows ones (notepad, wordpad) AFAIK will add the ^Z at the end of file.txt. Wget is a *windows* program (although running in console mode), not a *Dos* program (except for the real dos port I know exists but never tried out). I don't think there's a distinction between DOS and Windows programs in this regard. The C runtime library is most likely to play a significant role here. For a file fopen-ed in rt mode, teh RTL would convert \r\n - \n and silently eat the _first_ ^Z, returning EOF at that point. When writing, it goes the other way 'round WRT \n-\r\n. I'm unsure about whether it writes ^Z at the end, though. So personally I'd say it would not be really necessary adding support for the ^Z, even in the win32 port; except possibly for the Dos port, if the porter of that beast thinks it would be useful. Problem could be solved by opening .netrc in rt However, the t is a non-standard extension. However, this is not wget's problem IMO. Different editors may behave differently. Example: on OS/2 (which isn't a DOS shell, but can run DOS programs), the system editor (e.exe) *does* append a ^Z at the end of every file it saves. People have patched the binary to remove this feature :-) AFAIK no other OS/2 editor does this. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: Bug report: 1) Small error 2) Improvement to Manual
Herold Heiko [EMAIL PROTECTED] writes: My personal idea is: As a matter of fact no *windows* text editor I know of, even the supplied windows ones (notepad, wordpad) AFAIK will add the ^Z at the end of file.txt. Wget is a *windows* program (although running in console mode), not a *Dos* program (except for the real dos port I know exists but never tried out). So personally I'd say it would not be really neccessary adding support for the ^Z, even in the win32 port; That was my line of thinking too.
Re: Bug report: 1) Small error 2) Improvement to Manual
- Obviously, this is completely your decision. You are right, only DOS editors make the mistake. (It should be noted that DOS is MS Windows only command line language. It isn't going away; even Microsoft supplies command line utilities with all versions of its OSs. Yes, Windows will probably eventually go away, but not soon.) However, I have a comment: There is simple logic that would solve this problem. WGet, when it reads a line in the configuration file, probably now strips off trailing spaces (hex 20, decimal 32). I suggest that it strip off both trailing spaces and control characters (characters with hex values of 1F or less, decimal values of 31 or less). This is a simple change that would work in all cases. Regards, Michael __ Hrvoje Niksic wrote: Herold Heiko [EMAIL PROTECTED] writes: My personal idea is: As a matter of fact no *windows* text editor I know of, even the supplied windows ones (notepad, wordpad) AFAIK will add the ^Z at the end of file.txt. Wget is a *windows* program (although running in console mode), not a *Dos* program (except for the real dos port I know exists but never tried out). So personally I'd say it would not be really neccessary adding support for the ^Z, even in the win32 port; That was my line of thinking too.
RE: Bug report: 1) Small error 2) Improvement to Manual
From: Michael Jennings [mailto:[EMAIL PROTECTED]] Obviously, this is completely your decision. You are right, only DOS editors make the mistake. (It should be noted that DOS is MS Windows only command line language. It isn't going away; even Microsoft supplies command line utilities with all versions of its OSs. Yes, Windows will probably eventually go Please note the difference: all windows versions include a command line. However that commandline afaik is not dos - it is able to run dos programs, either because based on dos (win 9x) or because capable of understanding the difference between w32 commandline programs and dos programs, and starting the neccessary dos *emulation*. But it is not dos, and the behaviour is not like dos. As far as I know, windows command line programs do not use ^Z as end-of-file terminators (although some do honour it for emulation/compatibility), only real dos programs do (anybody knows if there is a - MS - standard for this ?). If this is true, should wget on windows really emulate the behaviour of dos programs, of a environment windows originally was based on but where it is *not*running*anymore* (wget I mean) ? From a purists point of view, not. From a end-user point of view, possibly in order to facilitate the changeover. On the other hand, your report is the first one I ever saw, considering Hrvoje's reaction and the lack of support in the original windows port I'd say this is not a problem generally felt as important, so personally I'm in favor of not cluttering up the port anymore with special behaviour. But it is Hrvoje's decsion, as always. If you feel it is important write a patch and submit it, shouldn't be a major piece of work. Heiko -- -- PREVINET S.p.A.[EMAIL PROTECTED] -- Via Ferretto, 1ph x39-041-5907073 -- I-31021 Mogliano V.to (TV) fax x39-041-5907087 -- ITALY
Bug report
Hello bug-wget, $ wget --version GNU Wget 1.8 $ wget ftp://password:[EMAIL PROTECTED]:12345/Dir%20One/This.Is.Long.Name.Of.The.Directory/* Warning: wildcards not supported in HTTP. Oooops! But this is FTP url, not HTTP! Please, fix it. Thank you, -- Best regards from future, HillDale. Pavel mailto:[EMAIL PROTECTED]
Re: Bug report
Pavel Stepchenko [EMAIL PROTECTED] writes: Hello bug-wget, $ wget --version GNU Wget 1.8 $ wget ftp://password:[EMAIL PROTECTED]:12345/Dir%20One/This.Is.Long.Name.Of.The.Directory/* Warning: wildcards not supported in HTTP. Oooops! But this is FTP url, not HTTP! Are you using a proxy?
Re: Re[2]: Bug report
Pavel Stepchenko [EMAIL PROTECTED] writes: Warning: wildcards not supported in HTTP. Oooops! But this is FTP url, not HTTP! HN Are you using a proxy? Yes. This means that HTTP is used for retrieval, and '*' won't work -- which is what Wget is trying to warn you about. --17:26:58-- ftp://1.2.3.4:12345/Dir%20One/This.Is.Long.Name.Of.The.Directory/* = `*' Connecting to 2.2.2.2:3128... connected! Proxy request sent, awaiting response... ^C 1.7.1 dont say no one word about Warning: wildcards not supported in HTTP. Instead, it just silently doesn't work.
RE: WGET 1.8 bug report
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]] Herold Heiko [EMAIL PROTECTED] writes: I put up the current cvs, mainly since there have been those patches to ftp-ls.c and the signal handler. Ok ? Please don't do that. Although all changes in the current CVS *should* be stable, mistakes are possible. Please provide a binary that is 1.8 plus the most critical patches -- currently only the progress.c patch. Correct, sorry. Site updated. quite a userbase which does take the zipped cvs sources I put up in order to use them on unix platforms. Don't ask me why. Well, possibly folks behind firewalls who can't use cvs but can download with a proxy or something.. We should have daily source snapshots for such people. I agree. With a minimum bit of logic this shouldn't even load the server too much - before tarring check if there have been commits since last time, shouldn't be difficult to parse a cvs history or somewhat. Or checkout and do a find -newer. Possibly (if the general setup and sysop permits that) the checkout files could even be directly in the sunsite wget ftp directory for easy access to changelogs or single files. Heiko -- -- PREVINET S.p.A.[EMAIL PROTECTED] -- Via Ferretto, 1ph x39-041-5907073 -- I-31021 Mogliano V.to (TV) fax x39-041-5907087 -- ITALY