Re: ** Nigerian Scam variation (Re: Co-operation Needed!)
On 16/07/2002 16:36:15 Fernando Cassia wrote: FYI and if someone has been living in a bottle This is a variation of the Nigerian scam. http://www.secretservice.gov/alert419.shtml http://www.fdic.gov/consumers/consumer/news/cnwin0102/TooGood.html Don't even bother contacting them. Regards Fernando Jesse Ndoro. wrote: Dear Sir, [snip Nigerian scam quoted in its entirety] Please don't do that. 1) This is a mailing list where the subscribes can actually think 2) FYI, you should not top-post and quote the original *in its entirety* This is a mailing list, where discussions are frequent. Replying at the top makes it difficult to follow who said what and in reply to whom. 3) We already received two copies of the scam. There was no need to send a third, unabridged copy. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: speed units
On 10/06/2002 23:07:47 Joonas Kortesalmi wrote: Wget seems top repots speeds with wrong units. It uses for example KB/s rather than kB/s which would be correct. Any possibility to fix that? :) K = Kelvin k = Kilo Propably you want to use small k with download speeds, right? Let's not go there again, lest wget will have to report download in kibibytes (ISTR wget using 1024 to divide). k = kilo is reserved for dividing by 1000. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: Can't get remote files - what am I doing wrong?
On 03/06/2002 14:56:47 dale wrote: [snip] wget ftp://user:[EMAIL PROTECTED]/folder1/folder2/*s.csv I get an error message of no match and if I use: wget --glob=on ftp://user:[EMAIL PROTECTED]/folder1/folder2/*s.csv I also get no match In the future, please post the output with the -d switch added. (did you read the instructions ?) [snip] The Mac machine I am using for testing is behind our firewall, but there is a hole opened to allow my internal IP to reach the specific remote IP. [snip] Because you didn't include the output with the -d switch, I'm guessing. Do you use a proxy to go through the firewall ? A lot of proxies issue HTTP requests even for FTP. HTTP cannot glob. p.s. The reply-to address has been anti-spammed (I hope anyway), please post any replies to the list. Somebody at Ultimate Search (the owner of nospam.net) will be mightily surprised. What you did can be interpreted as email address forgery. Please in the future use addresses which end in .invalid (this top level domain is guaranteed to always be, err, invalid), e.g. [EMAIL PROTECTED] -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: Can't get remote files - what am I doing wrong?
On 05/06/2002 13:08:05 drt - lists wrote: Thank for no help. If this is typical of how you reply to your customers I do *not* reply to customers. I am a developer, and post here as a private individual. Perhaps I should unsubscribe altogether. [snip] The Mac machine I am using for testing is behind our firewall, but there is a hole opened to allow my internal IP to reach the specific remote IP. [snip] Because you didn't include the output with the -d switch, I'm guessing. Do you use a proxy to go through the firewall ? A lot of proxies issue HTTP requests even for FTP. HTTP cannot glob. Yes we do, So there is a proxy after all. and no, it doesn't issue an ftp request as I have an opening for this specific request - which if you had bothered to read my message instead of trying to attack you would know that. Here is the part that you ignored which addresses the accusation above. ^^ Huh ? I described a scenario which could have caused the failure you described. I did not *accuse* you of using a proxy ! --- The Mac machine I am using for testing is behind our firewall, but there is a hole opened to allow my internal IP to reach the specific remote IP. And using the first example above it does connect so I know I am getting through the firewall. --- Note that if wget is set up to use the proxy by default (env. var, wgetrc) then it'll use the proxy even if it could connect directly through the hole in the firewall. The first example (which I snipped) did not use globbing. That would succeed regardless of whether wget connected directly or through a HTML-ized proxy. We're not getting any closer to a solution. Please post the output of the failed request (the one that fails) in debugging mode (be careful to obscure any possible passwords). [ad hominem attack snipped] I apologise. Although I consider what I've written to be valid, the tone was not. I claim temporary loss of diplomatic abilities. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: ? gets translated to @
On 24/05/2002 13:39:29 ladislav.gaspar wrote: Hi I do the following: wget http://killefiz.de/zaurus/showdetail.php?app=221 but the file is saved as http://killefiz.de/zaurus/showdetail.php@app=221 (*.php?app gets translated to *.php@app) Why is that and is there a workaround? That *is* the workaround :-) '?' is an invalid character for filenames on FAT, FAT32, NTFS. Instead of giving an error message like this: Cannot open killefiz.de/zaurus/showdetail.php?app=221 wget actually tries to do what you want (i.e. download the file). You can run wget on another platform (Linux, some Unix. etc). The filesystems there usually don't have this restriction. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: crawling servlet based urls
On 16/05/2002 17:06:31 Steve Mestdagh wrote: Hi, I'm trying to get crawl intranet urls of form: [snip, wget will try to save to filename like this:] `WKCCommand?command=getLessonLessonId=137' [snip] The filename above is invalid on many filesystems used by Micros~1. (It's the '?' causing the problem). This is corrected for sure in a newer version, either 1.8.1 or the current CVS. Heiko Herold provides: New CVS binary for windows at http://space.tin.it/computer/hherold -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: apache irritations
On 22/04/2002 16:38:15 Maciej W. Rozycki wrote: On Mon, 22 Apr 2002, Hrvoje Niksic wrote: How about using the -R option of wget? A brief test proves -R '*\?[A-Z]=[A-Z]' works as it should. Or maybe the default system wgetrc should ship with something like: reject = *?[A-Z]=[A-Z] Note the difference between strings! -- the backslash before the quotation mark is essential as otherwise it's a glob character. [A-Z] is a bit extreme, IMHO. How about reject = *\?[NMSD]=[AD] ^^ literal '?' needed here Well, I don't think it's sane but adding a *commented-out* reject line with an appropriate annotation to the default system wgetrc looks like a good idea to me. A good idea. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: Goodbye and good riddance
On 12/04/2002 19:21:41 James C. McMaster (Jim) wrote: My patience has reached an end. Perhaps, now that you have (for the first time) indicated you will do something to fix the problem, the possible light at the end of the tunnel will convince others to stay. The light at the end of the tunnel is just the explosion around the Pu239 : -) -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: HTTP 1.1
On 12/04/2002 21:37:31 hniksic wrote: Tony Lewis [EMAIL PROTECTED] writes: Hrvoje Niksic wrote: Is there any way to make Wget use HTTP/1.1 ? Unfortunately, no. In looking at the debug output, it appears to me that wget is really sending HTTP/1.1 headers, but claiming that they are HTTP/1.0 headers. For example, the Host header was not defined in RFC 1945, but wget is sending it. Yes. That is by design -- HTTP was meant to be extended in that way. Wget is also requesting and accepting `Keep-Alive', using `Range', and so on. Csaba Raduly's patch would break Wget because it doesn't suppose the chunked transfer-encoding. Also, its understanding of persistent connection might not be compliant with HTTP/1.1. IT WAS A JOKE ! Serves me right. I need to put bigger smilies :-( -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: qestio
On 05/04/2002 12:44:22 Varga Gabor wrote: Hi I am gabor from hungary I have a qestion I have an URL ending like this */show.php?id=843 I know how it works(correct me if I am wrong) the *.php (gets or posts) the arg. ID and the server returns the page 843 but why can't wget mirror these pages ? Because it'll try to save with the filename show.php?id=843, and '?' is invalid in a filename on DOS/Windows/OS2 What version of wget are you using ? What platform (operating system) ? What does the debug log say ? (run wget with the -d switch added) CC'd to wget, not bug-wget -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
wget parsing JavaScript
wget stumbled upon the following HTML file: --- 8 html head titlefoo/title /head body SCRIPT language=JavaScript1.2 var sitems=new Array() var sitemlinks=new Array() ///Edit below/ //extend or shorten this list sitems[0]=15.html sitems[1]=16.html sitems[2]=17.html sitems[3]=18.html sitems[4]=19.html sitems[5]=20.html sitems[6]=21.html sitems[7]=22.html sitems[8]=23.html sitems[9]=24.html sitems[10]=25.html sitems[11]=26.html sitems[12]=27.html //These are the links pertaining to the above text. sitemlinks[0]=31.html sitemlinks[1]=32.html sitemlinks[2]=33.html sitemlinks[3]=34.html sitemlinks[4]=35.html sitemlinks[5]=36.html sitemlinks[6]=37.html sitemlinks[7]=38.html sitemlinks[8]=39.html sitemlinks[9]=40.html sitemlinks[10]=41.html sitemlinks[11]=42.html sitemlinks[12]=43.html //If you want the links to load in another frame/window, specify name of //target (ie: target=_new) var target= for (i=0;i=sitems.length-1;i++) document.write('a href='+sitemlinks[i]+' target='+target+''+sitems[i]+'/abr') /SCRIPT NOSCRIPT Congratulations, you have turned off JavaScript. /NOSCRIPT /body /html --- 8 I see that wget handles SCRIPT with tag_find_urls, i.e. it tries to parse whatever it's inside. Why was this implemented ? JavaScript is most used to construct links programmatically. wget is likely to find bogus URLs until it can properly parse JavaScript. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: OK, time to moderate this list
On 22/03/2002 07:06:13 Daniel Stenberg wrote: On Fri, 22 Mar 2002, Hrvoje Niksic wrote: [snip] I think I agree with this. The amount of spam is staggering. I have no explanation as to why this happens on this list, and not on other lists which are *also* open to non-subscribers. Spammers work in mysterious ways. ;-) No, they work in fairly predictable ways. The wget mailinglist address is advertised on the wget homepage. According to empirical observations, if you publish a brand new email address on a web page, it'll receive spam within eight *hours* of it being published. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: KB or kB
On 08/02/2002 08:30:59 Henrik van Ginhoven wrote: On Fri, Feb 08, 2002 at 08:54:06AM +0100, Hrvoje Niksic wrote: Wget currently uses KB as abbreviation for kilobyte. In a Debian bug report someone suggested that kB should be used because it is more correct. This is the kind of stuff that leads to month-long flamewars :-) kB rather than KB? I think whoever filed that bugreport got it wrong, as far as I know kB would always mean 1000 (bytes), since k = thousand, and never ever 1024. If he'd said KiB I'd agree with him to a certain degree, but kB simply can't be right. Note that we can claim the distinction that k=1000 and K=1024 That won't work with 1E6 vs 2**20 because SI uses uppercase M for 1E6. Rather than me trying to sum it up and risk typing something wrong, this page seems to address the issue well: http://www.romulus2.com/articles/guides/misc/bitsbytes.shtml Please, no kibibytes :-) Maybe wget should just count 512-byte blocks, a la df. That would improve the understandability of the display ... NOT But it would keep the terminally anal-retentives at bay :-) Seriously, just ignore it. I can certainly live with 5% experimental error ( 2**20 = 1.0486E6 ) at megabyte level. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: wget not working
On 08/02/2002 15:34:53 Martin Schöneberger wrote: At 14:37 08.02.2002 +, Henderson, Daniel wrote: #wget www.sophos.com/downloads/ide/ides.zip --14:32:57-- http://www.sophos.com/downloads/ide/ides.zip = `ides.zip' Connecting to www.sophos.com:80... www.sophos.com: Host not found. Is there something else I should configure in Solaris to allow this to work? First of all you should find out why you can't connect to sophos.com. 1) sophos is down - try later Solution: get the file from another server 2) dns lookup failed - try if you can connect to other hosts like google.com or anything else, or if you can only connect to ip adresses. Solution1: try another DNS server Solution2: reconfigure your DNS settings or even your DNS-server (if you are running one) Try nslookup www.sophos.com ping www.sophos.com telnet www.sophos.com 80 If these work, it's wget's fault. If they don't, it's a connectivity problem. 4) user root not allowed to connect to the internet (standard on BSD if i remember correctly) - try if you can DL the file using another user Solution: change the user database or the firewall settings, or just don't connect to the internet using root :-) Good point. Look at the prompt... [snip] Last but not least: Try the -d switch with wget and have a look at the debug output of wget. Perhaps you find further information why you can't connect. If you don't, send it to this list, perhaps we find smth :-) Very good advice indeed. HTH, -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: KB or kB
On 08/02/2002 13:58:55 Andre Majorel wrote: On 2002-02-08 08:54 +0100, Hrvoje Niksic wrote: Wget currently uses KB as abbreviation for kilobyte. In a Debian bug report someone suggested that kB should be used because it is more correct. The reporter however failed to cite the reference for this, and a search of the web has proven inconclusive. Does someone understand the spelling issues involved enough to point out the correct spelling and back it up with arguments? The applicable standard is the SI (Système International) [snip SI prefixes] Capital K is not a prefix, it's the SI abbreviation for the temperature unit, the kelvin (note : lower case k) named after Lord Kelvin. So it's definitely kB for kilobyte. As long as it means 1000 and NOT 1024 Whether that means 1000 bytes or 1024 bytes is another issue. Not while claiming to conform to SI. Csaba -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: BUG https + index.html
On 01/02/2002 12:10:59 Mr.Fritz wrote: After the https/robots.txt bug, doing a recursive wget to an https-only server gives me this error: it searches for http://servername/index.html but there is no server on port 80, so wget receives a Connection refused error and quits. It should search for https://servername/index.html Are you sure this was an SSL-enabled wget ? Please provide a debug log by running wget with the -d parameter. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: mirroring vs -m
On 29/01/2002 15:54:17 Andre Majorel wrote: [snip debate about following links in HTML retrieved by FTP] I'm inclined to think that recursive retrieval without parsing is a feature. HTML content is normally served over HTTP. If you want to retrieve HTML through FTP, it's likely because you do *not* want to follow the links. I (client) don't get the choice. If the document at http://foo.bar/index.html has all its links like this: A HREF=ftp://foo.bar/welcome.html;welcome/A the client has no choice but to retrieve them via FTP. It would be nice if wget was able to follow all those links. If Wget always parsed HTML, even over FTP, it would be impossible to make a complete mirror a tree that has broken href links or hidden files. Perhaps If wget started with FTP, it should mirror FTP-like (.listing and all that). If it started via HTTP, it should follow links, regardless of future retrieval modes [snip] -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
RE: Bug report: 1) Small error 2) Improvement to Manual
On 17/01/2002 07:34:05 Herold Heiko wrote: [proper order restored] -Original Message- From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 17, 2002 2:15 AM To: Michael Jennings Cc: [EMAIL PROTECTED] Subject: Re: Bug report: 1) Small error 2) Improvement to Manual Michael Jennings [EMAIL PROTECTED] writes: 1) There is a very small bug in WGet version 1.8.1. The bug occurs when a .wgetrc file is edited using an MS-DOS text editor: WGet returns an error message when the .wgetrc file is terminated with an MS-DOS end-of-file mark (Control-Z). MS-DOS is the command-line language for all versions of Windows, so ignoring the end-of-file mark would make sense. Ouch, I never thought of that. Wget opens files in binary mode and handles the line termination manually -- but I never thought to handle ^Z. As much as I'd like to be helpful, I must admit I'm loath to encumber the code with support for this particular thing. I have never seen it before; is it only an artifact of DOS editors, or is it used on Windows too? [snip copy con file.txt] However in this case (at least when I just tried) the file won't contain the ^Z. OTOH some DOS programs still will work on NT4, NT2k and XP, and could be used, and would create files ending with ^Z. But do they really belong here and should wget be bothered ? What we really need to know is: Is ^Z still a valid, recognized character indicating end-of-file (for textmode files) for command shell programs on windows NT 4/2k/Xp ? Somebody with access to the *windows standards* could shed more light on this question ? My personal idea is: As a matter of fact no *windows* text editor I know of, even the supplied windows ones (notepad, wordpad) AFAIK will add the ^Z at the end of file.txt. Wget is a *windows* program (although running in console mode), not a *Dos* program (except for the real dos port I know exists but never tried out). I don't think there's a distinction between DOS and Windows programs in this regard. The C runtime library is most likely to play a significant role here. For a file fopen-ed in rt mode, teh RTL would convert \r\n - \n and silently eat the _first_ ^Z, returning EOF at that point. When writing, it goes the other way 'round WRT \n-\r\n. I'm unsure about whether it writes ^Z at the end, though. So personally I'd say it would not be really necessary adding support for the ^Z, even in the win32 port; except possibly for the Dos port, if the porter of that beast thinks it would be useful. Problem could be solved by opening .netrc in rt However, the t is a non-standard extension. However, this is not wget's problem IMO. Different editors may behave differently. Example: on OS/2 (which isn't a DOS shell, but can run DOS programs), the system editor (e.exe) *does* append a ^Z at the end of every file it saves. People have patched the binary to remove this feature :-) AFAIK no other OS/2 editor does this. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: Is wget --timestamping URL working on Windows 2000?
On 11/12/2001 14:03:54 Adrian Aichner wrote: Hi Wgeteers! Is -N, --timestamping don't retrieve files if older than local. supposed to work on windows 2000? [snip] cd c:\Hacking\SunSITE.dk\xemacsweb\Download\win32\ %TEMP%\wget.wip\src\wget.exe --debug --timestamping --output-document=setup.exe http://ftp.xemacs.org/windows/setup.exe Compilation started at Tue Dec 11 14:53:07 2001 +0100 (W. Europe Standard Time) DEBUG output created by Wget 1.8 on Windows. --14:53:07-- http://ftp.xemacs.org/windows/setup.exe = `setup.exe' Resolving ftp.xemacs.org... done. Caching ftp.xemacs.org = 207.96.122.9 Connecting to ftp.xemacs.org[207.96.122.9]:80... connected. Created socket 420. Releasing 007D1C00 (new refcount 1). ---request begin--- [snip HEAD request and response] Found ftp.xemacs.org in host_name_addresses_map (007D1C00) Registered fd 420 for persistent reuse. Length: 181,760 [application/octet-stream] Closing fd 420 Releasing 007D1C00 (new refcount 1). Invalidating fd 420 from further reuse. The sizes do not match (local 0) -- retrieving. ^^^ ^^^ Something is wrong there. Try it without --output-document; it should put it in the current dir anyway --14:53:08-- http://ftp.xemacs.org/windows/setup.exe = `setup.exe' Found ftp.xemacs.org in host_name_addresses_map (007D1C00) Connecting to ftp.xemacs.org[207.96.122.9]:80... connected. Created socket 420. Releasing 007D1C00 (new refcount 1). ---request begin--- GET /windows/setup.exe HTTP/1.0 [snip] 14:53:47 (6.14 KB/s) - `setup.exe' saved [181760/181760] Compilation finished at Tue Dec 11 14:53:47 -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: log errors
On 11/12/2001 15:09:25 hniksic wrote: Summer Breeze [EMAIL PROTECTED] writes: I want to know if Wget is a program similar to Mozilla, and if so is there any way to make my pages available to Wget? I use Netscape to create my web pages. Wget is a command-line downloading utility; it allows you to download a page or a part of the site without further user interaction. Here is a sample entry: 66.28.29.44 - - [08/Dec/2001:18:21:20 -0500] GET /index4.html%0A HTTP/1.0 403 280 - Wget/1.6 /index4.html%0A looks like a page is trying to link to /index4.html, but the link contains a trailing newline. That IP address is assigned to Road Runner (big cable ISP, I think) Is /index4.html%0A the *first* error line in the log from 66...44 ? Wget will try to download a URL in two cases: either because it was told to explicitly, or because it was doing a recursive download and found that link in a page downloaded earlier. /index4.html%0A looks like something somewhere was misparsed. It might conceivably be wget (unlikely, as this sort of problem would've surfaced long ago). If /index4.html%0A *is* the first URL requested by that IP address, then the blame is clearly elsewhere (unless -i was used). If not, can you search your site for a link to /index4.html that might be badly formatted HTML (although wget should be able to defend itself against bad HTML). (Please don't CC me; I'm on the list) -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: Uncoupling translations from source
On 10/12/2001 08:10:12 Martin v. Loewis wrote: Maybe you wanted to say that many Europeans speak English so well, that they do not need translations? It is my observation as well: Some users are hostile towards the notion of translated software. Those are typically not native English speakers, but people who found, at one time or the other, reason to complain about translations. They do so for all operating systems, making fun of erroneous translations (such as the infamous Pfeife zerbrochen of SINIX, or translations that an MS employee came up with). From an ancient DR-DOS (version 3.something) Nicht breit __reading__ laufwerk A: This was clearly an oversight (the message was probably pasted together from various places). My native language is Hungarian, and I don't remember using ANY software in Hungarian (with the possible exception of Recognita, which is written by hungarians). For the few I tried, I found the hungarian translation incredibly awkward (this is exacerbated by the fact that Hungarian is neither germanic nor latinic), even if not at the level of all your base are belong to us :-) It was easier to use the english version (this was all commercial software). Complaining about the *presence* of translation is silly, IMO. Presumably gettext has a way to decide what language to use (LANG environment variable, or suchlike; LANG=en_gb should do). Decoupling translations is a good idea, if the logistics can be sorted out. Csaba -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: Wget 1.8-beta1 now available
On 01/12/2001 19:44:44 John Poltorak wrote: On Sat, Dec 01, 2001 at 04:30:47PM +0100, Hrvoje Niksic wrote: John Poltorak [EMAIL PROTECTED] writes: Is it possible to include OBJEXT in Makefile.in to make this more cross-platform? I suppose so. I mean, o is already defined to .@U@o, but I'm not exactly sure what the U is supposed to stand for. It's looks to me as though @U@ is set up for some variable substitution, but I can't work out what for... Maybe it's getting replaced by NULL. I know next to nothing about how Auto* is (supposed to be) working, but I've seen lots of sed commands in If @U@ is doing a variable substitution, then it'll expand to something _before_ o (if @U@ - bar, then this will result in a dependency involving .baro) (looking through configure) Wget's configure contains this towards the end: s%@U@%$U%g U seems to be related to ansi2knr: if(can use prototypes) U= ANSI2KNR= else U=_ ANSI2KNR=./ansi2knr endif This will result in dependencies written as ._o if ansi2knr was run over the sources. This forces me to conclude that using @U@ _CAN_NOT_ and _WILL_NOT_ change .o to .obj I think .@U@o might need to be replaced with .@U@@objext@ (if there is such a beast, in analogy with @exeext@) Csaba -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: wget1.7.1: Compilation Error (please Cc'ed to me :-)
On 28/11/2001 10:28:44 Daniel Stenberg wrote: On Wed, 28 Nov 2001, zefiro wrote: ld: Undefined symbol _memmove Do you have any suggestion ? SunOS 4 is known to not have memmove. Isn't configure supposed to notice that ? -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: wget mirroring busted
On 14/11/2001 16:27:34 jwz wrote: [EMAIL PROTECTED] wrote: Can you post the entire debug log (on a web/ftp site, of course, not the list). Done -- http://www.jwz.org/wget-log.gz Does this mean you can't reproduce this when you run wget the same way I did? No, I just wanted to take a look at the surrounding lines in the log. wget -nv -m -nH -np \ http://www.dnalounge.com/flyers/ http://www.dnalounge.com/gallery/ I may try that myself. P.S. Please *don't* CC in the future, I'm on the list. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: A tricky download
On 12/10/2001 16:49:07 Edward J. Sabol wrote: [snip question about downloading a site with Javascript-only links] Probably not. If the only links to the other chapters are in JavaScript commands, then there's no way wget can do it. Wget does not interpret JavaScript and most likely never will. Implementing it is left as an exercise for the reader. ;-) -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: Recursive retrieval of page-requisites
On 09/10/2001 14:25:57 Andre Pang wrote: On Tue, Oct 09, 2001 at 03:46:52PM +0300, Mikko Kurki-Suonio wrote: To me that sounds like a logical combination of -r -np -p? Any correction appreciated. Doesn't work, apparently because -np overrides -p. I.e. with -np set, no document outside the selected subtree will be loaded, whether it is referred to through regular link-traversal or as a page-requisite element. My guess is that -p adds those links to the list of documents to load, but -np later rejects them because they're not within the selected subtree. What I'd basically like is a setting that loads page-requisites REGARDLESS OF ALL OTHER SETTINGS. I.e. you use the myriad of settings to fine tune the exact set of pages requested, and then request all requisites for the selected set of pages. Try this patch. It should make -p _always_ get pre-requisites, even if you have -np on (which was the reason why i wrote the patch). [snip] Actually, case can be made for both ways. Sometimes you might want -p to only get images conforming to -np. Perhaps to skip (advertising)banners. (those are usually served by another server, and thus ignored anyway unless --span-hosts). Perhaps make -p override -np, but have an alternative -p (e.g. -pnp ) which obeys -np. I didn't see Andre's patch so I cannot comment on it (stripped by my mail system)-: It modifies existing (admittedly confusing) behaviour, my suggestion would permit getting the old behaviour back. Another possibility would be to keep the existing behaviour (i.e. -np overrides -p) and have a stronger -p (e.g. -pp ) which ignores -np. Csaba -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED]http://www.sophos.com US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: Bus errors and recursion
[about alloca vs malloc] If you allocate with malloc and then accidentally overwrite it, you get a corrupted heap. If you allocate with alloca and then accidentally overwrite it, you get a corrupted stack. Guess which is easier to notice :-) Besides, alloca is a GCC builtin (IIRC), so you you don't have to worry about its implementation (the GCC folks do :). As long as you have the stack to allocate from, it's as transparent as declaring automatic arrays with variable length. e.g. p = alloca( strlen(s) ); is almost the same as char a[ strlen(s) ], p=a; /* this is a legal GCC extension */ -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED] http://www.sophos.com US support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: WGET for OS/2 and Proxy-Server
Hrvoje Niksic hniksic@arsdigitaTo: Wget List [EMAIL PROTECTED], Thomas Bohn .com [EMAIL PROTECTED] Sent by: cc: [EMAIL PROTECTED]Subject: Re: WGET for OS/2 and Proxy-Server sdigita.de 15/05/01 13:00 Thomas Bohn [EMAIL PROTECTED] writes: Hello, I tried to use WGET for OS/2 (tested V 1.5.3 and 1.6) with a proxy server. Without proxy server all works fine. But with... In a OS/2 commandline session I type the following commands: SET HTTP_PROXY=62.52.17.1:80 Your proxy setting gets ignored. Try using lower-case `http_proxy'. It seems to me that getenv has some issues on OS/2. Workaround: use .wgetrc commands instead. All environment variale names (i.e. the part before the '=') are uppercase on OS/2 wget uses getenv(http_proxy); the implementation of getenv seems to be scanning _environ and doing a strncmp (i.e. case-sensitive comparison). If getproxy in url.c is changed to getenv(HTTP_PROXY) then it does pick up the environment setting. Could we postulate that *ALL* environment vars influencing WGET be uppercase ? These are the places where getenv is used (excluding getopt.c) init.c:237: tmp = getenv (no_proxy); init.c:259: char *home = getenv (HOME); init.c:292: env = getenv (WGETRC); url.c:1292:proxy = opt.http_proxy ? opt.http_proxy : getenv (http_proxy); url.c:1294:proxy = opt.ftp_proxy ? opt.ftp_proxy : getenv (ftp_proxy); url.c:1297:proxy = opt.https_proxy ? opt.https_proxy : getenv (https_proxy); -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED] http://www.sophos.com US support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Yet another Makefile.watcom :-)
An hour of careful debugging can save you five minutes of reading the documentation (See attached file: Makefile.watcom) This version gets rid of the ugly double list of object files (one for the linker, one for the dependencies ). EXPLANATION target=Watcom users WLINK expects the object files to be specified like this: wlink FILE 1.obj,2.obj,etc_etc,n.obj NAME program.exe ... ^^ This is the format auto-generated by their IDE, BTW. However, wlink also accepts an alternate way: wlink FILE { 1.obj 2.obj etc_etc n.obj } NAME program.exe ... What's more, this is actually present in the documentation (gasp)! /EXPLANATION -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED] http://www.sophos.com US support: +1 888 SOPHOS 9 UK Support: +44 1235 559933 =?iso-8859-1?Q?Makefile.watcom?=
RE: New and improved Makefile.watcom
Herold Heiko Heiko.Herold@prTo: 'Hrvoje Niksic' [EMAIL PROTECTED], Wget List evinet.it [EMAIL PROTECTED] cc: 14/05/01 12:05 Subject: RE: New and improved Makefile.watcom -Original Message- From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]] Sent: Monday, May 14, 2001 11:23 AM To: Wget List Subject: Re: New and improved Makefile.watcom [EMAIL PROTECTED] writes: This is a rewrite of Makefile.watcom Thanks; I've put it in the repository. # Copy this file to the ..\src directory (maybe rename to Makefile). Also: # copy config.h.ms ..\src\config.h Maybe we should provide a win-build script (or something) that does this automatically? How about this ? config.h : ..\windows\config.h.ms copy $[@ $^@ (this would be copy $ $@ for GNU make) Yup, it works (for me ! :-) Isn't this what configure.bat is for ? In theory, but... Default to VC (or use VC if --msvc is given), otherwise if env var BORPATH is present (or --borland is given) use borland, otherwise error. I see no Watcom here :-) configure.bat doesn't know about Watcom C Hrvoje also wrote: #disabled for faster compiler LFLAGS=sys nt op st=32767 op vers=1.7 op map op q op de 'GNU wget 1.7dev' de all CFLAGS=/zp4 /d1 /w4 /fpd /5s /fp5 /bm /mf /os /bt=nt [snip] # /zp4= pack structure members with this alignment # /d1 = line number debug info # /w4 = warning level # /fpd= ??? no such switch ! # /5s = Pentium stack-based calling # /fp5= Pentium floating point # /bm = build multi-threaded # /mf = flat memory model # /os = optimize for size ^^^ # /bt = build target (nt) One thing I don't understand: why do you optimize for size? Doesn't it almost always make sense to optimize for speed instead? Because I like small and sleek executables :-) Are there any processor-intensive bits in wget ? Most of the time it'll wait for the Internet anyway. BTW, compiling with DEBUG_MALLOC reveals three memory leaks : 0x13830432: mswindows.c:72- *exec_name = xstrdup (*exec_name); in windows_main_junk 0x13830496: mswindows.c:168 - wspathsave = (char*) xmalloc (strlen (buffer) + 1); in ws_mypath 0x13830848: utils.c:1525 - (struct wget_timer *)xmalloc (sizeof (struct wget_timer)); Here's another edition of Makefile.watcom (See attached file: Makefile.watcom) -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED] http://www.sophos.com US support: +1 888 SOPHOS 9 UK Support: +44 1235 559933 =?iso-8859-1?Q?Makefile.watcom?=
New and improved Makefile.watcom
This is a rewrite of Makefile.watcom It is the end of two separate OBJ file lists (one for dependencies, the other for the linker command) which needed to be kept in sync. The explicit dependency list is also gone (Watcom C can pass dependencies to Watcom Make when using .AUTODEPEND) wget/windows/(See attached file: Makefile.watcom) -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED] http://www.sophos.com US support: +1 888 SOPHOS 9 UK Support: +44 1235 559933 =?iso-8859-1?Q?Makefile.watcom?=
Re: windows, continue bug
You mean this ? ---8--- DEBUG output created by Wget 1.7-dev on Windows. parseurl (http://turtle.power.org/;) - host turtle.power.org - opath - dir - file - ndir newpath: / Checking for turtle.power.org in host_name_address_map. Checking for turtle.power.org in host_slave_master_map. First time I hear about turtle.power.org by that name; looking it up. Caching turtle.power.org - 10.1.1.9 Checking again for turtle.power.org in host_slave_master_map. --10:35:49-- http://turtle.power.org/ = `turtle.power.org/index.html' Connecting to turtle.power.org:80... Found turtle.power.org in host_name_address_map: 10.1.1.9 Created fd 88. connected! ---request begin--- GET / HTTP/1.0 User-Agent: Wget/1.7-dev Host: turtle.power.org Accept: */* Connection: Keep-Alive HTTP request sent, awaiting response... HTTP/1.1 200 OK Date: Fri, 04 May 2001 09:35:48 GMT Server: Apache/1.3.14 (Unix) PHP/4.0.4pl1 X-Powered-By: PHP/4.0.4pl1 Connection: close Content-Type: text/html The server does not support continued download; refusing to truncate `turtle.power.org/index.html'. FINISHED --10:35:49-- Downloaded: 0 bytes in 0 files ---8--- It's not just on Windows; happens on OS/2 ( compiled with GCC ) too. Debugging it suggests that hstat.no_truncate desn't get initialized (dodgy random-looking value contained in no_truncate) : http_loop calls gethttp() at line 1539, but the following is only at line 1554: if( opt.always_rest ) hstat.no_truncate = file_exists_p(locf); Moving these two lines *above* the call to gethttp() on line 1554, the file was downloaded correctly. -- Csaba Ráduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED] http://www.sophos.com US support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Re: wget bug - after closing control connection
Which version of wget do you use ? Are you aware that wget 1.6 has been released and 1.7 is in development (and they contain a workaround for the "Lying FTP server syndrome" you are seeing) ? -- Csaba Rduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED] http://www.sophos.com US support: +1 888 SOPHOS 9UK Support: +44 1235 559933
Re: Wget
I'm confused. I thought 1.5.3 *did* display the dots, but I could be wrong. Please send queries like this to the list ( [EMAIL PROTECTED] ), not to me personally. -- Csaba Rduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED] http://www.sophos.com US support: +1 888 SOPHOS 9UK Support: +44 1235 559933 :-( sorry for the top-posting )-: [EMAIL PROTECTED] (Timo Maier) To: [EMAIL PROTECTED] cc: 06/03/01 Subject: Re: Wget 10:58 Hi! The newest wget is 1.6 release and 1.7 developer. I have GNU Wget 1.5.3 which doesn't dsiplay the dots, it lokks like this: --- Connecting to www.telekom.de:80... connected! HTTP request sent, awaiting response... 206 Partial content Length: 4,509,742 (4,267,794 to go) [application/octet-stream] 3.05Mb (236.28kb) done at 5.19 KB/s. time: 0:09:16 (0:04:05 left) --- Is it possible to implement this in new versions, too? TAM -- OS/2 Warp4, Ducati 750SS '92 You still have the freedom to learn and say what you wanna say http://tam.belchenstuermer.de