Re: ** Nigerian Scam variation (Re: Co-operation Needed!)

2002-07-17 Thread csaba . raduly


On 16/07/2002 16:36:15 Fernando Cassia wrote:

FYI and if someone has been living in a bottle This is a variation of
the
Nigerian scam.

http://www.secretservice.gov/alert419.shtml
http://www.fdic.gov/consumers/consumer/news/cnwin0102/TooGood.html

Don't even bother contacting them.

Regards
Fernando

Jesse Ndoro. wrote:

 Dear Sir,
[snip Nigerian scam quoted in its entirety]

Please don't do that.
1) This is a mailing list where the subscribes can actually think
2) FYI, you should not top-post and quote the original *in its entirety*
   This is a mailing list, where discussions are frequent. Replying
   at the top makes it difficult to follow who said what and in reply
   to whom.
3) We already received two copies of the scam. There was no need
   to send a third, unabridged copy.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: speed units

2002-06-11 Thread csaba . raduly


On 10/06/2002 23:07:47 Joonas Kortesalmi wrote:

Wget seems top repots speeds with wrong units. It uses for example KB/s
rather than kB/s which would be correct. Any possibility to fix that? :)

K = Kelvin
k = Kilo

Propably you want to use small k with download speeds, right?


Let's not go there again, lest wget will have to report download in
kibibytes (ISTR wget using 1024 to divide).
k = kilo is reserved for dividing by 1000.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Can't get remote files - what am I doing wrong?

2002-06-05 Thread csaba . raduly


On 03/06/2002 14:56:47 dale wrote:

[snip]
wget ftp://user:[EMAIL PROTECTED]/folder1/folder2/*s.csv

I get an error message of no match and if I use:

wget --glob=on ftp://user:[EMAIL PROTECTED]/folder1/folder2/*s.csv

I also get no match


In the future, please post the output with the -d switch added.
(did you read the instructions ?)

[snip]
The Mac machine I am using for testing is behind our firewall, but there
is
a hole opened to allow my internal IP to reach the specific remote IP.
[snip]

Because you didn't include the output with the -d switch, I'm guessing.
Do you use a proxy to go through the firewall ? A lot of proxies issue
HTTP requests even for FTP. HTTP cannot glob.


p.s. The reply-to address has been anti-spammed (I hope anyway), please
post any replies to the list.


Somebody at Ultimate Search (the owner of nospam.net) will be mightily
surprised. What you did can be interpreted as email address forgery.
Please in the future use addresses which end in .invalid (this top level
domain is guaranteed to always be, err, invalid), e.g.
[EMAIL PROTECTED]

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Can't get remote files - what am I doing wrong?

2002-06-05 Thread csaba . raduly


On 05/06/2002 13:08:05 drt - lists wrote:

Thank for no help.

If this is typical of how you reply to your customers

I do *not* reply to customers. I am a developer, and post here as a private
individual. Perhaps I should unsubscribe altogether.

[snip]
 The Mac machine I am using for testing is behind our firewall, but
 there is a hole opened to allow my internal IP to reach
 the specific remote IP.
 [snip]

 Because you didn't include the output with the -d switch, I'm guessing.
 Do you use a proxy to go through the firewall ? A lot of proxies issue
 HTTP requests even for FTP. HTTP cannot glob.

Yes we do,

So there is a proxy after all.

and no, it doesn't issue an ftp request as I have an opening for
this specific request - which if you had bothered to read my message
instead of trying to attack you would know that.


Here is the part that you ignored which addresses the accusation above.
   ^^
Huh ? I described a scenario which could have caused the failure you
described. I did not *accuse* you of using a proxy !

---
The Mac machine I am using for testing is behind our firewall, but there
is a hole opened to allow my internal IP to reach the specific remote
IP.
And using the first example above it does connect so I know I am getting
through the firewall.
---


Note that if wget is set up to use the proxy by default (env. var, wgetrc)
then it'll use the proxy even if it could connect directly through
the hole in the firewall. The first example (which I snipped) did not
use globbing. That would succeed regardless of whether wget connected
directly or through a HTML-ized proxy.

We're not getting any closer to a solution. Please post the output
of the failed request (the one that fails) in debugging mode
(be careful to obscure any possible passwords).


[ad hominem attack snipped]

I apologise. Although I consider what I've written to be valid, the tone
was not. I claim temporary loss of diplomatic abilities.


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: ? gets translated to @

2002-05-24 Thread csaba . raduly


On 24/05/2002 13:39:29 ladislav.gaspar wrote:

Hi

I do the following:
wget http://killefiz.de/zaurus/showdetail.php?app=221

but the file is saved as http://killefiz.de/zaurus/showdetail.php@app=221

(*.php?app gets translated to *.php@app)

Why is that and is there a workaround?


That *is* the workaround :-)
'?' is an invalid character for filenames on FAT, FAT32, NTFS.
Instead of giving an error message like this:
Cannot open killefiz.de/zaurus/showdetail.php?app=221
wget actually tries to do what you want (i.e. download the file).

You can run wget on another platform (Linux, some Unix. etc).
The filesystems there usually don't have this restriction.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: crawling servlet based urls

2002-05-16 Thread csaba . raduly


On 16/05/2002 17:06:31 Steve Mestdagh wrote:

Hi,
I'm trying to get crawl intranet urls of form:
[snip, wget will try to save to filename like this:]
 `WKCCommand?command=getLessonLessonId=137'
[snip]

The filename above is invalid on many filesystems used by Micros~1.
(It's the '?' causing the problem).

This is corrected for sure in a newer version, either 1.8.1
or the current CVS.


Heiko Herold provides:
New CVS binary for windows at http://space.tin.it/computer/hherold




--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: apache irritations

2002-04-22 Thread csaba . raduly


On 22/04/2002 16:38:15 Maciej W. Rozycki wrote:

On Mon, 22 Apr 2002, Hrvoje Niksic wrote:

   How about using the -R option of wget?  A brief test proves -R
  '*\?[A-Z]=[A-Z]' works as it should.

 Or maybe the default system wgetrc should ship with something like:

 reject = *?[A-Z]=[A-Z]

Note the difference between strings! -- the backslash before the
quotation mark is essential as otherwise it's a glob character.


[A-Z] is a bit extreme, IMHO. How about

reject = *\?[NMSD]=[AD]
  ^^ literal '?' needed here



Well, I don't think it's sane but adding a *commented-out* reject line
with an appropriate annotation to the default system wgetrc looks like a
good idea to me.


A good idea.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Goodbye and good riddance

2002-04-15 Thread csaba . raduly


On 12/04/2002 19:21:41 James C. McMaster (Jim) wrote:

My patience has reached an end.  Perhaps, now that you have (for the first
time) indicated you will do something to fix the problem, the possible
light
at the end of the tunnel will convince others to stay.

The light at the end of the tunnel is just the explosion around the Pu239 :
-)

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: HTTP 1.1

2002-04-15 Thread csaba . raduly


On 12/04/2002 21:37:31 hniksic wrote:

Tony Lewis [EMAIL PROTECTED] writes:

 Hrvoje Niksic wrote:

  Is there any way to make Wget use HTTP/1.1 ?

 Unfortunately, no.

 In looking at the debug output, it appears to me that wget is really
 sending HTTP/1.1 headers, but claiming that they are HTTP/1.0
 headers. For example, the Host header was not defined in RFC 1945,
 but wget is sending it.

Yes.  That is by design -- HTTP was meant to be extended in that way.
Wget is also requesting and accepting `Keep-Alive', using `Range', and
so on.

Csaba Raduly's patch would break Wget because it doesn't suppose the
chunked transfer-encoding.  Also, its understanding of persistent
connection might not be compliant with HTTP/1.1.

IT WAS A JOKE !
Serves me right. I need to put bigger smilies :-(


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: qestio

2002-04-05 Thread csaba . raduly


On 05/04/2002 12:44:22 Varga Gabor wrote:

Hi

I am gabor from hungary I have a qestion
I have an URL ending like this */show.php?id=843
I know how it works(correct me if I am wrong) the *.php
(gets or posts) the arg. ID
and the server returns the page 843 but why can't wget
mirror these pages ?


Because it'll try to save with the filename show.php?id=843, and '?' is
invalid in a filename on DOS/Windows/OS2

What version of wget are you using ? What platform (operating system) ?
What does the debug log say ? (run wget with the -d switch added)

CC'd to wget, not bug-wget

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




wget parsing JavaScript

2002-03-26 Thread csaba . raduly

wget stumbled upon the following HTML file:

--- 8 
html
head
titlefoo/title
/head

body

SCRIPT language=JavaScript1.2
var sitems=new Array()
var sitemlinks=new Array()

///Edit below/

//extend or shorten this list
sitems[0]=15.html
sitems[1]=16.html
sitems[2]=17.html
sitems[3]=18.html
sitems[4]=19.html
sitems[5]=20.html
sitems[6]=21.html
sitems[7]=22.html
sitems[8]=23.html
sitems[9]=24.html
sitems[10]=25.html
sitems[11]=26.html
sitems[12]=27.html


//These are the links pertaining to the above text.
sitemlinks[0]=31.html
sitemlinks[1]=32.html
sitemlinks[2]=33.html
sitemlinks[3]=34.html
sitemlinks[4]=35.html
sitemlinks[5]=36.html
sitemlinks[6]=37.html
sitemlinks[7]=38.html
sitemlinks[8]=39.html
sitemlinks[9]=40.html
sitemlinks[10]=41.html
sitemlinks[11]=42.html
sitemlinks[12]=43.html

//If you want the links to load in another frame/window, specify name of
//target (ie: target=_new)
var target=

for (i=0;i=sitems.length-1;i++)
document.write('a href='+sitemlinks[i]+'
target='+target+''+sitems[i]+'/abr')

/SCRIPT
NOSCRIPT
Congratulations, you have turned off JavaScript.
/NOSCRIPT
/body

/html
--- 8 

I see that wget handles SCRIPT with tag_find_urls, i.e. it tries to
parse whatever it's inside.
Why was this implemented ? JavaScript is most
used to construct links programmatically. wget is likely to find
bogus URLs until it can properly parse JavaScript.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933





Re: OK, time to moderate this list

2002-03-22 Thread csaba . raduly


On 22/03/2002 07:06:13 Daniel Stenberg wrote:

On Fri, 22 Mar 2002, Hrvoje Niksic wrote:
[snip]
 I think I agree with this.  The amount of spam is staggering.  I have no
 explanation as to why this happens on this list, and not on other lists
 which are *also* open to non-subscribers.

Spammers work in mysterious ways. ;-)


No, they work in fairly predictable ways.
The wget mailinglist address is advertised on the wget homepage.
According to empirical observations, if you publish a brand new email
address
on a web page, it'll receive spam within eight *hours* of it being
published.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: KB or kB

2002-02-08 Thread csaba . raduly


On 08/02/2002 08:30:59 Henrik van Ginhoven wrote:

On Fri, Feb 08, 2002 at 08:54:06AM +0100, Hrvoje Niksic wrote:
 Wget currently uses KB as abbreviation for kilobyte.  In a Debian
 bug report someone suggested that kB should be used because it is
 more correct.

This is the kind of stuff that leads to month-long flamewars :-)


kB rather than KB? I think whoever filed that bugreport got it wrong,
as
far as I know kB would always mean 1000 (bytes), since k = thousand,
and
never ever 1024. If he'd said KiB I'd agree with him to a certain
degree,
but kB simply can't be right.

Note that we can claim the distinction that k=1000 and K=1024
That won't work with 1E6 vs 2**20 because SI uses uppercase M for 1E6.


Rather than me trying to sum it up and risk typing something wrong, this
page seems to address the issue well:

http://www.romulus2.com/articles/guides/misc/bitsbytes.shtml


Please, no kibibytes :-)
Maybe wget should just count 512-byte blocks, a la df.
That would improve the understandability of the display ... NOT
But it would keep the terminally anal-retentives at bay :-)

Seriously, just ignore it. I can certainly live with 5%
experimental error ( 2**20 = 1.0486E6 ) at megabyte level.


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: wget not working

2002-02-08 Thread csaba . raduly


On 08/02/2002 15:34:53 Martin Schöneberger wrote:

At 14:37 08.02.2002 +, Henderson, Daniel wrote:
#wget www.sophos.com/downloads/ide/ides.zip
--14:32:57--  http://www.sophos.com/downloads/ide/ides.zip
= `ides.zip'
Connecting to www.sophos.com:80...
www.sophos.com: Host not found.

Is there something else I should configure in Solaris to allow this to
work?

First of all you should find out why you can't connect to sophos.com.
1) sophos is down - try later
Solution: get the file from another server
2) dns lookup failed - try if you can connect to other hosts like
google.com or anything else, or if you can only connect to ip adresses.
Solution1: try another DNS server
Solution2: reconfigure your DNS settings or even your DNS-server (if you
are running one)

Try
nslookup www.sophos.com
ping www.sophos.com
telnet   www.sophos.com 80

If these work, it's wget's fault.
If they don't, it's a connectivity problem.

4) user root not allowed to connect to the internet (standard on BSD if i
remember correctly) - try if you can DL the file using another user
Solution: change the user database or the firewall settings, or just don't
connect to the internet using root :-)

Good point. Look at the prompt...

[snip]

Last but not least: Try the -d switch with wget and have a look at the
debug output of wget. Perhaps you find further information why you can't
connect. If you don't, send it to this list, perhaps we find smth :-)


Very good advice indeed.
HTH,


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: KB or kB

2002-02-08 Thread csaba . raduly


On 08/02/2002 13:58:55 Andre Majorel wrote:

On 2002-02-08 08:54 +0100, Hrvoje Niksic wrote:

 Wget currently uses KB as abbreviation for kilobyte.  In a Debian
 bug report someone suggested that kB should be used because it is
 more correct.  The reporter however failed to cite the reference for
 this, and a search of the web has proven inconclusive.

 Does someone understand the spelling issues involved enough to point
 out the correct spelling and back it up with arguments?

The applicable standard is the SI (Système International)

[snip SI prefixes]

Capital K is not a prefix, it's the SI abbreviation for the
temperature unit, the kelvin (note : lower case k) named after
Lord Kelvin.

So it's definitely kB for kilobyte.

As long as it means 1000 and NOT 1024


Whether that means 1000 bytes or 1024 bytes is another issue.

Not while claiming to conform to SI.

Csaba

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: BUG https + index.html

2002-02-01 Thread csaba . raduly


On 01/02/2002 12:10:59 Mr.Fritz wrote:

After the https/robots.txt bug, doing a recursive wget to an https-only
server
gives me this error: it searches for http://servername/index.html but
there
is no server on port 80, so wget receives a Connection refused error and
quits.  It should search for https://servername/index.html 


Are you sure this was an SSL-enabled wget ?
Please provide a debug log by running wget with the -d parameter.


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: mirroring vs -m

2002-01-29 Thread csaba . raduly


On 29/01/2002 15:54:17 Andre Majorel wrote:

[snip debate about following links in HTML retrieved by FTP]

I'm inclined to think that recursive retrieval without parsing
is a feature. HTML content is normally served over HTTP. If you
want to retrieve HTML through FTP, it's likely because you do
*not* want to follow the links.


I (client) don't get the choice. If the document at
http://foo.bar/index.html has all its links like this:

A HREF=ftp://foo.bar/welcome.html;welcome/A

the client has no choice but to retrieve them via FTP.
It would be nice if wget was able to follow all those links.


If Wget always parsed HTML, even over FTP, it would be
impossible to make a complete mirror a tree that has broken href
links or hidden files.

Perhaps If wget started with FTP, it should mirror FTP-like
(.listing and all that). If it started via HTTP, it should follow links,
regardless of future retrieval modes

[snip]

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




RE: Bug report: 1) Small error 2) Improvement to Manual

2002-01-17 Thread csaba . raduly


On 17/01/2002 07:34:05 Herold Heiko wrote:
[proper order restored]
 -Original Message-
 From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]]
 Sent: Thursday, January 17, 2002 2:15 AM
 To: Michael Jennings
 Cc: [EMAIL PROTECTED]
 Subject: Re: Bug report: 1) Small error 2) Improvement to Manual


 Michael Jennings [EMAIL PROTECTED] writes:

  1) There is a very small bug in WGet version 1.8.1. The bug occurs
 when a .wgetrc file is edited using an MS-DOS text editor:
 
  WGet returns an error message when the .wgetrc file is terminated
  with an MS-DOS end-of-file mark (Control-Z). MS-DOS is the
  command-line language for all versions of Windows, so ignoring the
  end-of-file mark would make sense.

 Ouch, I never thought of that.  Wget opens files in binary mode and
 handles the line termination manually -- but I never thought to handle
 ^Z.

 As much as I'd like to be helpful, I must admit I'm loath to encumber
 the code with support for this particular thing.  I have never seen it
 before; is it only an artifact of DOS editors, or is it used on
 Windows too?



[snip copy con file.txt]

However in this case (at least when I just tried) the file won't contain
the ^Z. OTOH some DOS programs still will work on NT4, NT2k and XP, and
could be used, and would create files ending with ^Z. But do they really
belong here and should wget be bothered ?

What we really need to know is:

Is ^Z still a valid, recognized character indicating end-of-file (for
textmode files) for command shell programs on windows NT 4/2k/Xp ?
Somebody with access to the *windows standards* could shed more light on
this question ?

My personal idea is:
As a matter of fact no *windows* text editor I know of, even the
supplied windows ones (notepad, wordpad) AFAIK will add the ^Z at the
end of file.txt. Wget is a *windows* program (although running in
console mode), not a *Dos* program (except for the real dos port I know
exists but never tried out).


I don't think there's a distinction between DOS and Windows programs
in this regard. The C runtime library is most likely to play a
significant role here. For a file fopen-ed in rt mode, teh RTL
would convert \r\n - \n and silently eat the _first_ ^Z,
returning EOF at that point.

When writing, it goes the other way 'round WRT \n-\r\n.
I'm unsure about whether it writes ^Z at the end, though.

So personally I'd say it would not be really necessary adding support
for the ^Z, even in the win32 port; except possibly for the Dos port, if
the porter of that beast thinks it would be useful.


Problem could be solved by opening .netrc in rt
However, the t is a non-standard extension.

However, this is not wget's problem IMO. Different editors may behave
differently. Example: on OS/2 (which isn't a DOS shell, but can run
DOS programs), the system editor (e.exe) *does* append a ^Z at the end
of every file it saves. People have patched the binary to remove this
feature :-) AFAIK no other OS/2 editor does this.


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Is wget --timestamping URL working on Windows 2000?

2001-12-11 Thread csaba . raduly


On 11/12/2001 14:03:54 Adrian Aichner wrote:

Hi Wgeteers!

Is
  -N,  --timestamping   don't retrieve files if older than local.
supposed to work on windows 2000?

[snip]

cd c:\Hacking\SunSITE.dk\xemacsweb\Download\win32\
%TEMP%\wget.wip\src\wget.exe --debug --timestamping
--output-document=setup.exe
http://ftp.xemacs.org/windows/setup.exe
Compilation started at Tue Dec 11 14:53:07 2001 +0100 (W. Europe Standard
Time)
DEBUG output created by Wget 1.8 on Windows.

--14:53:07--  http://ftp.xemacs.org/windows/setup.exe
   = `setup.exe'
Resolving ftp.xemacs.org... done.
Caching ftp.xemacs.org = 207.96.122.9
Connecting to ftp.xemacs.org[207.96.122.9]:80... connected.
Created socket 420.
Releasing 007D1C00 (new refcount 1).
---request begin---
[snip HEAD request and response]


Found ftp.xemacs.org in host_name_addresses_map (007D1C00)
Registered fd 420 for persistent reuse.
Length: 181,760 [application/octet-stream]
Closing fd 420
Releasing 007D1C00 (new refcount 1).
Invalidating fd 420 from further reuse.
The sizes do not match (local 0) -- retrieving.

 ^^^
 ^^^
Something is wrong there.
Try it without --output-document; it should put it in the current dir
anyway


--14:53:08--  http://ftp.xemacs.org/windows/setup.exe
   = `setup.exe'
Found ftp.xemacs.org in host_name_addresses_map (007D1C00)
Connecting to ftp.xemacs.org[207.96.122.9]:80... connected.
Created socket 420.
Releasing 007D1C00 (new refcount 1).
---request begin---
GET /windows/setup.exe HTTP/1.0
[snip]

14:53:47 (6.14 KB/s) - `setup.exe' saved [181760/181760]


Compilation finished at Tue Dec 11 14:53:47



--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: log errors

2001-12-11 Thread csaba . raduly


On 11/12/2001 15:09:25 hniksic wrote:

Summer Breeze [EMAIL PROTECTED] writes:

 I want to know if Wget is a program similar to Mozilla, and if so is
 there any way to make my pages available to Wget? I use Netscape to
 create my web pages.

Wget is a command-line downloading utility; it allows you to download
a page or a part of the site without further user interaction.

 Here is a sample entry:

 66.28.29.44 - - [08/Dec/2001:18:21:20 -0500] GET /index4.html%0A
 HTTP/1.0 403 280 - Wget/1.6

/index4.html%0A looks like a page is trying to link to /index4.html,
but the link contains a trailing newline.

That IP address is assigned to Road Runner (big cable ISP, I think)

Is /index4.html%0A the *first* error line in the log from 66...44 ?

Wget will try to download a URL in two cases: either because it was told to
explicitly, or because it was doing a recursive download and found that
link in a page downloaded earlier.

/index4.html%0A looks like something somewhere was misparsed. It might
conceivably be wget (unlikely, as this sort of problem would've surfaced
long ago).

If /index4.html%0A *is* the first URL requested by that IP address, then
the blame is clearly elsewhere (unless -i was used). If not, can you search
your site for a link to /index4.html that might be badly formatted HTML
(although wget should be able to defend itself against bad HTML).


(Please don't CC me; I'm on the list)
--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933





Re: Uncoupling translations from source

2001-12-10 Thread csaba . raduly


On 10/12/2001 08:10:12 Martin v. Loewis wrote:

 Maybe you wanted to say that many Europeans speak English so well,
 that they do not need translations?

It is my observation as well: Some users are hostile towards the
notion of translated software. Those are typically not native English
speakers, but people who found, at one time or the other, reason to
complain about translations. They do so for all operating systems,
making fun of erroneous translations (such as the infamous Pfeife
zerbrochen of SINIX, or translations that an MS employee came up
with).


From an ancient DR-DOS (version 3.something)

Nicht breit __reading__ laufwerk A:

This was clearly an oversight (the message was probably pasted together
from various places).

My native language is Hungarian, and I don't remember using ANY software in
Hungarian (with the possible exception of Recognita, which is written by
hungarians). For the few I tried, I found the hungarian translation
incredibly awkward (this is exacerbated by the fact that Hungarian is
neither germanic nor latinic), even if not at the level of all your base
are belong to us :-) It was easier to use the english version (this was
all commercial software).

Complaining about the *presence* of translation is silly, IMO. Presumably
gettext has a way to decide what language to use (LANG environment
variable, or suchlike; LANG=en_gb should do).

Decoupling translations is a good idea, if the logistics can be sorted out.

Csaba

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Wget 1.8-beta1 now available

2001-12-03 Thread csaba . raduly


On 01/12/2001 19:44:44 John Poltorak wrote:

On Sat, Dec 01, 2001 at 04:30:47PM +0100, Hrvoje Niksic wrote:
 John Poltorak [EMAIL PROTECTED] writes:

  Is it possible to include OBJEXT in Makefile.in to make this more
  cross-platform?

 I suppose so.  I mean, o is already defined to .@U@o, but I'm not
 exactly sure what the U is supposed to stand for.


It's looks to me as though @U@ is set up for some variable substitution,
but I can't work out what for... Maybe it's getting replaced by NULL.



I know next to nothing about how Auto* is (supposed to be) working, but
I've seen lots of sed commands in


If @U@ is doing a variable substitution, then it'll expand to something
_before_ o
(if @U@ - bar, then this will result in a dependency involving .baro)

(looking through configure)
Wget's configure contains this towards the end:

s%@U@%$U%g

U seems to be related to ansi2knr:

if(can use prototypes)
 U= ANSI2KNR=
else
 U=_ ANSI2KNR=./ansi2knr
endif

This will result in dependencies written as ._o if ansi2knr was run over
the sources.


This forces me to conclude that using @U@ _CAN_NOT_ and _WILL_NOT_ change
.o to .obj
I think .@U@o might need to be replaced with .@U@@objext@ (if there is such
a beast, in analogy with @exeext@)

Csaba



--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: wget1.7.1: Compilation Error (please Cc'ed to me :-)

2001-11-28 Thread csaba . raduly


On 28/11/2001 10:28:44 Daniel Stenberg wrote:

On Wed, 28 Nov 2001, zefiro wrote:

 ld: Undefined symbol
_memmove

 Do you have any suggestion ?

SunOS 4 is known to not have memmove.


Isn't configure supposed to notice that ?


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: wget mirroring busted

2001-11-15 Thread csaba . raduly


On 14/11/2001 16:27:34 jwz wrote:

[EMAIL PROTECTED] wrote:

 Can you post the entire debug log (on a web/ftp site, of course, not the
 list).

Done -- http://www.jwz.org/wget-log.gz

Does this mean you can't reproduce this when you run wget the same
way I did?


No, I just wanted to take a look at the surrounding lines in the log.

wget -nv -m -nH -np \
   http://www.dnalounge.com/flyers/
   http://www.dnalounge.com/gallery/


I may try that myself.

P.S. Please *don't* CC in the future, I'm on the list.



--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: A tricky download

2001-10-12 Thread csaba . raduly


On 12/10/2001 16:49:07 Edward J. Sabol wrote:

[snip question about downloading a site with Javascript-only links]

Probably not. If the only links to the other chapters are in JavaScript
commands, then there's no way wget can do it. Wget does not interpret
JavaScript and most likely never will.

Implementing it is left as an exercise for the reader.
;-)

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Recursive retrieval of page-requisites

2001-10-09 Thread csaba . raduly


On 09/10/2001 14:25:57 Andre Pang wrote:

On Tue, Oct 09, 2001 at 03:46:52PM +0300, Mikko Kurki-Suonio wrote:

  To me that sounds like a logical combination of -r -np -p?
  Any correction appreciated.

 Doesn't work, apparently because -np overrides -p.

 I.e. with -np set, no document outside the selected subtree will be
 loaded, whether it is referred to through regular link-traversal or as a
 page-requisite element.

 My guess is that -p adds those links to the list of documents to load,
but
 -np later rejects them because they're not within the selected subtree.

 What I'd basically like is a setting that loads page-requisites
REGARDLESS
 OF ALL OTHER SETTINGS. I.e. you use the myriad of settings to fine tune
 the exact set of pages requested, and then request all requisites for
the
 selected set of pages.

Try this patch.  It should make -p _always_ get pre-requisites,
even if you have -np on (which was the reason why i wrote the
patch).  [snip]

Actually, case can be made for both ways.
Sometimes you might want -p to only get images conforming to -np. Perhaps
to skip (advertising)banners.
(those are usually served by another server, and thus ignored anyway unless
--span-hosts).

Perhaps make -p override -np, but have an alternative -p (e.g. -pnp )
which obeys -np.

I didn't see Andre's patch so I cannot comment on it (stripped by my mail
system)-:
It modifies existing (admittedly confusing) behaviour, my suggestion would
permit getting the old behaviour back.

Another possibility would be to keep the existing behaviour (i.e. -np
overrides -p) and have a stronger -p (e.g. -pp ) which ignores -np.

Csaba


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Bus errors and recursion

2001-06-18 Thread csaba . raduly


[about alloca vs malloc]

If you allocate with malloc and then accidentally overwrite it, you get a
corrupted heap.
If you allocate with alloca and then accidentally overwrite it, you get a
corrupted stack.

Guess which is easier to notice :-)

Besides, alloca is a GCC builtin (IIRC), so you you don't have to worry
about its implementation (the GCC folks do :). As long as you have the
stack to allocate from, it's as transparent as declaring automatic arrays
with variable length. e.g.

p = alloca( strlen(s) );

is almost the same as

char a[ strlen(s) ], p=a;
/* this is a legal GCC extension */

--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933






Re: WGET for OS/2 and Proxy-Server

2001-05-16 Thread csaba . raduly

   
 
Hrvoje Niksic  
 
hniksic@arsdigitaTo: Wget List [EMAIL PROTECTED], 
Thomas Bohn  
.com [EMAIL PROTECTED]   
 
Sent by:  cc:  
 
[EMAIL PROTECTED]Subject: Re: WGET for OS/2 and 
Proxy-Server   
sdigita.de 
 
   
 
   
 
15/05/01 13:00 
 
   
 
   
 









 Thomas Bohn [EMAIL PROTECTED] writes:

  Hello,
 
  I tried to use WGET for OS/2 (tested V 1.5.3 and 1.6) with a proxy
  server. Without proxy server all works fine. But with...
 
  In a OS/2 commandline session I type the following commands:
 
  SET HTTP_PROXY=62.52.17.1:80

 Your proxy setting gets ignored.  Try using lower-case `http_proxy'.


It seems to me that getenv has some issues on OS/2.
Workaround: use .wgetrc commands instead.

All environment variale names (i.e. the part before the '=') are uppercase
on OS/2

wget uses getenv(http_proxy); the implementation of getenv seems to be
scanning _environ and doing a strncmp (i.e. case-sensitive comparison). If
getproxy in url.c is changed to getenv(HTTP_PROXY) then it does pick up
the environment setting.

Could we postulate that *ALL* environment vars influencing WGET be
uppercase ?
These are the places where getenv is used (excluding getopt.c)

init.c:237:  tmp = getenv (no_proxy);
init.c:259:  char *home = getenv (HOME);
init.c:292:  env = getenv (WGETRC);
url.c:1292:proxy = opt.http_proxy ? opt.http_proxy : getenv (http_proxy);
url.c:1294:proxy = opt.ftp_proxy ? opt.ftp_proxy : getenv (ftp_proxy);
url.c:1297:proxy = opt.https_proxy ? opt.https_proxy : getenv (https_proxy);
--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933





Yet another Makefile.watcom :-)

2001-05-16 Thread csaba . raduly


An hour of careful debugging can save you five minutes of reading the
documentation

(See attached file: Makefile.watcom)

This version gets rid of the ugly double list of object files (one for the
linker, one for the dependencies ).

EXPLANATION target=Watcom users
WLINK expects the object files to be specified like this:

wlink FILE 1.obj,2.obj,etc_etc,n.obj NAME program.exe ...
  ^^

This is the format auto-generated by their IDE, BTW.
However, wlink also accepts an alternate way:

wlink FILE { 1.obj 2.obj etc_etc n.obj } NAME program.exe ...

What's more, this is actually present in the documentation (gasp)!

/EXPLANATION

--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933


 =?iso-8859-1?Q?Makefile.watcom?=


RE: New and improved Makefile.watcom

2001-05-14 Thread csaba . raduly


   
   
Herold Heiko   
   
Heiko.Herold@prTo: 'Hrvoje Niksic' 
[EMAIL PROTECTED], Wget List  
evinet.it  [EMAIL PROTECTED]  
   
cc:
   
14/05/01 12:05  Subject: RE: New and improved 
Makefile.watcom 
   
   
   
   









 -Original Message-
 From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]]
 Sent: Monday, May 14, 2001 11:23 AM
 To: Wget List
 Subject: Re: New and improved Makefile.watcom
 
 
 [EMAIL PROTECTED] writes:
 
  This is a rewrite of Makefile.watcom
 
 Thanks; I've put it in the repository.
 
  # Copy this file to the ..\src directory (maybe rename to
 Makefile). Also:
  # copy config.h.ms ..\src\config.h
 
 Maybe we should provide a win-build script (or something) that does
 this automatically?
 

How about this ?

config.h : ..\windows\config.h.ms
 copy $[@ $^@

(this would be copy $ $@ for GNU make)

Yup, it works (for me ! :-)


 Isn't this what configure.bat is for ?

In theory, but...

 Default to VC (or use VC if --msvc is given), otherwise if env var
 BORPATH is present (or --borland is given) use borland, otherwise error.


I see no Watcom here :-) configure.bat doesn't know about Watcom C

Hrvoje also wrote:
  #disabled for faster compiler
  LFLAGS=sys nt op st=32767 op vers=1.7 op map op q op de 'GNU wget
1.7dev' de all
  CFLAGS=/zp4 /d1 /w4 /fpd /5s /fp5 /bm /mf /os /bt=nt [snip]
  # /zp4= pack structure members with this alignment
  # /d1 = line number debug info
  # /w4 = warning level
  # /fpd= ??? no such switch !
  # /5s = Pentium stack-based calling
  # /fp5= Pentium floating point
  # /bm = build multi-threaded
  # /mf = flat memory model
  # /os = optimize for size
 ^^^
  # /bt = build target (nt)

 One thing I don't understand: why do you optimize for size?  Doesn't
 it almost always make sense to optimize for speed instead?

Because I like small and sleek executables :-)
Are there any processor-intensive bits in wget ? Most of the time it'll
wait for the Internet anyway.


BTW, compiling with DEBUG_MALLOC reveals three memory leaks :
0x13830432: mswindows.c:72-   *exec_name = xstrdup (*exec_name); in
windows_main_junk
0x13830496: mswindows.c:168   -   wspathsave = (char*) xmalloc (strlen
(buffer) + 1); in ws_mypath
0x13830848: utils.c:1525  -   (struct wget_timer *)xmalloc (sizeof
(struct wget_timer));

Here's another edition of Makefile.watcom
(See attached file: Makefile.watcom)
--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933

 =?iso-8859-1?Q?Makefile.watcom?=


New and improved Makefile.watcom

2001-05-13 Thread csaba . raduly


This is a rewrite of Makefile.watcom
It is the end of two separate OBJ file lists (one for dependencies, the
other for the linker command) which needed to be kept in sync.
The explicit dependency list is also gone (Watcom C can pass dependencies
to Watcom Make when using .AUTODEPEND)


wget/windows/(See attached file: Makefile.watcom)
--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933


 =?iso-8859-1?Q?Makefile.watcom?=


Re: windows, continue bug

2001-05-04 Thread csaba . raduly


You mean this ?

---8---
DEBUG output created by Wget 1.7-dev on Windows.

parseurl (http://turtle.power.org/;) - host turtle.power.org - opath  - dir  - 
file  - ndir
newpath: /
Checking for turtle.power.org in host_name_address_map.
Checking for turtle.power.org in host_slave_master_map.
First time I hear about turtle.power.org by that name; looking it up.
Caching turtle.power.org - 10.1.1.9
Checking again for turtle.power.org in host_slave_master_map.
--10:35:49--  http://turtle.power.org/
   = `turtle.power.org/index.html'
Connecting to turtle.power.org:80... Found turtle.power.org in host_name_address_map: 
10.1.1.9
Created fd 88.
connected!
---request begin---
GET / HTTP/1.0

User-Agent: Wget/1.7-dev

Host: turtle.power.org

Accept: */*

Connection: Keep-Alive



HTTP request sent, awaiting response... HTTP/1.1 200 OK
Date: Fri, 04 May 2001 09:35:48 GMT
Server: Apache/1.3.14 (Unix) PHP/4.0.4pl1
X-Powered-By: PHP/4.0.4pl1
Connection: close
Content-Type: text/html



The server does not support continued download;
refusing to truncate `turtle.power.org/index.html'.


FINISHED --10:35:49--
Downloaded: 0 bytes in 0 files
---8---

It's not just on Windows; happens on OS/2 ( compiled with GCC ) too.

Debugging it suggests that hstat.no_truncate desn't get initialized
(dodgy random-looking value contained in no_truncate) :

http_loop calls gethttp() at line 1539, but the following is only
at line 1554:

if( opt.always_rest )
hstat.no_truncate = file_exists_p(locf);

Moving these two lines *above* the call to gethttp() on line 1554,
the file was downloaded correctly.

--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933






Re: wget bug - after closing control connection

2001-03-08 Thread csaba . raduly


Which version of wget do you use ? Are you aware that wget 1.6 has been
released and 1.7 is in development (and they contain a workaround for the
"Lying FTP server syndrome" you are seeing) ?
--
Csaba Rduly, Software Engineer  Sophos Anti-Virus
email: [EMAIL PROTECTED]   http://www.sophos.com
US support: +1 888 SOPHOS 9UK Support: +44 1235 559933






Re: Wget

2001-03-06 Thread csaba . raduly


I'm confused. I thought 1.5.3 *did* display the dots, but I could be wrong.

Please send queries like this to the list ( [EMAIL PROTECTED] ), not to me
personally.
--
Csaba Rduly, Software Engineer  Sophos Anti-Virus
email: [EMAIL PROTECTED]   http://www.sophos.com
US support: +1 888 SOPHOS 9UK Support: +44 1235 559933
:-( sorry for the top-posting )-:




   

[EMAIL PROTECTED] 

(Timo Maier) To: [EMAIL PROTECTED]   

 cc:   

06/03/01 Subject: Re: Wget 

10:58  

   

   





Hi!

The newest wget is 1.6 release and 1.7 developer.

I have GNU Wget 1.5.3 which doesn't dsiplay the dots, it lokks like
this:

---
Connecting to www.telekom.de:80... connected!
HTTP request sent, awaiting response... 206 Partial content
Length: 4,509,742 (4,267,794 to go) [application/octet-stream]

3.05Mb (236.28kb) done at 5.19 KB/s. time: 0:09:16 (0:04:05 left)
---

Is it possible to implement this in new versions, too?

TAM
--
OS/2 Warp4, Ducati 750SS '92
You still have the freedom to learn and say what you wanna say
http://tam.belchenstuermer.de