bug in wget

2008-06-14 Thread Sir Vision

Hello,

enterring following command results in an error:

--- command start ---
c:\Downloads\wget_v1.11.3bwget 
ftp://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-mozilla1.8-l10n/;
 
-P c:\Downloads\
--- command end ---

wget cant convert .listing-file into a html-file

regards


_
Keine Mail mehr verpassen! Jetzt gibt’s Hotmail fürs Handy!
http://www.gowindowslive.com/minisites/mail/mobilemail.aspx?Locale=de-de

Re: bug in wget

2008-06-14 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Sir Vision wrote:
 Hello,
 
 enterring following command results in an error:
 
 --- command start ---
 c:\Downloads\wget_v1.11.3bwget
 ftp://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-mozilla1.8-l10n/;
 -P c:\Downloads\
 --- command end ---
 
 wget cant convert .listing-file into a html-file

As this seems to work fine on Unix, for me, I'll have to leave it to the
Windows porting guy (hi Chris!) to find out what might be going wrong.

...however, it would really help if you would supply the full output you
got, from wget, that leads you to believe Wget couldn't do this
conversion. in fact, it wouldn't hurt to supply the -d flag as well, for
maximum debugging messages.

- --
Cheers,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer,
and GNU Wget Project Maintainer.
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIVKXx7M8hyUobTrERAo40AKCAmwgDOGgjU2kcTYeEGC3+RkCjzQCeJt6B
dz38DW8jMMZtUxc+FhvIhfI=
=T+mK
-END PGP SIGNATURE-


Re: bug on wget

2007-11-21 Thread Hrvoje Niksic
Micah Cowan [EMAIL PROTECTED] writes:

 The new Wget flags empty Set-Cookie as a syntax error (but only
 displays it in -d mode; possibly a bug).

 I'm not clear on exactly what's possibly a bug: do you mean the fact
 that Wget only calls attention to it in -d mode?

That's what I meant.

 I probably agree with that behavior... most people probably aren't
 interested in being informed that a server breaks RFC 2616 mildly;

Generally, if Wget considers a header to be in error (and hence
ignores it), the user probably needs to know about that.  After all,
it could be the symptom of a Wget bug, or of an unimplemented
extension the server generates.  In both cases I as a user would want
to know.  Of course, Wget should continue to be lenient towards syntax
violations widely recognized by popular browsers.

Note that I'm not arguing that Wget should warn in this particular
case.  It is perfectly fine to not consider an empty `Set-Cookie' to
be a syntax error and to simply ignore it (and maybe only print a
warning in debug mode).


Re: bug on wget

2007-11-21 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hrvoje Niksic wrote:
 Generally, if Wget considers a header to be in error (and hence
 ignores it), the user probably needs to know about that.  After all,
 it could be the symptom of a Wget bug, or of an unimplemented
 extension the server generates.  In both cases I as a user would want
 to know.  Of course, Wget should continue to be lenient towards syntax
 violations widely recognized by popular browsers.
 
 Note that I'm not arguing that Wget should warn in this particular
 case.  It is perfectly fine to not consider an empty `Set-Cookie' to
 be a syntax error and to simply ignore it (and maybe only print a
 warning in debug mode).

That was my thought. I agree with both of your points above: if Wget's
not handling something properly, I want to know about it; but at the
same time, silently ignoring (erroneous) empty headers doesn't seem like
a problem.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHRGqx7M8hyUobTrERCPwQAJ4wGFwPBqyoVDXjrOifNB/fVF1vtACbBnDU
fnSx/Vj+S+DVnfRUbIz5HKU=
=n4yr
-END PGP SIGNATURE-


bug on wget

2007-11-20 Thread Diego Campo
Hi,
I got a bug on wget when executing:

wget -a log -x -O search/search-1.html --verbose --wait 3
--limit-rate=20K --tries=3
http://www.nepremicnine.net/nepremicninske_agencije.html?id_regije=1

Segmentation fault (core dumped)


I created directory search. 
The above creates a file search/search-1.html zero-sized.
Logfile log:

Resolviendo www.nepremicnine.net... 212.103.144.204
Conectando a www.nepremicnine.net|212.103.144.204|:80... conectado.
Petición HTTP enviada, esperando respuesta... 200 OK
--18:18:28--
http://www.nepremicnine.net/nepremicninske_agencije.html?id_regije=1
   = `search/search-1.html'

(I hope you understand the Spanish above. If not, labels are the usual:
resolving, connecting, HTTP petition sent, waiting for request)

It happens the same when varying the parameter on the url id_regije,
just in case it helps.

I'm using Intel CoreDuo E6300, plenty of disk/mem space.
ubuntu 7.10

Should you need any further information don't hesitate to contact.
Regards
 Diego



Re: bug on wget

2007-11-20 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Diego Campo wrote:
 Hi,
 I got a bug on wget when executing:
 
 wget -a log -x -O search/search-1.html --verbose --wait 3
 --limit-rate=20K --tries=3
 http://www.nepremicnine.net/nepremicninske_agencije.html?id_regije=1
 
 Segmentation fault (core dumped)

Hi Diego,

I was able to reproduce the problem above in the release version of
Wget; however, it appears to be working fine in the current development
version of Wget, which is expected to release soon as version 1.11.*

* Unfortunately, it has been expected to release soon for a few months
now; we got hung up with some legal/licensing issues that are yet to be
resolved. It will almost certainly be released in the next few weeks,
though.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHQypR7M8hyUobTrERCF99AJ4w790h4juXzPwO+csBbSY3KcLOXACdGYgO
Kf4Oawgfjx6WOEzYwkQ47mw=
=8gL2
-END PGP SIGNATURE-


Re: bug on wget

2007-11-20 Thread Hrvoje Niksic
Micah Cowan [EMAIL PROTECTED] writes:

 I was able to reproduce the problem above in the release version of
 Wget; however, it appears to be working fine in the current
 development version of Wget, which is expected to release soon as
 version 1.11.*

I think the old Wget crashed on empty Set-Cookie headers.  That got
fixed when I converted the Set-Cookie parser to use extract_param.
The new Wget flags empty Set-Cookie as a syntax error (but only
displays it in -d mode; possibly a bug).


Re: bug on wget

2007-11-20 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hrvoje Niksic wrote:
 Micah Cowan [EMAIL PROTECTED] writes:
 
 I was able to reproduce the problem above in the release version of
 Wget; however, it appears to be working fine in the current
 development version of Wget, which is expected to release soon as
 version 1.11.*
 
 I think the old Wget crashed on empty Set-Cookie headers.  That got
 fixed when I converted the Set-Cookie parser to use extract_param.
 The new Wget flags empty Set-Cookie as a syntax error (but only
 displays it in -d mode; possibly a bug).

I'm not clear on exactly what's possibly a bug: do you mean the fact
that Wget only calls attention to it in -d mode?

I probably agree with that behavior... most people probably aren't
interested in being informed that a server breaks RFC 2616 mildly;
especially if it's not apt to affect the results. Unless of course the
user was expecting that the user send a real cookie, but I'm guessing
that this only happens when the server doesn't have one to send (or
something). But a user in that situation should be using -d (or at least
- -S) to find out what the server is sending.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHQ3N97M8hyUobTrERCCpFAJ9RHcdJ8X4UWpEQIhz+khDWc8MOJwCfZANU
vr2lCTLP04R/PP/cBf7sIpE=
=6csr
-END PGP SIGNATURE-


Re: [bug #20323] Wget issues HEAD before GET, even when the file doesn't exist locally.

2007-07-12 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Mauro Tortonesi wrote:
 Micah Cowan ha scritto:
 Update of bug #20323 (project wget):

   Status:  Ready For Test = In
 Progress   
 ___

 Follow-up Comment #3:

 Moving back to In Progress until some questions about the logic are
 answered:

 http://addictivecode.org/pipermail/wget-notify/2007-July/75.html
 http://addictivecode.org/pipermail/wget-notify/2007-July/77.html
 
 thanks micah.
 
 i have partly misunderstood the logic behind preliminary HEAD request.
 in my code, HEAD is skipped if -O or --no-content-disposition are given,
 but if -N is given HEAD is always sent. this is wrong, as HEAD should be
 skipped even if -N and --no-content-disposition are given (no need to
 care about the deprecated -N -O combination). can't think of any other
 case in which HEAD should be skipped, though.

Cc'ing wget ML, as it's probably important to open up discussion of the
current logic.

What about the case when nothing is given on the command line except
- --no-content-disposition? What do we need HEAD for then?

Also: I don't believe HEAD should be sent if no options are given on the
command line. What purpose would that serve? If it's to find a possible
Content-Disposition header, we can get that (and more reliably) at GET
time (though, I believe we may currently be requiring the file name
before we fetch, which if true, should definitely be changed but not for
1.11, in which case the HEAD will be allowed for the time being); and
since we're not matching against potential accept/reject lists, we don't
really need it.

I think it really makes much more sense to enumerate those few cases
where we need to issue a HEAD, rather than try to determine all the
cases where we don't: if I have to choose a side to err on, I'd rather
not send HEAD in a case or two where we needed it, rather than send it
in a few where we didn't, as any request-response cycle eats up time. I
also believe that the cases where we want a HEAD are/should be fewer
than the cases where we don't want them.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGlol+7M8hyUobTrERCOT0AJwNt2dm/80zL7UYbadBaiaPrMvSUQCePKmS
WO77ltxl0vr0Pcgd8H1bIY8=
=zCTU
-END PGP SIGNATURE-


[Fwd: Bug#281201: wget prints it's progress even when background]

2007-07-11 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

The following bug was submitted to Debian's bug tracker.
I'm curious what people think about this suggestion.

Don't we already check for something like redirected output (and force
the progress indicator to dots)? It seems to me that if that is
appropriate, then a case could be made for this as well.

Perhaps instead of shutting up, though, wget should attempt to direct
to a file? Perhaps with a one last message to the terminal (assuming
the terminal doesn't have TOSTOP set--it should ignore SIGTTOU and
handle EIO to handle that case), to indicate that it's doing this.

- -Micah


-  Original Message 
Subject: Bug#281201: wget prints it's progress even when background
Resent-Date: Tue, 10 Jul 2007 13:57:01 +,   Tue, 10 Jul 2007 13:57:02
+
Resent-From: Ilya Anfimov [EMAIL PROTECTED]
Resent-To: [EMAIL PROTECTED]
Resent-CC: Noèl Köthe [EMAIL PROTECTED]
Date: Tue, 10 Jul 2007 17:54:51 +0400
From: Ilya Anfimov [EMAIL PROTECTED]
Reply-To: Ilya Anfimov [EMAIL PROTECTED], [EMAIL PROTECTED]
To: Peter Eisentraut [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]


 My suggestion is to stop printing verbose progress messages
when the job is resumed in background. It could be checked
by (successful) getpgrp() not equal to (successful) tcgetprp(1)
in SIGCONT signal handler.
 And something like this is used in some console applications,
for example, in lftp.


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGlThP7M8hyUobTrERCA4sAJ0RwfVIsL5UcafLkfm5qihERnRNvQCeIABc
t+Y3FeNYctJsdPcPbTwYukk=
=eBSi
-END PGP SIGNATURE-


possible bug in wget-1.10.2 and earlier

2007-05-30 Thread Harrington, Paul
Hi,
wget appears to be confused by FTP servers that only have one space
between the file-size information. We only came across this problem
today so I don't know how common it is.
 
pjjH
 




From: Harrington, Paul 
Sent: Thursday, May 31, 2007 12:06 AM
To:  recipient-removed 
Subject: RE: File issue using WGET


Your FTP server must have changed the output of the listing format or,
more precisely, the string representation of some of the components has
changed such that only one space separates the group name from the
file-size. The bug is, of course, with wget but it is one that hitherto
had not been observed when interacting with your FTP server.
 
 
pjjH
 
 
 
[EMAIL PROTECTED] diff -u ftp-ls.c  ~/tmp
--- ftp-ls.c2005-08-04 17:52:33.0 -0400
+++ /u/harringp/tmp/ftp-ls.c2007-05-31 00:02:07.209955000 -0400
@@ -229,6 +229,18 @@
  break;
}
  errno = 0;
+  /* after the while loop terminates, t may not always
+ point to a space character. In the case when
+ there is only one-space between the user/group
+ information and the file-size, the space will
+ have been overwritten by a \0 via strok().  So,
+ if you have been through the loop at least once,
+ advance forward one chacter.
+  */
+
+  if (t  ptok)
+  t++;
+
  size = str_to_wgint (t, NULL, 10);
  if (size == WGINT_MAX  errno == ERANGE)
/* Out of range -- ignore the size.   Should



 



Bug-report: wget with multiple cnames in ssl certificate

2007-04-12 Thread Alex Antener
Hi

If i connect with wget 1.10.2 (Debian Etch  Ubuntu Feisty Fawn) to a
secure host, that uses multiple cnames in the certificate i get the
following error:

[EMAIL PROTECTED]:~$ wget https://host.domain.tld
--10:18:55--  https://host.domain.tld/
   = `index.html'
Resolving host.domain.tld... xxx.xxx.xxx.xxx
Connecting to host.domain.tld|xxx.xxx.xxx.xxx|:443... connected.
ERROR: certificate common name `host0.domain.tld' doesn't match
requested host name `host.domain.tld'.
To connect to host.domain.tld insecurely, use `--no-check-certificate'.
Unable to establish SSL connection.

If I do the same with wget 1.9.1 (Debian Sarge) I do not get that Error.

Kind regards, Alex Antener

-- 
Alex Antener
Dipl. Medienkuenstler FH

[EMAIL PROTECTED] // http://lix.cc // +41 (0)44 586 97 63
GPG Key: 1024D/14D3C7A1 https://lix.cc/gpg_key.php
Fingerprint: BAB6 E61B 17D7 A9C9 6313  5141 3A3C DAA3 14D3 C7A1



Re: wget 1.11 alpha1 [Fwd: Bug#378691: wget --continue doesn't workwith HTTP]

2006-08-28 Thread Mauro Tortonesi

Jochen Roderburg wrote:


I have now tested the new wget 1.11 beta1 on my Linux system and the above issue
is solved now. The Remote file is newer message now only appears when the
local file exists and most of the other logic with time-stamping and
file-naming works like expected.


excellent.


I meanwhile found, however, another new problem with time-stamping, which mainly
occurs in connection with a proxy-cache, I will report that in a new thread.
Same for a small problem with the SSL configuration.


thank you very much for the useful bug reports you keep sending us ;-)

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget 1.11 alpha1 [Fwd: Bug#378691: wget --continue doesn't workwith HTTP]

2006-08-21 Thread Mauro Tortonesi

Jochen Roderburg ha scritto:

Zitat von Jochen Roderburg [EMAIL PROTECTED]:


Zitat von Hrvoje Niksic [EMAIL PROTECTED]:


Mauro, you will need to look at this one.  Part of the problem is that
Wget decides to save to index.html.1 although -c is in use.  That is
solved with the patch attached below.  But the other part is that
hstat.local_file is a NULL pointer when
stat(hstat.local_file, st) is used to determine whether the file
already exists in the -c case.  That seems to be a result of your
changes to the code -- previously, hstat.local_file would get
initialied in http_loop.


This looks as if if could also be the cause for the problems which I reported
some weeks ago for the timestamping mode
(http://www.mail-archive.com/wget@sunsite.dk/msg09083.html)




Hello Mauro,

The timestamping issues I reported in above mentioned message are now also
repaired by the patch you mailed last week here.
Only the small *cosmetic* issue remains that it *always* says:
   Remote file is newer, retrieving.
even if there is no local file yet.


hi jochen,

i have been working on the problem you reported for the last couple of days. 
i've just committed a patch that should fix it for good. could you please try 
the new HTTP code and tell me if it works properly?


thank you very much for your help.

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget 1.11 alpha1 [Fwd: Bug#378691: wget --continue doesn't workwith HTTP]

2006-08-20 Thread Jochen Roderburg
Zitat von Jochen Roderburg [EMAIL PROTECTED]:

 Zitat von Hrvoje Niksic [EMAIL PROTECTED]:

  Mauro, you will need to look at this one.  Part of the problem is that
  Wget decides to save to index.html.1 although -c is in use.  That is
  solved with the patch attached below.  But the other part is that
  hstat.local_file is a NULL pointer when
  stat(hstat.local_file, st) is used to determine whether the file
  already exists in the -c case.  That seems to be a result of your
  changes to the code -- previously, hstat.local_file would get
  initialied in http_loop.

 This looks as if if could also be the cause for the problems which I reported
 some weeks ago for the timestamping mode
 (http://www.mail-archive.com/wget@sunsite.dk/msg09083.html)


Hello Mauro,

The timestamping issues I reported in above mentioned message are now also
repaired by the patch you mailed last week here.
Only the small *cosmetic* issue remains that it *always* says:
   Remote file is newer, retrieving.
even if there is no local file yet.

J.Roderburg



Re: wget 1.11 alpha1 [Fwd: Bug#378691: wget --continue doesn't workwith HTTP]

2006-08-17 Thread Mauro Tortonesi

Hrvoje Niksic ha scritto:

Noèl Köthe [EMAIL PROTECTED] writes:



a wget -c problem report with the 1.11 alpha 1 version
(http://bugs.debian.org/378691):

I can reproduce the problem. If I have already 1 MB downloaded wget -c
doesn't continue. Instead it starts to download again:



Mauro, you will need to look at this one.  Part of the problem is that
Wget decides to save to index.html.1 although -c is in use.  That is
solved with the patch attached below.  But the other part is that
hstat.local_file is a NULL pointer when
stat(hstat.local_file, st) is used to determine whether the file
already exists in the -c case.  That seems to be a result of your
changes to the code -- previously, hstat.local_file would get
initialied in http_loop.

The partial patch follows:

Index: src/http.c
===
--- src/http.c  (revision 2178)
+++ src/http.c  (working copy)
@@ -1762,7 +1762,7 @@
 
   return RETROK;

 }
-  else
+  else if (!ALLOW_CLOBBER)
 {
   char *unique = unique_name (hs-local_file, true);
   if (unique != hs-local_file)


you're right, of course. the patch included in attachment should fix the 
problem. since the new HTTP code supports Content-Disposition and delays the 
decision of the destination filename until it receives the response header, the 
best solution i could find to make -c work is to send a HEAD request to 
determine the actual destination filename before resuming download if -c is given.


please, let me know what you think.

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it
Index: http.c
===
--- http.c  (revisione 2178)
+++ http.c  (copia locale)
@@ -1762,7 +1762,7 @@
 
   return RETROK;
 }
-  else
+  else if (!ALLOW_CLOBBER)
 {
   char *unique = unique_name (hs-local_file, true);
   if (unique != hs-local_file)
@@ -2231,6 +2231,7 @@
 {
   int count;
   bool got_head = false; /* used for time-stamping */
+  bool got_name = false;
   char *tms;
   const char *tmrate;
   uerr_t err, ret = TRYLIMEXC;
@@ -2264,7 +2265,10 @@
   hstat.referer = referer;
 
   if (opt.output_document)
+{
 hstat.local_file = xstrdup (opt.output_document);
+  got_name = true;
+}
 
   /* Reset the counter. */
   count = 0;
@@ -2309,13 +2313,16 @@
   /* Default document type is empty.  However, if spider mode is
  on or time-stamping is employed, HEAD_ONLY commands is
  encoded within *dt.  */
-  if ((opt.spider  !opt.recursive) || (opt.timestamping  !got_head))
+  if ((opt.spider  !opt.recursive) 
+  || (opt.timestamping  !got_head)
+  || (opt.always_rest  !got_name))
 *dt |= HEAD_ONLY;
   else
 *dt = ~HEAD_ONLY;
 
   /* Decide whether or not to restart.  */
   if (opt.always_rest
+   got_name
stat (hstat.local_file, st) == 0
S_ISREG (st.st_mode))
 /* When -c is used, continue from on-disk size.  (Can't use
@@ -2484,6 +2491,12 @@
   continue;
 }
   
+  if (opt.always_rest  !got_name)
+{
+  got_name = true;
+  continue;
+}
+  
   if ((tmr != (time_t) (-1))
(!opt.spider || opt.recursive)
((hstat.len == hstat.contlen) ||
Index: ChangeLog
===
--- ChangeLog   (revisione 2178)
+++ ChangeLog   (copia locale)
@@ -1,3 +1,9 @@
+2006-08-16  Mauro Tortonesi  [EMAIL PROTECTED]
+
+   * http.c: Fixed bug which broke --continue feature. Now if -c is
+   given, http_loop sends a HEAD request to find out the destination
+   filename before resuming download.
+
 2006-08-08  Hrvoje Niksic  [EMAIL PROTECTED]
 
* utils.c (datetime_str): Avoid code repetition with time_str.


Re: wget 1.11 alpha1 [Fwd: Bug#378691: wget --continue doesn't workwith HTTP]

2006-08-17 Thread Hrvoje Niksic
Mauro Tortonesi [EMAIL PROTECTED] writes:

 you're right, of course. the patch included in attachment should fix
 the problem. since the new HTTP code supports Content-Disposition
 and delays the decision of the destination filename until it
 receives the response header, the best solution i could find to make
 -c work is to send a HEAD request to determine the actual
 destination filename before resuming download if -c is given.

 please, let me know what you think.

I don't like the additional HEAD request, but I can't think of a
better solution.


Re: wget 1.11 alpha1 [Fwd: Bug#378691: wget --continue doesn't workwith HTTP]

2006-08-17 Thread Mauro Tortonesi

Hrvoje Niksic ha scritto:

Mauro Tortonesi [EMAIL PROTECTED] writes:



you're right, of course. the patch included in attachment should fix
the problem. since the new HTTP code supports Content-Disposition
and delays the decision of the destination filename until it
receives the response header, the best solution i could find to make
-c work is to send a HEAD request to determine the actual
destination filename before resuming download if -c is given.

please, let me know what you think.


I don't like the additional HEAD request, but I can't think of a
better solution.


same for me. in order to avoid the overhead of the extra HEAD request, i had 
considered disabling Content-Disposition and using url_file_name to determine 
the destination filename in case -c is given. but i really didn't like that 
solution.


--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget 1.11 alpha1 [Fwd: Bug#378691: wget --continue doesn't workwith HTTP]

2006-08-09 Thread Mauro Tortonesi

Hrvoje Niksic wrote:

Noèl Köthe [EMAIL PROTECTED] writes:



a wget -c problem report with the 1.11 alpha 1 version
(http://bugs.debian.org/378691):

I can reproduce the problem. If I have already 1 MB downloaded wget -c
doesn't continue. Instead it starts to download again:


Mauro, you will need to look at this one. 


i surely will. unfortunately, at the moment i am attending the winsys 
2006 research conference:


http://www.winsys.org

i'll take a look at the problem as soon as i get back to italy.

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


wget 1.11 alpha1 [Fwd: Bug#378691: wget --continue doesn't work with HTTP]

2006-08-08 Thread Noèl Köthe
Hello,

a wget -c problem report with the 1.11 alpha 1 version
(http://bugs.debian.org/378691):

I can reproduce the problem. If I have already 1 MB downloaded wget -c
doesn't continue. Instead it starts to download again:

 Weitergeleitete Nachricht 

 [EMAIL PROTECTED]:~$ strace -o wget-strace wget -c
 http://ftp.iasi.roedu.net/100MB
 --14:28:07--  http://ftp.iasi.roedu.net/100MB
 Resolving ftp.iasi.roedu.net... 192.129.4.120
 Connecting to ftp.iasi.roedu.net|192.129.4.120|:80... connected.
 HTTP request sent, awaiting response... 200 OK
 Length: 104857600 (100M) [text/plain]
 Saving to: dMB.8'
 
 
 The HTTP conversation:
 
 GET /100MB HTTP/1.0
 User-Agent: Wget/1.11-alpha-1
 Accept: */*
 Host: ftp.iasi.roedu.net
 Connection: Keep-Alive
 
 
 HTTP/1.1 200 OK
 Date: Tue, 18 Jul 2006 11:24:14 GMT
 Server: Apache/2.2.2 (Unix)
 Last-Modified: Sat, 03 Dec 2005 09:14:42 GMT
 ETag: a002e4cb-640-1dbb0480
 Accept-Ranges: bytes
 Content-Length: 104857600
 Keep-Alive: timeout=5, max=100
 Connection: Keep-Alive
 Content-Type: text/plain
 
 With an older version of wget, same file, same server, it works. This
 version works with FTP.
 
 A strace (attached) shows that it doesn't even try to see if 100MB exists
 before sending the HTTP request.
 
 -- System Information:
 Debian Release: testing/unstable
   APT prefers experimental
   APT policy: (500, 'experimental'), (500, 'unstable'), (500, 'testing')
 Architecture: i386 (i686)
 Shell:  /bin/sh linked to /bin/bash
 Kernel: Linux 2.6.17-1-686
 Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
 
 Versions of packages wget depends on:
 ii  libc62.3.999.2-8 GNU C Library: Shared libraries
 ii  libssl0.9.8  0.9.8b-2SSL shared libraries

-- 
Noèl Köthe noel debian.org
Debian GNU/Linux, www.debian.org


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


Re: wget 1.11 alpha1 [Fwd: Bug#378691: wget --continue doesn't workwith HTTP]

2006-08-08 Thread Jochen Roderburg
Zitat von Hrvoje Niksic [EMAIL PROTECTED]:

 Mauro, you will need to look at this one.  Part of the problem is that
 Wget decides to save to index.html.1 although -c is in use.  That is
 solved with the patch attached below.  But the other part is that
 hstat.local_file is a NULL pointer when
 stat(hstat.local_file, st) is used to determine whether the file
 already exists in the -c case.  That seems to be a result of your
 changes to the code -- previously, hstat.local_file would get
 initialied in http_loop.

This looks as if if could also be the cause for the problems which I reported
some weeks ago for the timestamping mode
(http://www.mail-archive.com/wget@sunsite.dk/msg09083.html)

J.Roderburg



Re: Bug in wget 1.10.2 makefile

2006-07-17 Thread Mauro Tortonesi

Daniel Richard G. ha scritto:

Hello,

The MAKEDEFS value in the top-level Makefile.in also needs to include 
DESTDIR='$(DESTDIR)'.


fixed, thanks.

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Bug in wget 1.10.2 makefile

2006-07-14 Thread Daniel Richard G.
Hello,

The MAKEDEFS value in the top-level Makefile.in also needs to include 
DESTDIR='$(DESTDIR)'.


(build log excerpt)
+ make install DESTDIR=/tmp/wget--1.10.2.build/__dest__
cd src  make CC='cc' CPPFLAGS='-D__EXTENSIONS__ -D_REENTRANT -Dsparc' ... 
install.bin
/tg/freeport/src/wget/wget--1.10.2/mkinstalldirs /tg/freeport/arch/sunos64/bin
/tg/freeport/src/wget/wget--1.10.2/install-sh -c wget 
/tg/freeport/arch/sunos64/bin/wget
cp: cannot create /tg/freeport/arch/sunos64/bin/_inst.8_: Read-only file 
system
*** Error code 1
make: Fatal error: Command failed for target `install.bin'
Current working directory /tmp/wget--1.10.2.build/src
*** Error code 1
make: Fatal error: Command failed for target `install.bin'
(end)


--Daniel


-- 
NAME   = Daniel Richard G.   ##  Remember, skunks   _\|/_  meef?
EMAIL1 = [EMAIL PROTECTED]##  don't smell bad---(/o|o\) /
EMAIL2 = [EMAIL PROTECTED]  ##  it's the people who(^),
WWW= http://www.**.org/  ##  annoy them that do!/   \
--
(** = site not yet online)


A bug in wget 1.10.2

2006-06-07 Thread Joaquim Andrade
Hello, i'm using wget 1.10.2 in Windows, the windows binary version, and it have a bug when downloading with -c and with a input file. If the first file of the list is the one to be continued, wget do it fine, if not, wgettry to download the files from the beginning, and it says that is downloading the files, but do not replace the existing ones. I'm using instead -nc but is not what i want, cause, with it, wget skip existing files, even if not fully downloaded.


Sorry for my english.
Hope you have understood what i was trying to say.
Keep up the good work.


Re: [Fwd: Bug#366434: wget: Multiple 'Pragma:' headers not supported]

2006-05-19 Thread Mauro Tortonesi

Noèl Köthe wrote:

Hello,

a forwarded report from http://bugs.debian.org/366434

could this behaviour be added to the doc/manpage?


i wonder if it makes sense to add generic support for multiple headers 
in wget, for instance by extending the --header option like this:


wget --header=Pragma: xxx --header=dontoverride,Pragma: xxx2 someurl

as an alternative, we could choose to support multiple headers only for 
a few header types, like Pragma. however, i don't really like this 
second choise, as it would require to hardcode the above mentioned 
header names in the wget sources, which IMVHO is a *VERY* bad practice.


what do you think?

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: [Fwd: Bug#366434: wget: Multiple 'Pragma:' headers not supported]

2006-05-19 Thread Hrvoje Niksic
Mauro Tortonesi [EMAIL PROTECTED] writes:

 Noèl Köthe wrote:
 Hello,
 a forwarded report from http://bugs.debian.org/366434
 could this behaviour be added to the doc/manpage?

 i wonder if it makes sense to add generic support for multiple headers
 in wget, for instance by extending the --header option like this:

Or by adding a `--append-header' with that functionality.  Originally
--header always appended, but the problem was that people sometimes
wanted to change the headers issued by Wget.

The reason I didn't introduce (in fact keep) append was that HTTP
pretty much disallows duplicate headers.  According to HTTP, a
duplicate header field is equivalent to a single header header with
multiple values joined using the , separator -- which the bug report
mentions.


RE: [Fwd: Bug#366434: wget: Multiple 'Pragma:' headers not suppor ted]

2006-05-19 Thread Herold Heiko
 From: Mauro Tortonesi [mailto:[EMAIL PROTECTED]
 i wonder if it makes sense to add generic support for 
 multiple headers 
 in wget, for instance by extending the --header option like this:
 
 wget --header=Pragma: xxx --header=dontoverride,Pragma: 
 xxx2 someurl

That could be a problem if you need to send a really weird custom header
named dontoverride,Pragma. Probability is near nil but with the whole big
bad internet waiting maybe separating switches (--header and --header-add)
would be better.

 as an alternative, we could choose to support multiple 
 headers only for 
 a few header types, like Pragma. however, i don't really like this 
 second choise, as it would require to hardcode the above mentioned 
 header names in the wget sources, which IMVHO is a *VERY* bad 
 practice.

Same opinion, hard coding the header list would be ugly and will byte some
user in the nose some time in the future: if you need to add several XXXY
headers either patch and recompile or use at least versione x.y

Heiko 

-- 
-- PREVINET S.p.A. www.previnet.it
-- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED]
-- +39-041-5907073 / +39-041-5917073 ph
-- +39-041-5907472 / +39-041-5917472 fax


Re: [Fwd: Bug#366434: wget: Multiple 'Pragma:' headers not suppor ted]

2006-05-19 Thread Mauro Tortonesi

Herold Heiko wrote:

From: Mauro Tortonesi [mailto:[EMAIL PROTECTED]
i wonder if it makes sense to add generic support for 
multiple headers 
in wget, for instance by extending the --header option like this:


wget --header=Pragma: xxx --header=dontoverride,Pragma: 
xxx2 someurl



That could be a problem if you need to send a really weird custom header
named dontoverride,Pragma. Probability is near nil but with the whole big
bad internet waiting maybe separating switches (--header and --header-add)
would be better.


you're right. in fact, i like hrvoje's --append-header proposal better.

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


[Fwd: Bug#366434: wget: Multiple 'Pragma:' headers not supported]

2006-05-14 Thread Noèl Köthe
Hello,

a forwarded report from http://bugs.debian.org/366434

could this behaviour be added to the doc/manpage?

thx.

 Package: wget
 Version: 1.10.2-1

   It's meaningful to have multiple 'Pragma:' headers within an http
 request, but wget will silently issue only a single one of them if
 they are specified within separate arguments.  For example,
 
 [EMAIL PROTECTED] /tmp]$ wget -U 'NSPlayer/4.1.0.3856' --header='Pragma: 
 no-cache,rate=1.00,stream-time=0,stream-offset=0:0,request-context=2,max-duration=0'
  --header='Pragma: xClientGUID={c77e7400-738a-11d2-9add-0020af0a3278}' 
 --header='Pragma: xPlayStrm=1' --header='Pragma: stream-switch-count=1' 
 --header='Pragma: stream-switch-entry=:1:0' 
 http://wms.scripps.com:80/knoxville/siler/siler.mp3 
 
   ... doesn't work, and inspection with ethereal reveals that wget is
 only sending the last 'Pragma:' header given.  Compressing all the
 'Pragma' directives into a single header makes the fetch work:
 
 [EMAIL PROTECTED] /tmp]$ wget -U 'NSPlayer/4.1.0.3856' --header='Pragma: 
 no-cache,rate=1.00,stream-time=0,stream-offset=0:0,request-context=2,max-duration=0,xClientGUID={c77e7400-738a-11d2-9add-0020af0a3278},xPlayStrm=1,stream-switch-count=1,stream-switch-entry=:1:0'
  http://wms.scripps.com:80/knoxville/siler/siler.mp3

-- 
Noèl Köthe noel debian.org
Debian GNU/Linux, www.debian.org


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


bug in wget windows

2005-10-14 Thread Tobias Koeck

done.
== PORT ... done.== RETR SUSE-10.0-EvalDVD-i386-GM.iso ... done.

[   =  ] -673,009,664  113,23K/s

Assertion failed: bytes = 0, file retr.c, line 292

This application has requested the Runtime to terminate it in an unusual 
way.

Please contact the application's support team for more information.


smime.p7s
Description: S/MIME Cryptographic Signature


Re: bug in wget windows

2005-10-14 Thread Mauro Tortonesi

Tobias Koeck wrote:

done.
== PORT ... done.== RETR SUSE-10.0-EvalDVD-i386-GM.iso ... done.

[   =  ] -673,009,664  113,23K/s

Assertion failed: bytes = 0, file retr.c, line 292

This application has requested the Runtime to terminate it in an unusual 
way.

Please contact the application's support team for more information.


you are probably using an older version of wget, without long file 
support. please upgrade to wget 1.10.2.


--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


a bug about wget

2005-10-04 Thread baidu baidu
That is, there is HTML like this:

pClick the following to go to the
 a href=http://www.something.com/junk.asp?thepageIwant=2;;next
 page/a./p



What I need is for wget to understand that stuff following an ? in a URL
 indicates that it's a distinctly different page, and it should go
 recursively retrieve it.  The --recursive option and -A option
doesn't seem to help.
I had tried:
wget -r -l2 -A junk.asp?the*
and
wget -r -l2 -A junk.asp%3Fthe*

no command can download the file?

 Any help you can give me is appreciated.
thanks


Re: openssl server renogiation bug in wget

2005-08-26 Thread Hrvoje Niksic
Thanks for the report; I've applied this patch:

2005-08-26  Jeremy Shapiro  [EMAIL PROTECTED]

* openssl.c (ssl_init): Set SSL_MODE_AUTO_RETRY.

Index: openssl.c
===
--- openssl.c   (revision 2063)
+++ openssl.c   (working copy)
@@ -225,6 +225,10 @@
  handles them correctly), allow them in OpenSSL.  */
   SSL_CTX_set_mode (ssl_ctx, SSL_MODE_ENABLE_PARTIAL_WRITE);
 
+  /* The OpenSSL library can handle renegotiations automatically, so
+ tell it to do so.  */
+  SSL_CTX_set_mode (ssl_ctx, SSL_MODE_AUTO_RETRY);
+
   return true;
 
  error:


openssl server renogiation bug in wget

2005-08-18 Thread Jeremy Shapiro
I believe I've encountered a bug in wget.  When using https, if the
server does a renegotiation handshake wget fails trying to peek for
the application data.  This occurs because wget does not set the
openssl context mode  SSL_MODE_AUTO_RETRY.  When I added the line:
SSL_CTX_set_mode (ssl_ctx, SSL_MODE_AUTO_RETRY);
just after the line that sets PARTIAL_WRITE mode in ssl_init() in
openssl.c everything worked again.

To reproduce, set up an apache server that only does client
authentication for a protected directory.  When wget does the ssl
connect it negotiates the handshake.  However, when it sends the
request for the restricted directory the server will try to
renegotiate with a client authenticated handshake.  Wget will fail
trying to read the application data, and continually retry.

Jeremy


[Fwd: Bug#319088: wget: don't rely on exactly one blank char between size and month]

2005-07-20 Thread Noèl Köthe
Hello,

giuseppe wrote a patch for 1.10.1.beta1. Full report can be viewed here:
http://bugs.debian.org/319088

 Weitergeleitete Nachricht 
 Von: giuseppe bonacci [EMAIL PROTECTED]
 Antwort an: giuseppe bonacci [EMAIL PROTECTED],
 [EMAIL PROTECTED]
 An: Debian Bug Tracking System [EMAIL PROTECTED]
 Betreff: Bug#319088: wget: don't rely on exactly one blank char
 between size and month
 Datum: Wed, 20 Jul 2005 10:26:20 +0200
 
 Package: wget
 Version: 1.10-3+1.10.1beta1
 Followup-For: Bug #319088
 
 
 A better patch is the following, that drops the assumption that there
 is exactly one blank char between size and month (implicit in the
 statement char *t = tok - 2;).
 
 As far as I know, strtok() modifies the string
 1234  aaa  bbb@ (where @ stands for \0, for clarity)
 so that when tok points to aaa the string looks like
 1234@ aaa@ bbb@, 
 and (tok - 2) points to , which is not useful for backtracking.
 I think the best way to access the previous token is ... keeping a pointer
 to it.
 g.b.
 
 
 --- wget-1.10/src/ftp-ls.c.orig   2005-05-12 18:24:33.0 +0200
 +++ wget-1.10/src/ftp-ls.c2005-07-20 09:53:30.206791032 +0200
 @@ -110,7 +110,7 @@
struct tm timestruct, *tnow;
time_t timenow;
  
 -  char *line, *tok;  /* tokenizer */
 +  char *line, *tok, *ptok;   /* tokenizer */
struct fileinfo *dir, *l, cur; /* list creation */
  
fp = fopen (file, rb);
 @@ -201,7 +201,9 @@
This tactic is quite dubious when it comes to
internationalization issues (non-English month names), but it
works for now.  */
 -  while ((tok = strtok (NULL,  )) != NULL)
 +  ptok = line;
 +  while (ptok = tok,
 +   (tok = strtok(NULL,  )) != NULL)
   {
 --next;
 if (next  0) /* a month name was not encountered */
 @@ -217,9 +219,7 @@
  
 /* Back up to the beginning of the previous token
and parse it with str_to_wgint.  */
 -   char *t = tok - 2;
 -   while (t  line  ISDIGIT (*t))
 - --t;
 +   char *t = ptok;
 if (t == line)
   {
 /* Something has gone wrong during parsing. */
 
 -- System Information:
 Debian Release: testing/unstable
   APT prefers testing
   APT policy: (500, 'testing')
 Architecture: i386 (i686)
 Shell:  /bin/sh linked to /bin/bash
 Kernel: Linux 2.6.8-2-686-smp
 Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
 
 Versions of packages wget depends on:
 ii  libc6   2.3.2.ds1-22 GNU C Library: Shared libraries 
 an
 ii  libssl0.9.7 0.9.7e-3 SSL shared libraries
 
 wget recommends no packages.
 
 -- no debconf information
 

-- 
Noèl Köthe noel debian.org
Debian GNU/Linux, www.debian.org


signature.asc
Description: This is a digitally signed message part


Re: Small bug in Wget manual page

2005-06-18 Thread Mauro Tortonesi
On Wednesday 15 June 2005 04:57 pm, Ulf Harnhammar wrote:
 On Wed, Jun 15, 2005 at 03:53:40PM -0500, Mauro Tortonesi wrote:
  the web pages (including the documentation) on gnu.org have just been
  updated.

 Nice! I have found some broken links and strange grammar, though:

 * index.html: There are archives of the main GNU Wget list at
 ** fly.cc.fer.hr
 ** www.geocrawler.com
 (neither works)

 * wgetdev.html
 ** Translation Project page
 (doesn't work)

 * faq.html
 ** 3.1 [..]
 Yes, starting from version 1.10, GNU Wget support files larger than 2GB.
 (should be supports)

fixed. thank you very much.

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute for Human  Machine Cognition  http://www.ihmc.us
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: Small bug in Wget manual page

2005-06-18 Thread Mauro Tortonesi
On Wednesday 15 June 2005 05:14 pm, Ulf Harnhammar wrote:
 On Wed, Jun 15, 2005 at 11:57:42PM +0200, Ulf Harnhammar wrote:
  * faq.html
  ** 3.1 [..]
  Yes, starting from version 1.10, GNU Wget support files larger than 2GB.
  (should be supports)

 ** 2.0 How I compile GNU Wget?
 (should be How do I)

fixed. thank you very much.

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute for Human  Machine Cognition  http://www.ihmc.us
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: Small bug in Wget manual page

2005-06-15 Thread Hrvoje Niksic
Mauro Tortonesi [EMAIL PROTECTED] writes:

 this seems to be already fixed in the 1.10 documentation.

Now that 1.10 is released, we should probably update the on-site
documentation.


Re: Small bug in Wget manual page

2005-06-15 Thread Mauro Tortonesi
On Wednesday 15 June 2005 02:05 pm, Hrvoje Niksic wrote:
 Mauro Tortonesi [EMAIL PROTECTED] writes:
  this seems to be already fixed in the 1.10 documentation.

 Now that 1.10 is released, we should probably update the on-site
 documentation.

i am doing it right now.

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute for Human  Machine Cognition  http://www.ihmc.us
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: Small bug in Wget manual page

2005-06-15 Thread Mauro Tortonesi
On Wednesday 15 June 2005 02:16 pm, Mauro Tortonesi wrote:
 On Wednesday 15 June 2005 02:05 pm, Hrvoje Niksic wrote:
  Mauro Tortonesi [EMAIL PROTECTED] writes:
   this seems to be already fixed in the 1.10 documentation.
 
  Now that 1.10 is released, we should probably update the on-site
  documentation.

 i am doing it right now.

the web pages (including the documentation) on gnu.org have just been updated.

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute for Human  Machine Cognition  http://www.ihmc.us
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: Small bug in Wget manual page

2005-06-15 Thread Ulf Harnhammar
On Wed, Jun 15, 2005 at 03:53:40PM -0500, Mauro Tortonesi wrote:
 the web pages (including the documentation) on gnu.org have just been updated.

Nice! I have found some broken links and strange grammar, though:

* index.html: There are archives of the main GNU Wget list at
** fly.cc.fer.hr
** www.geocrawler.com
(neither works)

* wgetdev.html
** Translation Project page
(doesn't work)

* faq.html
** 3.1 [..]
Yes, starting from version 1.10, GNU Wget support files larger than 2GB.
(should be supports)

// Ulf



Re: Small bug in Wget manual page

2005-06-15 Thread Ulf Harnhammar
On Wed, Jun 15, 2005 at 11:57:42PM +0200, Ulf Harnhammar wrote:
 * faq.html
 ** 3.1 [..]
 Yes, starting from version 1.10, GNU Wget support files larger than 2GB.
 (should be supports)

** 2.0 How I compile GNU Wget?
(should be How do I)

// Ulf



Re: Small bug in Wget manual page

2005-06-07 Thread Mauro Tortonesi
On Thursday 02 June 2005 09:33 am, Herb Schilling wrote:
 Hi,

   On http://www.gnu.org/software/wget/manual/wget.html, the section on
 protocol-directories has a paragraph that is a duplicate of the
 section on no-host-directories. Other than that, the manual is
 terrific! Wget is wonderful also. I don't know what I would do
 without it.



 --protocol-directories
  Use the protocol name as a directory component of local file
 names. For example, with this option, wget -r http://host will save
 to http/host/... rather than just to host/

  Disable generation of host-prefixed directories. By default,
 invoking Wget with -r http://fly.srk.fer.hr/ will create a structure
 of directories beginning with fly.srk.fer.hr/. This option disables
 such behavior.

this seems to be already fixed in the 1.10 documentation.

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute for Human  Machine Cognition  http://www.ihmc.us
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Small bug in Wget manual page

2005-06-02 Thread Herb Schilling
Title: Small bug in Wget manual page


Hi,

On
http://www.gnu.org/software/wget/manual/wget.html, the section
on
protocol-directories has a paragraph that is a duplicate of the
section on
no-host-directories. Other than that, the manual is terrific!
Wget is wonderful also. I don't know what I would do without it.



--protocol-directories
 Use the protocol name as a directory component of
local file names. For example, with this option, wget -r http://host
will save to http/host/... rather than just to host/


Disable generation of host-prefixed directories. By default, invoking
Wget with -r http://fly.srk.fer.hr/ will create a structure of
directories beginning with fly.srk.fer.hr/. This option disables such
behavior.

-- 

Herb Schilling
NASA Glenn Research Center
Brook Park, OH 44135
[EMAIL PROTECTED]

If all our misfortunes were
laid in one common heap whence everyone must take an equal portion,
most people would be contented to take their own and depart. -Socrates
(469?-399 B.C.)



Re: Serious retrieval bug in wget 1.9.1 and newer

2005-05-30 Thread Werner LEMBERG

 Wget doesn't recognize the image tag,

Aah, thanks.

 Should Wget support it to be compatible?

IMHO yes.

Thanks for your help.


Werner


Serious retrieval bug in wget 1.9.1 and newer

2005-05-29 Thread Werner LEMBERG

[CVS 2005-05-25]

I tried this command:

  wget -r -L -l1 freetype.freedesktop.org/freetype2/screenshots.html

directly from the build directory, without using a .wgetrc file.  In
the file `screenshots.html' there is a reference to the file

  ../image/ft2-kde-thumb.png

(and others) which wget simply doesn't download -- no error message,
no warning.  My Mozilla browser displays the page just fine.  Since
wget downloads the first thumbnail picture
`../image/ft2-nautilus-thumb.png' without problems I suspect a serious
bug in wget.

I'm running wget on a GNU/Linux box.

BTW, it is not possible for CVS wget to have builddir != srcdir (after
creating the configure script), which is bad IMHO.


Werner


Re: Serious retrieval bug in wget 1.9.1 and newer

2005-05-29 Thread Hrvoje Niksic
Werner LEMBERG [EMAIL PROTECTED] writes:

 directly from the build directory, without using a .wgetrc file.  In
 the file `screenshots.html' there is a reference to the file

   ../image/ft2-kde-thumb.png

The reference looks like this:

  image width=160 height=120 alt=KDE screenshot
 src=../image/ft2-kde-thumb.png

Wget doesn't recognize the image tag, which I've never heard of
before.  It's not mentioned in HTML 4.01, it seems to be missing from
the various documents listing IE and Netscape extensions to HTML.
Googling for image tag reveals a number of hits that really refer to
IMG.

Mozilla and Opera do support it, so there's obviously some history
behind the tag.  Has anyone heard it before?  Should Wget support it
to be compatible?

 (and others) which wget simply doesn't download -- no error message,
 no warning.  My Mozilla browser displays the page just fine.  Since
 wget downloads the first thumbnail picture
 `../image/ft2-nautilus-thumb.png' without problems I suspect a
 serious bug in wget.

ft2-nautilus-thumb.png is referenced using the regular img tag.

 BTW, it is not possible for CVS wget to have builddir != srcdir
 (after creating the configure script), which is bad IMHO.

It seems to work here, except for the case when you build Wget in
srcdir as well.


Is this a bug in wget ? I need an urgent help!

2005-05-06 Thread Will Kuhn
I try to do something like
wget http://website.com/ ...
login=usernamedomain=hotmail%2ecom_lang=EN

But when wget sends the URL out, the hotmail%2ecom
becomes hotmail.com !!! Is this the supposed
behaviour ? I saw this on the sniffer. I suppose the
translation of %2 to . is done by wget. Because of
this, wget cannot retrieve the document.

How can I force wget to send out URL as it is without
making any translation ??!



Yahoo! Mail
Stay connected, organized, and protected. Take the tour:
http://tour.mail.yahoo.com/mailtour.html



Re: Is this a bug in wget ? I need an urgent help!

2005-05-06 Thread Hrvoje Niksic
Will Kuhn [EMAIL PROTECTED] writes:

 I try to do something like
 wget http://website.com/ ...
 login=usernamedomain=hotmail%2ecom_lang=EN

 But when wget sends the URL out, the hotmail%2ecom
 becomes hotmail.com !!! Is this the supposed
 behaviour ?

Yes.

 I saw this on the sniffer. I suppose the
 translation of %2 to . is done by wget.

Actually, %2e is translated to ..  Since 2e is the ASCII hex code
corresponding to the . character, the two are entirely equivalent.

Are you sure that the download doesn't fail for some other unrelated
reason?

 How can I force wget to send out URL as it is without making any
 translation ??!

Some translation must be done, for example spaces must be converted to
%20, and so on.  During that course Wget translates regular characters
represented by hex codes into regular characters.  If you don't like
it, you can hack url.c:decide_copy_method to always return
CM_PASSTHROUGH upon encountering a %XX sequence.


Re: Is this a bug in wget ? I need an urgent help!

2005-05-06 Thread Hrvoje Niksic
Hrvoje Niksic [EMAIL PROTECTED] writes:

 Can I have it not do the translation ??!

 Unfortunately, only by changing the source code as described in the
 previous mail.

BTW I've just changed the CVS code to not decode the % sequences.
Wget 1.10 will contain the fix.


[Fwd: Bug#197916: wget: Mutual incompatibility between arguments -k and -O]

2004-08-11 Thread Noèl Köthe
Hello,

here a bugreport:
(http://bugs.debian.org/197916)

-Weitergeleitete Nachricht-
 From: Antoni Bella Perez [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: Bug#197916: wget: Mutual incompatibility between arguments -k and -O
 Date: Wed, 18 Jun 2003 16:49:22 +0200
 
 Package: wget
 Version: 1.8.2-10
 Severity: important
 
   These are the arguments:
 
   ## ARGUMENTS
`-O FILE'
`--output-document=FILE'
   ##
`-k'
`--convert-links'
   ##
 
   I have created script following the documentation man and info, and has 
 happened that the scheme of the line that I specified does not work:
 
   wget - k URL - Or file.html
 
   Low I show the output of the command: 
 
 ## BUG
 [16:28:43] [EMAIL PROTECTED]:~$ wget -k http://www.terra.es/personal7/bella5/ -O 
 index.html
 --16:28:48--  http://www.terra.es/personal7/bella5/
= `index.html'
 Resolving www.terra.es... done.
 Connecting to www.terra.es[213.4.130.210]:80... connected.
 HTTP request sent, awaiting response... 200 OK
 Length: unspecified [text/html]
 
 [  =  ] 11,13726.08K/s
 
 16:29:04 (26.08 KB/s) - `index.html' saved [11137]
 
 index.html.1: No such file or directory
 Converting index.html.1... nothing to do.
 Converted 1 files in 0.00 seconds.
  END
 
   With this there am lost long time, with which step to consider that; or he 
 is bug or it would have to be documented.
 
   Regards
   Toni
 
-- 
Nol Kthe noel debian.org
Debian GNU/Linux, www.debian.org


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


[Fwd: Bug#182957: wget: manual page doesn't document type of patterns for --rejlist, --acclist]

2004-08-11 Thread Noèl Köthe
Hello,

maybe someone can document this (http://bugs.debian.org/182957) in one
or two sentences in wget.texi.

thx.

-Weitergeleitete Nachricht-
 From: Daniel B. dsb  smart.net
...
 The wget manual page doesn't document the format of the comma-separated values
 for the --rejlist and --acclist options
 
 The wget manual page says:
 
-A acclist --accept acclist
-R rejlist --reject rejlist
Specify comma-separated lists of file name suffixes or
patterns to accept or reject.
 
 Particular unanswered questsions are:
 
 - Whether pattern means shell (globbing) pattern or regular expression.
 
 - If it means regular expression:
   - Which style of regular expression (basic, extended, Perl 5, other).
   - Whether the expression anchored or not.
 
 - Whether suffix means xyz in abc.xyz, .xyz in abc.xyz, or
   any string found at the end of the candidate string (e.g.,  yz in abc.xyz
   or in DJayz).
 
 - How a suffix is differentiated from a pattern.


-- 
Nol Kthe noel debian.org
Debian GNU/Linux, www.debian.org


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


Re: Bug in wget 1.9.1 documentation

2004-07-12 Thread Hrvoje Niksic
Tristan Miller [EMAIL PROTECTED] writes:

 There appears to be a bug in the documentation (man page, etc.) for
 wget 1.9.1.

I think this is a bug in the man page generation process.



Bug in wget 1.9.1 documentation

2004-07-11 Thread Tristan Miller
Greetings.

There appears to be a bug in the documentation (man page, etc.) for wget 
1.9.1.  Specifically, the section about the command-line option for 
proxies ends abruptly:

   -Y on/off
   --proxy=on/off
   Turn proxy support on or off.  The proxy is on by default if the
   appropriate environment variable is defined.

   For more information about the use of proxies with Wget,

   -Q quota
   --quota=quota
   Specify download quota for automatic retrievals.  The value can 
be
   specified in bytes (default), kilobytes (with k suffix), or
   megabytes (with m suffix).

Regards,
Tristan

-- 
   _
  _V.-o  Tristan Miller [en,(fr,de,ia)]Space is limited
 / |`-'  -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=In a haiku, so it's hard
(7_\\http://www.nothingisreal.com/ To finish what you


Re: Bug in wget: cannot request urls with double-slash in the query string

2004-03-05 Thread Hrvoje Niksic
D Richard Felker III [EMAIL PROTECTED] writes:

 The request log shows that the slashes are apparently respected.

 I retried a test case and found the same thing -- the slashes were
 respected.

OK.

 Then I remembered that I was using -i. Wget seems to work fine with
 the url on the command line; the bug only happens when the url is
 passed in with:

 cat EOF | wget -i -
 http://...
 EOF

But I cannot repeat that, either.  As long as the consecutive slashes
are in the query string, they're not stripped.

 Using this method is necessary since it is the ONLY secure way I
 know of to do a password-protected http request from a shell script.

Yes, that is the best way to do it.



Re: Bug in wget: cannot request urls with double-slash in the query string

2004-03-04 Thread D Richard Felker III
On Mon, Mar 01, 2004 at 07:25:52PM +0100, Hrvoje Niksic wrote:
   Removing the offending code fixes the problem, but I'm not sure if
   this is the correct solution. I expect it would be more correct to
   remove multiple slashes only before the first occurrance of ?, but
   not afterwards.
  
  That's exactly what should happen.  Please give us more details, if
  possible accompanied by `-d' output.
 
  If you'd still like details now that you know the version I was
  using, let me know and I'll be happy to do some tests.
 
 Yes please.  For example, this is how it works for me:
 
 $ /usr/bin/wget -d http://www.xemacs.org/something?redirect=http://www.cnn.com;
 DEBUG output created by Wget 1.8.2 on linux-gnu.
 
 --19:23:02--  http://www.xemacs.org/something?redirect=http://www.cnn.com
= `something?redirect=http:%2F%2Fwww.cnn.com'
 Resolving www.xemacs.org... done.
 Caching www.xemacs.org = 199.184.165.136
 Connecting to www.xemacs.org[199.184.165.136]:80... connected.
 Created socket 3.
 Releasing 0x8080b40 (new refcount 1).
 ---request begin---
 GET /something?redirect=http://www.cnn.com HTTP/1.0
 User-Agent: Wget/1.8.2
 Host: www.xemacs.org
 Accept: */*
 Connection: Keep-Alive
 
 ---request end---
 HTTP request sent, awaiting response...
 ...
 
 The request log shows that the slashes are apparently respected.

I retried a test case and found the same thing -- the slashes were
respected. Then I remembered that I was using -i. Wget seems to work
fine with the url on the command line; the bug only happens when the
url is passed in with:

cat EOF | wget -i -
http://...
EOF

Using this method is necessary since it is the ONLY secure way I know
of to do a password-protected http request from a shell script.
Otherwise the password appears on the command line...

Rich



Re: Bug in wget: cannot request urls with double-slash in the query string

2004-03-01 Thread Hrvoje Niksic
D Richard Felker III [EMAIL PROTECTED] writes:

 The following code in url.c makes it impossible to request urls that
 contain multiple slashes in a row in their query string:
[...]

That code is removed in CVS, so multiple slashes now work correctly.

 Think of something like http://foo/bar/redirect.cgi?http://...
 wget translates this into: [...]

Which version of Wget are you using?  I think even Wget 1.8.2 didn't
collapse multiple slashes in query strings, only in paths.

 Removing the offending code fixes the problem, but I'm not sure if
 this is the correct solution. I expect it would be more correct to
 remove multiple slashes only before the first occurrance of ?, but
 not afterwards.

That's exactly what should happen.  Please give us more details, if
possible accompanied by `-d' output.



Re: Bug in wget: cannot request urls with double-slash in the query string

2004-03-01 Thread D Richard Felker III
On Mon, Mar 01, 2004 at 03:36:55PM +0100, Hrvoje Niksic wrote:
 D Richard Felker III [EMAIL PROTECTED] writes:
 
  The following code in url.c makes it impossible to request urls that
  contain multiple slashes in a row in their query string:
 [...]
 
 That code is removed in CVS, so multiple slashes now work correctly.
 
  Think of something like http://foo/bar/redirect.cgi?http://...
  wget translates this into: [...]
 
 Which version of Wget are you using?  I think even Wget 1.8.2 didn't
 collapse multiple slashes in query strings, only in paths.

I was using 1.8.2 and noticed the problem, so I upgraded to 1.9.1 and
it persisted.

  Removing the offending code fixes the problem, but I'm not sure if
  this is the correct solution. I expect it would be more correct to
  remove multiple slashes only before the first occurrance of ?, but
  not afterwards.
 
 That's exactly what should happen.  Please give us more details, if
 possible accompanied by `-d' output.

If you'd still like details now that you know the version I was using,
let me know and I'll be happy to do some tests.

Rich



Re: Bug in wget: cannot request urls with double-slash in the query string

2004-03-01 Thread Hrvoje Niksic
D Richard Felker III [EMAIL PROTECTED] writes:

  Think of something like http://foo/bar/redirect.cgi?http://...
  wget translates this into: [...]
 
 Which version of Wget are you using?  I think even Wget 1.8.2 didn't
 collapse multiple slashes in query strings, only in paths.

 I was using 1.8.2 and noticed the problem, so I upgraded to 1.9.1
 and it persisted.

OK.

  Removing the offending code fixes the problem, but I'm not sure if
  this is the correct solution. I expect it would be more correct to
  remove multiple slashes only before the first occurrance of ?, but
  not afterwards.
 
 That's exactly what should happen.  Please give us more details, if
 possible accompanied by `-d' output.

 If you'd still like details now that you know the version I was
 using, let me know and I'll be happy to do some tests.

Yes please.  For example, this is how it works for me:

$ /usr/bin/wget -d http://www.xemacs.org/something?redirect=http://www.cnn.com;
DEBUG output created by Wget 1.8.2 on linux-gnu.

--19:23:02--  http://www.xemacs.org/something?redirect=http://www.cnn.com
   = `something?redirect=http:%2F%2Fwww.cnn.com'
Resolving www.xemacs.org... done.
Caching www.xemacs.org = 199.184.165.136
Connecting to www.xemacs.org[199.184.165.136]:80... connected.
Created socket 3.
Releasing 0x8080b40 (new refcount 1).
---request begin---
GET /something?redirect=http://www.cnn.com HTTP/1.0
User-Agent: Wget/1.8.2
Host: www.xemacs.org
Accept: */*
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...
...

The request log shows that the slashes are apparently respected.



Bug in wget: cannot request urls with double-slash in the query string

2004-02-29 Thread D Richard Felker III
The following code in url.c makes it impossible to request urls that
contain multiple slashes in a row in their query string:

else if (*h == '/')
{
  /* Ignore empty path elements.  Supporting them well is hard
 (where do you save http://x.com///y.html;?), and they
 don't bring any practical gain.  Plus, they break our
 filesystem-influenced assumptions: allowing them would
 make x/y//../z simplify to x/y/z, whereas most people
 would expect x/z.  */
  ++h;
}

Think of something like http://foo/bar/redirect.cgi?http://...
wget translates this into:

http://foo/bar/redirect.cgi?http:/...

and then the web server of course gives an error. Note that the
problem occurs even if the slashes were url escaped, since wget
unescapes them.

Removing the offending code fixes the problem, but I'm not sure if
this is the correct solution. I expect it would be more correct to
remove multiple slashes only before the first occurrance of ?, but not
afterwards.

Rich



bug in wget 1.8.1/1.8.2

2003-09-16 Thread Dieter Drossmann
Hello,

I use a extra file with a long list of http entries. I included this 
file with the  -i option.
After 154 downloads I got an error message: Segmentation fault.

With wget 1.7.1 everything works well.

Is there a new limit of lines?

Regards,
Dieter Drossmann




Re: bug in wget 1.8.1/1.8.2

2003-09-16 Thread Hrvoje Niksic
Dieter Drossmann [EMAIL PROTECTED] writes:

 I use a extra file with a long list of http entries. I included this
 file with the -i option.  After 154 downloads I got an error
 message: Segmentation fault.

 With wget 1.7.1 everything works well.

 Is there a new limit of lines?

No, there's no built-in line limit, what you're seeing is a bug.

I cannot see anything wrong inspecting the code, so you'll have to
help by providing a gdb backtrace.  You can get it by doing this:

* Compile Wget with `-g' by running `make CFLAGS=-g' in its source
  directory (after configure, of course.)

* Go to the src/ directory and run that version of Wget the same way
  you normally run it, e.g. ./wget -i FILE.

* When Wget crashes, run `gdb wget core', type `bt' and mail us the
  resulting stack trace.

Thanks for the report.



bug in wget - wget break on time msec=0

2003-09-13 Thread Boehn, Gunnar von
Hello,


I think I found a bug in wget.

My GNU wget version is 1.82
My system GNU/Debian unstable


I use wget to replay our apache logfiles to a 
test webserver to try different tuning parameters.


Wget fails to run through the logfile
and give out the error message that msec =0 failed.

This is the command I run
#time wget -q -i replaylog -O /dev/null


Here is the output of strace
#time strace wget -q -i replaylog -O /dev/null

read(4, HTTP/1.1 200 OK\r\nDate: Sat, 13 S..., 4096) = 4096
write(3, \377\330\377\340\0\20JFIF\0\1\1\1\0H\0H\0\0\377\354\0\21...,
3792) = 3792
gettimeofday({1063461157, 858103}, NULL) = 0
select(5, [4], NULL, [4], {900, 0}) = 1 (in [4], left {900, 0})
read(4, \377\0\344=\217\355V\\\232\363\16\221\255\336h\227\361..., 1435) =
1435
write(3, \377\0\344=\217\355V\\\232\363\16\221\255\336h\227\361..., 1435)
= 1435
gettimeofday({1063461157, 858783}, NULL) = 0
time(NULL)  = 1063461157
access(390564.jpg?time=1060510404, F_OK) = -1 ENOENT (No such file or
directory)
time(NULL)  = 1063461157
select(5, [4], NULL, NULL, {0, 1})  = 0 (Timeout)
time(NULL)  = 1063461157
select(5, NULL, [4], [4], {900, 0}) = 1 (out [4], left {900, 0})
write(4, GET /fotos/4/390564.jpg?time=106..., 244) = 244
select(5, [4], NULL, [4], {900, 0}) = 1 (in [4], left {900, 0})
read(4, HTTP/1.1 200 OK\r\nDate: Sat, 13 S..., 4096) = 4096
write(3, \377\330\377\340\0\20JFIF\0\1\1\1\0H\0H\0\0\377\333\0C..., 3792)
= 3792
gettimeofday({1063461157, 880833}, NULL) = 0
select(5, [4], NULL, [4], {900, 0}) = 1 (in [4], left {900, 0})
read(4, \343P\223\36T\4\203Rc\317\257J\4x\2165\303;o\211\256+\222..., 817)
= 817
write(3, \343P\223\36T\4\203Rc\317\257J\4x\2165\303;o\211\256+\222...,
817) = 817
gettimeofday({1063461157, 874729}, NULL) = 0
time(NULL)  = 1063461157
write(2, wget: retr.c:262: calc_rate: Ass..., 60wget: retr.c:262:
calc_rate: Assertion `msecs =
 0' failed.
) = 60
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
getpid()= 7106
kill(7106, SIGABRT) = 0
--- SIGABRT (Aborted) @ 0 (0) ---
+++ killed by SIGABRT +++


I hope that help.
Keep up the good work

Kind regards

Gunnar


Re: bug in wget - wget break on time msec=0

2003-09-13 Thread Hrvoje Niksic
Boehn, Gunnar von [EMAIL PROTECTED] writes:

 I think I found a bug in wget.

You did.  But I believe your subject line is slightly incorrect.  Wget
handles 0 length time intervals (see the assert message), but what it
doesn't handle are negative amounts.  And indeed:

 gettimeofday({1063461157, 858103}, NULL) = 0
 gettimeofday({1063461157, 858783}, NULL) = 0
 gettimeofday({1063461157, 880833}, NULL) = 0
 gettimeofday({1063461157, 874729}, NULL) = 0

As you can see, the last gettimeofday returned time *preceding* the
one before it.  Your ntp daemon must have chosen that precise moment
to set back the system clock by ~6 milliseconds, to which Wget reacted
badly.

Even so, Wget shouldn't crash.  The correct fix is to disallow the
timer code from ever returning decreasing or negative time intervals.
Please let me know if this patch fixes the problem:


2003-09-14  Hrvoje Niksic  [EMAIL PROTECTED]

* utils.c (wtimer_sys_set): Extracted the code that sets the
current time here.
(wtimer_reset): Call it.
(wtimer_sys_diff): Extracted the code that calculates the
difference between two system times here.
(wtimer_elapsed): Call it.
(wtimer_elapsed): Don't return a value smaller than the previous
one, which could previously happen when system time is set back.
Instead, reset start time to current time and note the elapsed
offset for future calculations.  The returned times are now
guaranteed to be monotonically nondecreasing.

Index: src/utils.c
===
RCS file: /pack/anoncvs/wget/src/utils.c,v
retrieving revision 1.51
diff -u -r1.51 utils.c
--- src/utils.c 2002/05/18 02:16:25 1.51
+++ src/utils.c 2003/09/13 23:09:13
@@ -1532,19 +1532,30 @@
 # endif
 #endif /* not WINDOWS */
 
-struct wget_timer {
 #ifdef TIMER_GETTIMEOFDAY
-  long secs;
-  long usecs;
+typedef struct timeval wget_sys_time;
 #endif
 
 #ifdef TIMER_TIME
-  time_t secs;
+typedef time_t wget_sys_time;
 #endif
 
 #ifdef TIMER_WINDOWS
-  ULARGE_INTEGER wintime;
+typedef ULARGE_INTEGER wget_sys_time;
 #endif
+
+struct wget_timer {
+  /* The starting point in time which, subtracted from the current
+ time, yields elapsed time. */
+  wget_sys_time start;
+
+  /* The most recent elapsed time, calculated by wtimer_elapsed().
+ Measured in milliseconds.  */
+  long elapsed_last;
+
+  /* Approximately, the time elapsed between the true start of the
+ measurement and the time represented by START.  */
+  long elapsed_pre_start;
 };
 
 /* Allocate a timer.  It is not legal to do anything with a freshly
@@ -1577,22 +1588,17 @@
   xfree (wt);
 }
 
-/* Reset timer WT.  This establishes the starting point from which
-   wtimer_elapsed() will return the number of elapsed
-   milliseconds.  It is allowed to reset a previously used timer.  */
+/* Store system time to WST.  */
 
-void
-wtimer_reset (struct wget_timer *wt)
+static void
+wtimer_sys_set (wget_sys_time *wst)
 {
 #ifdef TIMER_GETTIMEOFDAY
-  struct timeval t;
-  gettimeofday (t, NULL);
-  wt-secs  = t.tv_sec;
-  wt-usecs = t.tv_usec;
+  gettimeofday (wst, NULL);
 #endif
 
 #ifdef TIMER_TIME
-  wt-secs = time (NULL);
+  time (wst);
 #endif
 
 #ifdef TIMER_WINDOWS
@@ -1600,39 +1606,76 @@
   SYSTEMTIME st;
   GetSystemTime (st);
   SystemTimeToFileTime (st, ft);
-  wt-wintime.HighPart = ft.dwHighDateTime;
-  wt-wintime.LowPart  = ft.dwLowDateTime;
+  wst-HighPart = ft.dwHighDateTime;
+  wst-LowPart  = ft.dwLowDateTime;
 #endif
 }
 
-/* Return the number of milliseconds elapsed since the timer was last
-   reset.  It is allowed to call this function more than once to get
-   increasingly higher elapsed values.  */
+/* Reset timer WT.  This establishes the starting point from which
+   wtimer_elapsed() will return the number of elapsed
+   milliseconds.  It is allowed to reset a previously used timer.  */
 
-long
-wtimer_elapsed (struct wget_timer *wt)
+void
+wtimer_reset (struct wget_timer *wt)
 {
+  /* Set the start time to the current time. */
+  wtimer_sys_set (wt-start);
+  wt-elapsed_last = 0;
+  wt-elapsed_pre_start = 0;
+}
+
+static long
+wtimer_sys_diff (wget_sys_time *wst1, wget_sys_time *wst2)
+{
 #ifdef TIMER_GETTIMEOFDAY
-  struct timeval t;
-  gettimeofday (t, NULL);
-  return (t.tv_sec - wt-secs) * 1000 + (t.tv_usec - wt-usecs) / 1000;
+  return ((wst1-tv_sec - wst2-tv_sec) * 1000
+ + (wst1-tv_usec - wst2-tv_usec) / 1000);
 #endif
 
 #ifdef TIMER_TIME
-  time_t now = time (NULL);
-  return 1000 * (now - wt-secs);
+  return 1000 * (*wst1 - *wst2);
 #endif
 
 #ifdef WINDOWS
-  FILETIME ft;
-  SYSTEMTIME st;
-  ULARGE_INTEGER uli;
-  GetSystemTime (st);
-  SystemTimeToFileTime (st, ft);
-  uli.HighPart = ft.dwHighDateTime;
-  uli.LowPart = ft.dwLowDateTime;
-  return (long)((uli.QuadPart - wt-wintime.QuadPart) / 1);
+  return (long)(wst1-QuadPart - wst2-QuadPart) / 1;
 #endif
+}
+
+/* Return the number of milliseconds

Maybe a bug in wget?

2003-09-09 Thread n_fujikawa
Dear Sir;

 We are using wget-1.8.2 and it's very convinient for our routine
program.  By the way, now we have a trouble with the return code
from wget in case of trying to use it with -r option,  When wget with
-r option fails in a ftp connection, wget returns a code 0.  If no -r
option,  it returns a code 1.  We look over source programs, and find
a suspicious line in ftp.c.

ftp.c

 +1699if ((opt.ftp_glob  wild) || opt.recursive ||
opt.timestamping)
 +1700  {
 +1701/* ftp_retrieve_glob is a catch-all function that gets
called
 +1702   if we need globbing, time-stamping or recursion.  Its
 +1703   third argument is just what we really need.  */
 +1704ftp_retrieve_glob (u, con,
 +1705   (opt.ftp_glob  wild) ? GLOBALL :
GETONE);
 +1706  }
 +1707else
 +1708  res = ftp_loop_internal (u, NULL, con);

We guess the line 1704 should be a following line in order to return the
error code back to the main function.

 +1704res = ftp_retrieve_glob (u, con,
 +1705   (opt.ftp_glob  wild) ? GLOBALL :
GETONE);

Is this right?  If we change ftp.c in this way, does any other problems not
occured?

Best Regards,
   Norihisa Fujikawa,
   Programming Section in Numerical Prediction
Division,
   Japan Meteorological Agency



*** Workaround found ! *** (was: Hostname bug in wget ...)

2003-09-05 Thread webmaster
Hi,

I found a workaround for the problem described below.

Using option -nh does the job for me.

As the subdomains mentioned below are on the same IP
as the main domain wget seems not to compare their
names but the IP only.

If you need more info please let me know.
Have a nice weekend !

Regards
Klaus
--- Forwarded message follows ---
From:   [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Date sent:  Thu, 4 Sep 2003 12:53:39 +0200
Subject:Hostname bug in wget ...
Priority:   normal

... or a silly sleepless webmaster !?

Hi,

Version
==
I use the GNU wget version 1.7 which is found on
OpenBSD Release 3.3 CD.
I use it on i386 architecture.


How to reproduce
==
wget -r coolibri.com
(adding the span hosts option did not improve)


Problem category
=
There seems to be a problem with prepending wrong hostnames.


Problem more detailed

Between fine GETs there are lots of 404s caused by prepending
wrong hostnames. That website consists of several parts
distributed on several subdomains.

coolibri.com

cpu-kuehler.coolibri.com
luefter.coolibri.com
etc.


Example:
=
wget tries to get files that are located on cpu-kuehler.coolibri.com
but does not prepend cpu-kuehler.coolibri.com but coolibri.com
only.

Instead of (correct)
http://cpu-
kuehler.coolibri.com/80_Kuehler_Grafik_Grafikkarte_/80_kuehler_grafik_
grafikkarte_ .html

it tries (incorrect)
http://coolibri.com/80_Kuehler_Grafik_Grafikkarte_/80_kuehler_grafik_g
rafikkarte_.h tml


Tried my best not to waste your time - but some lack
of sleep during last week was not really helpful ;-)

Best regards

Klaus

--- End of forwarded message ---


Re: *** Workaround found ! *** (was: Hostname bug in wget ...)

2003-09-05 Thread Hrvoje Niksic
[EMAIL PROTECTED] writes:

 I found a workaround for the problem described below.

 Using option -nh does the job for me.

 As the subdomains mentioned below are on the same IP
 as the main domain wget seems not to compare their
 names but the IP only.

I believe newer versions of Wget don't do that anymore.  At the time
Wget was originally written, DNS-based virtual hosting was still in
its infancy.  Nowadays almost everyone does it, so what used to be
`-nh' became the default.

Either way, thanks for the report.


Hostname bug in wget ...

2003-09-04 Thread webmaster
... or a silly sleepless webmaster !?

Hi,

Version
==
I use the GNU wget version 1.7 which is found on
OpenBSD Release 3.3 CD.
I use it on i386 architecture.


How to reproduce
==
wget -r coolibri.com
(adding the span hosts option did not improve)


Problem category
=
There seems to be a problem with prepending wrong hostnames.


Problem more detailed

Between fine GETs there are lots of 404s caused by prepending
wrong hostnames. That website consists of several parts
distributed on several subdomains.

coolibri.com

cpu-kuehler.coolibri.com
luefter.coolibri.com
etc.


Example:
=
wget tries to get files that are located on cpu-kuehler.coolibri.com
but does not prepend cpu-kuehler.coolibri.com but coolibri.com only.

Instead of (correct)
http://cpu-
kuehler.coolibri.com/80_Kuehler_Grafik_Grafikkarte_/80_kuehler_grafik_grafikkarte_
.html

it tries (incorrect)
http://coolibri.com/80_Kuehler_Grafik_Grafikkarte_/80_kuehler_grafik_grafikkarte_.h
tml


Tried my best not to waste your time - but some lack
of sleep during last week was not really helpful ;-)

Best regards

Klaus



A small bug in wget

2003-02-28 Thread Håvar Valeur
The bug appers if you use another output file and try to convert the url's
at the same time.

If you try to execute the following:

wget -k -O myFile http://www.stud.ntnu.no/index.html

The file will not convert, becuse wget do not locate the file index.html
since the output-file is not index.html but myFile.



Bug in wget version 1.8.1

2003-02-24 Thread Micha Byrecki
Hello.
In version wget 1.8.1 i got a segfault after executing:
$wget -c -r -k http://www.repairfaq.orghttp://www.repairfaq.org

The bug is probably with two https in command line. I've attached strace
output, but there's rather noting usefull. I have no source code of such
version of wget, so i'm not able to check it out now. If you need any
other, additional information from me just mail me.

-- 
Regards
Michal Byrecki



wget-out
Description: Binary data


Bug in wget version 1.8.1

2003-02-24 Thread Micha Byrecki
Hello again.
Matter about version wget 1.8.1
I downloaded source code of wget 1.8.1, so i can tell you more for now
about this bug :)

Here's more data:
(gdb) set args -c -r -k http://www.repairfaq.orghttp://www.repairfaq.org
(gdb) run
Starting program: /home/byrek/testy/wget-1.8.1/src/wget -c -r -k
http://www.repairfaq.orghttp://www.repairfaq.org

Program received signal SIGSEGV, Segmentation fault.
0x0805ca6c in retrieve_tree (start_url=0x8077b98
http://www.repairfaq.orghttp://www.repairfaq.org;)at recur.c:201
201   url_enqueue (queue, xstrdup (start_url_parsed-url), NULL, 0);
(gdb) 

*
(gdb) bt
#0  0x0805ca6c in retrieve_tree (start_url=0x8077b98
http://www.repairfaq.orghttp://www.repairfaq.org;)at recur.c:201
#1  0x0805a499 in main (argc=-1073743340, argv=0xbb24) at main.c:812
(gdb) 

Since my clock shows 3:52 AM i'm not able to analyze anything except
route to my bed, so i didn't figure it out what's wrong. I hope you'll
do.

-- 
Regards
Michal Byrecki




possible bug in wget?

2003-02-08 Thread unicorn76
error-description

wget aborts with segmentation violation while i try to get some files
recursively.

wget -r -l1 http://somewhere/somewhat.htm

(gdb) where
#0  0x080532a2 in fnmatch ()
#1  0x08065788 in fnmatch ()
#2  0x0805e523 in fnmatch ()
#3  0x08060da7 in fnmatch ()
#4  0x0805c733 in fnmatch ()
#5  0x0804a295 in getsockname ()

i looked into the html file and determined that there where one link two
times.
at this position wget crashes...

my configuration

FreeBSD 5.0-RELEASE-p1
GNU Wget 1.8.2 (build from bsd ports collection)

thanx
unicorn

-- 
+++ GMX - Mail, Messaging  more  http://www.gmx.net +++
NEU: Mit GMX ins Internet. Rund um die Uhr für 1 ct/ Min. surfen!




RE: Bug with wget ? I need help.

2002-06-21 Thread Herold Heiko

Try telnet www.sosi.cnrs.fr 80
if it connects type GET / HTTP/1.0 followed by two newlines. If you don't
get the output of the webserver you probably have a routing problem or
something else.

Heiko 

-- 
-- PREVINET S.p.A.[EMAIL PROTECTED]
-- Via Ferretto, 1ph  x39-041-5907073
-- I-31021 Mogliano V.to (TV) fax x39-041-5907472
-- ITALY

 -Original Message-
 From: Cédric Rosa [mailto:[EMAIL PROTECTED]]
 Sent: Friday, June 21, 2002 4:37 PM
 To: [EMAIL PROTECTED]
 Subject: Bug with wget ? I need help.
 
 
 Hello,
 
 First, scuse my english but I'm french.
 
 When I try with wget (v 1.8.1) to download an url which is 
 behind a router,
 the software wait for ever even if I've specified a timeout.
 
 With ethereal, I've seen that there is no response from the 
 server (ACK
 never appears).
 
 Here is the debug output:
 rosa@r1:~/htmlparser1.1/lib$ wget www.sosi.cnrs.fr
 --16:30:54-- http://www.sosi.cnrs.fr/
 = `index.html'
 Resolving www.sosi.cnrs.fr... done.
 Connecting to www.sosi.cnrs.fr[193.55.87.37]:80...
 
 Thanks by advance for your help.
 Cedric Rosa.
 



Fwd: Bug with wget ? I need help.

2002-06-21 Thread Cédric Rosa

It seems to be the default timer that can't be overwritten.

--17:12:19--  http://www.sosi.cnrs.fr/
= `index.html'
Resolving www.sosi.cnrs.fr... done.
Connecting to www.sosi.cnrs.fr[193.55.87.37]:80... failed: Connection timed 
out.
Giving up.
--17:26--

Someone can reproduce this problem ?


Date: Fri, 21 Jun 2002 16:37:02 +0200
To: [EMAIL PROTECTED]
From: Cédric Rosa [EMAIL PROTECTED]
Subject: Bug with wget ? I need help.

Hello,

First, scuse my english but I'm french.

When I try with wget (v 1.8.1) to download an url which is behind a router,
the software wait for ever even if I've specified a timeout.

With ethereal, I've seen that there is no response from the server (ACK
never appears).

Here is the debug output:
rosa@r1:~/htmlparser1.1/lib$ wget www.sosi.cnrs.fr
--16:30:54-- http://www.sosi.cnrs.fr/
= `index.html'
Resolving www.sosi.cnrs.fr... done.
Connecting to www.sosi.cnrs.fr[193.55.87.37]:80...

Thanks by advance for your help.
Cedric Rosa.




Re: Bug with wget ? I need help.

2002-06-21 Thread Hack Kampbjørn

Cédric Rosa wrote:
 
 Hello,
 
 First, scuse my english but I'm french.
 
 When I try with wget (v 1.8.1) to download an url which is behind a router,
 the software wait for ever even if I've specified a timeout.
 
 With ethereal, I've seen that there is no response from the server (ACK
 never appears).
 
This a documented behavior, because of programming issues the timeout
does not cover the connection but only response after a connection has
been established. For version 1.9 the timeout option will also cover the
connection.

http://cvs.sunsite.dk/viewcvs.cgi/*checkout*/wget/NEWS?rev=HEADcontent-type=text/plain

 Here is the debug output:
 rosa@r1:~/htmlparser1.1/lib$ wget www.sosi.cnrs.fr
 --16:30:54-- http://www.sosi.cnrs.fr/
 = `index.html'
 Resolving www.sosi.cnrs.fr... done.
 Connecting to www.sosi.cnrs.fr[193.55.87.37]:80...
 
 Thanks by advance for your help.
 Cedric Rosa.

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: Bug with wget ? I need help.

2002-06-21 Thread Cédric Rosa

thanks for your help :)
I'm installing version 1.9 to check. I think this update may solve my
problem.

Cedric Rosa.

- Original Message -
From: Hack Kampbjørn [EMAIL PROTECTED]
To: Cédric Rosa [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, June 21, 2002 7:27 PM
Subject: Re: Bug with wget ? I need help.


 Cédric Rosa wrote:
 
  Hello,
 
  First, scuse my english but I'm french.
 
  When I try with wget (v 1.8.1) to download an url which is behind a
router,
  the software wait for ever even if I've specified a timeout.
 
  With ethereal, I've seen that there is no response from the server (ACK
  never appears).
 
 This a documented behavior, because of programming issues the timeout
 does not cover the connection but only response after a connection has
 been established. For version 1.9 the timeout option will also cover the
 connection.


http://cvs.sunsite.dk/viewcvs.cgi/*checkout*/wget/NEWS?rev=HEADcontent-type
=text/plain

  Here is the debug output:
  rosa@r1:~/htmlparser1.1/lib$ wget www.sosi.cnrs.fr
  --16:30:54-- http://www.sosi.cnrs.fr/
  = `index.html'
  Resolving www.sosi.cnrs.fr... done.
  Connecting to www.sosi.cnrs.fr[193.55.87.37]:80...
 
  Thanks by advance for your help.
  Cedric Rosa.

 --
 Med venlig hilsen / Kind regards

 Hack Kampbjørn




(fwd) Bug#149075: wget: option for setting tcp window size

2002-06-16 Thread Noel Koethe

Hello,

I got this feature request:

http://bugs.debian.org/149075

- Forwarded message from Erno Kuusela [EMAIL PROTECTED] -
hello,

it would be really useful to be able to set the tcp window size
for wget, since the default window size can be much too small
for long latency links. also setting the window size to less
than the default would in effect work as a rate limiter.

the window size can be set with the SOL_SOCKET/SO_RCVBUF socket
option.
- End forwarded message -

-- 
Noèl Köthe



Re: small bug in wget manpage: --progress

2002-04-15 Thread Hrvoje Niksic

Noel Koethe [EMAIL PROTECTED] writes:

 the wget 1.8.1 manpage tells me:

--progress=type
Select the type of the progress indicator you wish to
use.  Legal indicators are ``dot'' and ``bar''.

The ``dot'' indicator is used by default.  It traces
the retrieval by printing dots on the screen, each dot
representing a fixed amount of downloaded data.

 But it looks like the default is bar.

Yes.  Thanks for the report; I'm about to apply this fix.


2002-04-15  Hrvoje Niksic  [EMAIL PROTECTED]

* wget.texi (Download Options): Fix the documentation of
`--progress'.

Index: doc/wget.texi
===
RCS file: /pack/anoncvs/wget/doc/wget.texi,v
retrieving revision 1.64
diff -u -r1.64 wget.texi
--- doc/wget.texi   2002/04/13 22:44:16 1.64
+++ doc/wget.texi   2002/04/15 20:52:28
@@ -625,10 +625,15 @@
 Select the type of the progress indicator you wish to use.  Legal
 indicators are ``dot'' and ``bar''.
 
-The ``dot'' indicator is used by default.  It traces the retrieval by
-printing dots on the screen, each dot representing a fixed amount of
-downloaded data.
+The ``bar'' indicator is used by default.  It draws an ASCII progress
+bar graphics (a.k.a ``thermometer'' display) indicating the status of
+retrieval.  If the output is not a TTY, the ``dot'' bar will be used by
+default.
 
+Use @samp{--progress=dot} to switch to the ``dot'' display.  It traces
+the retrieval by printing dots on the screen, each dot representing a
+fixed amount of downloaded data.
+
 When using the dotted retrieval, you may also set the @dfn{style} by
 specifying the type as @samp{dot:@var{style}}.  Different styles assign
 different meaning to one dot.  With the @code{default} style each dot
@@ -639,11 +644,11 @@
 files---each dot represents 64K retrieved, there are eight dots in a
 cluster, and 48 dots on each line (so each line contains 3M).
 
-Specifying @samp{--progress=bar} will draw a nice ASCII progress bar
-graphics (a.k.a ``thermometer'' display) to indicate retrieval.  If the
-output is not a TTY, this option will be ignored, and Wget will revert
-to the dot indicator.  If you want to force the bar indicator, use
-@samp{--progress=bar:force}.
+Note that you can set the default style using the @code{progress}
+command in @file{.wgetrc}.  That setting may be overridden from the
+command line.  The exception is that, when the output is not a TTY, the
+``dot'' progress will be favored over ``bar''.  To force the bar output,
+use @samp{--progress=bar:force}.
 
 @item -N
 @itemx --timestamping



Re: Debian wishlist bug 21148 - wget doesn't allow selectivitybased on mime type

2002-04-10 Thread Hrvoje Niksic

I believe this is already on the todo list.  However, this is made
harder by the fact that, to implement this kind of reject, you have to
start downloading the file.  This is very different from the
filename-based rejection, where the decision can be made at a very
early point in the download process.



Re: debian bug 32712 - wget -m sets atimet to remote mtime.

2002-04-08 Thread Hrvoje Niksic

Good point there.  I wonder... is there a legitimate reason to require
atime to be set to the mtime time?  If not, we could just make the
change without the new option.  In general I'm careful not to add new
options unless they're really necessary.



Re: Debian bug 55145 - wget gets confused by redirects

2002-04-08 Thread Hrvoje Niksic

Guillaume Morin [EMAIL PROTECTED] writes:

 If wget fetches a url which redirects to another host, wget
 retrieves the file, and there's nothing that can be done to turn
 that off.

 So, if you do wget -r on a machine that happens to have a redirect to
 www.yahoo.com you'll wind up trying to pull down a big chunk of
 yahoo.

Hmm.  Are you sure?  Wget 1.8.1 is trying hard to restrict following
redirections by applying the same rules normally used for following
links.  Downloading a half of Yahoo! because someone redirects to
www.yahoo.com is not intended to happen.

I tried to reproduce it by creating a page that redirects to
www.yahoo.com, but Wget behaved correctly:

$ wget -r -l0 http://muc.arsdigita.com:2005/test.tcl
--19:13:53--  http://muc.arsdigita.com:2005/test.tcl
   = `muc.arsdigita.com:2005/test.tcl'
Resolving muc.arsdigita.com... done.
Connecting to muc.arsdigita.com[212.84.246.68]:2005... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://www.yahoo.com [following]
--19:13:53--  http://www.yahoo.com/
   = `www.yahoo.com/index.html'
Resolving www.yahoo.com... done.
Connecting to www.yahoo.com[64.58.76.223]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]

[   =   ] 16,82922.39K/s 

19:13:55 (22.39 KB/s) - `www.yahoo.com/index.html' saved [16829]


FINISHED --19:13:55--
Downloaded: 16,829 bytes in 1 files

Guillaume, exactly how have you reproduced the problem?



suspected bug in WGET 1.8.1

2002-04-03 Thread Matt Jackson

I'm using the NT port of WGET 1.8.1.

FTP retrieval of files works fine, retrieval of directory listings fails.
The problem happens under certain conditions when connecting to OS2 FTP
servers.

For example, if the current directory on the FTP server at login time is
e:/abc, the command wget ftp://userid:password@ipaddr/g:\def\test.doc;
works fine to retrieve the file, but the command wget
ftp://userid:password@ipaddr/g:\def\; fails to retrieve the directory
listing.

For what it's worth, g:\def/, g:/def\ and g:/def/ also fail.


Matt Jackson
(919) 254-4547
[EMAIL PROTECTED]




small bug in wget manpage: --progress

2002-03-11 Thread Noel Koethe

Hello,

the wget 1.8.1 manpage tells me:

   --progress=type
   Select the type of the progress indicator you wish to
   use.  Legal indicators are ``dot'' and ``bar''.

   The ``dot'' indicator is used by default.  It traces
   the retrieval by printing dots on the screen, each dot
   representing a fixed amount of downloaded data.

But it looks like the default is bar.

thx.

-- 
Noèl Köthe



Debian bug 113281 - wget doesn't wait when retrying

2002-02-06 Thread Guillaume Morin

Hi,

I am forwarding Debian bug 113281
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=113281repeatmerged=yes

It still applies to 1.8.1. I am sure it is a bug though




wget doesn't wait when retrying to connect to an FTP server.  Not sure
if
this affects HTTP downloads.

In the case shown below, I was attempting to download from a server that
had reached its user limit.  wget would retry every second or two,
eventually resulting in my system being temporarily banned from the
server
(the last attempt reflects this change).

I tried this with both `wget -w 40 url' (should be the same as `wget
--wait=40 url') and `wget --waitretry=40 url'


[mike@3po][~/download]$ wget --waitretry=40
ftp://ftp.idsoftware.com/idstuff/wolf/linux/wolfmptest-0.7.16-1.x86.run
--15:02:37-- 
ftp://ftp.idsoftware.com/idstuff/wolf/linux/wolfmptest-0.7.16-1.x86.run
   = `wolfmptest-0.7.16-1.x86.run'
Connecting to ftp.idsoftware.com:21... connected!
Logging in as anonymous ... 
The server refuses login.
Retrying.

--15:02:38-- 
ftp://ftp.idsoftware.com/idstuff/wolf/linux/wolfmptest-0.7.16-1.x86.run
  (try: 2) = `wolfmptest-0.7.16-1.x86.run'
Connecting to ftp.idsoftware.com:21... connected!
Logging in as anonymous ... 
The server refuses login.
Retrying.

--15:02:41-- 
ftp://ftp.idsoftware.com/idstuff/wolf/linux/wolfmptest-0.7.16-1.x86.run
  (try: 3) = `wolfmptest-0.7.16-1.x86.run'
Connecting to ftp.idsoftware.com:21... connected!
Logging in as anonymous ... 
The server refuses login.
Retrying.

--15:02:45-- 
ftp://ftp.idsoftware.com/idstuff/wolf/linux/wolfmptest-0.7.16-1.x86.run
  (try: 4) = `wolfmptest-0.7.16-1.x86.run'
Connecting to ftp.idsoftware.com:21... connected!
Logging in as anonymous ... 
Error in server response, closing control connection.
Retrying.



Please keep [EMAIL PROTECTED] CC'ed.

-- 
Guillaume Morin [EMAIL PROTECTED]

I am the saddest kid in grade number two
(Lisa Simpsons)



Debian wishlist bug 21148 - wget doesn't allow selectivity based on mime type

2002-02-06 Thread Guillaume Morin

Hi,

I am forwarding Debian wishlist bug 21148
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=21148repeatmerged=yes



While wget allows me to include/exclude documents based on their
extension,
it doesn't allow me to do the same based on mime type (for example,
if I only want to save text/* documents).



Please keep [EMAIL PROTECTED] CC'ed.

-- 
Guillaume Morin [EMAIL PROTECTED]

Justice is lost, Justice is raped, Justice is done. (Metallica)



Debian bug 117774 - wget returns 0 even when failing when using wildcards

2002-02-05 Thread Guillaume Morin

Hi,

I am forwarding you this bug. I can reproduce this on 1.8.1

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=117774repeatmerged=yes
---


wget seems to always return 0 as return code even when it fails, but
only 
AFAIK when using some wildcard char in the URL. For example:

spiney:~ $ wget --use-proxy=off
ftp://this.should.be.enough.spiney.org/README?;
--14:18:18--  ftp://this.should.be.enough.spiney.org/README?
   = `.listing'
Connecting to this.should.be.enough.spiney.org:21... 
this.should.be.enough.spiney.org: Host not found
unlink: No such file or directory
spiney:~ $ echo $?
0
spiney:~ $ wget --use-proxy=off
ftp://this.should.be.enough.spiney.org/README*;
--14:19:12--  ftp://this.should.be.enough.spiney.org/README*
   = `.listing'
Connecting to this.should.be.enough.spiney.org:21... 
this.should.be.enough.spiney.org: Host not found
unlink: No such file or directory
spiney:~ $ echo $?
0
spiney:~ $ wget --use-proxy=off
ftp://this.should.be.enough.spiney.org/README;
--14:19:21--  ftp://this.should.be.enough.spiney.org/README
   = `README'
Connecting to this.should.be.enough.spiney.org:21... 
this.should.be.enough.spiney.org: Host not found
spiney:~ $ echo $?
1
spiney:~ $ 


-- 
Guillaume Morin [EMAIL PROTECTED]

  Marry me girl, be my only fairy to the world (RHCP)



Re: bug in wget 1.8

2001-12-17 Thread Hrvoje Niksic

Vladimir Volovich [EMAIL PROTECTED] writes:

 while downloading some file (via http) with wget 1.8, i got an error:
 
 assertion failed: p - bp-buffer = bp-width, file progress.c, line 673
 Abort (core dumped)

Thanks for the report.  It's a known problem in 1.8, fixed by this
patch.

Index: src/progress.c
===
RCS file: /pack/anoncvs/wget/src/progress.c,v
retrieving revision 1.21
retrieving revision 1.22
diff -u -r1.21 -r1.22
--- src/progress.c  2001/12/09 01:24:40 1.21
+++ src/progress.c  2001/12/09 04:51:40 1.22
@@ -647,7 +647,7 @@
/* Hours not printed: pad with three spaces (two digits and
   colon). */
APPEND_LITERAL (   );
-  else if (eta_hrs = 10)
+  else if (eta_hrs  10)
/* Hours printed with one digit: pad with one space. */
*p++ = ' ';
   else



bug in wget rate limit feature

2001-12-10 Thread tnetdev



Hi,

 Today I downloaded the new wget release (1.8) (I'm a huge fan of the util
btw ;p ) and have been trying out the rate-limit feature.

When I run:

wget --limit-rate=20k 
http://www.planetmirror.com/pub/debian-cd/2.1_r4/i386/binary-i386-1.iso

I get a core dump with the following output

--10:16:54--
http://www.planetmirror.com/pub/debian-cd/2.1_r4/i386/binary-i386-1.iso
   = `binary-i386-1.iso.1'
Resolving twist... done.
Connecting to twist[167.123.1.1]:8080... connected.
Proxy request sent, awaiting response... 200 OK
Length: 639,348,736 [application/octet-stream]

 0% [
] 64,30619.26K/s ETA 9:00:08 assertion p - bp-buffer =
bp-width failed: file progress.c, line 673
Abort (core dumped) 





twist is our web proxy (running squid)


The funny thing is I can snarf the whole intranet using the -m and rate
limit options with no bugs at all. A huge iso though just makes it fall
over.

I'm running FreeBSD 4.1 (which until it hits ports may be a problem).

I can't test a non-proxied linux pc from here to see if the same thing
happens when grabbing an iso.




keep up the good work with wget!




Re: bug in wget rate limit feature

2001-12-10 Thread Hrvoje Niksic

[EMAIL PROTECTED] writes:

 Today I downloaded the new wget release (1.8) (I'm a huge fan of the
 util btw ;p ) and have been trying out the rate-limit feature.
[...]
 assertion p - bp-buffer = bp-width failed: file progress.c,
 line 673

Thanks for the report.  The bug shows with downloads whose ETA is 10
or more hours, and is trivially fixed by this patch, already applied
to the CVS:

Index: progress.c
===
RCS file: /pack/anoncvs/wget/src/progress.c,v
retrieving revision 1.21
retrieving revision 1.22
diff -u -r1.21 -r1.22
--- progress.c  2001/12/09 01:24:40 1.21
+++ progress.c  2001/12/09 04:51:40 1.22
@@ -647,7 +647,7 @@
/* Hours not printed: pad with three spaces (two digits and
   colon). */
APPEND_LITERAL (   );
-  else if (eta_hrs = 10)
+  else if (eta_hrs  10)
/* Hours printed with one digit: pad with one space. */
*p++ = ' ';
   else



Bug in wget 1.7 prev init.c: wgetrc environment var

2001-11-08 Thread Polley Christopher W

In wget 1.7 and 1.6, if the WGETRC environment variable is set but the file
specified is inaccessible, the message:

wget: (null): No such file or directory.

is displayed and the program exits with status 1.

Debugging traces the problem to the following function in init.c (ca. line
261)

/* Return the path to the user's .wgetrc.  This is either the value of
   `WGETRC' environment variable, or `$HOME/.wgetrc'.

   If the `WGETRC' variable exists but the file does not exist, the
   function will exit().  */
static char *
wgetrc_file_name (void)
{
  char *env, *home;
  char *file = NULL;

  /* Try the environment.  */
  env = getenv (WGETRC);
  if (env  *env)
{
  if (!file_exists_p (env))
{
  fprintf (stderr, %s: %s: %s.\n, exec_name, file, strerror
(errno));
  exit (1);
}
  return xstrdup (env);
}


where the error message is printed

Firstly, file is a null pointer at the time that this error message is
printed; env is the correct pointer to use here.  Secondly, there is no
explanation of why the program is looking for this file.

A possible fix is as follows:

278c278
 fprintf (stderr, %s: Unable to access WGETRC specified in
environment: %s: %s.\n, exec_name, env, strerror (errno));
---
 fprintf (stderr, %s: %s: %s.\n, exec_name, file, strerror
(errno));

the resultant output is now (when WGETRC is set to c:\.wgetrc, and this file
doesn't exist):

wget: Unable to access WGETRC specified in environment: c:\.wgetrc: No such
file
 or directory.

Note:  debugging and patching was done with version 1.6 source.  I have
upgraded my executable to 1.7 and the bug still exists, but I haven't
obtained the source code to see if there are any changes in this function
between ver 1.6 and 1.7.

Warm Regards,
Chris




Bug in wget 1.7

2001-10-03 Thread Thomas Preymesser

Hello.

I have discovered a bug in wget 1.7

When I try to get thist page: http://www.lehele.de/

this error occurs:
-
wget -d -r -l 1 www.lehele.de 
DEBUG output created by Wget 1.7 on linux-gnu.

parseurl (www.lehele.de) - host www.lehele.de - opath  - dir  - 
file  - ndir
newpath: /
Checking for www.lehele.de in host_name_address_map.
Checking for www.lehele.de in host_slave_master_map.
First time I hear about www.lehele.de by that name; looking it up.
Caching www.lehele.de - 212.227.118.88
Checking again for www.lehele.de in host_slave_master_map.
--19:59:51--  http://www.lehele.de/
   = `www.lehele.de/index.html'
Verbindungsaufbau zu www.lehele.de:80... Found www.lehele.de in 
host_name_address_map: 212.227.118.88
Created fd 3.
verbunden!
---request begin---
GET / HTTP/1.0
User-Agent: Wget/1.7
Host: www.lehele.de
Accept: */*
Connection: Keep-Alive

---request end---
HTTP Anforderung gesendet, auf Antwort wird gewartet... HTTP/1.1 200 OK
Date: Wed, 03 Oct 2001 18:03:52 GMT
Server: Apache/1.3.14 (Unix)
Connection: close
Content-Type: text/html


Länge: nicht spezifiziert [text/html]

0K ...@ 332.21 B/s

Closing fd 3
20:00:15 (332.21 B/s) - »www.lehele.de/index.html« gespeichert [3830]

parseurl (www.lehele.de) - host www.lehele.de - opath  - dir  - 
file  - ndir
newpath: /
Loaded www.lehele.de/index.html (size 3830).
Speicherzugriffsfehler (core dumped)


The file index.html is saved an complete in directory www.lehele.de.
If I call wget without recursion then everything is ok, but when i try 
to go deeper wget is crashing.

-Thomas






Re: Size bug in wget-1.7

2001-08-21 Thread Ian Abbott

On 17 Aug 2001, at 11:41, Dave Turner wrote:

 On Fri, 17 Aug 2001, Dave Turner wrote:
 
  By way of a hack I have used the SIZE command, not supported by RFC959 but
  still accepted by many of the servers I use, to get the size of the file.
  If that fails then it falls back on the old method.  The patch is
  attached, in what I hope is an acceptable format.
 
 Guess who forgot the attachment?  Sorry!

Nice patch. I think it can be improved by only sending the SIZE command 
if the file already exists (has a non-zero size, i.e. restval parameter 
in function getftp is non-zero).

I have attached a slightly modified version of Dave Turner 
[EMAIL PROTECTED]' s patch to add the non-zero restval test, 
remove an unused variable 's' from function ftp_size, and I have 
created the patch using the command cvs diff -uR to make it relative to 
the current CVS sources for wget-1.7.1-pre1.

Here is a ChangeLog entry for it:

2001-08-21  Dave Turner [EMAIL PROTECTED]

* ftp-basic.c (ftp_size): New function to send non-standard SIZE
  command to server to request file size.
* ftp.h (ftp_size): Export it.
* ftp.c (getftp): Use new ftp_size function if restoring
  transfer of a file with unknown size.



 wget-1.7.1-pre1-size-fix.patch


Size bug in wget-1.7

2001-08-15 Thread Dave Turner

Not sure if this is wget's fault or a broken server, but it happens on a
lot of servers so maybe it should be handled better.

The bug seems to manifest itself when resuming an FTP transfter and the
length is unauthoritative.  The reported total length is in fact the
remaining length (i.e. the total length minus the length downloaded); the
reported remaining length is the total length minus twice the length
downloaded, which goes negative once you've downloaded 50% of the file!
For example, the actual size of kdebase-2.2.tar.bz2 is 10917131 bytes, and
this is what my log says when the transfer was resumed at 56%:

--11:38:44--  
ftp://ftp.sourceforge.net/pub/mirrors/kde/stable/2.2/src/kdebase-2.2.tar.bz2
   = `kdebase-2.2.tar.bz2'
Connecting to ftp.sourceforge.net:21... connected!
Logging in as anonymous ... Logged in!
== SYST ... done.== PWD ... done.
== TYPE I ... done.  == CWD /pub/mirrors/kde/stable/2.2/src ... done.
== PORT ... done.== REST 6131968 ... done.
== RETR kdebase-2.2.tar.bz2 ... done.
Length: 4,785,163 [-1,346,805 to go] (unauthoritative)

  [ skipping 5760K ]
 5760K   131% @   2.07 KB/s

As an artefact of this bug, the percentage downloaded is also incorrect
(also shown here)

Yours,

Dave Turner
[EMAIL PROTECTED]




bug in wget-1.7/doc/Makefile.in

2001-07-26 Thread Trapp, Michael

hi,

guess there is a bug in the Makefile.in of the doc directory
the wget.1 couldn't be found if --srcdir option is used ...

regards
michael

8-

/ diff doc/Makefile.in doc/Makefile.in_new 
118c118
   $(INSTALL_DATA) $(srcdir)/$(MAN)
$(DESTDIR)$(mandir)/man$(manext)/$(MAN)
---
   $(INSTALL_DATA) $(MAN) $(DESTDIR)$(mandir)/man$(manext)/$(MAN)




Serious bug in Wget/1.6

2001-07-16 Thread Boris 'pi' Piwinger

Hi!

I found the following in the log file of piology.org:
202.108.68.179 - - [15/Jul/2001:10:50:19 +0200] GET /3.14/ HTTP/1.0
404 2332 http://www.go2net.com/useless/useless/pi.html; Wget/1.6
202.108.68.179 - - [15/Jul/2001:12:49:38 +0200] GET /elmi/ HTTP/1.0
404 2316 http://piology.org/photo.html; Wget/1.6
202.108.68.179 - - [15/Jul/2001:13:08:45 +0200] GET
/cgi-bin/events.cgi?act=1Event=3 HTTP/1.0 404 2347
http://piology.org/photo.html; Wget/1.6

If you look at the referrers you see nothing of this is true (all
those pages haven't changed recently).

I guess the problem is that wget ignores virtual hosts. If it notes
that two machines have the same IP it changed absolute links to
relative ones on the presumed host.

pi