wget 1.8.2 configuration problem on Solaris 7 with GCC 2.95.3

2003-01-20 Thread Paul Eggert
Here are the symptoms during the build of wget 1.8.2 on Solaris 7
with GCC 2.95.3:

gcc -I. -I. -I/usr/local/ssl/include   -DHAVE_CONFIG_H 
-DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O2 
-Wall -Wno-implicit -c ftp.c
In file included from /usr/local/ssl/include/openssl/bio.h:65,
 from /usr/local/ssl/include/openssl/ssl.h:119,
 from rbuf.h:34,
 from ftp.c:50:
/usr/local/lib/gcc-lib/sparc-sun-solaris2.7/2.95.3/include/stdarg.h:170: 
warning: redefinition of `va_list'
/usr/include/stdio.h:118: warning: `va_list' previously declared here

The problem is that Solaris 7 stdio.h plays funny games if you define
_XOPEN_SOURCE: it adds a typedef for va_list in that situation.  The
simplest workaround is to not define _XOPEN_SOURCE on Solaris, as it
isn't needed.

More generally, wget should probably be using AC_CHECK_DECLS instead
of AC_CHECK_FUNC to determine whether a function is declared.  That
way wget shouldn't have to tweak the namespace at all; it could just
adapt to the namespace that configure gives it.  But that's a longer
story.

Here is a patch only for the simple problem.

2003-01-20  Paul Eggert  eggert@whale

* src/config.h.in (NAMESPACE_TWEAKS): Remove.
(__EXTENSIONS__): Define only for Solaris.
(_XOPEN_SOURCE, _SVID_SOURCE, _BSD_SOURCE): Define only for Linux.

===
RCS file: src/RCS/config.h.in,v
retrieving revision 1.8.2.0
retrieving revision 1.8.2.1
diff -pu -r1.8.2.0 -r1.8.2.1
--- src/config.h.in 2002/05/18 03:05:14 1.8.2.0
+++ src/config.h.in 2003/01/20 21:50:40 1.8.2.1
@@ -254,30 +254,19 @@ char *alloca ();
Because of that, we define them only on architectures we know
about.  */
 
-#undef NAMESPACE_TWEAKS
-
 #ifdef solaris
-# define NAMESPACE_TWEAKS
+/* Request Solaris extensions, even if the compiler options ask for
+   ANSI C headers.  */
+# define __EXTENSIONS__
 #endif
 
 #ifdef __linux__
-# define NAMESPACE_TWEAKS
-#endif
-
-#ifdef NAMESPACE_TWEAKS
-
-/* Request the Unix 98 compilation environment. */
-#define _XOPEN_SOURCE 500
+/* Request the Unix 98 compilation environment.  */
+# define _XOPEN_SOURCE 500
 
-/* For Solaris: request everything else that is available and doesn't
-   conflict with the above.  */
-#define __EXTENSIONS__
-
-/* For Linux: request features of 4.3BSD and SVID (System V Interface
-   Definition). */
-#define _SVID_SOURCE
-#define _BSD_SOURCE
-
-#endif /* NAMESPACE_TWEAKS */
+/* Request features of 4.3BSD and SVID (System V Interface Definition).  */
+# define _SVID_SOURCE
+# define _BSD_SOURCE
+#endif
 
 #endif /* CONFIG_H */




Re: wget 1.6 problems with FTP globbing through a Squid firewall

2001-05-31 Thread Paul Eggert

 From: Hrvoje Niksic [EMAIL PROTECTED]
 Date: 31 May 2001 11:40:26 +0200
 
 Paul Eggert [EMAIL PROTECTED] writes:
 
  I'm using wget 1.6 on Solaris 8 (sparc), and am connected to the
  Internet via a Squid 2.3.STABLE4 proxy server on a host named
  'firewall'.  I can access single files OK, but I can't use FTP
  globbing.
 
 That's true.  You need to use something like:
 
 wget -rl1 ftp://... -A'glob-pattern-here'

Thanks, I didn't know about that method.  That helps, but it's still a
bit awkward, as it has the following problems:

* A command like wget -rl1 'ftp://elsie.nci.nih.gov/pub/' -A'tz*.tar.gz'
  is less natural than wget 'ftp://elsie.nci.nih.gov/pub/tz*.tar.gz'.

* The -A variant retrieves an index.html file that I don't want.

* The -A variant put the files into a subdirectory, which I don't want.

Presumably the second and third problems can be fixed, but at the expense
of making the first problem worse.


This problem originally arose because I wanted to give people the following
instructions on a web page:

wget 'ftp://elsie.nci.nih.gov/pub/tz*.tar.gz'
gzip -dc tzcode*.tar.gz | tar -xf -
gzip -dc tzdata*.tar.gz | tar -xf -

I want to keep these instructions simple.  It would be nicer if I
don't have to tell people to use a longer-winded command that will
work regardless of whether they're behind a Squid proxy.


  Looking at the code, I don't see a trivial fix.  If you don't see a
  fix either, perhaps the limitation should be documented, and 'wget'
  should refuse to attempt to access globbed files via an FTP proxy; I
  think this would be better than its current behavior, where it
  silently mishandles globbing.
 
 Yes, but it leads to potential problems with accessing files that are
 named '*' or such.

Sorry, I don't understand this point.  I thought that (in principle,
at least) wget should interpret file name globbing consistently,
regardless of whether it is using a proxy.

In other words, if I do not use a proxy and if I type the command

wget 'ftp://elsie.nci.nih.gov/pub/tz*.tar.gz'

and if there happens to be two files on the server, named
'pub/tzFOO.tar.gz' and 'pub/tz*.tar.gz', then I assume wget will
retrieve both files.  Shouldn't the same thing also occur if I do use
a proxy?


(Please understand that I am not complaining -- I'm just trying to
 help wget get better.)



wget 1.6 problems with FTP globbing through a Squid firewall

2001-05-30 Thread Paul Eggert

I'm using wget 1.6 on Solaris 8 (sparc), and am connected to the
Internet via a Squid 2.3.STABLE4 proxy server on a host named
'firewall'.  I can access single files OK, but I can't use FTP globbing.

Looking at the code, I don't see a trivial fix.  If you don't see a
fix either, perhaps the limitation should be documented, and 'wget'
should refuse to attempt to access globbed files via an FTP proxy; I
think this would be better than its current behavior, where it
silently mishandles globbing.

A scenario is enclosed below, showing an unsuccessful (globbed)
retrieval, followed by a successful (non-globbed) retrieval.

$ wget -d --passive-ftp 'ftp://elsie.nci.nih.gov/pub/tz*.tar.gz'
DEBUG output created by Wget 1.6 on solaris2.8.

parseurl (ftp://elsie.nci.nih.gov/pub/tz*.tar.gz;) - host elsie.nci.nih.gov - opath 
pub/tz*.tar.gz - dir pub - file tz*.tar.gz - ndir pub
newpath: /pub/tz*.tar.gz
parseurl (http://firewall:3128/;) - host firewall - port 3128 - opath  - dir  - 
file  - ndir 
newpath: /
--11:32:10--  ftp://elsie.nci.nih.gov/pub/tz*.tar.gz
   = `tz*.tar.gz'
Connecting to firewall:3128... Created fd 4.
connected!
---request begin---
GET ftp://elsie.nci.nih.gov/pub/tz*.tar.gz HTTP/1.0
User-Agent: Wget/1.6
Host: elsie.nci.nih.gov:21
Accept: */*

---request end---
Proxy request sent, awaiting response... HTTP/1.0 404 Not Found
Server: Squid/2.3.STABLE4
Mime-Version: 1.0
Date: Wed, 30 May 2001 18:32:13 GMT
Content-Type: text/html
Content-Length: 1049
Expires: Wed, 30 May 2001 18:32:13 GMT
X-Squid-Error: ERR_FTP_NOT_FOUND 0
X-Cache: MISS from alioth.twinsun.com
Proxy-Connection: close


Closing fd 4
11:32:13 ERROR 404: Not Found.


$ wget -d --passive-ftp 'ftp://elsie.nci.nih.gov/pub/pi.shar.gz'
DEBUG output created by Wget 1.6 on solaris2.8.

parseurl (ftp://elsie.nci.nih.gov/pub/pi.shar.gz;) - host elsie.nci.nih.gov - opath 
pub/pi.shar.gz - dir pub - file pi.shar.gz - ndir pub
newpath: /pub/pi.shar.gz
parseurl (http://firewall:3128/;) - host firewall - port 3128 - opath  - dir  - 
file  - ndir 
newpath: /
--11:31:57--  ftp://elsie.nci.nih.gov/pub/pi.shar.gz
   = `pi.shar.gz'
Connecting to firewall:3128... Created fd 4.
connected!
---request begin---
GET ftp://elsie.nci.nih.gov/pub/pi.shar.gz HTTP/1.0
User-Agent: Wget/1.6
Host: elsie.nci.nih.gov:21
Accept: */*

---request end---
Proxy request sent, awaiting response... HTTP/1.0 200 OK
Server: Squid/2.3.STABLE4
Mime-Version: 1.0
Date: Wed, 30 May 2001 18:08:07 GMT
Content-Type: application/x-shar
Content-Length: 3073
Last-Modified: Wed, 09 Mar 1994 13:37:48 GMT
Content-Encoding: gzip
Age: 1430
X-Cache: HIT from alioth.twinsun.com
Proxy-Connection: close


Length: 3,073 [application/x-shar]

0K - ...[100%]

Closing fd 4
11:31:57 (1.47 MB/s) - `pi.shar.gz' saved [3073/3073]




wget 1.6 inconveniences with FTP access through a FWTK firewall

2001-05-30 Thread Paul Eggert

I'm using wget 1.6 on Solaris 8 (sparc), and am connected to the
Internet via a FWTK FTP proxy http://www.fwtk.org/main.html.

If I want to retrieve a file via the standard Solaris 'ftp' command,
without using 'wget', I do something like this:

$ ftp firewall
Connected to alioth.twinsun.com.
220 alioth FTP proxy (Version V2.1) ready.
Name (firewall:eggert): [EMAIL PROTECTED]
331-(GATEWAY CONNECTED TO elsie.nci.nih.gov)
331-(220 elsie.nci.nih.gov FTP server (Version wu-2.6.0(1) Thu Apr 27 22:04:37 
EDT 2000) ready.)
331 Guest login ok, send your complete e-mail address as password.
Password:[EMAIL PROTECTED]

230 Guest login ok, access restrictions apply.
ftp bin
200 Type set to I.
ftp cd pub
250 CWD command successful.
ftp get pi.shar.gz
200 PORT command successful.
150 Opening BINARY mode data connection for pi.shar.gz (3073 bytes).
226 Transfer complete.
local: pi.shar.gz remote: pi.shar.gz
3073 bytes received in 0.61 seconds (4.92 Kbytes/s)
ftp quit
221-(221-You have transferred 3073 bytes in 1 files.)
221-(221-Total traffic for this session was 3591 bytes in 1 transfers.)
221-(221-Thank you for using the FTP service on elsie.nci.nih.gov.)
221 Goodbye.

If I want to use wget to grab the same file, I have to do something
like this:

$ wget ftp://anonymous%40elsie.nci.nih.gov@firewall/pub/pi.shar.gz
--12:04:24--  ftp://anonymous%40elsie.nci.nih.gov@firewall/pub/pi.shar.gz
   = `pi.shar.gz'
Connecting to firewall:21... connected!
Logging in as [EMAIL PROTECTED] ... Logged in!
== TYPE I ... done.  == CWD pub ... done.
== PORT ... done.== RETR pi.shar.gz ... done.
Length: 3,073 (unauthoritative)

0K - ...[100%]

12:04:26 (5.45 KB/s) - `pi.shar.gz' saved [3073]

The latter command is less convenient than the former, which removes
some of the advantages of wget.  It would be nicer if I could set an
environment variable or something so that wget users could use URLs
like ftp://ftp.gnu.org/; instead of
ftp://anonymous%40ftp.gnu.org@firewall/;.

If I set the ftp_proxy environment variable to be
http://firewall:3128/;, that causes the above example to work (as
Squid is also running on the same firewall), but as I mentioned in my
previous message Squid doesn't work with globbing, whereas the FTWK
FTP firewall does work with globbing.  It would be nice if 'wget'
would conveniently support FWTK as well as Squid.

Perhaps if ftp_proxy is an FTP URL, wget should assume a FWTK-style
proxy?  (Currently it rejects such a setting.)



wget 1.6 porting problem with snprintf and isdigit on Solaris 2.5.1

2001-04-08 Thread Paul Eggert

When building wget 1.6 on Solaris 2.5.1 with GCC 2.95.3, I ran
into the following porting problem.

snprintf.c: In function `dopr':
snprintf.c:230: warning: subscript has type `char'
snprintf.c:254: warning: subscript has type `char'

This is warning that isdigit doesn't work on negative characters
(which are possible on hosts where characters are signed).
Here is a patch.

2001-04-07  Paul Eggert  [EMAIL PROTECTED]

* snprintf.c (is_digit): New macro.
(dopr): Use it instead of isdigit, to avoid problems with negative
characters and pacify GCC.

===
RCS file: src/snprintf.c,v
retrieving revision 1.6
retrieving revision 1.6.0.1
diff -pu -r1.6 -r1.6.0.1
--- src/snprintf.c  2000/11/04 22:49:46 1.6
+++ src/snprintf.c  2001/04/08 06:33:06 1.6.0.1
@@ -160,6 +160,7 @@ static int dopr_outch (char *buffer, siz
 #define DP_C_LLONG   3
 #define DP_C_LDOUBLE 4
 
+#define is_digit(c) ('0' = (c)  (c) = '9')
 #define char_to_int(p) (p - '0')
 #define MAX(p,q) ((p = q) ? p : q)
 #define MIN(p,q) ((p = q) ? p : q)
@@ -227,7 +228,7 @@ static int dopr (char *buffer, size_t ma
   }
   break;
 case DP_S_MIN:
-  if (isdigit(ch)) 
+  if (is_digit (ch)) 
   {
min = 10*min + char_to_int (ch);
ch = *format++;
@@ -251,7 +252,7 @@ static int dopr (char *buffer, size_t ma
state = DP_S_MOD;
   break;
 case DP_S_MAX:
-  if (isdigit(ch)) 
+  if (is_digit (ch)) 
   {
if (max  0)
  max = 0;



Re: wget 1.6 porting problem with snprintf and isdigit on Solaris 2.5.1

2001-04-08 Thread Paul Eggert

 Date: Sun, 8 Apr 2001 12:05:35 +0200
 From: Jan Prikryl [EMAIL PROTECTED]

 Wouldn't just an explicit type cast to `(unsigned char)ch' suffice?

That would work for now, but it won't work if wget got properly
internationalized.  That is because isdigit(x) succeeds for non-ASCII
digits in some locales.  Some locales have multiple ways to represent
the decimal digits, and some locales even have non-decimal digits.

It's best to use isdigit only when one wants _all_ the characters that
are digits, not just '0' through '9'.  If you just want '0' through
'9', then you should use the test '0' = x  x = '9'; this code is
guaranteed to work in all locales.



Re: wget 1.6 porting problem with snprintf and isdigit on Solaris 2.5.1

2001-04-08 Thread Paul Eggert

 From: Hrvoje Niksic [EMAIL PROTECTED]
 Date: 08 Apr 2001 23:06:32 +0200

 In the general case (other is* macros), such hacked-up code is
 probably slower than table lookups.

For the special case of isdigit, '0'=x  x='9' is usually faster
than table lookups.  Decent compilers like GCC optimize that
expression into the equivalent of (x - '0') = 9u, and subtracting a
constant like '0' is typically faster than table lookup.

 It will still work, because Wget doesn't call setlocale() with
 LC_CTYPE.

Yes, Wget doesn't now (on decent hosts), but I thought it conceivable
that it might in the future.  Also, on hosts without LC_MESSAGES, Wget
1.6 invokes setlocale with LC_ALL, which in turn affects LC_CTYPE.
So I thought it safer (as well as faster) for Wget to use '0'=x  x='9'.