Missing asprintf()

2008-09-09 Thread Gisle Vanem

Why the need for asprintf() in url.c:903? This function is
missing on DOS/Win32 and nowhere to be found in ./lib.

I suggest we replace with this:

--- hg-latest/src/url.c  Tue Sep 09 12:37:23 2008
+++ url.c   Tue Sep 09 13:01:33 2008
@@ -893,16 +893,18 @@

  if (error_code == PE_UNSUPPORTED_SCHEME)
{
-  char *error, *p;
+  char *p;
  char *scheme = xstrdup (url);
+  static char error[100];
+
  assert (url_has_scheme (url));

  if ((p = strchr (scheme, ':')))
*p = '\0';
  if (!strcasecmp (scheme, https))
-asprintf (error, _(HTTPS support not compiled in));
+sprintf (error, _(HTTPS support not compiled in));
  else
-asprintf (error, _(parse_errors[error_code]), quote (scheme));
+sprintf (error, _(parse_errors[error_code]), quote (scheme));
  xfree (scheme);

  return error;

---

Here 'error' is guaranteed to be big enough.

--gv


Where is program_name?

2008-09-09 Thread Gisle Vanem
'program_name' is used in lib/error.c, but it is not allocated 
anywhere. Should it be added to main.c and initialised to exec_name?


--gv


Re: Missing asprintf()

2008-09-09 Thread Gisle Vanem

Hrvoje Niksic [EMAIL PROTECTED] wrote:


Wget is supposed to use aprintf, which is defined in utils.c, and is
not specific to Unix.

It's preferable to use an asprintf-like functions than a static buffer
because it supports reentrance (unlike a static buffer) and imposes no
arbitrary limits on error output.


Fine by me. Here is an adjusted patch:

--- hg-latest/src/url.c  Tue Sep 09 12:37:23 2008
+++ url.c   Tue Sep 09 14:37:39 2008
@@ -900,9 +900,9 @@
  if ((p = strchr (scheme, ':')))
*p = '\0';
  if (!strcasecmp (scheme, https))
-asprintf (error, _(HTTPS support not compiled in));
+error = aprintf (_(HTTPS support not compiled in));
  else
-asprintf (error, _(parse_errors[error_code]), quote (scheme));
+error =aprintf (_(parse_errors[error_code]), quote (scheme));
  xfree (scheme);

  return error;

-

--gv


Re: Where is program_name?

2008-09-09 Thread Gisle Vanem

Google for that and you will find the corresponding man page. Like it's
written here 
http://www.tin.org/bin/man.cgi?section=3topic=PROGRAM_INVOCATION_NAME
These variables are automatically initialised by the glibc run-time
startup code.


I'm on Windows. So glibc is of no help here.

--gv


test

2007-10-05 Thread Gisle Vanem

A loop-test; trouble with my subscription.

--gv


Wget stuck after HEAD

2007-08-23 Thread Gisle Vanem

Resently I'm having problems with d/l from some sites using http.
E.g.:

wget -d http://lynx.isc.org/current/lynx2.8.7dev.7.tar.bz2
DEBUG output created by Wget 1.10+devel on Windows-MinGW.

--14:42:04--  http://lynx.isc.org/current/lynx2.8.7dev.7.tar.bz2
Resolving lynx.isc.org... seconds 0.00, 204.152.184.112
Caching lynx.isc.org = 204.152.184.112
Connecting to lynx.isc.org|204.152.184.112|:80... seconds 0.00, connected.
Created socket 1952.
Releasing 0x009d3280 (new refcount 1).

---request begin---
HEAD /current/lynx2.8.7dev.7.tar.bz2 HTTP/1.0
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows 95; Windows 9x 4.0)
Accept: */*
Host: lynx.isc.org
Connection: Keep-Alive
From: Donald Duck [EMAIL PROTECTED]

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Date: Thu, 23 Aug 2007 15:42:04 GMT
Server: Apache/2.0.59 (FreeBSD)
Last-Modified: Fri, 03 Aug 2007 00:04:46 GMT
ETag: 1e6f1c3-2384e0-4e880380
Accept-Ranges: bytes
Content-Length: 2327776
Connection: close
Content-Type: application/x-bzip2

---response end---
200 OK
Length: 2327776 (2.2M) [application/x-bzip2]
Closed fd 1952
14:42:04 (0.00 B/s) - Connection closed at byte 0. Retrying.



The same happens on the 2nd, 3rd.. retry.
It seems there's an issue with the HEAD request. Maybe some
servers doesn't like it? Can anybody on Win32 try the above url?

All this is with a fresh SVN checkout. Built on MingW.

--gv



Re: Wget stuck after HEAD

2007-08-23 Thread Gisle Vanem

Gisle Vanem [EMAIL PROTECTED] wrote:


It seems there's an issue with the HEAD request. Maybe some
servers doesn't like it? Can anybody on Win32 try the above url?


It seems to be connected with this change:

2007-07-04  Mauro Tortonesi  [EMAIL PROTECTED]

   * http.c (http_loop): Skip HEAD request and start immediately with GET
   if -O is given.

since if I use -Ofile, everything works as usual.

--gv


Re: Wget stuck after HEAD

2007-08-23 Thread Gisle Vanem

Micah Cowan [EMAIL PROTECTED] wrote:


My strong suspicion is that you're using the wrong repository?


Spot on!


I believe I asked dotsrc to remove the old one, I'll ping them again on
that.



Please do.

--gv



login incorrect

2006-06-26 Thread Gisle Vanem

Consider this command and output:


wget -d ftp://ftp.openssl.org/snapshot/openssl-SNAP-20060626.tar.gz

DEBUG output created by Wget 1.11-alpha-1 on Windows-MinGW.

--14:06:04--  ftp://ftp.openssl.org/snapshot/openssl-SNAP-20060626.tar.gz
  = `openssl-SNAP-20060626.tar.gz'
Resolving ftp.openssl.org... seconds 0.00, 195.30.6.166
Caching ftp.openssl.org = 195.30.6.166
Connecting to ftp.openssl.org|195.30.6.166|:21... seconds 0.00, connected.
Created socket 1948.
Releasing 0x009d3280 (new refcount 1).
Logging in as anonymous ... 220 ftp.openssl.org FTP Server (ProFTPD) ready.

-- USER anonymous

331 Anonymous login ok, send your complete email address as your password.

-- PASS [EMAIL PROTECTED]

530 Sorry, max 20 users allowed -- try again later, please.

Login incorrect.
Closed fd 1948



Kinda misleading that wget prints login incorrect here. Why couldn't it just
print the 530 message?


wget -V

GNU Wget 1.11-alpha-1

--gv


Re: Trying to use WGET as a poor mans proxy

2006-01-30 Thread Gisle Vanem

Bruso, John [EMAIL PROTECTED] wrote:


I'm trying to get wget to go fetch a url like this:

http://voap.weather.com/weather/oap/82801?template=GENXVpar=nullunit=0

 key=63c4c2fa0cd55c5f42d6e20a7e56586f


but, WGET isn't recognizing some of the parameters in the URL.

Is it possible for WGET to grab a URL like this?


You ought to know that '' is interpreted as a command-separator
under cmd. Use wget url... to fix.

--gv


Re: Unifying Windows Makefiles

2005-07-08 Thread Gisle Vanem

Hrvoje Niksic wrote:


3. Add the redundant #ifdefs to all compilation-dependent C files to
  make sure that they are ignored when the libraries they require are
  missing.  For example, openssl.c should be wrapped in #ifdef
  HAVE_OPENSSL. 


So should http-ntlm.c. But using GNU make it's pretty easy to avoid;
I use my homebrew makefile with things like:

USE_OPENSSL = 1
USE_GNUTLS = 1
...
ifeq ($(USE_OPENSSL),1)
 CFLAGS  += -DHAVE_OPENSSL -I$(OPENSSL_ROOT)/outinc
 SOURCE   = openssl.c http-ntlm.c
endif

SOURCE +=  cmpt.c connect.c ...

If you adopt this style, I urge you to reconsider the #undef HAVE_OPENSSL in 
config.h.
Instructing users (via windows/README) to change USE_OPENSSL in windows/Makefile 
is so much easier/cleaner.



5. Create windows/Makefile-$compiler for each $compiler we support,
  which contains the necessary CFLAGS, LDFLAGS, etc.  configure.bat
  --$compiler would copy both windows/Makefile and
  windows/Makefile-$compiler to src.


I think a single windows/makefile is enough. Instruct make to put objects for each 
target into separate sub-dirs. Along the lines of:


MSVC_OBJECTS = $(addprefix MSVC_obj/, $(SOURCE))
MINGW_OBJECTS = $(addprefix MingW_obj/, $(SOURCE))

MSVC_obj/%.obj: %.c
cl -c $(MSVC_CFLAGS) -Fo$@ $

MingW_obj/%.obj: %.c
gcc -c $(MINGW_CFLAGS) -o $@ $

---

And issuing make -f ../windowsMakefile msvc would AFAICS remove the
need for a configure.bat file (add the rules for ./doc to the same 
windows/Makefile).

--gv


Re: Unifying Windows Makefiles

2005-07-08 Thread Gisle Vanem

MSVC_OBJECTS = $(addprefix MSVC_obj/, $(SOURCE))
MINGW_OBJECTS = $(addprefix MingW_obj/, $(SOURCE))


Should off course be:
MSVC_OBJECTS = $(addprefix MSVC_obj/, $(SOURCE:.c=.obj))
MINGW_OBJECTS = $(addprefix MingW_obj/, $(SOURCE:.c=.o))


MingW_obj/%.obj: %.c
gcc -c $(MINGW_CFLAGS) -o $@ $


And:
 MingW_obj/%.o: %.c
gcc -c $(MINGW_CFLAGS) -o $@ $

--gv


Re: Unifying Windows Makefiles

2005-07-08 Thread Gisle Vanem

Hrvoje Niksic wrote:

Wouldn't you need to have separate targets for linking as well?  


Sure. That target would simply depend on $(MSVC_OBJECTS) etc.:

wget-msvc.exe: $(MSVC_OBJECTS)
   link $(MSVC_LDFLAGS) -out:$@ $^ $(MSVC_EXT_LIBS)

Possibly with an extra mv -f $@ $(INSTALL_DIR)/wget.exe.


And
how would you handle the distinction between compilation flags,
optimization flags, link flags, and so on?


Not sure what you mean; I image windows/Makefile having sections
with 
 MSVC_CFLAGS =  -nologo -MT -W2 -I../windows ...

...
 MINGW_CFLAGS = -Wall -O2 -I../windows ...

same for $COMPILER_LDFLAGS.

I'm not sure how windows/Makefile should be invoked. Maybe the 'all'
target should tell user to invoke with targets 'msvc', 'mingw', 'watcom'
etc. E.g.
 make -f ../windows/makefile mingw USE_OPENSSL=1

--gv


Re: ftp bug in 1.10

2005-06-25 Thread Gisle Vanem

Hrvoje Niksic [EMAIL PROTECTED] wrote:


It should print a line containing 100.  If it does, it means
we're applying the wrong format.  If it doesn't, then we must find
another way of printing LARGE_INT quantities on Windows.


I don't know what compiler OP used, but Wget only uses
%I64 for MSVC on Windows. Ref sysdep.h line 111-114.

--gv


Large file problem

2005-02-27 Thread Gisle Vanem
It doesn't seem the patches to support 2GB files works on
Windows. Wget hangs indefinitely at the end of transfer.
E.g.
\wget.exe  ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nt.gz
--05:38:54--  ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nt.gz
  = `nt.gz'
Resolving ftp.ncbi.nih.gov... 130.14.29.30
Connecting to ftp.ncbi.nih.gov|130.14.29.30|:21... connected.
Logging in as anonymous ... Logged in!
== SYST ... done.== PWD ... done.
== TYPE I ... done.  == CWD /blast/db/FASTA ... done.
== PORT ... done.== RETR nt.gz ... done.
Length: 3,920,316,626 (unauthoritative)
100%[==] 3,920,316,626  
190.04K/sETA 00:00
hangs
I have no idea what's causing this. Windows TaskManager show that wget is 
idle; # of allocations and page-faults are constant. The resulting .gz file is okay 
though.

MingW 3,7 + gcc 3.3.1
Gisle V.
# rm /bin/laden 
/bin/laden: Not found


Re: Large file problem

2005-02-27 Thread Gisle Vanem
Hrvoje Niksic wrote:
Gisle Vanem [EMAIL PROTECTED] writes:
It doesn't seem the patches to support 2GB files works on
Windows. Wget hangs indefinitely at the end of transfer.
Is there a way to trace what syscall Wget is stuck at?  Under Cygwin I
can try to use strace, but I'm not sure if I'll be able to repeat the
bug.
There is strace for Win-NT too. But I dare not install it to find out.
PS. it is quite annoying to get 2 copies of every message. Also,
there should be a Reply-to: header so the replies goes to the list.
Just my 0.02 .
--gv 



Re: Large file support

2005-02-26 Thread Gisle Vanem
Hrvoje Niksic wrote:
In other words, large files now work on Windows?  I must admit, that
was almost too easy.  :-)
Don't open the champagne bottle just yet :)
Now could someone try this with Borland and/or Watcom and MingW?  I'm
pretty sure I broke them in some places, but it's near impossible to
fix it without having the compilers available for testing.
Patch attached. 

errno/WSAGetLastError() handling and reporting is still broken for all 
Win32 compilers. Search for SET_ERRNO() in the mail-archive. And also
here:
http://www.mail-archive.com/wget%40sunsite.dk/msg06475.html

*_ERRNO() is in my working copy. Drop that until I come up with a use for them.
--gv
diff -u3 -Hb -r CVS-Latest/src/mswindows.h src/mswindows.h
--- CVS-Latest/src/mswindows.h  Fri Feb 25 23:23:21 2005
+++ src/mswindows.h Sat Feb 26 13:54:53 2005
@@ -83,7 +83,11 @@
/* Define a wgint type under Windows. */
typedef __int64 wgint;
#define SIZEOF_WGINT 8
+#ifdef __GNUC__
+#define WGINT_MAX 9223372036854775807LL
+#else
#define WGINT_MAX 9223372036854775807I64
+#endif
#define str_to_wgint str_to_int64
__int64 str_to_int64 (const char *, char **, int);
@@ -99,7 +103,7 @@
# define fstat(fd, buf) _fstati64 (fd, buf)
#endif
-#if defined(_MSC_VER)
+#if defined(_MSC_VER) || defined(__MINGW32__)
# define struct_stat struct _stati64
#elif defined(__BORLANDC__)
# define struct_stat struct stati64
diff -u3 -Hb -r CVS-Latest/src/sysdep.h src/sysdep.h
--- CVS-Latest/src/sysdep.h Wed Feb 23 21:21:04 2005
+++ src/sysdep.hSat Feb 26 13:43:34 2005
@@ -108,6 +108,17 @@
#endif
#endif
+/* Hacks for setting/getting errno / h_errno.  */
+#ifdef WINDOWS
+# define GET_ERRNO()  errno = WSAGetLastError()
+# define SET_ERRNO(e) WSASetLastError (errno = (e))
+# define SET_H_ERRNO(e)   WSASetLastError (e)
+#else
+# define GET_ERRNO()  ((void)0)
+# define SET_ERRNO(e) ((void)(errno = (e)))
+# define SET_H_ERRNO(e)   ((void)(h_errno = (e)))
+#endif
+
/* Define a large integral type useful for storing large sizes that
   exceed sizes of one download, such as when printing the sum of all
   downloads.  Note that this has nothing to do with large file
@@ -127,7 +138,7 @@
typedef long long LARGE_INT;
#  define LARGE_INT_FMT %lld
# else
-#  if _MSC_VER
+#  if defined(_MSC_VER) || defined(__MINGW32__) || defined(__WATCOMC__)
/* Use __int64 under Windows. */
typedef __int64 LARGE_INT;
#   define LARGE_INT_FMT %I64


Re: O_EXCL under Windows?

2005-02-25 Thread Gisle Vanem
Hrvoje Niksic wrote:
Is there a way to get the functionality of open(..., O_CREAT|O_EXCL)
under Windows?  For those who don't know, O_EXCL opens the file
exclusively, guaranteeing that the file we're opening will not be
overwritten.  (Note that it's not enough to check that the file
doesn't exist before opening it; it can spring into existence between
the check and the open.)
This works with MingW and MSVC. Watcom acts a bit odd if you use
fdopen (fd, w+) afterwards. A snippet from my contribution to libnet
(works across processes of course):
#include fcntl.h
#include share.h
#include io.h
int open_flags = O_WRONLY | O_CREAT | O_TRUNC;
/* possibly drop the O_WRONLY for r/w */
int fd = sopen (file_name, open_flags | O_BINARY | _O_SEQUENTIAL,
SH_DENYWR, S_IREAD | S_IWRITE);
_O_SEQUENTIAL is just to tell the cache manager to stop wasting
memory. 

--gv


Re: Windows and long long

2005-02-20 Thread Gisle Vanem
Hrvoje Niksic wrote:
Does MSVC support long long?  If not, how does one...
No, it has a '__int64' built-in.
* print __int64 values?  I assume printf(%lld, ...) doesn't work?
Correct, use %I64d for signed 64-bit and %I64u for unsigned.
* retrieve __int64 values from strings?  I assume there is no
  strtoll?
No, but MSVC7 has a 
 __int64 __cdecl _strtoi64(const char *str, char **endptr, int base); 

MingW and Watcom have strtoll(). For MSVC6 one could use
sscanf (str,%I64d,val);
--gv


Re: Back after a while

2005-02-15 Thread Gisle Vanem
Hrvoje Niksic wrote:
For the last several months I've been completely absent from Wget
development, and from the net in general.  Here is why, and the story
is not for the faint of heart.
Glad you're back and hope your health is getting better. 

The TODO list has grown a bit while you've been away. Hope we
can work on some of the items. Most importanly and most requested
is proably support for large files. AFAICS this should be pretty easy.
Look at Leonids (?) patch.
--gv 


Re: errno and Windows

2004-11-27 Thread Gisle Vanem
Wget incorrectly tests 'errno' after network calls on Windows.
'errno' is *not* set on failure. One must use WSAGetLastError() for
this. I've added a SET/GET_ERRNO() macro to do this more portably.
I sent this patch to Wget-patches a week ago, but heard nothing.
Isn't there not anybody monitoring that list? 

--gv


Re: Cannot pass ampersand in URL?

2004-06-12 Thread Gisle Vanem
Phil Lewis [EMAIL PROTECTED] said:

 I have been trying to wget a URL with an ampersand (e.g.:
 http://search.yahoo.com/search?fr=slv1-
 http://search.yahoo.com/search?fr=slv1-ei=UTF-8p=wget ei=UTF-8p=wget).
 I substitute %26 for the ampersands, but wget does not download the page. 

Judging from your mailer (Outlook), I assume you're running Wget from
Windows' cmd or similar. Both  and % have special meaning to shells
on Windows. Protect the URL by putting backquotes or  around it.

This works fine here (in 4NT):
wget `http://search.yahoo.com/search?fr=slv1-ei=UTF-8p=wget`

--gv




Re: Large Files Support for Wget

2004-05-08 Thread Gisle Vanem
Hrvoje Niksic [EMAIL PROTECTED] said:

#define FILE_OFF_T 32/64 bit unsigned
 
 Fair enough.  But isn't off_t signed?

Obs, yes. A 'long' on Win32.
 
#define FILE_OFF_FMT  %llu or %Lu
 
 How does gettext cope with that?  For example, this string is what
 worries me:
 
 printf (_(The file is  FILE_OFF_FMT  octets long.\n), size);
 
I assume Wget needs a msg-entry for each string. 
  The file is %Ld octets long.\n
  The file is %lld octets long.\n

unless messages is built on same platform as Wget is.
Not sure this is possible with the msg* tools.

--gv



non-ASCII in host names

2004-03-19 Thread Gisle Vanem
Trying to connect to hosts with non-ASCII in the name doesn't
work. E.g.
  wget www.tromsø.no

Resolving www.troms%f8.no... failed: Host not found.

(ø = o with slash, oslash;) The host does in fact exist.

I have to use the ACE form www.xn--troms-zua.no
which is a bit of a pain.
Ref. http://www.norid.no/domenenavnbaser/ace/?language=en

Why is wget munging the hostname here? Seem it calls
reencode_escapes() on the hostname part. Why I don't know.

If it where not for the Host: header, the name could remain
un-escaped. I don't know what the standard say about this case.
Should the header contain Host:www.xn--troms-zua.no ?

--gv




Re: non-ASCII in host names

2004-03-19 Thread Gisle Vanem
Hrvoje Niksic [EMAIL PROTECTED] said:

  If it where not for the Host: header, the name could remain
  un-escaped. I don't know what the standard say about this case.
  Should the header contain Host:www.xn--troms-zua.no ?

 The Host header is (I think) not URL-escaped, so we can simply send
 the 8-bit characters as we received them.

 Here's a patch; please let me know if it works for you.

It works kind off; wget resolves the name okay. The problem is that
www.tromsø.no is served by a virtual server that gives you what's specified
in the Host: header. So doing
  wget www.xn--troms-zua.no

gives the correct page while wget www.tromsø.no does not (the
same in IE also).

IMHO for this to work, wget needs to know the ACE encoded name
prior to resolving and building the HTTP header. Not a trivial task.

PS. Windows does not actually support non-ASCII in it's
DNS resolver. I had to use the hosts file.

--gv




Windows titlebar fix

2004-03-02 Thread Gisle Vanem
ws_percenttitle() should not be called in quiet mode since ws_changetitle() 
AFAICS is only called in verbose mode. That caused an assert in 
mswindows.c. An easy patch:

--- CVS-latest\src\retr.c   Sun Dec 14 14:35:27 2003
+++ src\retr.c  Tue Mar 02 21:18:55 2004
@@ -311,7 +311,7 @@
   if (progress)
progress_update (progress, ret, wtimer_read (timer));
 #ifdef WINDOWS
-  if (toread  0)
+  if (toread  0  !opt.quiet)
ws_percenttitle (100.0 *
 (startpos + sum_read) / (startpos + toread));
 #endif

--gv




Re: Windows titlebar fix

2004-03-02 Thread Gisle Vanem
 We could also fix this by calling ws_changetitle() unconditionally.  Should the 
 title bar be affected by verbosity?

IMHO yes, Quiet is quiet.

--gv



Re: fork_to_background() on Windows

2003-12-20 Thread Gisle Vanem
 The shell is smart enough to run those 2 commands in series. If wget
 is a GUI app, it runs them in parallell causing gzip to fail.
 
  wget -O- http://host/index.html | most
 
 works kind of; only some of the stdout data gets displayed.

I've searched google and the only way AFAICS to get redirection
in a GUI app to work is to create 3 pipes. Then use a thread (or
run_with_timeout with infinite timeout) to read/write the console 
handles to put/get data into/from the parent's I/O handles. I don't 
fully understand how yet, but it could get messy. 

Just for the sake of running Wget in the background, it doesn't 
seem to be worth. Unless someone else have a better idea.

--gv



fork_to_background() on Windows

2003-12-19 Thread Gisle Vanem
The fork-to-background on Windows is just a joke (as the comment
in config.h.mingw says). Is anybody using it? It could be a useful feature 
if we attach to the console of calling process (the shell in most cases) 
at startup. Then when ^Break is pressed (or '-b' specified), we free
that console and continue running in the background. That way we will
get the shell prompt back (and not as it is know with Wget seemingly 
hanging idle until finished).

A brief description of how I did accomplish this:

* In makefiles, add
  CFLAGS += -Dmain=wget_main and
  link wget as a GUI app (-Wl,--subsystem,windows or
  /subsystem:windows)

* In mswindows.c, add a WinMain() function that sets up a detached
  console (or for Win-9x/ME/NT allocates one). Reopen stdin, stdout 
  and stderr using $CONIN etc. Call wget_main(). 
  BTW. We could do the Winsock init stuf here too to avoid cluttering
main.c

* In fork_to_background(), free the attached or allocated console
  seemingly continuing in background.

This actually works fine, but redirection (e.g. wget -h  foo) doesn't 
work.I don't know any way to get at the redirected stdout/stderr handles 
from Wget. But then you're not supposed to do that from a GUI app.

I've not tested on anything but Win-XP. I've attached the modfied 
mswindows.c if anybody could try it on other Win OS'es.
(define 'CTRLBREAK_BACKGND' in config.* or makefiles).

Anybody see other ways to run detached or in background. I haven't 
checked how Cygwin's fork() does it, but hear it is pretty slow. 
And can we live without redirection?

Gisle V.


mswindows.c
Description: Binary data


Re: fork_to_background() on Windows

2003-12-19 Thread Gisle Vanem
Hrvoje Niksic [EMAIL PROTECTED] said:

 Making `-b' work would be great.  But is it really desirable for
 ctrl-break to put Wget in background?  I thought ctrl-break was
 supposed to interrupt and abort the program, like ^C on Unix?  Hmm,
 now I see that ctrl-break backgrounds Wget even now, so I guess people
 don't mind; I don't remember receiving a report for this.

ws_handler() now acts on ^C and ^Break depending on #define.
Maybe Windows could be told to use ^Z as interrupt key?.

 redirections on Windows (as far as I know).  But does something like
 `wget -O -  FILE' work now?  How about `wget -O - | command...'?  If
 those currently work, we might want to be careful not to break them.

Using - and redirection works if linking as a console app (the default). 
And wget -O- http://host/index.html | most works fine, but not as a 
GUI app. Since a GUI app (or a subsystem 2 specified in the PE-header), 
doesn't get a console when it's started. 

If I in my shell (4NT or CMD) do:

 wget http://host/file.tar.gz  gzip -d file.tar.gz

The shell is smart enough to run those 2 commands in series. If wget
is a GUI app, it runs them in parallell causing gzip to fail.

 wget -O- http://host/index.html | most

works kind of; only some of the stdout data gets displayed.

--gv



Re: Using a proxy from command line under win32?

2003-12-07 Thread Gisle Vanem
Jago Pearce [EMAIL PROTECTED] said:

 The documentation is a bit unclear, an example would help because at the 
 moment I'm trying
 
 wget http://www.foo.com/[EMAIL PROTECTED]:3128
 
 Which isn't working.

Do you *always* need to use a HTTP proxy? If so you can put this
in your '.wgetrc' file:
  useproxy = on
  httpproxy = cache.proxy.com:3128

and use '--no-proxy' on the cmd-line when you don't need it.

PS. '.wgetrc' should be in a directory pointed to by $WGETRC.
Or use a wget.ini in directory of wget.exe.

--gv



Recursive ftp

2003-12-07 Thread Gisle Vanem
Some minor issues with recursive ftp.

If all extensions are rejected (no file in .listing are accepted), Wget
still issues a PORT and an empty RETR command. Is this WAD
(working as designed)?
E.g.
  wget -r -Ahtm ftp://host/foo/
when I really intended -Ahtml

It would be nice if Wget could say what number the current file is in
the total. E.g. Length: 17400 [file 10 of 68]. It should be possible
for ftp if we get a .listing file but impossible( ?) for http.

Gisle V.

# rm /bin/laden 
/bin/laden: Not found



Re: Recursive ftp broken

2003-11-26 Thread Gisle Vanem
 Interestingly, I can't repeat this.  Still, to be on the safe side, I
 added some additional restraints to the code that make it behave more
 like the previous code, that worked.  Please try again and see if it
 works now.  If not, please provide some form of debugging output as
 well.

This Changelog fixed it:
* ftp.c: Set con-csock to -1 where rbuf_uninitialize was
previously used.

Thanks.

--gv



Recursive ftp broken

2003-11-22 Thread Gisle Vanem
I don't know when it happened, but latest CVS version breaks
recursive ftp download. I tried with this:

wget -rAZIP ftp://ftp.mpoli.fi/pub/software/DOS/NETWORK/

and the result is:

--20:46:02--  ftp://ftp.mpoli.fi/pub/software/DOS/NETWORK/
   = `ftp.mpoli.fi/pub/software/DOS/NETWORK/.listing'
Resolving ftp.mpoli.fi... 80.81.183.82
Connecting to ftp.mpoli.fi|80.81.183.82|:21... connected.
Logging in as anonymous ... Logged in!
== SYST ... done.== PWD ... done.
== TYPE I ... done.  == CWD /pub/software/DOS/NETWORK ... done.
== PORT ... done.== LIST ... done.

[ = ] 2,342 --.--K/s

20:46:03 (73.78 KB/s) - `ftp.mpoli.fi/pub/software/DOS/NETWORK/.listing' saved [2342]

Removed `ftp.mpoli.fi/pub/software/DOS/NETWORK/.listing'.
Rejecting `DLX0_8A.EXE'.
Rejecting `INDEX.HTM'.
Rejecting `MINUARC.EXE'.
Rejecting `_INDEX_'.
--20:46:03--  ftp://ftp.mpoli.fi/pub/software/DOS/NETWORK/BAN-SHIM.ZIP
   = `ftp.mpoli.fi/pub/software/DOS/NETWORK/BAN-SHIM.ZIP'
Connecting to ftp.mpoli.fi|80.81.183.82|:21... connected.
Logging in as anonymous ... Logged in!
== SYST ... done.== PWD ... done.   !   is '/' here
== TYPE I ... done.  == CWD not required.
== PORT ... done.== RETR BAN-SHIM.ZIP ...
No such file `BAN-SHIM.ZIP'.
...

Don't know why Wget is reconnecting for each file thus the CWD becomes 
'/' each time. Using ftp and mget is no problem.

Gisle V.

# rm /bin/laden 
/bin/laden: Not found



Re: ipv6 patch

2003-11-19 Thread Gisle Vanem
Herold Heiko [EMAIL PROTECTED] said:

 Attached a little patch needed for current cvs in order to compile on
 windows nt 4 (any system without IPV6 really).

FYI, Wget/IPv6 on Windows do work somewhat; getaddrinfo()
is able to resolve a host to it's IPv6 address(es). But getnameinfo()
isn't able to convert it back to a presentation address. Same behaviour
as Linux w/o a IPv6 stack AFAICS.

Due to lack of inet_ntop() on Windows, I used this instead:

  struct sockaddr_in6 addr6;
  addr6.sin6_family = AF_INET6;
  memcpy (addr6.sin6_addr, address, sizeof(addr6.sin6_addr));
  if (getnameinfo ((const struct sockaddr*)addr6, sizeof(addr6),
   buf, size, NULL, 0, NI_NUMERICHOST) == 0)
return (buf);
  return (NULL);

but Wget doesn't check return value of inet_ntop(). Hint hint.
So the trace looks a bit weird:

wget -d6 ftp://ftp.deepspace6.net/
Resolving ftp.deepspace6.net... seconds 0.00, ,
Caching ftp.deepspace6.net = 
Connecting to ftp.deepspace6.net||:21... failed: Address family not supported
Connecting to ftp.deepspace6.net||:21... failed: Address family not supported

Note Wget tries twice; once for each address in the list.
A little refinement would be to stop trying if the address-families
are the same.

--gv



Re: ipv6 patch

2003-11-19 Thread Gisle Vanem
  but Wget doesn't check return value of inet_ntop(). Hint hint.
 
 I wasn't aware that inet_ntop could really fail.  Why did getaddrinfo
 return the address if I can't print it?

getaddrinfo() on Win-XP seems to be a thin wrapper over the DNS
client which resolves  records fine. But getnameinfo() seems
to rely on some deeper IPv6 stuff being installed.

So inet_ntop() can fail if coded using getnameinfo() as I described.
Therefore I adapted Paul Vixie's inet_ntop() which works w/o IPv6
installed.

 getaddrinfo shouldn't even return IPv6 addresses if AF_INET6 is not
 supported.  There is code that tries to handle this case, but it
 obviously fails on Windows.  Are you using the latest CVS?
 
Compiled from yesterdays CVS. Note, I used '-6', so socket_has_inet6() 
is bypassed which is okay IMHO.

Your statement getaddrinfo shouldn't even return IPv6 ...
contadicts what you wrote earlier:

[EMAIL PROTECTED]
As to why my system resolves IPv6 addresses in the first place -- good
question.  But it's a more or less default Red Hat Linux 9 setting,
I'm sure I won't be the only one with this problem.

So Win-XP and RH9 does pretty much the same.

--gv



Re: Translations for 1.9.1

2003-11-17 Thread Gisle Vanem
Manfred Schwarb [EMAIL PROTECTED] said:

 But on my machine, translations are broken somehow, all special characters
 are scrambled. With wget 1.9 this didn't happen.

 Example from de.po:
 #: src/convert.c:439
 #, c-format
 msgid Cannot back up %s as %s: %s\n
 msgstr Anlegen eines Backups von %s als %s nicht mglich: %s\n

It's normal. de.po is written in UTF-8.
Use e.g
cat de.po | iconv -f UTF-8 -t CP850

to display correctly, but gettext should handle this fine.

--gv




--inet6-only option

2003-11-15 Thread Gisle Vanem
MingW version compiled with ENABLE_IPV6, HAVE_GETADDRINFO etc.
but no HAVE_GETADDRINFO_AI_ADDRCONFIG.

Running wget -6 url.. on a machine with no IPv6 installed silently
uses IPv4. A warning with fallback to IPv4 is IMHO okay. Or an exit?

Not sure about the rationale behind '--inet6-only', but the problem seems 
to be in defaults(); when socket() fails, 'opt.ipv4_only=1' is forced before 
the cmd-line is parsed. So lookup_host() uses AF_UNSPEC in the hints.

My suggestion is to move the socket(AF_INET?) test to host.c and
call it once from lookup_host(). Take action (exit?) if 'opt.ipv?_only' 
doesn't match returned 'res.ai_family'.

Similarily, on a machine with only IPv6 (it they really exist?),
'--inet4-only' should also fail or give a warning.

Gisle V.

# rm /bin/laden 
/bin/laden: Not found 



gettext and charsets

2003-11-04 Thread Gisle Vanem
This should go to the gettext people, but I couldn't find any
mailing list.

I've built Wget with NLS support on Win-XP, but the display 
char-set is wrong. Built with LOCALEDIR=g:/MingW32/share

BTW. This is IMHO so ugly. Shouldn't there be a way to
set this at runtime (as Lynx does). E.g. have a $WGET_LOCALEDIR
and call bindtextdomain() on that. $LANGUAGE doesn't
seem to handle drive letters and ':' on the Win32 version of gettext.

But the main problem I can solve by e.g.
  wget -h | iconv -f ISO-8859-1 -t CP850

Isn't there a better way?

--gv




Re: gettext and charsets

2003-11-04 Thread Gisle Vanem
Hrvoje Niksic [EMAIL PROTECTED] said:

 I'm not sure about the charset issues on Windows.  Does gettext detect
 the presence of GNU iconv?  (I assume you have the latter if you have
 the `iconv' command.)

libintl depends on libiconv:
 cygcheck wget.exe
..
  f:\windows\System32\libintl-2.dll
f:\windows\System32\libiconv-2.dll

Browsing the sources, I found the answer:
  set OUTPUT_CHARSET=CP850

--gv



accept() error

2003-11-01 Thread Gisle Vanem
In some ftp downloads I occationally see the error
accept: Timed out
Retrying

immediately and then hanging in the PORT command
for a long time. Studying the code, I can understand why:

uerr_t
acceptport (int *sock)
{
  struct sockaddr_storage ss;
  struct sockaddr *sa = (struct sockaddr *)ss;
  socklen_t addrlen = sizeof (ss);

#ifdef HAVE_SELECT
  if (select_fd (msock, opt.connect_timeout, 0) = 0)
return ACCEPTERR;
#endif

AFAICS 'opt.connect_timeout' is set to 0.0 to indicate
indefinite timeout. So select() ends prematurely. Shouldn't
select() use tv == NULL in this case? What am I missing 
here?

Wget 1.9+cvs-dev,  Win-XP.

--gv



Re: errno patches for Windows

2003-10-16 Thread Gisle Vanem
Hrvoje Niksic [EMAIL PROTECTED] said:

 OK.  So the whole thing with errno is only necessary when dealing with
 Winsock errors.  For errors from, say, fopen it's fine to use errno?

Yes.
 
 There is another possible approach.  We already #define read and write
 to call Winsock stuff.  We could add some more magic so that they and
 other Winsock invocations automatically set errno to last error value,
 translating Windows errors to errno errors. 

Then all Winsock functions must be wrapped in such macro.
E.g (untested):
#define SOCK_SELECT(fd,rd,wr,ex,tv)  ( \
int _rc = select (fd,rd,wr,ex,tv), \
(int)(WSAGetLastError() ? (errno = WSAGetLastError()) : (0)), \
_rc)

which could get messy; hard to return with a value from such a
macro.

 static struct errentry errtable[] = {
   {  ERROR_INVALID_FUNCTION,   EINVAL},  /* 1 */
   {  ERROR_FILE_NOT_FOUND, ENOENT},  /* 2 */

XEmacs is probably using native Win functions (e.g CreateFile
instead of fopen), so it needs to map them to Unix errnos. Wget only 
uses ANSI/Winsock functions, so only WS errors need attention.

Besides, on Windows there is no suiteable errno.h value for
e.g. ENOTCONN; we must use the winsock*.h value WSAENOTCONN.
So the XEmacs method wouldn't work.

--gv



Re: errno patches for Windows

2003-10-16 Thread Gisle Vanem
Hrvoje Niksic [EMAIL PROTECTED] said:

 #ifdef WINDOWS
 # define select(a, b, c, d) windows_select (a, b, c, d)
 #endif

Okay by me.
 
 #ifndef ENOTCONN
 # define ENOTCONN X_ENOTCONN
 #endif

Except you cannot make Winsock return X_ENOTCONN.
It returns WSAENOTCONN (def'ed to ENOTCONN in
mswindows.h). Winsock errors are in the range
WSABASEERR (1) to 11031 with some holes in
the range.

 const char *
 windows_strerror (int err)
 {
   /* Leave the standard ones to strerror. */
   if (err  X_ERRBASE)
 return strerror (err);
 
   /* Handle the unsupported ones manually. */
   switch (err)
 {
   case X_ENOTCONN:
 return Connection refused;

Which AFAICS is the pretty much the same as in my patch.

Another thing is that Wget could mask errnos for Unix
too. In connect.c:

 ...
   {
 CLOSE (sock);
 sock = -1;
 goto out;
   }

out:
...
 else
   {
 save_errno = errno;
 if (!silent)
   logprintf (LOG_VERBOSE, failed: %s.\n, strerror (errno));
 errno = save_errno;
   }

The close() could possibly set errno too, but we want the errno 
from bind() or connect() don't we?

--gv



Re: Error in wget-1.9-b5.zip

2003-10-15 Thread Gisle Vanem
 Error in wget-1.9-b5.zip

wget cannot find the host. Turn on -d option and observe:

Location: http://www.yourworstenemy.com?tgpid=008drefid=393627 [following]
Closing fd 1952
--13:38:35--  http://www.yourworstenemy.com/?tgpid=008drefid=393627
   = `tmp2/www.yourworstenemy.com/[EMAIL PROTECTED]refid=393627'
Resolving www.yourworstenemy.com... seconds 0.00, failed: Host not found.

--gv





Re: Error in wget-1.9-b5.zip

2003-10-15 Thread Gisle Vanem
Hrvoje Niksic [EMAIL PROTECTED] said:

 Note that David's Wget seems to have printed unknown error, not
 Host not found.  Is that an artifact of his version of system
 libraries, or is Wget doing something wrong?

I don't know how/when Windows could print anything else 
what's already in herrmsg(): HOST_NOT_FOUND, NO_RECOVERY
NO_DATA, NO_ADDRESS or TRY_AGAIN. 
Should maybe print the error-number otherwise.

--gv



Re: Error in wget-1.9-b5.zip

2003-10-15 Thread Gisle Vanem
Hrvoje Niksic [EMAIL PROTECTED] said:

 Note that David's Wget seems to have printed unknown error, not
 Host not found.  Is that an artifact of his version of system
 libraries, or is Wget doing something wrong?

That's because wget incorrectly uses strerror() for Winsock
errors or uses 'errno' when that's not set. The correct thing would be 
to set 'errno' to last Winsock error and make a compatible function 
that returns correct string for both sys-errors and WS errors. 
E.g. we put this in mswindows.h:
  #define strerror(err) win_strerror (err)
  extern const char*win_strerror (int err);  

But the problem is that sys_errlist[] always returns English texts, but
Windows's FormatMessage() returns in native language. So to be 
consistent, I suggest we return English also for Winsock errors (easier 
when we receive a bug-report from a user without the proper wget.gmo file.
Who ever uses NLS anyway?).

I could commit a patch if we agree on this.

--gv



touch() on Windows

2003-10-13 Thread Gisle Vanem
It seems touch() is called on an open file and hence
utime() is either silently ignored or causing Access denied on
Watcom.

I added this inside touch():
  DEBUGP ((touching %s to %.24s\n, file, asctime(localtime(tm;

And ran:

wget -d -Otcpdump.tgz http://www.tcpdump.org/daily/tcpdump-2003.09.29.tar.gz

---request begin---
GET /daily/tcpdump-2003.09.29.tar.gz HTTP/1.0
User-Agent: Wget/1.9-b5
...
Last-Modified: Mon, 29 Sep 2003 09:05:27 GMT
ETag: 492f4-7e693-3f77f5d7
...

touching tcpdump.tgz to Mon Sep 29 11:05:27 2003
..

dir /mk tcp*

13.10.2003  19:01 517 779  tcpdump.tgz

As you see utime() doesn't do anything.
The code in http.c/ftp.c is hard to follow, so question is 
when the touch() is called. Should IMHO be called after the 
file is closed.

Alternative hack (only for MingW/MSVC) is to use
int _futime (int handle, struct _utimbuf *filetime);

but I guess no one else has this.

--gv




Re: touch() on Windows

2003-10-13 Thread Gisle Vanem
 It seems touch() is called on an open file and hence
 utime() is either silently ignored or causing Access denied on
 Watcom.

Correction; Watcom says Permission denied.

--gv




Re: touch() on Windows

2003-10-13 Thread Gisle Vanem
Hrvoje Niksic [EMAIL PROTECTED] said:

 Wget already has code that closes and reopens output document if
 it's a regular file.  Perhaps the same should be done here...

Allthough IE or other browsers doesn't seems to do it, I think it would
be a good thing to honour the Last-Modified header on regular output 
files.

--gv





Re: Bug in Windows binary?

2003-10-05 Thread Gisle Vanem
Jens Rsner [EMAIL PROTECTED] said:

 I downloaded
 wget 1.9 beta 2003/09/29 from Heiko
 http://xoomer.virgilio.it/hherold/
...
 wget -d http://www.google.com
 DEBUG output created by Wget 1.9-beta on Windows.

 set_sleep_mode(): mode 0x8001, rc 0x8000

 I disabled my wgetrc as well and the output was exactly the same.

 I then tested
 wget 1.9 beta 2003/09/18 (earlier build!)
 from the same place and it works smoothly.

 Can anyone reproduce this bug?

Yes, but the MSVC version crashed on my machine.  But I've found
the cause caused by my recent change :(

A simple case of wrong calling-convention:

--- mswindows.c.org Mon Sep 29 11:46:06 2003
+++ mswindows.c Sun Oct 05 17:34:48 2003
@@ -306,7 +306,7 @@
 DWORD set_sleep_mode (DWORD mode)
 {
   HMODULE mod = LoadLibrary (kernel32.dll);
-  DWORD (*_SetThreadExecutionState) (DWORD) = NULL;
+  DWORD (WINAPI *_SetThreadExecutionState) (DWORD) = NULL;
   DWORD rc = (DWORD)-1;

I assume Heiko didn't notice it because he doesn't have that function
in his kernel32.dll. Heiko and Hrvoje, will you correct this ASAP?

--gv




mswindows.h patch

2003-10-03 Thread Gisle Vanem
Regarding my run_with_timeout() patch, I forgot the following 
patch to mswindows.h (which isnt included in util.c).

In my forthcoming patches for IPv6, we need to use the correct 
Winsock headers. To avoid ifdef clutter throughout the .c-files, I've
put them in mswindows.h. So the .c-files should never include it, 
but only need network headers like this:
  #ifndef WINDOWS
  # include sys/socket.h
  # include netdb.h
   ...
  #endif
  #include wget.h  

The above which includes sysdep.h which includes
mswindows.h. 

--- CVS-latest/src/mswindows.h   Tue Sep 30 23:24:36 2003
+++ src/mswindows.h Fri Oct 03 16:57:57 2003
@@ -30,6 +30,37 @@
 #ifndef MSWINDOWS_H
 #define MSWINDOWS_H

+#ifndef WGET_H
+#error Include mswindows.h inside or after wget.h
+#endif
+
+#ifndef WIN32_LEAN_AND_MEAN
+#define WIN32_LEAN_AND_MEAN  /* Prevent inclusion of winsock*.h in windows.h */
+#endif
+
+#include windows.h
+
+/* Use the correct winsock header; ws2tcpip.h includes winsock2.h only on
+ * Watcom/MingW. We cannot use winsock.h for IPv6. Using getaddrinfo() requires
+ * ws2tcpip.h
+ */
+#if defined(ENABLE_IPV6) || defined(HAVE_GETADDRINFO)
+# include winsock2.h
+# include ws2tcpip.h
+#else
+# include winsock.h
+#endif
+
+#ifndef EAI_SYSTEM
+#define EAI_SYSTEM -1   /* value doesn't matter */
+#endif
+
+/* Must include sys/stat.h because of 'stat' define below. */
+#include sys/stat.h
+
+/* Missing in several .c files. Include here. */
+#include io.h
+
 /* Apparently needed for alloca(). */
 #include malloc.h

@@ -81,8 +112,6 @@
 # define mkdir(a, b) mkdir(a)
 #endif /* __BORLANDC__ */

-#include windows.h
-
 /* Declarations of various socket errors: */
@@ -136,5 +164,21 @@
 char *ws_mypath (void);
 void ws_help (const char *);
 void windows_main_junk (int *, char **, char **);
+
+/* Things needed for IPv6; missing in ws2tcpip.h. */
+#ifdef ENABLE_IPV6
+ #ifndef HAVE_NTOP
+  extern const char *inet_ntop (int af, const void *src, char *dst, size_t size);
+ #endif
+ #ifndef HAVE_PTON
+  extern int inet_pton (int af, const char *src, void *dst);
+ #endif
+#endif /* ENABLE_IPV6 */

-

Defining WIN32_LEAN_AND_MEAN also makes it compile much faster.

I think it would be handy to have 'opt.debug' in levels of verbosity. 
I.e. '-dd' gives a more chatty wget. Or should it be '-vv'? I'm a bit 
confused about the distinction between those options. I propose we 
add this macro to wget.h:

# define DEBUGN(level,x)   do { if (opt.debug = (level)) \
 DEBUGP (x); } while (0)

And patch init.c:

@@ -85,6 +85,7 @@
 CMD_DECLARE (cmd_boolean);
 CMD_DECLARE (cmd_bytes);
 CMD_DECLARE (cmd_directory_vector);
+CMD_DECLARE (cmd_increment);
 CMD_DECLARE (cmd_lockable_boolean);
 CMD_DECLARE (cmd_number);
 CMD_DECLARE (cmd_number_inf);
@@ -129,7 +128,7 @@
   { cookies, opt.cookies,   cmd_boolean },
   { cutdirs, opt.cut_dirs,  cmd_number },
 #ifdef DEBUG
-  { debug,   opt.debug, cmd_boolean },
+  { debug,   opt.debug, cmd_increment },
 #endif
   { deleteafter, opt.delete_after,  cmd_boolean },
   { dirprefix,   opt.dir_prefix,cmd_directory },
@@ -632,6 +631,17 @@
 }

   *(int *)closure = bool_value;
+  return 1;
+}
+
+/* Increment a value from VAL to CLOSURE.  COM is ignored,
+   except for error messages.  */
+static int
+cmd_increment (const char *com, const char *val, void *closure)
+{
+  int tmp;
+  if (cmd_boolean(com,val,tmp))
+ (*(int*)closure)++;
   return 1;
 }


Wadda you think? AFAIK only wget.texi should be updated.
Add this to @item -d:
  To get increased verbosity turn up the debug-level
  by repeating this option. E.g. @samp{-dd} or
  @samp{--debug --debug}.

And one last patch (close - CLOSE):

--- CVS-latest/src/connect.c Mon Sep 22 15:55:22 2003
+++ src/connect.c   Thu Oct 02 16:52:33 2003
@@ -37,9 +37,7 @@
 #endif
 #include assert.h

-#ifdef WINDOWS
-# include winsock.h
-#else
+#ifndef WINDOWS
 # include sys/socket.h
 # include netdb.h
 # include netinet/in.h
@@ -201,7 +199,7 @@
   wget_sockaddr_set_address (bsa, ip_default_family, 0, bind_address);
   if (bind (sock, bsa.sa, sockaddr_len ()))
{
- close (sock);
+ CLOSE (sock);
  sock = -1;
  goto out;
}
@@ -211,7 +209,7 @@
   if (connect_with_timeout (sock, sa.sa, sockaddr_len (),
opt.connect_timeout)  0)
 {
-  close (sock);
+  CLOSE (sock);
   sock = -1;
   goto out;
 }
--

--gv




run_with_timeout() for Windows

2003-10-02 Thread Gisle Vanem
I've patched util.c to make run_with_timeout() work on
Windows (better than it does with alarm()!).

In short it creates and starts a thread, then loops querying 
the thread exit-code. breaks if != STLL_ACTIVE, else sleep
for 0.1 sec. Uses a wget_timer too for added accuracy.

Tested with --dns-timeout, --connect-timeout, gethostbyname()
and getaddrinfo(). Built and tested wih MingW/gcc 3,3,1, OpenWatcom 
1.1 and DMC 8.36, but not MSVC 6. All seems okay. 

I have a problem with run_with_timeout() returning 1 and hence
lookup_host() reporting ETIMEDOUT. Isn't TRY_AGAIN more suited
indicating the caller should try a longer timeout?

Patch against beta-2 (I think):

--- src/utils.c.orig Sun Sep 21 01:12:18 2003
+++ src/utils.c Thu Oct 02 22:04:01 2003
@@ -1965,12 +1965,141 @@
 # endif /* not HAVE_SIGSETJMP */
 #endif /* USE_SIGNAL_TIMEOUT */
 
+
+#if defined(WINDOWS)
+
+/* Wait for thread completion in 0.1s intervals (a tradeoff between 
+ * CPU loading and resolution).
+ */
+#define THREAD_WAIT_INTV   100  
+#define THREAD_STACK_SIZE  4096 
+
+struct thread_data {
+   void (*fun) (void *);
+   void  *arg;
+   DWORD ws_error; 
+};
+
+static DWORD WINAPI 
+thread_helper (void *arg)
+{
+  struct thread_data *td = (struct thread_data *) arg;
+  
+  WSASetLastError (0);
+  td-ws_error = 0;
+  (*td-fun) (td-arg);
+  
+  /* Since run_with_timeout() is only used for Winsock functions and
+   * Winsock errors are per-thread, we must return this to caller.
+   */
+  td-ws_error = WSAGetLastError();
+  return (0); 
+}
+
+#ifdef GV_DEBUG  /* I'll remove this eventually */
+#define DEBUGN(lvl,x)  do { if (opt.verbose = (lvl)) DEBUGP (x); } while (0)
+#else
+#define DEBUGN(lvl,x)  ((void)0)
+#endif  
+
+/*
+ * Create a thread for 'fun' to run in. Since call-convention of 'fun' is
+ * undefined [1], we must call it via thread_helper() which must be __stdcall/WINAPI.
+ *
+ * Return -1 if illegal timeout or failed to create thread.
+ * Return +1 on thread timeout,
+ * else 0 (okay)
+ *
+ * [1] MSVC can use __fastcall globally (cl /Gr) and on Watcom this is the
+ * default (wcc386 -3r). 
+ */
+static BOOL
+spawn_thread (double seconds, void (*fun) (void *), void *arg)
+{
+  static HANDLE thread_hnd = NULL;
+  struct thread_data thread_arg;
+  struct wget_timer *timer;
+  DWORD  thread_id, exitCode;
+  double elapsed, max_msec;
+  
+  DEBUGN (2, (seconds %.2f, , seconds));
+  
+  if (seconds == 0.0)
+return (-1); /* run blocking 'fun' */
+
+  if (seconds  1.0)
+seconds = 1.0;
+   
+  /* Should never happen, but test for recursivety anyway */
+  assert (thread_hnd == NULL);  
+  thread_arg.arg = arg;
+  thread_arg.fun = fun;
+  thread_hnd = CreateThread (NULL, THREAD_STACK_SIZE,
+ thread_helper, (void*)thread_arg, 
+ 0, thread_id); 
+  if (!thread_hnd)
+  {
+DEBUGP ((CreateThread() failed; %s\n, strerror(GetLastError(;
+return (-1);  
+  }
+ 
+  exitCode = STILL_ACTIVE;
+  max_msec = 1000.0 * seconds;
+  timer = wtimer_new();  
+  
+  /* Sleep() isn't very accurate, so do a double check in the for-loop */
+  for (elapsed = 0.0; 
+   elapsed  max_msec  wtimer_elapsed(timer)  max_msec;
+   elapsed += (double)THREAD_WAIT_INTV)
+  {
+GetExitCodeThread (thread_hnd, exitCode);
+DEBUGN (2, (thread exit-code %lu\n, exitCode));
+if (exitCode != STILL_ACTIVE)
+   break;
+Sleep (THREAD_WAIT_INTV);
+  }
+  
+  DEBUGN (2, (elapsed %.2f, wtimer_elapsed %.2f, , elapsed, wtimer_elapsed(timer)));
+  
+  wtimer_delete (timer);
+
+  /* If we timed out kill the thread. Normal thread exitCode would be 0.
+   */
+  if (exitCode == STILL_ACTIVE)
+  {
+DEBUGN (2, (thread timed out\n));
+exitCode = 1;
+TerminateThread (thread_hnd, exitCode);
+WSASetLastError (ETIMEDOUT); /* overridden by caller */
+  }  
+  else
+  {
+DEBUGN (2, (thread exit-code %lu, WS error %lu\n, exitCode, 
thread_arg.ws_error));
+exitCode = 0; 
+WSASetLastError (thread_arg.ws_error);
+  }  
+  thread_hnd = NULL;
+  return (exitCode);
+}
+#endif  /* WINDOWS */
+
 int
 run_with_timeout (double timeout, void (*fun) (void *), void *arg)
 {
-#ifndef USE_SIGNAL_TIMEOUT
+#if defined(WINDOWS)
+  int rc = spawn_thread (timeout, fun, arg);
+  
+  if (rc  0)
+  {
+fun (arg);
+rc = 0;
+  }  
+  return rc;
+  
+#elif !defined(USE_SIGNAL_TIMEOUT)
   fun (arg);
   return 0;
+
 #else
   int saved_errno;



Gisle V.

# rm /bin/laden 
/bin/laden: Not found



Re: run_with_timeout() for Windows

2003-10-02 Thread Gisle Vanem
Forgot this in src/Changelog:

2003-10-02  Gisle Vanem  [EMAIL PROTECTED]

* utils.c (run_with_timeout): For Windows: Run the 'fun' in
  a thread via a helper function. Continually query the 
  thread's exit-code until finished or timed out. 

PS.:

+static DWORD WINAPI 
+thread_helper (void *arg)
+{
+  struct thread_data *td = (struct thread_data *) arg;
+  
+  WSASetLastError (0);
+  td-ws_error = 0;

AFAIK, error-codes are inherited from parent-thread, but
not conveyed back. That's why I clear it in the new thread.

Gisle V.

# rm /bin/laden 
/bin/laden: Not found 



Re: run_with_timeout() for Windows

2003-10-02 Thread Gisle Vanem
Hrvoje Niksic [EMAIL PROTECTED] said:

 I've committed this patch, with minor changes, such as moving the code
 to mswindows.c.  Since I don't have MSVC, someone else will need to
 check that the code compiles.  Please let me know how it goes.

It compiled it with MSVC okay, but crashed somewhere
unrelated. Both before and after my patch.

--gv



Windows titlebar patch

2003-09-29 Thread Gisle Vanem
I've made a patch to show percentage downloaded in addition
to URL in the titlebar. Real handy IMHO on minimised windows 
sitting in the bottom toolbar.

2003-09-26  Gisle Vanem  [EMAIL PROTECTED]
  * src/mswindows.c: Added ws_percenttitle() showing progress
in the window titlebar. Called from retr.c. 
Secured ws_mypath().

  * windows/config.h.ms: alloca() prototype not needed.
Removed #undef ENABLE_NLS; should be in Makefile IMHO. 
Moved WGET_USE_STDARG from mswindows.h to config.ms.h 
because of #ifdef in log.c. (MSVC's vararg.h and stdarg.h
are incompatible).

diff -u3 -H -B -r src/mswindows.c.orig src/mswindows.c
--- src/mswindows.c.orig Sat Sep 27 02:35:28 2003
+++ src/mswindows.c Tue Sep 30 03:15:37 2003
@@ -37,6 +37,7 @@
 #include string.h
 #include assert.h
 #include errno.h
+#include math.h
 
 #ifdef HACK_BCC_UTIME_BUG
 # include io.h
@@ -176,22 +178,41 @@
   return TRUE;
 }
 
+static char *title_buf = NULL;
+static char *curr_url  = NULL;
+static int   num_urls  = 0;
+
 void
-ws_changetitle (char *url, int nurl)
+ws_changetitle (const char *url, int nurl)
 {
-  char *title_buf;
   if (!nurl)
 return;
 
-  title_buf = (char *)alloca (strlen (url) + 20);
-  sprintf (title_buf, Wget %s%s, url, nurl == 1 ?  :  ...);
-  SetConsoleTitle (title_buf);
+  num_urls = nurl;
+  if (title_buf)
+ xfree(title_buf);
+  if (curr_url)
+ xfree(curr_url);
+  title_buf = (char *)xmalloc (strlen (url) + 20);
+  curr_url = xstrdup(url);
+  sprintf(title_buf, Wget %s%s, url, nurl == 1 ?  :  ...);
+  SetConsoleTitle(title_buf);
+}
+
+void
+ws_percenttitle (double percent)
+{
+  if (num_urls == 1  title_buf  curr_url  fabs(percent) = 100.0)
+{
+  sprintf (title_buf, Wget [%.1f%%] %s, percent, curr_url);
+  SetConsoleTitle (title_buf);
+}
 }
 
 char *
 ws_mypath (void)
 {
-  static char *wspathsave;
+  static char *wspathsave = NULL;
   char buffer[MAX_PATH];
   char *ptr;
 
@@ -200,14 +221,11 @@
   return wspathsave;
 }
 
-  GetModuleFileName (NULL, buffer, MAX_PATH);
-
-  ptr = strrchr (buffer, '\\');
-  if (ptr)
+  if (GetModuleFileName (NULL, buffer, MAX_PATH) 
+  (ptr = strrchr (buffer, PATH_SEPARATOR)) != NULL)
 {
   *(ptr + 1) = '\0';
-  wspathsave = (char*) xmalloc (strlen (buffer) + 1);
-  strcpy (wspathsave, buffer);
+  wspathsave = xstrdup (buffer);
 }
   else
 wspathsave = NULL;
 
diff -u3 -H -B -r src/mswindows.h.orig src/mswindows.h
--- src/mswindows.h.orig Sat Sep 27 02:35:28 2003
+++ src/mswindows.h Sat Sep 27 06:36:44 2003
@@ -65,10 +65,6 @@
 #endif
 #endif
 
-/* Use ANSI-style stdargs regardless of whether the compiler bothers
-   to define __STDC__.  (Many don't when extensions are enabled.)  */
-#define WGET_USE_STDARG
-
 #define REALCLOSE(x) closesocket (x)
 
 /* read  write don't work with sockets on Windows 95.  */
@@ -135,7 +132,8 @@
 #endif
 
 void ws_startup (void);
-void ws_changetitle (char*, int);
+void ws_changetitle (const char*, int);
+void ws_percenttitle (double);
 char *ws_mypath (void);
 void ws_help (const char *);
 void windows_main_junk (int *, char **, char **);

diff -u3 -H -B -r src/retr.c.orig src/retr.c
--- src/retr.c.orig Mon Sep 22 15:34:55 2003
+++ src/retr.c Sat Sep 27 07:00:25 2003
@@ -238,7 +238,13 @@
 
   if (progress)
  progress_update (progress, res, dltime);
+
   *len += res;
+#ifdef WINDOWS
+  if (use_expected  expected  0)
+ ws_percenttitle (100.0 * (double)(*len) / (double)expected);
+#endif
+
 }
   if (res  -1)
 res = -1;

diff -u3 -H -B -r windows/config.h.ms.orig windows/config.h.ms
--- windows/config.h.ms.orig Sat Sep 27 02:35:31 2003
+++ windows/config.h.ms Sat Sep 27 05:14:41 2003
@@ -32,11 +32,6 @@
 /* Define if you have the alloca.h header file.  */
 #undef HAVE_ALLOCA_H
 
-#if !defined(__GNUC__)  !defined(__DMC__)  !defined(__WATCOMC__)
-/* Microsoft and Watcom libraries have an alloca function. */
-char *alloca ();
-#endif
-
 /* Define to empty if the keyword does not work.  */
 /* #undef const */
 
@@ -53,9 +48,6 @@
significant byte first).  */
 #undef WORDS_BIGENDIAN
 
-/* Define this if you want the NLS support.  */
-#undef ENABLE_NLS
-
 /* Define if you want the FTP support for Opie compiled in.  */
 #define USE_OPIE 1
 
@@ -127,6 +119,11 @@
 
 /* Define if you have the stdarg.h header file.  */
 #define HAVE_STDARG_H 1
+
+/* Use ANSI-style stdargs regardless of whether the compiler bothers
+   to define __STDC__.  (Many don't when extensions are enabled.)
+   This define used to be in mswindows.h, but wheren't making any use there */
+#define WGET_USE_STDARG
 
 /* Define if you have the stdlib.h header file.  */
 #define HAVE_STDLIB_H 1

---

Gisle V.

# rm /bin/laden 
/bin/laden: Not found



Windows patches

2003-09-26 Thread Gisle Vanem
Some more patches for wget on Windows.

1) config.h.ms: DMC already have usleep() and sleep().
2) mswindows.c: 
- Removed read_registry() as it's not needed.
- Added set_sleep_mode() to prevent Windows entering sleep-mode
  or hibernation on long transfers. Console mode programs doesn't seem
  to reset the idle-counter (as GUI programs do).

Patches against latest CVS version:

--- orig/windows/config.h.msFri Sep 26 00:39:37 2003
+++ windows/config.h.ms  Sat Sep 27 01:12:43 2003
@@ -139,9 +139,10 @@
 #undef HAVE_UNISTD_H
 #endif

-/* None except Digital Mars have usleep function */
+/* None except Digital Mars have sleep/usleep functions */
 #if defined(__DMC__)
 #define HAVE_USLEEP
+#define HAVE_SLEEP
 #endif

diff -u3 -H -B orig/src/mswindows.c ./mswindows.c
--- orig/src/mswindows.c Fri Sep 26 00:39:35 2003
+++ ./mswindows.c Sat Sep 27 01:58:57 2003
@@ -57,13 +57,25 @@
 extern int errno;
 #endif

+#ifndef ES_SYSTEM_REQUIRED
+#define ES_SYSTEM_REQUIRED  0x0001
+#endif
+
+#ifndef ES_CONTINUOUS
+#define ES_CONTINUOUS   0x8000
+#endif
+
+
 /* Defined in log.c.  */
 void log_request_redirect_output PARAMS ((const char *));

-static int windows_nt_p;
+static DWORD set_sleep_mode (DWORD mode);

+static DWORD pwr_mode = 0;
+static int windows_nt_p;

 #ifndef HAVE_SLEEP
+
 /* Emulation of Unix sleep.  */

 unsigned int
@@ -92,21 +105,6 @@
 }
 #endif  /* HAVE_USLEEP */

-static char *
-read_registry (HKEY hkey, char *subkey, char *valuename, char *buf, int *len)
-{
-  HKEY result;
-  DWORD size = *len;
-  DWORD type = REG_SZ;
-  if (RegOpenKeyEx (hkey, subkey, 0, KEY_READ, result) != ERROR_SUCCESS)
-return NULL;
-  if (RegQueryValueEx (result, valuename, NULL, type, (LPBYTE)buf, size) != 
ERROR_SUCCESS)
-buf = NULL;
-  *len = size;
-  RegCloseKey (result);
-  return buf;
-}
-
 void
 windows_main_junk (int *argc, char **argv, char **exec_name)
 {
@@ -125,6 +123,9 @@
 ws_cleanup (void)
 {
   WSACleanup ();
+  if (pwr_mode)
+ set_sleep_mode (pwr_mode);
+  pwr_mode = 0;
 }

 static void
@@ -170,7 +171,7 @@
 case CTRL_CLOSE_EVENT:
 case CTRL_LOGOFF_EVENT:
 default:
-  WSACleanup ();
+  ws_cleanup ();
   return FALSE;
 }
   return TRUE;
@@ -266,6 +267,7 @@
   exit (1);
 }
   atexit (ws_cleanup);
+  pwr_mode = set_sleep_mode (0);
   SetConsoleCtrlHandler (ws_handler, TRUE);
 }

@@ -295,3 +297,31 @@
   return res;
 }
 #endif
+
+/*
+ * Prevent Windows entering sleep/hibernation-mode while wget is doing a lengthy 
transfer.
+ * Windows does by default not consider network activity in console-programs as 
activity !
+ * Works on Win-98/ME/2K and up.
+ */
+static
+DWORD set_sleep_mode (DWORD mode)
+{
+  HMODULE mod = LoadLibrary (kernel32.dll);
+  DWORD (*_SetThreadExecutionState) (DWORD) = NULL;
+  DWORD rc = (DWORD)-1;
+
+  if (mod)
+ (void*)_SetThreadExecutionState = GetProcAddress ((HINSTANCE)mod, 
SetThreadExecutionState);
+
+  if (_SetThreadExecutionState)
+{
+  if (mode == 0)  /* first time */
+ mode = (ES_SYSTEM_REQUIRED | ES_CONTINUOUS);
+  rc = (*_SetThreadExecutionState) (mode);
+}
+  if (mod)
+ FreeLibrary (mod);
+  DEBUGP ((set_sleep_mode(): mode 0x%08lX, rc 0x%08lX\n, mode, rc));
+  return (rc);
+}
+

diff -u3 -H -B orig/src/mswindows.h ./mswindows.h
--- orig/src/mswindows.h Fri Sep 26 00:39:35 2003
+++ ./mswindows.h Sat Sep 27 02:01:13 2003
@@ -125,11 +125,6 @@
 #define ESTALE  WSAESTALE
 #define EREMOTE WSAEREMOTE

-#ifdef __DMC__
-# define HAVE_SLEEP 1
-# define HAVE_USLEEP 1
-#endif
-
 /* Public functions.  */


Gisle V.

# rm /bin/laden 
/bin/laden: Not found



Re: Capture HTML Stream

2003-07-09 Thread Gisle Vanem
Aaron S. Hawley [EMAIL PROTECTED] said:

 but wget could do
 
 wget -O /dev/stdout www.washpost.com

On DOS/Windows too? I think not. There must be a better way.

--gv




Preventing sleep modeon long tranfers

2003-01-06 Thread Gisle Vanem
Hi,

How can wget (or some other program) prevent Win-XP from going
into sleep-mode (or hibernation) on long transfers (recursive or mirroring
d/l). I've set sleep-mode to activate after 15 min. But must disable it
when I suspect the d/l to take longer time (--continue doesn't always
work).

Network activity in console-mode programs doesn't seem to reset
the idle-timer in Win-XP. With GUI programs this never happens.

I use wget 1.9-beta (w/OpenSSL).

Could wget perhaps intercept the WM_POWERNOTIFY message
somehow?

Gisle V.