Hi,

I have tried out the wget alpha under Linux and found that the timestamping
option (which I usually have defined) does not work correctly.

First thing I saw, that on *every* download I got a line
   Remote file is newer, retrieving.
in the output, even when there was no local file.
That looked like a cosmetic issue only, but further tests show that more things
were going wrong.


First test run, wgetrc disabled for test, local file not present before:

wget.111 -d -N http://www.uni-koeln.de

Setting --timestamping (timestamping) to 1
DEBUG output created by Wget 1.11-alpha-1 on linux-gnu.

--20:46:41--  http://www.uni-koeln.de/
Resolving www.uni-koeln.de... 134.95.19.39
Caching www.uni-koeln.de => 134.95.19.39
Connecting to www.uni-koeln.de|134.95.19.39|:80... connected.
Created socket 3.
Releasing 0x08086440 (new refcount 1).

---request begin---
HEAD / HTTP/1.0
User-Agent: Wget/1.11-alpha-1
Accept: */*
Host: www.uni-koeln.de
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Date: Sat, 17 Jun 2006 18:46:41 GMT
Server: Apache/2.0.52
Last-Modified: Wed, 14 Jun 2006 06:47:06 GMT
Accept-Ranges: bytes
Content-Type: text/html; charset=iso-8859-1
Connection: close

---response end---
200 OK
hs->local_file is: index.html (not existing)
TEXTHTML is on.
Length: unspecified [text/html]
Closed fd 3
Remote file is newer, retrieving.

--20:46:41--  http://www.uni-koeln.de/
Found www.uni-koeln.de in host_name_addresses_map (0x8086440)
Connecting to www.uni-koeln.de|134.95.19.39|:80... connected.
Created socket 3.
Releasing 0x08086440 (new refcount 1).

---request begin---
GET / HTTP/1.0
User-Agent: Wget/1.11-alpha-1
Accept: */*
Host: www.uni-koeln.de
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Date: Sat, 17 Jun 2006 18:46:42 GMT
Server: Apache/2.0.52
Last-Modified: Wed, 14 Jun 2006 06:47:06 GMT
Accept-Ranges: bytes
Content-Type: text/html; charset=iso-8859-1
Connection: close

---response end---
200 OK
hs->local_file is: index.html (not existing)
TEXTHTML is on.
Length: unspecified [text/html]
Saving to: `index.html'

    [   <=>
    ] 20,703      2
7.0K/s   in 0.7s

Closed fd 3
20:46:43 (27.0 KB/s) - `index.html' saved [20703]
[EMAIL PROTECTED]:~
bash 722 > cat wget111-1.log
wget.111 -d -N http://www.uni-koeln.de

Setting --timestamping (timestamping) to 1
DEBUG output created by Wget 1.11-alpha-1 on linux-gnu.

--20:46:41--  http://www.uni-koeln.de/
Resolving www.uni-koeln.de... 134.95.19.39
Caching www.uni-koeln.de => 134.95.19.39
Connecting to www.uni-koeln.de|134.95.19.39|:80... connected.
Created socket 3.
Releasing 0x08086440 (new refcount 1).

---request begin---
HEAD / HTTP/1.0
User-Agent: Wget/1.11-alpha-1
Accept: */*
Host: www.uni-koeln.de
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Date: Sat, 17 Jun 2006 18:46:41 GMT
Server: Apache/2.0.52
Last-Modified: Wed, 14 Jun 2006 06:47:06 GMT
Accept-Ranges: bytes
Content-Type: text/html; charset=iso-8859-1
Connection: close

---response end---
200 OK
hs->local_file is: index.html (not existing)
TEXTHTML is on.
Length: unspecified [text/html]
Closed fd 3
Remote file is newer, retrieving.

--20:46:41--  http://www.uni-koeln.de/
Found www.uni-koeln.de in host_name_addresses_map (0x8086440)
Connecting to www.uni-koeln.de|134.95.19.39|:80... connected.
Created socket 3.
Releasing 0x08086440 (new refcount 1).

---request begin---
GET / HTTP/1.0
User-Agent: Wget/1.11-alpha-1
Accept: */*
Host: www.uni-koeln.de
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Date: Sat, 17 Jun 2006 18:46:42 GMT
Server: Apache/2.0.52
Last-Modified: Wed, 14 Jun 2006 06:47:06 GMT
Accept-Ranges: bytes
Content-Type: text/html; charset=iso-8859-1
Connection: close

---response end---
200 OK
hs->local_file is: index.html (not existing)
TEXTHTML is on.
Length: unspecified [text/html]
Saving to: `index.html'

    [   <=>
    ] 20,703      27.0K/s   in 0.7s

Closed fd 3
20:46:43 (27.0 KB/s) - `index.html' saved [20703]


Old version just does a HTTP GET in this case and write to the local file.

Here I see, it does a HTTP HEAD first, *then* says:
   local_file is: index.html (not existing)
which is correct.
Then it says:
   Remote file is newer, retrieving.
which is questionable, as there is no local file yet for comparison.
Then it does a HTTP get and saves the file, which is correct.


When I do the same request again (now with the file existing local) I get:


wget.111 -d -N http://www.uni-koeln.de

Setting --timestamping (timestamping) to 1
DEBUG output created by Wget 1.11-alpha-1 on linux-gnu.

--20:49:01--  http://www.uni-koeln.de/
Resolving www.uni-koeln.de... 134.95.19.39
Caching www.uni-koeln.de => 134.95.19.39
Connecting to www.uni-koeln.de|134.95.19.39|:80... connected.
Created socket 3.
Releasing 0x08086440 (new refcount 1).

---request begin---
HEAD / HTTP/1.0
User-Agent: Wget/1.11-alpha-1
Accept: */*
Host: www.uni-koeln.de
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Date: Sat, 17 Jun 2006 18:49:01 GMT
Server: Apache/2.0.52
Last-Modified: Wed, 14 Jun 2006 06:47:06 GMT
Accept-Ranges: bytes
Content-Type: text/html; charset=iso-8859-1
Connection: close

---response end---
200 OK
hs->local_file is: index.html (existing)
TEXTHTML is on.
Length: unspecified [text/html]
Closed fd 3
Remote file is newer, retrieving.

--20:49:01--  http://www.uni-koeln.de/
Found www.uni-koeln.de in host_name_addresses_map (0x8086440)
Connecting to www.uni-koeln.de|134.95.19.39|:80... connected.
Created socket 3.
Releasing 0x08086440 (new refcount 1).

---request begin---
GET / HTTP/1.0
User-Agent: Wget/1.11-alpha-1
Accept: */*
Host: www.uni-koeln.de
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Date: Sat, 17 Jun 2006 18:49:01 GMT
Server: Apache/2.0.52
Last-Modified: Wed, 14 Jun 2006 06:47:06 GMT
Accept-Ranges: bytes
Content-Type: text/html; charset=iso-8859-1
Connection: close

---response end---
200 OK
hs->local_file is: index.html.1 (not existing)
TEXTHTML is on.
Length: unspecified [text/html]
Saving to: `index.html.1'

    [ <=>
    ] 20,703      --.-K/s   in 0.1s

Closed fd 3
20:49:02 (165 KB/s) - `index.html.1' saved [20703]


It starts similar as in the first case:
  HTTP HEAD
and says:
  local_file is: index.html (existing)
which is correct now.
Then again it says:
  Remote file is newer, retrieving.
which is wrong now, as local and remote are the same.
Then it goes ahead and downloads the file and saves it to
index.html.1, as if the timestamping option is not set at all,
wrong again.


Best Regards,

Jochen Roderburg
ZAIK/RRZK
University of Cologne
Robert-Koch-Str. 10                    Tel.:   +49-221/478-7024
D-50931 Koeln                          E-Mail: [EMAIL PROTECTED]
Germany





Reply via email to