Using the following file of junk cookies: --------------------------------------- bugcookies.txt ----------------------- apache.org TRUE / FALSE 1056383818 foo bar
apache.org TRUE / FALSE 1040399818 baz quux apache.org TRUE / FALSE 1052979156 zorkminster piffle www.apache.org TRUE / FALSE 1025859542 sniffle cribbage www.apache.org TRUE / FALSE 1025859542 wiffle persiflage --------------------------------------- end bugcookies.txt ------------------ when I make a request to www.apache.org (see below for log) only the last 2 of the 5 cookies are used in the request, rather than all 5, which a regular browser such as Mozilla or IE would send. I believe the Netscape cookie specification indicates that the domain cookies should be applicable. The relevant part of the unofficial 'specification' (http://wp.netscape.com/newsref/std/cookie_spec.html) is: "domain=DOMAIN_NAME When searching the cookie list for valid cookies, a comparison of the domain attributes of the cookie is made with the Internet domain name of the host from which the URL will be fetched. If there is a tail match, then the cookie will go through path matching to see if it should be sent. "Tail matching" means that domain attribute is matched against the tail of the fully qualified domain name of the host. A domain attribute of "acme.com" would match host names "anvil.acme.com" as well as "shipping.crate.acme.com". Only hosts within the specified domain can set a cookie for a domain and domains must have at least two (2) or three (3) periods in them to prevent domains of the form: ".com", ".edu", and "va.us". Any domain that fails within one of the seven special top level domains listed below only require two periods. Any other domain requires at least three. The seven special top level domains are: "COM", "EDU", "NET", "ORG", "GOV", "MIL", and "INT". The default value of domain is the host name of the server which generated the cookie response. " It appears that the algorithm for cookie matching in wget-1.8.1/src/cookies.c does not agree with this spec. Jeff Log trace: $ wget -d --load-cookies c:/temp/bugcookies.txt -O c:/temp/trywget.html http://www.apache.org/ DEBUG output created by Wget 1.8.1 on cygwin. Stored cookie apache.org 80 / permanent 0 Mon Jun 23 08:56:58 2003 foo bar Stored cookie apache.org 80 / permanent 0 Fri Dec 20 07:56:58 2002 baz quux Stored cookie apache.org 80 / permanent 0 Wed May 14 23:12:36 2003 zorkminster piffle Stored cookie www.apache.org 80 / permanent 0 Fri Jul 5 01:59:02 2002 sniffle cribbage Stored cookie www.apache.org 80 / permanent 0 Fri Jul 5 01:59:02 2002 wiffle persiflage --10:21:43-- http://www.apache.org/ => `c:/temp/trywget.html' Resolving www.apache.org... done. Caching www.apache.org => 63.251.56.142 Connecting to www.apache.org[63.251.56.142]:80... connected. Created socket 4. Releasing 0x100b8a50 (new refcount 1). ---request begin--- GET / HTTP/1.0 User-Agent: Wget/1.8.1 Host: www.apache.org Accept: */* Connection: Keep-Alive Cookie: sniffle=cribbage; wiffle=persiflage ---request end--- HTTP request sent, awaiting response... HTTP/1.1 200 OK Date: Thu, 20 Jun 2002 17:21:39 GMT Server: Apache/2.0.39 (Unix) Cache-Control: max-age=86400 Expires: Fri, 21 Jun 2002 17:21:39 GMT Accept-Ranges: bytes Content-Length: 7611 Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Content-Type: text/html Found www.apache.org in host_name_addresses_map (0x100b8a50) Registered fd 4 for persistent reuse. Length: 7,611 [text/html] 100%[======================================================================= =====>] 7,611 181.28K/s ETA 00:00 10:21:43 (181.28 KB/s) - `c:/temp/trywget.html' saved [7611/7611]
