Cookie handling problem

2006-02-10 Thread wget . overbored
Hi all, I have a problem archiving a website using wget 1.10.2. It
sends back cookies in an incorrect syntax (the line ends in
cookie_name=). Here is the relevant wget --debug output:

=== BEGIN OUTPUT ===

---response begin---
HTTP/1.1 302 Moved Temporarily
Date: Sat, 11 Feb 2006 03:58:47 GMT
Server: Apache/1.3.28 (Unix) mod_jk/1.2.3-dev mod_ssl/2.8.15 OpenSSL/0.9.7c
Set-Cookie: ok_test=test; Path=
Set-Cookie: ok_iptest=272.14.207.99; Path=
Location: http://myhost/?cookie_test=1
Connection: close
Content-Type: text/plain

---response end---
302 Moved Temporarily
Error in Set-Cookie, field `Path'Syntax error in Set-Cookie:
ok_test=test; Path= at position 19.
Error in Set-Cookie, field `Path'Syntax error in Set-Cookie:
ok_iptest=272.14.207.99; Path= at position 29.

=== END OUTPUT ===

This is a case that all the major browsers seem to be able to handle
fine. Is there any way around this (short of, e.g., debugging wget or
writing my own proxy in Python to modify these bad cookie header
lines)? Thanks in advance!



Re: Wget not resending cookies on Location: in headers

2005-04-26 Thread wget
The obvious problem is that this command lacks --keep-session-cookies,
and the cookie it gets is session-based.
I tried to reproduce the bug in the more generic way.
But there are other problems
as well: if you examine the cookie.txt produced by (the amended
version of) the first command, you'll notice that the cookie's path is
wget/setcookie.php.  For one, the setcookie.php part should have
been stripped (Mozilla does this, I've just checked).  Second, the
path should always begin with a slash.  Either of these problems would
guarantee that no other URL would ever match this cookie.
I've now fixed both bugs in the CVS, along with a third, unrelated
bug.  Please let me know if the latest CVS works for you.  (It works
for me on the example you set up.)
Thanks a lot for your corrections. It's now working like a charm. It's 
also working with session cookies.

Regards,
Pierre


Wget not resending cookies on Location: in headers

2005-04-25 Thread wget
Hello,
I use Wget version 1.10-alpha2+cvs-dev (because of the avalability of 
the --keep-session-cookies option).

I'm trying to wget a member page, where cookies are required for access.
The usual login procedure is:
-
- Get a session cookie (PHPSESSID) on http://host.com/index.php
- Get http://host.com/checkuser.php which defines additional cookies. 
checkuser.php requires PHPSESSID, username, and password in POST 
method.
- The server responds with a Location: http://host.com/member.php; in 
headers. Here is the point : member.php requires cookies defined by 
index.php and checkuser.php. However these cookies are not resended by 
Wget. Thus I can't have access to member.php (for some reason, the 
download of member.php ends with a timeout when cookies aren't set).

Here is how I proceed:
--
- Get the PHPSESSID
$ wget --cookies=on --keep-session-cookies --save-cookies=cookie.txt 
http://host.com/index.php
(I'm then getting the value of PHPSESSID in a variable with a cut -s -f 
7 cookie.txt)
- Authenticate on http://host.com/checkuser.php
$ wget --referer='http://host.com/index.php' --cookies=on 
--load-cookies=cookie.txt --keep-session-cookies 
--save-cookies=cookie.txt 
--post-data='PHPSESSID=$phpsessidusername=$usrpassword=$pwd' 
http://host.com/checkuser.php

Wget downloads and set the new cookies properly, on checkuser.php, then 
redirects on member.php but keep retrying and eventually ends with a 
timeout.

Considerations
--
I can see two ways to avoid this issue:
- Tell Wget not to follow links in Location: field in headers. I 
could then resend the cookies to member.php;
- Tell Wget to resend cookies when following links in headers.

I didn't find anything in the documentation about these work-arounds. 
How to resolve this problem?

Best regards,
Pierre


Re: Wget not resending cookies on Location: in headers

2005-04-25 Thread wget
Is there a publically accessible site that exhibits this problem?
I've set up a small example which illustrates the problem. Files can be 
found at http://dev.mesca.net/wget/ (using demo:test as login).

Three files:
setcookie.php:
--
? setcookie(wget,I love it!); ?
getcookie.php:
--
? header('Location: getcookie-redirect.php'); ?
get-cookie-redirect.php:

?
if(isset($_COOKIE['wget'])){
echo Ok, I can read the cookie: [wget] .$_COOKIE['wget'];
}else{
echo Cookie is not set.;
}
?
We first set the cookie by wgetting setcookie.php.
Then, we're trying to read the cookie by querying getcookie.php, which 
redirects to get-cookie-redirect.php: wget can't read it.

$ wget --http-user=demo --http-passwd=test --cookies=on 
--save-cookies=cookie.txt http://dev.mesca.net/wget/setcookie.php
$ wget --http-user=demo --http-passwd=test --cookies=on 
--load-cookies=cookie.txt http://dev.mesca.net/wget/getcookie.php

Note: tests were made using the latest version from the CVS 
(1.10-alpha2+cvs-dev).

Le 26 avr. 05, à 00:09, Hrvoje Niksic a écrit :
- The server responds with a Location: http://host.com/member.php; in
headers. Here is the point : member.php requires cookies defined by
index.php and checkuser.php. However these cookies are not resended by
Wget.
That sounds like a bug.  Wget is supposed to resend the cookies.
Could you provide any kind of debug information?  The contents of the
cookies is not important, but the path parameter and the expiry date
is.
According to my tests, the problem is still reproducible whatever 
Path and Expiry date contain.

Regards,
Pierre


wget appears to not send cookies on POST request

2005-04-07 Thread wget
Hi,

I'm trying to automate a rather long file download process using wget.
The process goes like this:

1) Request main home page (to get first session cookie)
2) Submit login form (to get next authorization cookie)
3) Submit another form (to get file to download)
4) Log out

Each of these steps is a separate call to wget from within the Perl
script that's controlling the whole process. I'm using the
--load-cookies, --save-cookies, and --keep-session-cookies options to
preserve the necessary cookies between steps.

This works fine up to the end of Step 2 -- at the end of Step 2, I'm
logged in and have the appropriate cookies.

But when I call wget for Step 3, wget doesn't send the cookie. I've
turned on wget's debugging output, and I can see that it's reading the
cookie file and importing the right cookies. I can see also that
according to the domain of the cookies read in, one cookie should be
sent to the server I'm posting the download form from. But according to
the debug output, wget isn't sending the cookie.

Because of that, the remote server doesn't recognize I'm logged in, and
I get sent back to the login page.

Does anyone have any suggestions as to how to get around this?

FYI I'm using the windows binary of wget 1.9+cvs-dev-200502251532.

Thanks.

Mike


problem downloading images from Zope documentation pages

2003-12-23 Thread rupert . bug-wget
Hi.

I'm trying to download the documentation for the Zope application
server, at http://zope.org/Documentation/Books/ZopeBook/2_6Edition/,
and I'm having problems getting the images.  For example, when I run
the following command:

  /usr/bin/wget -kp \
http://zope.org/Documentation/Books/ZopeBook/2_6Edition/Preface.stx

the only files I get are the HTML page itself and the robots.txt file.
I don't get any of the referenced images.

/usr/bin/wget is version 1.8.1, and is part of the Debian 3.0 distribution
that I'm running.  I downloaded and compiled wget 1.9.1 and tried the
same command with this newer version, and I got exactly the same result.

Can anyone explain why wget isn't fetching the images from this page,
or suggest how I could make wget fetch them?

Regards,

Ronan.


Get a segmentation fault on this link command

2002-12-05 Thread wget
This is the command I use:
wget -mp -P/downs http://www.cs.wright.edu/people/faculty/agoshtas/tindex.html

I think it might be because in the tindex.html file is a double call to 
http-refresh which is written badly, at different times 2 and 15, in the META 
section. Or the call to the twelcome.html

Using GNU Wget 1.8.1

I suppose you might say the solution is to upgrade - well tell me if the 
upgraded version is ok with this.

Hal



Wget license and OpenSSL license incompatible

2001-09-06 Thread wget

FYI, the GPL license that wget is shipped with is incompatible with
the OpenSSL license. Below is a mail message I forward to the
development mailing list for lftp and a response from
[EMAIL PROTECTED] As far as I know, this only presents a problem when
wget binaries linked against OpenSSL are *distributed*. The lftp
author has modified the license to allow lftp (distributed under the
GPL) to be linked against OpenSSL.

  [EMAIL PROTECTED] wrote:
   
   I don't know the specifics but the following is included in the
   license for fetchmail 5.8.17 (5.9.0 most recent version):
   
 Specific permission is granted for this code to be linked to OpenSSL
 (this is necessary becuse the OpenSSL license is not GPL-compatible).
   
   Because lftp is GPL, I presume it is not legal to redistribute
   binaries linked against OpenSSL. Any problems adding the above clause
   to the lftp license to make it legal?
   
   BTW, according to:
 http://www.fsf.org/licenses/license-list.html#TOCSoftwareLicenses
   I believe the incompatibility is the result of the advertising clause
   in the OpenSSL license.


  If you wrote LFTP and do not include or link against code from any 
  other GPL'd source (except fetchmail or others with this permission), 
  there is no problem with adding this to your license.

  If LFTP includes work from other GPL'd software, then you can't do this.  
  In that case, you might want to consider rewriting OpenSSL using a 
  GPL-compatible license.  Or, I hear there's a way to use OpenSSL without
  linking to it.

  --
  -David Novalis Turner,
  Licensing Question Volunteer,
  Free Software Foundation

-- 
albert chin ([EMAIL PROTECTED])



Re: How do I get SSL support to work in 1.7?

2001-06-07 Thread wget

On Thu, Jun 07, 2001 at 11:42:08AM +0200, Hrvoje Niksic wrote:
 [EMAIL PROTECTED] writes:
 
  Not surprising. Neither IRIX 6.5 nor Tru64 UNIX 4.0D have
  /dev/random.  So, you need either EGD/PRNGD to provide a substitute
  for your missing /dev/random. And, the *client* software has to be
  configured to support this. So, if wget doesn't call RAND_egd() from
  OpenSSL, there is *nothing* you can do. And, from a quick perusal of
  wget 1.7, it doesn't. So, 1.7 is useless for https:// on any system
  without /dev/random.
 
 Ouch.  I would be thankful for any patches that allowed the use of
 Wget/SSL on non-Linux systems.  (I know next to nothing about SSL
 myself.)

Is Wget available via CVS somewhere or should patches be against 1.7?

-- 
albert chin ([EMAIL PROTECTED])



Re: wget 1.7, linux, -rpath

2001-06-06 Thread wget

On Wed, Jun 06, 2001 at 06:36:26PM +0200, Jan Prikryl wrote:
 Quoting [EMAIL PROTECTED] ([EMAIL PROTECTED]):
 
  The ssl support is much appreciated in wget 1.7.  But there is a problem
  with the configure support that makes it think ssl can't be used, at
  least with gcc 2.95.2 on my redhat 6.2 system:
 
 Thanks for the report. Unfortunately the SSL test does not work on
 linux at all. Replacing -rpath  with -Wl,rpath  will solve part of
 the problems. You may want to try if the attached patch works for
 you. Note that this is an unofficial patch and while it may help
 solving the SSL check problem, it may break other things.

Why don't you steal the --with-ssl option for cURL? It works.

-- 
albert chin ([EMAIL PROTECTED])



Re: How do I get SSL support to work in 1.7?

2001-06-06 Thread wget

On Wed, Jun 06, 2001 at 02:09:12PM -0400, Edward J. Sabol wrote:
 H. I've tried connecting to various sites using https without success.
 I've tried this on both IRIX 6.5.2 and Digital Unix 4.0d.
 
 When I installed OpenSSL 0.9.6a using the default configure options, it
 didn't make any shared libraries, but I have libssl.a and libcrypto.a
 installed, and wget's configure process does find them. (Do I need to install
 the shared libraries?)
 
 For example, I can connect to https://www.apache-ssl.org/ in Netscape just
 fine, but here's what happens when I try with wget 1.7:

 [... debug output removed ...]

Not surprising. Neither IRIX 6.5 nor Tru64 UNIX 4.0D have /dev/random.
So, you need either EGD/PRNGD to provide a substitute for your missing
/dev/random. And, the *client* software has to be configured to
support this. So, if wget doesn't call RAND_egd() from OpenSSL, there
is *nothing* you can do. And, from a quick perusal of wget 1.7, it
doesn't. So, 1.7 is useless for https:// on any system without
/dev/random.

Note that we have added such support to other programs and will
hopefully get time soon to do it to wget. cURL already has support for
EGD/PRNGD so we're just going to steal their solution.

-- 
albert chin ([EMAIL PROTECTED])