Re: cookies & proxy server

2001-05-14 Thread Brian Beuning

Getting closer.  The proxy issue is fixed.  Thanks!

Now there is a different problem.  If a site sets cookies like this
foo.bar.com/pub/linux
.bar.com/

When I try 'wget foo.bar.com' (with no path), wget tries to match against
only the first cookie (and fails to match). It should match the second.
If I did try 'wget foo.bar.com/pub/linux' it should match both cookies.

Brian Beuning

Hrvoje Niksic wrote:

> Brian Beuning <[EMAIL PROTECTED]> writes:
>
> > The new cookies code is working great.
>
> Thanks a lot for testing it!
>
> Did I say that more people should test the cookies code?  Well, I'll
> say it now: the cookies code really needs testing!
>
> > I tried it today with a proxy server and it has a small problem.  It
> > uses the proxy server name when looking for matches in the cookies.
> > Of course it needs to match against the real URL.
>
> Yup.  Does this patch fix the problem?
>
> 2001-05-14  Hrvoje Niksic  <[EMAIL PROTECTED]>
>
> * http.c (gethttp): Use real URL data for cookies, not the cookie
> stuff.
>
> Index: src/http.c
> ===
> RCS file: /pack/anoncvs/wget/src/http.c,v
> retrieving revision 1.59
> diff -u -r1.59 http.c
> --- src/http.c  2001/05/11 12:37:37 1.59
> +++ src/http.c  2001/05/14 09:27:47
> @@ -596,10 +596,6 @@
>keep_alive = 0;
>http_keep_alive_1 = http_keep_alive_2 = 0;
>
> -  if (opt.cookies)
> -cookies = build_cookies_request (u->host, u->port, u->path,
> -u->proto == URLHTTPS);
> -
>/* Initialize certain elements of struct http_stat.  */
>hs->len = 0L;
>hs->contlen = -1;
> @@ -806,6 +802,10 @@
>  request_keep_alive = "Connection: Keep-Alive\r\n";
>else
>  request_keep_alive = NULL;
> +
> +  if (opt.cookies)
> +cookies = build_cookies_request (ou->host, ou->port, ou->path,
> +ou->proto == URLHTTPS);
>
>/* Allocate the memory for the request.  */
>request = (char *)alloca (strlen (command) + strlen (path)




RE: win binary

2001-05-14 Thread Herold Heiko

Build :(
The source is fine.
Heiko

-- 
-- PREVINET S.p.A.[EMAIL PROTECTED]
-- Via Ferretto, 1ph  x39-041-5907073
-- I-31021 Mogliano V.to (TV) fax x39-041-5907087
-- ITALY



>-Original Message-
>From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]]
>Sent: Monday, May 14, 2001 5:19 PM
>To: Wget List
>Subject: Re: win binary
>
>
>Herold Heiko <[EMAIL PROTECTED]> writes:
>
>> Most recent binaries compiled by myself for windows didn't have ssl
>> enabled correctly due to a simple bug.
>
>A source bug or a build bug?
>



Re: New and improved Makefile.watcom

2001-05-14 Thread Hrvoje Niksic

[EMAIL PROTECTED] writes:

> Here's another edition of Makefile.watcom
> (See attached file: Makefile.watcom)

Committed; thanks.



Re: win binary

2001-05-14 Thread Hrvoje Niksic

Herold Heiko <[EMAIL PROTECTED]> writes:

> Most recent binaries compiled by myself for windows didn't have ssl
> enabled correctly due to a simple bug.

A source bug or a build bug?



Re: New and improved Makefile.watcom

2001-05-14 Thread Hrvoje Niksic

[EMAIL PROTECTED] writes:

> > One thing I don't understand: why do you optimize for size?  Doesn't
> > it almost always make sense to optimize for speed instead?>
> 
> Because I like small and sleek executables :-)

No comment.

> Are there any processor-intensive bits in wget ? Most of the time
> it'll wait for the "Internet" anyway.

That's what I thought.  But that's patently false for downloads of
large sites -- for each page, Wget has to extract all the links, and
do a number of operations for each link.  When wget was using lists
instead of hash tables (this includes 1.6), performance would become a
problem after some time.

> BTW, compiling with DEBUG_MALLOC reveals three memory leaks :
> 0x13830432: mswindows.c:72<-   *exec_name = xstrdup (*exec_name); in
> windows_main_junk
> 0x13830496: mswindows.c:168   <-   wspathsave = (char*) xmalloc (strlen
> (buffer) + 1); in ws_mypath

Can't say what these two should be.

> 0x13830848: utils.c:1525  <-   (struct wget_timer *)xmalloc (sizeof
> (struct wget_timer));

I think this is a timer allocated in show_progress.  It's allocated
only once so it's not really a leak.



Re: Can't get the -nc option working

2001-05-14 Thread Hrvoje Niksic

"Tim D. Scheibe" <[EMAIL PROTECTED]> writes:

> I have a problem with using the -nc option. No matter what I do, I always
> get the msg : "Can't timestamp and not clobber old files at the same
> time."
[...]
> I'm NOT using a '.wgetrc-file'.

Are you absolutely sure about that?  Are you sure that there is no
site-wide wgetrc where the "helpful" admin placed a timestamping
instruction?



win binary

2001-05-14 Thread Herold Heiko

Most recent binaries compiled by myself for windows didn't have ssl
enabled correctly due to a simple bug.
Since I did't need the ssl feature lately (and appareantly nobody else
used it lately, thanks Muller Zsolt) nobodoy did notice the problem.
The new binary at  should be ok.

Heiko

-- 
-- PREVINET S.p.A.[EMAIL PROTECTED]
-- Via Ferretto, 1ph  x39-041-5907073
-- I-31021 Mogliano V.to (TV) fax x39-041-5907087
-- ITALY




RE: New and improved Makefile.watcom

2001-05-14 Thread csaba . raduly


   
   
Herold Heiko   
   
, Wget List  
evinet.it>  <[EMAIL PROTECTED]>  
   
cc:
   
14/05/01 12:05  Subject: RE: New and improved 
Makefile.watcom 
   
   
   
   









> >-Original Message-
> >From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]]
> >Sent: Monday, May 14, 2001 11:23 AM
> >To: Wget List
> >Subject: Re: New and improved Makefile.watcom
> >
> >
> >[EMAIL PROTECTED] writes:
> >
> >> This is a rewrite of Makefile.watcom
> >
> >Thanks; I've put it in the repository.
> >
> >> # Copy this file to the ..\src directory (maybe rename to
> >Makefile). Also:
> >> # copy config.h.ms ..\src\config.h
> >
> >Maybe we should provide a "win-build" script (or something) that does
> >this automatically?
> >

How about this ?

config.h : ..\windows\config.h.ms
 copy $[@ $^@

(this would be "copy $< $@" for GNU make)

Yup, it works (for me ! :-)

>
> Isn't this what configure.bat is for ?

In theory, but...

> Default to VC (or use VC if --msvc is given), otherwise if env var
> BORPATH is present (or --borland is given) use borland, otherwise error.
>

I see no Watcom here :-) configure.bat doesn't know about Watcom C

Hrvoje also wrote:
> > #disabled for faster compiler
> > LFLAGS=sys nt op st=32767 op vers=1.7 op map op q op de 'GNU wget
1.7dev' de all
> > CFLAGS=/zp4 /d1 /w4 /fpd /5s /fp5 /bm /mf /os /bt=nt [snip]
> > # /zp4= pack structure members with this alignment
> > # /d1 = line number debug info
> > # /w4 = warning level
> > # /fpd= ??? no such switch !
> > # /5s = Pentium stack-based calling
> > # /fp5= Pentium floating point
> > # /bm = build multi-threaded
> > # /mf = flat memory model
> > # /os = optimize for size
> ^^^
> > # /bt = "build target" (nt)
>
> One thing I don't understand: why do you optimize for size?  Doesn't
> it almost always make sense to optimize for speed instead?>

Because I like small and sleek executables :-)
Are there any processor-intensive bits in wget ? Most of the time it'll
wait for the "Internet" anyway.


BTW, compiling with DEBUG_MALLOC reveals three memory leaks :
0x13830432: mswindows.c:72<-   *exec_name = xstrdup (*exec_name); in
windows_main_junk
0x13830496: mswindows.c:168   <-   wspathsave = (char*) xmalloc (strlen
(buffer) + 1); in ws_mypath
0x13830848: utils.c:1525  <-   (struct wget_timer *)xmalloc (sizeof
(struct wget_timer));

Here's another edition of Makefile.watcom
(See attached file: Makefile.watcom)
--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933

 =?iso-8859-1?Q?Makefile.watcom?=


-H with --no-parent -> URL rewrite bug?

2001-05-14 Thread Andre Pang

hello,

i tried using --no-parent with -H, which seemed to work quite
well.  it would span hosts (and acknowledge a -D option to
restrict the domain) while never traversing upward from the
original directory name.

however, when wget rewrote the URLs, this happened:

http://www.heebie.net/royo/../../www.fantaysia.com/royoimg/secret4t.jpg";>

instead of:



commandline was thus:

wget -m --no-parent -E -k -K -Dfantaysia.com,heebie.net -H -nv 
http://www.heebie.net/royo/royo1.htm

i had a quick look at the sources but judged that i would
probably take a long time to find out where the bug is since my
programming competence level isn't very high :)

note: wget proceeded to the /royoimg directory even though the
original directory is /royo and --no-parent is specified because
the jpg is in an  tag, which i presume is a page-requisite.

if you need any other information, please let me know!  (debug
output is available if you'd like it.)


-- 
#ozone/algorithm <[EMAIL PROTECTED]>  - trust.in.love.to.save



Can't get the -nc option working

2001-05-14 Thread Tim D. Scheibe

Hi there!

I have a problem with using the -nc option. No matter what I do, I always
get the msg : "Can't timestamp and not clobber old files at the same time."
As u can see below, even th example from the gnu manual doesn't work :

$ wget -nc -r http://www.gnu.ai.mit.edu/
Can't timestamp and not clobber old files at the same time.
Usage: wget [OPTION]... [URL]...


Version info :
$ wget -V
GNU Wget 1.6

Copyright (C) 1995, 1996, 1997, 1998, 2000 Free Software Foundation, Inc.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

Originally written by Hrvoje Niksic <[EMAIL PROTECTED]>.


I'm NOT using a '.wgetrc-file'.
I also tried Verion 1.5.3 with the same error msg.

With kind regards,

Tim Scheibe





Re: New and improved Makefile.watcom

2001-05-14 Thread Hrvoje Niksic

Herold Heiko <[EMAIL PROTECTED]> writes:

> >Maybe we should provide a "win-build" script (or something) that does
> >this automatically?
> 
> Isn't this what configure.bat is for ?

I guess.  I know next to nothing about the Windows stuff; I assumed
`configure.bat' handles only one build environment.

> Default to VC (or use VC if --msvc is given), otherwise if env var
> BORPATH is present (or --borland is given) use borland, otherwise
> error.

And don't forget Watcom...



RE: New and improved Makefile.watcom

2001-05-14 Thread Herold Heiko

>-Original Message-
>From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]]
>Sent: Monday, May 14, 2001 11:23 AM
>To: Wget List
>Subject: Re: New and improved Makefile.watcom
>
>
>[EMAIL PROTECTED] writes:
>
>> This is a rewrite of Makefile.watcom
>
>Thanks; I've put it in the repository.
>
>> # Copy this file to the ..\src directory (maybe rename to 
>Makefile). Also:
>> # copy config.h.ms ..\src\config.h
>
>Maybe we should provide a "win-build" script (or something) that does
>this automatically?
>

Isn't this what configure.bat is for ?
Default to VC (or use VC if --msvc is given), otherwise if env var
BORPATH is present (or --borland is given) use borland, otherwise error.

However configure.bat in "borland mode" does copy config.h.bor, not
config.h.ms .

Or did I misunderstand the question ?

Heiko

-- 
-- PREVINET S.p.A.[EMAIL PROTECTED]
-- Via Ferretto, 1ph  x39-041-5907073
-- I-31021 Mogliano V.to (TV) fax x39-041-5907087
-- ITALY



Re: ftp_bug

2001-05-14 Thread Hrvoje Niksic

[EMAIL PROTECTED] writes:

> Is it correct?
> Command: wget -m -o log -d ftp://user:passwd@host/
> 
> Problem: full path directory: /home/user/utils
> program read directory utils from path /utils and write "No such file or
> directory."

This looks like you're using Wget 1.5.3.  If so, upgrade to 1.6 where
this problem has been partially solved.  A full solution is in the
current CVS sources.



Re: cookies & proxy server

2001-05-14 Thread Hrvoje Niksic

Brian Beuning <[EMAIL PROTECTED]> writes:

> The new cookies code is working great.

Thanks a lot for testing it!

Did I say that more people should test the cookies code?  Well, I'll
say it now: the cookies code really needs testing!

> I tried it today with a proxy server and it has a small problem.  It
> uses the proxy server name when looking for matches in the cookies.
> Of course it needs to match against the real URL.

Yup.  Does this patch fix the problem?

2001-05-14  Hrvoje Niksic  <[EMAIL PROTECTED]>

* http.c (gethttp): Use real URL data for cookies, not the cookie
stuff.

Index: src/http.c
===
RCS file: /pack/anoncvs/wget/src/http.c,v
retrieving revision 1.59
diff -u -r1.59 http.c
--- src/http.c  2001/05/11 12:37:37 1.59
+++ src/http.c  2001/05/14 09:27:47
@@ -596,10 +596,6 @@
   keep_alive = 0;
   http_keep_alive_1 = http_keep_alive_2 = 0;
 
-  if (opt.cookies)
-cookies = build_cookies_request (u->host, u->port, u->path,
-u->proto == URLHTTPS);
-
   /* Initialize certain elements of struct http_stat.  */
   hs->len = 0L;
   hs->contlen = -1;
@@ -806,6 +802,10 @@
 request_keep_alive = "Connection: Keep-Alive\r\n";
   else
 request_keep_alive = NULL;
+
+  if (opt.cookies)
+cookies = build_cookies_request (ou->host, ou->port, ou->path,
+ou->proto == URLHTTPS);
 
   /* Allocate the memory for the request.  */
   request = (char *)alloca (strlen (command) + strlen (path)



Re: New and improved Makefile.watcom

2001-05-14 Thread Hrvoje Niksic

[EMAIL PROTECTED] writes:

> This is a rewrite of Makefile.watcom

Thanks; I've put it in the repository.

> # Copy this file to the ..\src directory (maybe rename to Makefile). Also:
> # copy config.h.ms ..\src\config.h

Maybe we should provide a "win-build" script (or something) that does
this automatically?

> #disabled for faster compiler
> LFLAGS=sys nt op st=32767 op vers=1.7 op map op q op de 'GNU wget 1.7dev' de all
> CFLAGS=/zp4 /d1 /w4 /fpd /5s /fp5 /bm /mf /os /bt=nt /DWINDOWS /DHAVE_CONFIG_H 
>/I=$(%WATCOM)\h;$(%WATCOM)\h\nt;.
> # /zp4= pack structure members with this alignment
> # /d1 = line number debug info
> # /w4 = warning level
> # /fpd= ??? no such switch !
> # /5s = Pentium stack-based calling
> # /fp5= Pentium floating point
> # /bm = build multi-threaded
> # /mf = flat memory model
> # /os = optimize for size
^^^
> # /bt = "build target" (nt)

One thing I don't understand: why do you optimize for size?  Doesn't
it almost always make sense to optimize for speed instead?



Re: ftp_bug

2001-05-14 Thread Jan Prikryl

Quoting [EMAIL PROTECTED] ([EMAIL PROTECTED]):

> Is it correct?
> Command: wget -m -o log -d ftp://user:passwd@host/

Well. I guess you're using either 1.5.3 or 1.6. This bug has been
removed in the current CVS version.

> P.S. have you FAQs?

No. But look at the wget homepage at http://sunsite.dk/wget/ - you will
find there some links to searchable archives of this mailing list.

-- jan

---+
  Jan Prikryl  icq | vr|vis center for virtual reality and
  <[EMAIL PROTECTED]>  83242638 | visualisation http://www.vrvis.at
---+



ftp_bug

2001-05-14 Thread exch

Hello

Is it correct?
Command: wget -m -o log -d ftp://user:passwd@host/

Problem: full path directory: /home/user/utils
program read directory utils from path /utils and write "No such file or
directory."

log: (host, user,passwd changed by me)
Using `host/utils/.listing' as listing tmp file.
--16:11:09--  ftp://user:passwd@host/%2Futils/
   => `host//utils.listing'
==> CWD /utils ...
--> CWD /utilsM

550 /utils: No such file or directory.

No such directory `/utils'.

Closing fd 4
Checking for host.
host was already used, by that name.
--16:11:09--  ftp://user:passwd@host/%2Futils/
   => `host/utils/index.html'
Connecting to nkuzora:21... Created fd 4.
connected!
Logging in as user ... 220 host.domain FTP server (wu-2.4.2-academ>

--> USER userM

331 Password required for user.
 File log  Lines 407  Line 191   Col 0Char
0143
--> PASS passwdM

230 User user logged in.
Logged in!
==> TYPE I ...
--> TYPE IM

200 Type set to I.
done.  ==> CWD /utils ...
--> CWD /utilsM

550 /utils: No such file or directory.

No such directory `/utils'.

Closing fd 4


P.S. have you FAQs?



Re: Recursive output fails after first file when writing to stdout

2001-05-14 Thread Jan Prikryl

Quoting Greg Robinson ([EMAIL PROTECTED]):

> I'm having a problem with wget.  I need to have the program (while
> running recursively) output to stdout so that I can pipe the output to a
> separate filter process.  Unfortunately, wget will only download the
> first file from any site I point it at when stdout is specified as the
> file to write to.

The difficulty here is the recursive download: When recursively
donwloading, wget requires a physical copies of files to exist in
order to extract URLs form those files. No chance in the moment to do
it directly when downloading, sorry.

> Does that mean it's trying to write to the non-existant www.google.com
> directory on my drive, or does it mean that there's no index.html file
> on any server I want to suck from?

The message comes probably from the URL parser and it means that no
local copy of index.html from www.google.com exists on your drive (and
therefore no URLs can be extracted and the recursive download will
fail).

-- jan

+--
 Jan Prikryl| vr|vis center for virtual reality and visualisation
 <[EMAIL PROTECTED]> | http://www.vrvis.at
+--