no-clobber add more suffix

2003-10-06 Thread Sergey Vasilevsky
`--no-clobber' is very usfull option, but i retrive document not only with
.html/.htm suffix.

Make addition option that like -A/-R define all allowed/rejected rules
for -nc option.



Re: subscribe wget

2003-10-06 Thread Hrvoje Niksic
To subscribe to this list, please send mail to
[EMAIL PROTECTED].


Re: can wget disable HTTP Location Forward ?

2003-10-06 Thread Hrvoje Niksic
There is currently no way to disable following redirects.  A patch to
do so has been submitted recently, but I didn't see a good reason why
one would need it, so I didn't add the option.  Your mail is a good
argument, but I don't know how prevalent that behavior is.

What is it with servers that can't be bothered to return 404?  Are
there lots of them nowadays?  Is a new default setting of Apache or
IIS to blame, or are people intentionally screwing up their
configurations?


Re: no-clobber add more suffix

2003-10-06 Thread Jens Rösner
Hi Sergey!

-nc does not only apply to .htm(l) files.
All files are considered.
At least in all wget versions I know of.

I cannot comment on your suggestion, to restrict -nc to a 
user-specified list of file types.
I personally don't need it, but I could imagine certain situations 
were this could indeed be helpful. 
Hopefully someone with more knowledge than me 
can elaborate a bit more on this :)

CU
Jens



 `--no-clobber' is very usfull option, but i retrive document not only with
 .html/.htm suffix.
 
 Make addition option that like -A/-R define all allowed/rejected rules
 for -nc option.
 

-- 
NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien...
Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService

Jetzt kostenlos anmelden unter http://www.gmx.net

+++ GMX - die erste Adresse für Mail, Message, More! +++



RE: Bug in Windows binary?

2003-10-06 Thread Herold Heiko
 From: Gisle Vanem [mailto:[EMAIL PROTECTED]

 Jens Rösner [EMAIL PROTECTED] said:
 
...
 
 I assume Heiko didn't notice it because he doesn't have that function
 in his kernel32.dll. Heiko and Hrvoje, will you correct this ASAP?
 
 --gv

Probably.
Currently I'm compiling and testing on NT 4.0 only.
Beside that I'm VERY tight on time in this moment so testing usually means
does it run ? Does it download one sample http and one https site ? Yes ?
Put it up for testing!.

Heiko

-- 
-- PREVINET S.p.A. www.previnet.it
-- Heiko Herold [EMAIL PROTECTED]
-- +39-041-5907073 ph
-- +39-041-5907472 fax


New win binary was (RE: Compilation breakage in html-parse.c)

2003-10-06 Thread Herold Heiko
 From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]
 
 This might be one cause for compilation breakage in html-parse.c.
 It's a Gcc-ism/c99-ism/c++-ism, depending on how you look at it, fixed
 by this patch:
 
 2003-10-03  Hrvoje Niksic  [EMAIL PROTECTED]
 
   * html-parse.c (convert_and_copy): Move variable declarations
   before statements.

Either this or another patch resolved - I didn't have time to track it down
for good. Didn't even read the Changelog, just a quick export, make, minimal
test, put up on site.
New msvc binary from current cvs at http://xoomer.virgilio.it/hherold
(yes, ISP decided to change the url. Old urls do still work).

Heiko

-- 
-- PREVINET S.p.A. www.previnet.it
-- Heiko Herold [EMAIL PROTECTED]
-- +39-041-5907073 ph
-- +39-041-5907472 fax


Web page source using wget?

2003-10-06 Thread Suhas Tembe
Hello Everyone,

I am new to this wget utility, so pardon my ignorance.. Here is a brief explanation of 
what I am currently doing:

1). I go to our customer's website every day  log in using a User Name  Password.
2). I click on 3 links before I get to the page I want.
3). I right-click on the page  choose view source. It opens it up in Notepad.
4). I save the source to a file  subsequently perform various tasks on that file.

As you can see, it is a manual process. What I would like to do is automate this 
process of obtaining the source of a page using wget. Is this possible? Maybe you 
can give me some suggestions.

Thanks in advance.
Suhas


Problems ands suggestions

2003-10-06 Thread Bloodflowers [Tuth 10]
I'm a big fan of wget. I've been usnig it for quite a while now, and am now 
testing the 1.9beta3 on win2k.

First of all, I'd like to suggest a couple of things:
# it should be possible to tell wget to ignore a couple of errors:
  	FTPLOGINC // FTPs often give out this error when they're full. I want it 
to keep trying
   CONREFUSED // the FTP may be temporarily down
   FTPLOGREFUSED // the FTP may be full
   FTPSRVERR // freakish errors happen every once in a while

# if I tell to download files from a list, and it fails, it should still 
obey the waitretry timeout as was pointed out by someone else earlier (I 
don't have the time right now to go look for the post)

and now for the problem:

Apparently, when wget has a problem during a transfer in win32 and dies, it 
the starts saying:
failed: No such file or directory

this has hapened to me on HTTPs, I have to check to see what the real error 
is (I'm guessing CONREFUSED), but it shouldn't be givin this error anyway.

thanks for everything ;)

_
Help STOP SPAM with the new MSN 8 and get 2 months FREE*  
http://join.msn.com/?page=features/junkmail



Re: Web page source using wget?

2003-10-06 Thread Tony Lewis
Suhas Tembe wrote:

 1). I go to our customer's website every day  log in using a User Name 
 Password.
[snip]
 4). I save the source to a file  subsequently perform various tasks on
 that file.

 What I would like to do is automate this process of obtaining the source
 of a page using wget. Is this possible?

That depends on how you enter your user name and password. If it's via using
an HTTP user ID and password, that's pretty easy.

wget
http://www.custsite.com/some/page.html --http-user=USER --http-passwd=PASS

If you supply your user ID and password via a web form, it will be tricky
(if not impossible) because wget doesn't POST forms (unless someone added
that option while I wasn't looking. :-)

Tony



Re: Web page source using wget?

2003-10-06 Thread Hrvoje Niksic
Tony Lewis [EMAIL PROTECTED] writes:

 wget
 http://www.custsite.com/some/page.html --http-user=USER --http-passwd=PASS

 If you supply your user ID and password via a web form, it will be
 tricky (if not impossible) because wget doesn't POST forms (unless
 someone added that option while I wasn't looking. :-)

Wget 1.9 can send POST data.

But there's a simpler way to handle web sites that use cookies for
authorization: make Wget use the site's own cookie.  Export cookies as
explained in the manual, and specify:

wget --load-cookies=COOKIE-FILE http://...

Here is an excerpt from the manual section that explains how to export
cookies.

`--load-cookies FILE'
 Load cookies from FILE before the first HTTP retrieval.  FILE is a
 textual file in the format originally used by Netscape's
 `cookies.txt' file.

 You will typically use this option when mirroring sites that
 require that you be logged in to access some or all of their
 content.  The login process typically works by the web server
 issuing an HTTP cookie upon receiving and verifying your
 credentials.  The cookie is then resent by the browser when
 accessing that part of the site, and so proves your identity.

 Mirroring such a site requires Wget to send the same cookies your
 browser sends when communicating with the site.  This is achieved
 by `--load-cookies'--simply point Wget to the location of the
 `cookies.txt' file, and it will send the same cookies your browser
 would send in the same situation.  Different browsers keep textual
 cookie files in different locations:

Netscape 4.x.
  The cookies are in `~/.netscape/cookies.txt'.

Mozilla and Netscape 6.x.
  Mozilla's cookie file is also named `cookies.txt', located
  somewhere under `~/.mozilla', in the directory of your
  profile.  The full path usually ends up looking somewhat like
  `~/.mozilla/default/SOME-WEIRD-STRING/cookies.txt'.

Internet Explorer.
  You can produce a cookie file Wget can use by using the File
  menu, Import and Export, Export Cookies.  This has been
  tested with Internet Explorer 5; it is not guaranteed to work
  with earlier versions.

Other browsers.
  If you are using a different browser to create your cookies,
  `--load-cookies' will only work if you can locate or produce a
  cookie file in the Netscape format that Wget expects.

 If you cannot use `--load-cookies', there might still be an
 alternative.  If your browser supports a cookie manager, you can
 use it to view the cookies used when accessing the site you're
 mirroring.  Write down the name and value of the cookie, and
 manually instruct Wget to send those cookies, bypassing the
 official cookie support:

  wget --cookies=off --header Cookie: NAME=VALUE




Re: Web page source using wget?

2003-10-06 Thread Hrvoje Niksic
Suhas Tembe [EMAIL PROTECTED] writes:

 Hello Everyone,

 I am new to this wget utility, so pardon my ignorance.. Here is a
 brief explanation of what I am currently doing:

 1). I go to our customer's website every day  log in using a User Name  Password.
 2). I click on 3 links before I get to the page I want.
 3). I right-click on the page  choose view source. It opens it up in Notepad.
 4). I save the source to a file  subsequently perform various tasks on that file.

 As you can see, it is a manual process. What I would like to do is
 automate this process of obtaining the source of a page using
 wget. Is this possible? Maybe you can give me some suggestions.

It's possible, in fact it's what Wget does in its most basic form.
Disregarding authentication, the recipe would be:

1) Write down the URL.

2) Type `wget URL' and you get the source of the page in file named
   SOMETHING.html, where SOMETHING is the file name that the URL ends
   with.

Of course, you will also have to specify the credentials to the page,
and Tony explained how to do that.



Wget 1.9-beta4 is available for testing

2003-10-06 Thread Hrvoje Niksic
Several bugs fixed since beta3, including a fatal one on Windows.
Includes a working Windows implementation of run_with_timeout.

Get it from:

http://fly.srk.fer.hr/~hniksic/wget/wget-1.9-beta4.tar.gz



-q and -S are incompatible

2003-10-06 Thread Dan Jacobson
-q and -S are incompatible and should perhaps produce errors and be noted thus
 in the docs.

BTW, there seems no way to get the -S output, but no progress
indicator.  -nv, -q kill them both.

P.S. one shouldn't have to confirm each bug submission. Once should be enough.