wget suggestion

2007-05-03 Thread Robert La Ferla
There needs to be a way to tell wget to reject all domains EXCEPT  
those that are accepted.  This should include subdomains.  Ie.  I  
just want to download www.mydomain.com and cache.mydomain.com.  I  
thought the --domains option would work this way but it doesn't.





Re: wget suggestion

2007-05-03 Thread Steven M. Schweda
From: Robert La Ferla

 There needs to be a way to tell wget to reject all domains EXCEPT those
 that are accepted. This should include subdomains. Ie. I just want to
 download www.mydomain.com and cache.mydomain.com. I thought the
 --domains option would work this way but it doesn't.

   Can you provide any evidence that it doesn't?  Useful info might
include the wget version, your OS and version, the command you used, and
the results you got.  Adding -d to the command often reveals more than
not using it.  A real example is usually more useful than a fictional
example.

   If you can't exhibit the actual failure and explain how to reproduce
it, you might do better with a psychic hot-line, as most of us are not
skilled in remote viewing.



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547


Re: wget suggestion

2007-05-03 Thread Robert La Ferla

GNU Wget 1.10.2

Capture this sub-site and not the rest of the site so that you can  
view it locally.  i.e.  just www.boston.com and cache.boston.com


http://www.boston.com/ae/food/gallery/cheap_eats/


On May 3, 2007, at 10:34 PM, Steven M. Schweda wrote:


From: Robert La Ferla

There needs to be a way to tell wget to reject all domains EXCEPT  
those

that are accepted. This should include subdomains. Ie. I just want to
download www.mydomain.com and cache.mydomain.com. I thought the
--domains option would work this way but it doesn't.


   Can you provide any evidence that it doesn't?  Useful info might
include the wget version, your OS and version, the command you  
used, and
the results you got.  Adding -d to the command often reveals more  
than

not using it.  A real example is usually more useful than a fictional
example.

   If you can't exhibit the actual failure and explain how to  
reproduce

it, you might do better with a psychic hot-line, as most of us are not
skilled in remote viewing.

-- 
--


   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547




Re: wget suggestion

2007-05-03 Thread Steven M. Schweda
From: Robert La Ferla

 GNU Wget 1.10.2

   Ok.  Running on what?

 Capture this sub-site and not the rest of the site so that you can  
 view it locally.  i.e.  just www.boston.com and cache.boston.com
 
 http://www.boston.com/ae/food/gallery/cheap_eats/

   What is a sub-site?  Do you mean this page, or this page and all
the pages to which it links, excluding off-site pages, or what?

   I have a better idea.  Read this again:

 Can you provide any evidence that it doesn't?  Useful info might
  include the wget version, your OS and version, the command you  
  used, and
  the results you got.  Adding -d to the command often reveals more  
  than
  not using it.  A real example is usually more useful than a fictional
  example.
 
 If you can't exhibit the actual failure and explain how to  
  reproduce
  it, you might do better with a psychic hot-line, as most of us are not
  skilled in remote viewing.

   You might also consider phrasing your demands as polite requests in
future.  Phrases like I would like to learn how to, or Can you
explain how to can be useful for this.  Even better would be, I tried
this command insert command here, and I got this result insert result
here, but I was expecting something more like this insert expected
result here, and I definitely didn't expect this insert undesirable
result here.



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547


feature suggestion - make option to use system date instead multiple version number

2007-04-27 Thread Alvydas
Hi

I use wget to download the same file in regular intervals
(price list). And I caught myself renaming files programmatically
(even in several projetcs) after they were downloaded by wget
and named file.1 file.2 etc.

Why I need this: one day I take the downloaded files from downlddir/
and move them into backupdir/. Two days later I move downlddir/*
to backupdir/*  again. They would overwrite those already in
backupdir/ because wget restarted numbering when it found
empty downlddir/ on day 1.

I guess it would relatively easy and quite useful to add an option
to name file.20070426142800 file.20070426142955 ... instead just numbers.

Thank you, wget is excellent tool anyway.

-- 
[EMAIL PROTECTED]
SDF Public Access UNIX System - http://sdf.lonestar.org


Re: feature suggestion - make option to use system date instead multiple version number

2007-04-27 Thread Steven M. Schweda
From: Alvydas

 I guess it would relatively easy and quite useful to add an option
 to name file.20070426142800 file.20070426142955 ... instead just numbers.

   The relevant code is in src/utils.c: unique_name(), and should be
easy enough to change.  On a fast system, however, one-second resolution
(or multiple users) could lead to non-unique names, so it would be wise
to do something a little more like the existing code, but with a
date-time string added in.



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547


Suggestion: wget -r should fetch images declared in CSS

2007-04-13 Thread J . F . Groff
Hi there,

wget -r is very useful to slurp and archive entire web sites, I use it all the
time when working with other web designers remotely. However, images declared
in CSS rules are ignored by the robot if they are not seen elsewhere in the
HTML pages. Therefore, to mirror the site properly, such images must be fetched
manually with separate commands after poking into each individual CSS file --
a cumbersome and error- prone process. As web designers increasingly rely on
background images in CSS to cleanly separate presentation from content, wget
ought to accomodate this feature.

Happy Friday the 13th,

  JFG




Re: Feature suggestion for WGET

2007-04-11 Thread Steven M. Schweda
From: Daniel Clarke - JAS Worldwide

 I'd like to suggest a feature for WGET:  the ability to download a file
 and then delete it afterwards.

   Assuming that you'd like to delete it on the FTP server, and not
locally, the basics of this seem pretty easy to add:

   0. Documentation.

   1. Some kind of command-line option to control the new
source-delete feature (or whatever you decide to call it).

   2. src/ftp-basic.c: Add a new function, ftp_dele() (very nearly
ftp_retr() converted to send DELE instead of RETR, and to expect a
2xx success response instead of a 1xx).

   3. src/ftp.h: Add function prototype for ftp_dele().

   4. src/ftp.c: In getftp(), if ftp_retr() succeeds, and the new
source-delete option is enabled, call the new ftp_dele().

   5.  src/ftp.c: Add a bunch of new debug and error message code to
deal with ftp_dele() activity and failures.

   I've done steps 2, 3, and 4 in my experimental code, and the basic
functionality seems to be there.  If anyone is eager to do the whole job
and wants to see my rough code, just let me know.



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547


Feature suggestion for WGET

2007-04-10 Thread Daniel Clarke - JAS Worldwide
Hi
I'd like to suggest a feature for WGET:  the ability to download a file
and then delete it afterwards. 
At the moment I use this tool as part of a batch script that downloads
all the waiting files from a remote server using wget, then quits wget,
and used Windows's FTP.exe to delete all the files.  This is not ideal,
because if the process is interrupted, e.g. after downloading 99% of the
files the connection is severed, I have to re-download all the files
again.  It also runs the risk of deleting new files placed on the server
between getting the first and second file listing.
 
 
It would be helpful if Wget can delete each file immediately after
downloading it.
 
Thanks
Daniel
 



Daniel Clarke MSci ARCS 
Senior Application Developer
JAS Worldwide Management, LLC
Global Headquarters, Atlanta, USA 
Cell/mobile: +1 4045181127 
Office: +1 4042558230 ext 3023  *** note new office number 
[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]  
Skype: jas-jasww-danielclarke callto:jas-jasww-danielclarke  
 


wget - feature suggestion

2007-03-12 Thread Rudolf Martin
First thanks for this great tool.

Perhaps following features are helpfull (for me they are)

--strict-level  download data only from the given depthlevel

-include should break -np. Momentarily wget don't accept the include-directory 
when it is an a higher level and -np is set.

- filtering options for download-files. Minimum and maximium filesize would be 
very helpfull.

Please cc to my adress.
-- 
Feel free - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: www.gmx.net/de/go/mailfooter/topmail-out


wget - feature suggestion - md5/sha1 signatures on downloaded files

2007-03-04 Thread Alvin Austin

Hello,

How about adding an option to display the md5 and/or sha1 signatures of 
files that wget downloads?


These signatures can be calculated in real-time as each file is 
downloaded, and so would not require much extra I/O or cpu, but having 
the signatures shown right away would help people to verify files easily 
and quickly.


Keep up the good work on wget!

Alvin


Suggestion

2007-01-24 Thread Nejc Škoberne

Hello,

as far as I can see, wget always prints the final data transfer speed
in autodetected units. I think it would be useful (and I guess also
simple to add) an option, which would tell wget to always print the
speed in bytes per second (for example) so that it is always nicely
parsable no matter what the data transfer speed range is. Or else it
is necessary to parse also the K and the M characters and do some
conditionals ... it's just not nice.

Thanks,
Nejc
begin:vcard
fn;quoted-printable:Nejc =C5=A0koberne
n;quoted-printable:=C5=A0koberne;Nejc
email;internet:[EMAIL PROTECTED]
tel;fax:+38653810387
tel;cell:+38631883217
x-mozilla-html:FALSE
version:2.1
end:vcard



smime.p7s
Description: S/MIME Cryptographic Signature


wget logging suggestion

2007-01-23 Thread Bruce Holm
I notice that the logging output that wget provides only includes a time
stamp, but not date.  So when using the -a for appending output to a log
file, the time of execution is logged but you have no idea which dates it
ran.  Seems very odd.  This applies to both -nv and -v options.

--
Bruce Holm
Lattice Semiconductor
[EMAIL PROTECTED]
--


Question / Suggestion for wget

2006-10-13 Thread Mitch Silverstein

If -O output file and -N are both specified, it seems like there should be some 
mode where
the tests for noclobber apply to the output file, not the filename that exists 
on the remote machine.

So, if I run
# wget -N http://www.gnu.org/graphics/gnu-head-banner.png -O foo
and then
# wget -N http://www.gnu.org/graphics/gnu-head-banner.png -O foo
the second wget would not clobber and re-get the file.

Similarly, it seems odd that
# wget http://www.gnu.org/graphics/gnu-head-banner.png
and then
# wget -N http://www.gnu.org/graphics/gnu-head-banner.png -O foo
refuses to write the file named foo.

I realize there are already lots of options and the interactions can be pretty 
confusing, but I think
what I'm asking for would be of general usefulness. Maybe I'm sadistic, but -NO 
amuses me as a why to
turn on this behavior. Perhaps just --no-clobber-output-document would be saner.

Thanks for your consideration,
Mitch





Re: Question / Suggestion for wget

2006-10-13 Thread Steven M. Schweda
From: Mitch Silverstein

 If -O output file and -N are both specified [...]

   When -O foo is specified, it's not a suggestion for a file name to
be used later if needed.  Instead, wget opens the output file (foo)
before it does anything else.  Thus, it's always a newly created file,
and hence tends to be newer than any any file existing on any server
(whose date-time is set correctly).

   -O has its uses, but it makes no sense to combine it with -N. 
Remember, too, that wget allows more than one URL to be specified on a
command line, so multiple URLs may be associated with a single -O
output file.  What sense does -N make then?

   It might make some sense to create some positional option which would
allow a URL-specific output file, like, say, -OO, to be used so:

  wget http://a.b.c/d.e -OO not_dd.e http://g.h.i/j.k -OO not_j.k

but I don't know if the existing command-line parser could handle that. 
Alternatively, some other notation could be adopted, like, say,
file=URL, to be used so:

  wget not_dd.e=http://a.b.c/d.e not_j.k=http://g.h.i/j.k

   But that's not what -O does, and that's why you're (or your
expectations are) doomed.



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547


Re: Feature suggestion: change detection for wget -c

2006-09-15 Thread Mauro Tortonesi

John McCabe-Dansted ha scritto:

Wget has no way of verifying that the local file is
  really a valid prefix of the remote file

Couldn't wget redownload the last 4 bytes (or so) of the file?

For a few bytes per file we could detect changes to almost all
compressed files and the majority of uncompressed files.


reliable detection of changes in the resource to be downloaded would be 
a very interesting feature. but do you really think that checking the 
last X ( 100) bytes would be enough to be reasonably sure the resource 
was (not) modified? what about resources which are updated by appending 
information, such as log files?


--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: Feature suggestion: change detection for wget -c

2006-09-15 Thread John McCabe-Dansted

On 9/15/06, Mauro Tortonesi [EMAIL PROTECTED] wrote:

reliable detection of changes in the resource to be downloaded would be
a very interesting feature. but do you really think that checking the
last X ( 100) bytes would be enough to be reasonably sure the resource
was (not) modified? what about resources which are updated by appending
information, such as log files?


In terms of corruption prevention, wget -c is safe if the resources
are updated only by appending.

Two weaknesses I can think of are logs with fixed width repetitive
messages, e.g.

 12:05 Disks not mirrored
 12:10 Disks not mirrored

Then if we did a wget -c on the new log  file

 11:40 Disks not mirrored
 11:45 Disks not mirrored
 11:50 Disks not mirrored

we would get an invalid log file. However I imagine most log files
have at least a few variable length messages, so this technique would
work on a majority of log files (well over 50%).

Another weakness would be uncompressed database files...

However I suspect that comparing the last 4 bytes would catch 90% of
the real world snafus. I can't verify this without doing a survey of
wget users, but I can say that this would have caught 100% of my own
snafus.

There are two problems common enough to be mentioned in the man page,
proxies that append transfer interrupted to the end of failed
downloads and inappropriate use of wget -c -r.  Checking the last 4
bytes would catch ~100% of cases of transfer interrupted being
appended. If wget acts recursively on a directory (wget -c -r) there
are many more opportunities for corruption to be detected.

--
John C. McCabe-Dansted
PhD Student
University of Western Australia


Feature suggestion: change detection for wget -c

2006-08-31 Thread John McCabe-Dansted

Wget has no way of verifying that the local file is
  really a valid prefix of the remote file

Couldn't wget redownload the last 4 bytes (or so) of the file?

For a few bytes per file we could detect changes to almost all
compressed files and the majority of uncompressed files.

--
John C. McCabe-Dansted
PhD Student
University of Western Australia


Re: Suggestion

2006-07-13 Thread Mauro Tortonesi

Kumar Varanasi ha scritto:

Hello there,

I am using WGET in my system to download http files. I see that there is no
option to download the file faster with multiple connections to the server.
Are you planning on a multi-threaded version of WGET to make downloads 
much faster?


no, there is no plan to implement parallel download at the moment.
however, please notice that it is highly unlikely that opening more than 
one connection with the same server will speed up the download process. 
parallel download makes sense only when more than one server is involved.


--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Suggestion for modification to wget

2006-06-20 Thread Ulf Samuelsson



A lot of packages to build Linux distributions for 
the embedded world relies on "wget".
Typically they are based on a Makefile and a 
configuration file included by the Makefile.

If a package is to be built, then the package is 
downloaded to a directory somewhere.
The makefile will try to extract the package from 
this directory first 
and if it does not exist it will try to download 
the package from an Internet site
using wget.

It is a quite common problem that the file does not 
exist, because 
a new version is available and the original file 
has been moved to another location.

wget would be significantly improved if it would 
try one or more alternate locations for the 
package.

Whenwget readsthe configuration file(s) 
~/.wgetrc.xml
this could contain information about alternate 
sites.

Ideally, these files should contain information 
about
*list of files containing associations 
between package names and 
 where they can be found, 
either on ftp, http sites or on the local 
disk
* Web sites from where new lists of associations 
can be downloaded.

In addition there shouldbe the files 
containing the associations...

In order to avoid having to rewrite a lot of 
scripts there
should ideally not be a switch in the wget 
"download" command which indicates this.
It is better if the configuration is retrieved and 
the fact it is there
is enough for wget to try.

Something like
wget -O http://site.mydomain.com/retreive.php?file=package-version.tar.bz?try=4 
| wget - 

would of course work if someone set up such a site, 
but a local solution is better.


-
It would not be a bad idea if wget could report all 
files at the failing site
which are similar to 
the one requested.

"wget --switch 
site/package-*.tar.gz"
should give a list of all packages that look like 
package-*.tar.gz/tar.bz2
wget --spider somehow does this, but results in a 
lot of extra unneccessary info.
I just like to get the filenames.




I do not subscribe to the wget mailing list 
so pleasereply to "ulf at atmel dot 
com"




Best Regards,Ulf 
Samuelsson


Suggestion/Question

2006-02-26 Thread Markus Raab

Hallo,

yesterday I encountered to wget and I find it a very useful program. I 
am mirroring a big site, more precious a forum. Because it is a forum 
under each post you have the action quote. Because that forum has 
20.000 post it would download all with action=quote, so I rejected it 
with R=*action=quote*. It works as in the manual documented, the files 
aren't stored, but they are downloaded anyway and deleted right after 
downloading. Why can't wget skip these files resp urls that would make 
downloading much faster and the site admin would also be happy because 
he has less traffic. If it has to do that wget must get these files to 
ensure that it doesn't forget anything downloading then a switch would 
be useful to turn this behaviour of manually, if the user knows that he 
doesn't need that an the deeper documents. In a forum e.g. it is 
absolutely clear that you can abdicate analysing this files because the 
won't link to any further documents.

Thanks for your answer.

Markus


Suggestion for documentation

2006-02-17 Thread Frank McCown
It may be useful to add a paragraph to the manual which lets users know 
they can use the --debug option to see why certain URLs are not followed 
(rejected) by wget.  It would be especially useful to mention this in 
9.1 Robot Exclusion.  Something like this:


If you wish to see which URLs are blocked by the robots.txt while wget 
is crawling, use the --debug option.  You will see 2 lines that describe 
why the URL is being rejected:


Rejecting path /abc/bar.html because of rule `/abc'.
Not following http://foo.org/abc/bar.html because robots.txt forbids it.

Thanks,
Frank


A bug or suggestion

2005-10-14 Thread Conrado Miranda
I saw that the option "-k, --convert-links" make the links on the root directory, not at the directory you down the pages. For example: if I download a page that the url is www.pageexample.com, the pages I download goes into there. But if i use that option, in the pages the links will link to the root directory. For example: if i download at /home and there is a link to www.pageexample.com/test/index.htm, the link must focus /home/www.pageexample.com/test/index.htm, but it focus /www.pageexample.com/test/index.htm.
Well, I haven't tested it on Linux yet, but this problem occurs on cygwin (the root directory becomes the partition where the program is installed, like C:).

Thak you for your attention, ConradoO difícil se faz agora... O impossível é apenas uma quesão de tempo.A prática leva à perfeição, exceto na roleta russa.
		 
Promoção Yahoo! Acesso Grátis: a cada hora navegada você acumula cupons e concorre a mais de 500 prêmios! Participe!

Re: Suggestion for manpage clarification (re --progress)

2005-09-15 Thread Linda Walsh

Bonjourno! :-)

Sigh.  Was hoping that someone who wrote the original
man page format might already have expertise in that area.
It's just arcana (obscure knowledge, not necessarily hard
to learn or use, just not widely known).  Are you saying
that you wrote the original, but aren't familiar with the
even more arcane tbl input language for tables?: Not that
I or anyone would _expect_ knowledge of one or the other -- both
are somewhat obscure source formats (even though widely used
for manpages) these days...(*sigh*)...the sacrificing
we make in the, not unappreciated, grandfathering of the
old ways...:-)

- Linda

p.s. - I think this posting was meant to go to wget@sunsite.dk,
as such, am responding to it there...

Mauro Tortonesi wrote:


Alle 22:03, sabato 27 agosto 2005, hai scritto:

 Being a computer geek, I tend to like things organized

 in tables so options stand out. I took the time to rewrite

 the text for the --progress section of the manpage, as

 it was always difficult for me to find the values and

 differences for the different subtags. Looking at the

 --progress=type, it doesn't quickly stand out what the

 possible values are nor that there are . I tended more

 toward a BNF type specification, but the central change is

 making the style types stand out. So even if you don't

 like the exact wording, I do think the table format presents

 the style options more clearly (i.e. they stand out quickly

 note, output of man was used as template for the changes, so

 this isn't directly applicable as a patch. I hope that isn't

 a block to the change, as it seems simple enough but I don't

 currently have a subversion source tree setup nor do I know

 manpage source syntax by memory (not a frequently used source

 language ;^) ):





 --progress=style

 Legal styles are bar[:force] and dot[:dotsize].



 The bar style is used by default. It draws an

 ASCII progress bar graphics (a.k.a thermometer

 display) indicating the status of retrieval. If the

 output is not a TTY, the dot style will be used.



 To force bar usage when output is not a TTY, use

 the :force tag (i.e. --progress=bar:force )



 The dot style traces traces the retrieval by printing

 dots on the screen, each dot representing a fixed

 amount of downloaded data.



 An optional dotsize tag can be specified to change

 the the amount of downloaded data per dot, grouping

 and line as follows (K = 1024 bytes; M = 1024KBytes ):



 size per dots per

 dotsize dot line group line

 - ---  - 

 default 1K 50K 10 50

 binary 8K 384K 16 48

 mega 64K 3M 8 48



 default is used if no dotsize tag is specified.



 Note that you can set per user defaults using the

 progress command in .wgetrc. Note: specifying

 an option on the command line overrides .wgetrc

 settings.

i like this change, but there is a small problem. all the 
documentation of wget is generated from the same texinfo sources, so 
in order to support tex documentation formats we'll have to include 
something like:


@ifnottex

your ascii graphics

@end

@iftex

a real tex table

@end

in wget.texi.

i've never used tex. anybody knows how to create tables in tex?

--

Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi http://www.tortonesi.com

University of Ferrara - Dept. of Eng. http://www.ing.unife.it

GNU Wget - HTTP/FTP file retrieval tool http://www.gnu.org/software/wget

Deep Space 6 - IPv6 for Linux http://www.deepspace6.net

Ferrara Linux User Group http://www.ferrara.linux.it



[Suggestion] Surpress host header?

2005-08-31 Thread Yuan LIU
This is no bug.  But we encountered a situation where a server insists on 
accurate FQDN in the host header, or no header at all.  When we have to 
access the server from outside the NAT firewall using port forwarding, wget 
cannot retrieve file.


If there's an option to surpress host header all together, this problem can 
be solved.


Yuan Liu




Suggestion for manpage clarification (re --progress)

2005-08-27 Thread Linda Walsh

Being a computer geek, I tend to like things organized
in tables so options stand out.  I took the time to rewrite
the text for the --progress section of the manpage, as
it was always difficult for me to find the values and
differences for the different subtags.  Looking at the
--progress=type, it doesn't quickly stand out what the
possible values are nor that there are .  I tended more
toward a BNF type specification, but the central change is
making the style types stand out.  So even if you don't
like the exact wording, I do think the table format presents
the style options more clearly (i.e. they stand out quickly
note, output of man was used as template for the changes, so
this isn't directly applicable as a patch.  I hope that isn't
a block to the change, as it seems simple enough but I don't
currently have a subversion source tree setup nor do I know
manpage source syntax by memory (not a frequently used source
language ;^) ):


  --progress=style
  Legal styles are bar[:force] and dot[:dotsize].

  The bar style is used by default.  It draws an
  ASCII progress bar graphics (a.k.a thermometer
  display) indicating the status of retrieval.  If the
  output is not a TTY, the dot style will be used.

  To force bar usage when output is not a TTY, use
  the :force tag (i.e. --progress=bar:force )
 
  The dot style traces traces the retrieval by printing

  dots on the screen, each dot representing a fixed
  amount of downloaded data.

  An optional dotsize tag can be specified to change
  the the amount of downloaded data per dot, grouping
  and line as follows (K = 1024 bytes; M = 1024KBytes ):

   size per dots per
  dotsize   dot  linegroup line
  -   ---  - 
  default  1K   50K 1050
  binary   8K  384K 1648
  mega64K3M  848

  default is used if no dotsize tag is specified.

  Note that you can set per user defaults using the
  progress command in .wgetrc.  Note: specifying
  an option on the command line overrides .wgetrc
  settings.


Re: A suggestion for configure.in

2005-08-26 Thread Stepan Kasal
Hello,

On Fri, Aug 26, 2005 at 02:07:16PM +0200, Hrvoje Niksic wrote:
 I've applied a slightly modified version of this patch, thanks.
 [...]  I used elif instead.

thank you, also for correcting my mistake.

Actually, I wasn't aware about the fact that shell elif is portable.
(I checked, and it appears in autoconf source several times, so it really
is portable.)

Thank you.

Stepan


Re: A suggestion for configure.in

2005-08-24 Thread Hrvoje Niksic
Stepan Kasal [EMAIL PROTECTED] writes:

 1) I removed the AC_DEFINEs of symbols HAVE_GNUTLS, and HAVE_OPENSSL.
 AC_LIB_HAVE_LINKFLAGS defines HAVE_LIBGNUTLS and HAVE_LIBSSL, which
 can be used instead.  wget.h was fixed to expect these symnbols.
 (You might think your defines are more aptly named, but they are used
 only once, in wget.h.)

You're right.  While I do prefer the old names, it's not that big a
deal and it doesn't make sense to needlessly duplicate the defines.

 2) Was it intentional that --without-ssl doesn't switch off OpenSSL
 autodetection?  I hope it wasn't.

Definitely not.

 3) Explicit --with-ssl=gnutls should fail if libgnutls is not found.
 If the user explicitely asked for it, we shouldn't silently ignore the
 request if we cannot fulfill it.
 And likewise with ./configre --with-ssl.

Agreed.

 (I know this is not common practice (yet), but I believe it's
 according to common sense.

Wget 1.10 did this.  The feature got lost when moving to
AC_LIB_HAVE_LINKFLAGS.


A suggestion for configure.in

2005-08-23 Thread Stepan Kasal
Hello,
   attached please find a patch with several suggestions.
(I'm not sending it to wget-patches, as I'm not sure all the suggestions
will be welcome.)

1) I removed the AC_DEFINEs of symbols HAVE_GNUTLS, and HAVE_OPENSSL.
AC_LIB_HAVE_LINKFLAGS defines HAVE_LIBGNUTLS and HAVE_LIBSSL, which
can be used instead.  wget.h was fixed to expect these symnbols.
(You might think your defines are more aptly named, but they are used
only once, in wget.h.)

2) Was it intentional that --without-ssl doesn't switch off OpenSSL
autodetection?  I hope it wasn't.

3) Explicit --with-ssl=gnutls should fail if libgnutls is not found.
If the user explicitely asked for it, we shouldn't silently ignore the
request if we cannot fulfill it.
And likewise with ./configre --with-ssl.
(I know this is not common practice (yet), but I believe it's according
to common sense.  This is discussed in the CVS version of Autoconf manual,
you can get it from savannah.)

4) A typo in a comment (a spare dnl).

All these issues are resolved by the combined patch, attached to this mail.
Please cc the replies to me, I'm not subscribed.

Regards,
Stepan Kasal
Index: configure.in
===
--- configure.in(revision 2062)
+++ configure.in(working copy)
@@ -248,15 +248,15 @@
   if test x$LIBGNUTLS != x
   then
 AC_MSG_NOTICE([compiling in support for SSL via GnuTLS])
-AC_DEFINE([HAVE_GNUTLS], 1,
- [Define if support for the GnuTLS library is being compiled in.])
 SSL_OBJ='gnutls.o'
+  else
+AC_MSG_ERROR([--with-ssl=gnutls was given, but GNUTLS is not available.])
   fi
-else
+else if test x$with_ssl != xno; then
   dnl As of this writing (OpenSSL 0.9.6), the libcrypto shared library
   dnl doesn't record its dependency on libdl, so we need to make sure
   dnl -ldl ends up in LIBS on systems that have it.  Most OSes use
-  dnl dlopen(), but HP-UX uses dnl shl_load().
+  dnl dlopen(), but HP-UX uses shl_load().
   AC_CHECK_LIB(dl, dlopen, [], [
 AC_CHECK_LIB(dl, shl_load)
   ])
@@ -274,9 +274,10 @@
   if test x$LIBSSL != x
   then
 AC_MSG_NOTICE([compiling in support for SSL via OpenSSL])
-AC_DEFINE([HAVE_OPENSSL], 1,
- [Define if support for the OpenSSL library is being compiled in.])
 SSL_OBJ='openssl.o'
+  else if -n $with_ssl
+  then
+AC_MSG_ERROR([--with-ssl was given, but OpenSSL is not available.])
   fi
 fi
 
Index: src/wget.h
===
--- src/wget.h  (revision 2062)
+++ src/wget.h  (working copy)
@@ -40,7 +40,8 @@
 # define NDEBUG
 #endif
 
-#if defined HAVE_OPENSSL || defined HAVE_GNUTLS
+/* Is OpenSSL or GNUTLS available? */
+#if defined HAVE_LIBSSL || defined HAVE_LIBGNUTLS
 # define HAVE_SSL
 #endif
 


Re: Suggestion

2005-06-23 Thread Hrvoje Niksic
Matthew J Harms [EMAIL PROTECTED] writes:

 I'm sure you've already had this suggested, and I don't know if it
 will work, due to the complexity of the suggestion, but is there a
 way you could implement the capability of wget to download any file
 that meets a criteria yet use wildcards (i.e. * or ?) to fill in the
 blanks.

You can use wget -rl1 URL -A 200506*.exe.

The problem is that you must have a URL that lists all the available
files in HTML form.  If you don't have such a URL, it's impossible to
guess which files the server may contain.  (Unlike FTP, HTTP doesn't
support producing directory listings.)

 I'm not sure if wget even has the capability right now to do it?

If the problem is what I described above, no generic downloading agent
has the capability to do it.


Suggestion

2005-06-22 Thread Matthew J Harms








Im sure youve already had this suggested, and
I dont know if it will work, due to the complexity of the suggestion,
but is there a way you could implement the capability of wget to download any
file that meets a criteria yet use wildcards (i.e. * or ?) to fill in the
blanks. For example, Im trying to download the latest Intelligent
Updater antivirus definitions from Symantecs website, but I dont
want to have to visit the site to figure out what the actual file name is, in
order to download it. So if I were able to use wildcards I would put
something like, wget http://definitions.symantec.com/defs/200506*.exe
-N and leave it up to wget to get only the latest file from the
server. Now I dont know if possible because I dont think,
but I could be wrong, wget has the possibility to preview the sites and find
out what files it does have, in order to download the latest. Or, even if
it could download every file that were newer than the last one in the folder,
wget will still download the file to specified folder. Ive tried
to access the definitions site by itself, but there is nothing to be
seen. 



Im not sure if wget even has the capability right now
to do it? Ive tried previewing the help that comes with it, but
there is nothing mentioning what Ive suggested within the help. Ill
keep looking around on the inet to see if anyone else has figured it out.



Thanks,

 Matt








wget Question/Suggestion

2005-05-20 Thread Mark Anderson
Is there an option, or could you add one if there isn't,
to specify that I want wget to write the downloaded html
file, or whatever, to stdout so I can pipe it into some
filters in a script?


Re: wget Question/Suggestion

2005-05-20 Thread Hrvoje Niksic
Mark Anderson [EMAIL PROTECTED] writes:

 Is there an option, or could you add one if there isn't, to specify
 that I want wget to write the downloaded html file, or whatever, to
 stdout so I can pipe it into some filters in a script?

Yes, use `-O -'.


Re: suggestion

2005-03-26 Thread Hrvoje Niksic
Stephen Leaf [EMAIL PROTECTED] writes:

 parameter option --stdout
 this option would print the file being downloaded directly to stdout. which 
 would also mean that _only_ the file's content is printed. no errors, 
 verbosity.

 usefulness?
 wget --stdout http://server.com/file.bz2 | bzcat  file

Note that you can emulate the proposed `--stdout' by specifying
`-qO-'.


suggestion

2005-03-25 Thread Stephen Leaf
parameter option --stdout
this option would print the file being downloaded directly to stdout. which 
would also mean that _only_ the file's content is printed. no errors, 
verbosity.

usefulness?
wget --stdout http://server.com/file.bz2 | bzcat  file


Suggestion regarding size

2005-02-21 Thread Baptiste Bullot
Hello all,

Would it be possible to specify minimum size for files
to retrieve?

Please add me in the CC list of your replies as I'm
not a subscriber.

Thanks,
Baptiste



__ 
Do you Yahoo!? 
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250


Suggestion regarding size

2005-02-21 Thread Baptiste Bullot
Hello all,

Would it be possible to specify minimum size for files
to retrieve?

Please add me in the CC list of your replies as I'm
not a subscriber.

Thanks,
Baptiste

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


Suggestion regarding size

2005-02-21 Thread Baptiste Bullot
Hello all,

Would it be possible to specify minimum size for files
to retrieve?

Please add me in the CC list of your replies as I'm
not a subscriber.

Thanks,
Baptiste



__ 
Do you Yahoo!? 
Yahoo! Mail - Helps protect you from nasty viruses. 
http://promotions.yahoo.com/new_mail


suggestion for wget

2005-02-05 Thread Sorin
hi there ::)
the would be ok to have 2 or more downloads in the same time because 
some files are big and the host limits the speeed...

thanks:)
Sorin


Re: suggestion for wget

2005-02-05 Thread Ryan Underwood

On Sat, Feb 05, 2005 at 02:04:26PM +0200, Sorin wrote:
 hi there ::)
 
 the would be ok to have 2 or more downloads in the same time because 
 some files are big and the host limits the speeed...

You could use a multithreaded download manager (example: d4x).  Many of
these packages use wget as a backend.  You could also use the screen
utility to run many wgets concurrently, or just background them in the
current shell (but your screen will become a mess ... )

-- 
Ryan Underwood, [EMAIL PROTECTED]


Re: Suggestion, --range

2004-10-01 Thread Alain Bench
Hello Robert,

 On Thursday, September 30, 2004 at 6:36:43 PM +0200, Robert Thomson wrote:

 It would be really advantageous if wget had a --range command line
 argument, that would download a range of bytes of a file, if the
 server supports it.

You could try the feature patch posted by Rodrigo S. Wanderley last
year on the wget mailing list. The guy made the work, and nobody gave
feedback :-\. See [EMAIL PROTECTED].


Bye!Alain.
-- 
When you want to reply to a mailing list, please avoid doing so from a
digest. This often builds incorrect references and breaks threads.


Suggestion, --range

2004-09-30 Thread Robert Thomson
G'day,

It would be really advantageous if wget had a --range command line
argument, that would download a range of bytes of a file, if the
server supports it.

I've tried adding it with --header 'Range: bytes=from-to' but wget has
a problem with the 206 return code, and I can't see a way around that
on the command line.  An alternative might be an
--allow-returncode=206 option. ;)

Downloading partial files is really useful when you have a small USB
key and a large ISO. ;)

Thanks,
Rob.



Re: Suggestion to add an switch on timestamps

2004-03-19 Thread Hrvoje Niksic
david-zhan [EMAIL PROTECTED] writes:

 WGET is popular FTP software for UNIX. But, after the files were
 downloaded for the first time, WGET always use the date and time,
 matching those on the remote server, for the downloaded files. If
 WGET is executed in temporary directory in which the files will be
 deleted according to the date of the files, the files, created seven
 days ago, will be deleted automatically once they are finish. I
 suggest that an option on timestamps can be added to WGET such that
 the users can use the current date and time for the newly downloaded
 files.

Can't you simply use `touch *' to update the timestamps?



Suggestion to add an switch on timestamps

2004-03-16 Thread david-zhan
Suggestion to add an switch on timestamps

 

Dear Sir/Madam:

 

WGET is popular FTP software for UNIX. But, after the files were downloaded
for the first time, WGET always use the date and time, matching those on the
remote server, for the downloaded files. If WGET is executed in temporary
directory in which the files will be deleted according to the date of the
files, the files, created seven days ago, will be deleted automatically once
they are finish. I suggest that an option on timestamps can be added to WGET
such that the users can use the current date and time for the newly
downloaded files.

 

Thank you for kind attention.



doc suggestion

2003-12-22 Thread Chuck Roberts
Please put in the wget docs, in at least 2 places The rc file used
by wget under windows is actually wgetrc (no prefixed period), not
.wgetrc.

I could not find this info in the docs, and only figured it out by
experimentation.

Chuck

-- 
__
Freezone Freeware: http://freezone.darksoft.co.nz
http://chuckr.freeshell.org
1000+ programs in 40+ categories. Links to 500+ free Delphi controls
in 20+ categories!
Mirrors: http://www.bsdg.org/resources/
http://chuckr.bravepages.com 
http://groups.yahoo.com/group/DelphiOpenSource/files/Links/


wget Suggestion: ability to scan ports BESIDE #80, (like 443) Anyway Thanks for WGET!

2003-12-07 Thread mark . lombardi


Re: wget Suggestion: ability to scan ports BESIDE #80, (like 443) Anyway Thanks for WGET!

2003-12-07 Thread Tony Lewis
- Original Message - 
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, December 07, 2003 8:04 AM
Subject: wget Suggestion: ability to scan ports BESIDE #80, (like 443)
Anyway Thanks for WGET!

What's wrong with wget https://www.somesite.com ?



suggestion: rethink retrial and abort behavior

2003-10-11 Thread Jan Tisje
hi,

if I observerd correctly, wget behaves this way:

errors are classified into two classes:. critical and non-critical errors.

when a non critical error (eg time out) occurs wget retries continuing 
at the byte the last transmission stopped at.
(if configured that way)

if a critical error (like access denied, file not found) occured, wget 
stops.

I now experienced for the very first time, wget OVERWRITING a partly 
retrieved file because of a bug.
it was a pain in the ass, because I already had been waiting 1:30 hours 
to download the first half of 600 MB.

Wget tried to continue, but the server answered with the remaining file 
size (I believe) , not the complete file size. So wget got confused and 
restarted.
no, sorry I do not have the logfile any more. but i can get the link 
(it was a filefront download)

This makes me think about another behaviour:

- wget may be forced to always retry
since yesterday and in the past I experiencend many false stops because 
of bad server or connection. probably it should delay a retry in the 
critical case.

- if wget decides a continue is not possible due to server limitiation 
it should NOT delete the file but create a diff, if it seems appropriate
(is the diff command able to work on binary files?)

Jan







suggestion

2003-09-12 Thread John Joseph Bachir
it would be great if there was a flag that could be used with -q that
would only give output if there was an error.

i use wget a lot in pcs:

  johnjosephbachir.org/pcs

thanks!
john


Re: suggestion

2003-09-12 Thread Aaron S. Hawley
is -nv (non-verbose) an improvement?

$ wget -nv www.johnjosephbachir.org/
12:50:57 URL:http://www.johnjosephbachir.org/ [3053/3053] - index.html [1]
$ wget -nv www.johnjosephbachir.org/m
http://www.johnjosephbachir.org/m:
12:51:02 ERROR 404: Not Found.

but if you're not satisfied you could use shell redirection and the tail
command:

$ wget -nv www.johnjosephbachir.org/m 21  /dev/null | tail +2

you could use the return value of error to echo would ever you want.
$ wget -q www.johnjosephbachir.org/m || echo Error
Error


On Fri, 12 Sep 2003, John Joseph Bachir wrote:

 it would be great if there was a flag that could be used with -q that
 would only give output if there was an error.

 i use wget a lot in pcs:

   johnjosephbachir.org/pcs

 thanks!
 john


Re: suggestion

2003-09-12 Thread John Joseph Bachir
great, thanks for the suggestions. yeah i am lookig for somethign that
will be absolutely quiet when there is no error, but i have been using
-nv in the meantime.

john



On Fri, 12 Sep 2003, Aaron S. Hawley wrote:

|is -nv (non-verbose) an improvement?
|
|$ wget -nv www.johnjosephbachir.org/
|12:50:57 URL:http://www.johnjosephbachir.org/ [3053/3053] - index.html [1]
|$ wget -nv www.johnjosephbachir.org/m
|http://www.johnjosephbachir.org/m:
|12:51:02 ERROR 404: Not Found.
|
|but if you're not satisfied you could use shell redirection and the tail
|command:
|
|$ wget -nv www.johnjosephbachir.org/m 21  /dev/null | tail +2
|
|you could use the return value of error to echo would ever you want.
|$ wget -q www.johnjosephbachir.org/m || echo Error
|Error
|
|
|On Fri, 12 Sep 2003, John Joseph Bachir wrote:
|
| it would be great if there was a flag that could be used with -q that
| would only give output if there was an error.
|
| i use wget a lot in pcs:
|
|   johnjosephbachir.org/pcs
|
| thanks!
| john
|


suggestion

2003-06-17 Thread Roman Dusek
Dear Sirs,

thanks for WGet, it's a great tool. I would very appreciate one more 
option: a possibility to get http page using POST method instead of GET.

Cheers,
Roman


Re: suggestion

2003-06-17 Thread Aaron S. Hawley
it's available in the CVS version..

information at:
http://www.gnu.org/software/wget/

On Tue, 17 Jun 2003, Roman Dusek wrote:

 Dear Sirs,

 thanks for WGet, it's a great tool. I would very appreciate one more
 option: a possibility to get http page using POST method instead of GET.

 Cheers,
 Roman

-- 
Women do two-thirds of the work for five percent of the world's income.


feature suggestion -- download small files only

2003-03-06 Thread An Zhu
WGET could download only certains files based on file
type.  Usually the purporse is to avoid wasting time
in those unrelated files.  Accutally we don't care
much on small files, such as 1K text files.

Can we just limit the file size? For example, just
take those less than 1M. Generally we get the real
file size before the downloading starts, except files
returned by CGI, which could give a wrong
content-length.

Just my idea. I don't know whether current version 1.8
has already implemneted it in some way.


__
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/


A suggestion for `man wget'

2002-12-18 Thread Uri Elias
Hello,

This is not a bug, but could you please add in the manual, after the
sentence
The proxy is on by default if the appropriate environmental
 variable is defined.
that this variable is called http_proxy.  It is not easy to guess.

Yours,
U. Elias





suggestion

2002-10-13 Thread Ben Purbrick



could you add to wget an option to force line feeds 
to be 0D0A ?


Re: [Req #1764] Suggestion: Anonymous rsync access to the wget CVS tree.

2002-09-23 Thread Erlend Aasland

Rsync would be very convenient. You've got my vote on that one.


Erlend Aasland

On 09/23/02 10:01, Lars Chr. Hausmann wrote:
  Max == Max Bowsher [EMAIL PROTECTED] writes:
 
  Max As a dial-up user, I find it extremely useful to have access to
  Max the full range of cvs functionality whilst offline. Some other
  Max projects provide read-only rsync access to the CVS repository,
  Max which allows a local copy of the repository to be made, not just
  Max a checkout of a particular version.
 
  Max Since access to xemacs cvs on sunsite.dk is already provided in
  Max this manner, perhaps it would be possible for wget, as well?
 
 Wget - list do you want us to set this up ?
 
 /LCH
 -- 
 SunSITE.dk Staff
 http://SunSITE.dk
 



Re: [Req #1764] Suggestion: Anonymous rsync access to the wget CVStree.

2002-09-23 Thread Lars Chr. Hausmann

 Max == Max Bowsher [EMAIL PROTECTED] writes:

 Max As a dial-up user, I find it extremely useful to have access to
 Max the full range of cvs functionality whilst offline. Some other
 Max projects provide read-only rsync access to the CVS repository,
 Max which allows a local copy of the repository to be made, not just
 Max a checkout of a particular version.

 Max Since access to xemacs cvs on sunsite.dk is already provided in
 Max this manner, perhaps it would be possible for wget, as well?

Wget - list do you want us to set this up ?

/LCH
-- 
SunSITE.dk Staff
http://SunSITE.dk




Suggestion: Anonymous rsync access to the wget CVS tree.

2002-09-12 Thread Max Bowsher

As a dial-up user, I find it extremely useful to have access to the
full range of cvs functionality whilst offline. Some other projects
provide read-only rsync access to the CVS repository, which allows a
local copy of the repository to be made, not just a checkout of a
particular version. 

Since access to xemacs cvs on sunsite.dk is already provided in this
manner, perhaps it would be possible for wget, as well?

Thanks,

Max.




Suggestion: Anonymous rsync access to the CVS tree.

2002-08-15 Thread Max Bowsher

As a dial-up user, I find it extremely useful to have access to the full range
of cvs functionality whilst offline. Some other projects provide read-only rsync
access to the CVS repository, which allows a local copy of the repository to be
made, not just a checkout of a particular version.

Since access to xemacs cvs on sunsite.dk is already provided in this manner,
perhaps it would be possible for wget, as well?

Thankyou,

Max.




Re: Suggestion

2002-07-17 Thread Ced

Hello Danny,

Wednesday, July 17, 2002, 9:19:10 PM, you wrote:

DL interrput the downloading of a certain file or even
DL branch when downloading directory tree recursively.

For one file
Stop a download with Ctrl-C, and resume it with :
wget -c http://pwet/file_you_were_downloading

For a recursive download, -N option download only files if theyre
newer or if they don't exist on current folder, but doesn't resume the
big file of the tree you were downloading..

Current solution :
wget -c http://pwet/tree/the_big_file
and then
wget -rN http://pwet/tree/


DL http://www.abcd.com/-index.html-docs.html-snapshots.html
DL [0-4]?

wget -r --max-depth=1 http://pwet/tree/

this will only retry for one level (only index.html for your example)

-- 
Best regards,
 Cedmailto:[EMAIL PROTECTED]

 
__
ifrance.com, l'email gratuit le plus complet de l'Internet !
vos emails depuis un navigateur, en POP3, sur Minitel, sur le WAP...
http://www.ifrance.com/_reloc/email.emailif





Re: timestamping ( another suggestion)

2002-04-16 Thread Brix Lichtenberg

DCA This isn't a bug, but the offer of a new feature.  The timestamping
DCA feature doesn't quite work for us, as we don't keep just the latest
DCA view of a website and we don't want to copy all those files around for
DCA each update.

Which brings me to mention two features I've been meaning to suggest for
ages. Probably it means changing some basic things in the core of wget, I don't know. 
I'm
no programmer. Maybe it has been thought about already and was
decided otherwise.

But why does wget have to rename the last file it fetches when finding
another one with the same name. Why isn't the previous file already
there renamed to .1, .2 and so on if more files are present.

IMO this would be a major advantage for mirroring sites with timestamping
*and* keeping the old files (which may not be wanted to be discarded)
*and* keep the links between newer and older unchanged files intact.

Hm?

The other thing more or less is ripped from the Windows DL-Manager
FlashGet (but why not). Wouldn't it be useful if wget retrieves a file
to a temporary renamed filename, for instance with the extension .wg! or
something and renamed back to the original name after finishing? Two advantages IMO: 
First you can easily see at which point
a download broke (so you don't have to look for a file by date or size
or something in a whole lot of them).

The other is the possibility to resume a broken download with the
option -nc (so the already downloaded files aren't looked up again).
Wget needn't check a lot and could determine by the file extension
that this is the one file where it has to continue.

Do I make sense? Sorry only raw ideas.

-- Brix




Re[2]: timestamping ( another suggestion)

2002-04-16 Thread Brix Lichtenberg

 The other thing more or less is ripped from the Windows DL-Manager
 FlashGet (but why not). Wouldn't it be useful if wget retrieves a file
 to a temporary renamed filename, for instance with the extension .wg! or
 something and renamed back to the original name after finishing? Two
TL advantages IMO: First you can easily see at which point
 a download broke (so you don't have to look for a file by date or size
 or something in a whole lot of them).

 The other is the possibility to resume a broken download with the
 option -nc (so the already downloaded files aren't looked up again).
 Wget needn't check a lot and could determine by the file extension
 that this is the one file where it has to continue.

TL wget needs to remember a LOT more than simply the last file that was being
TL downloaded. It needs to remember all the files it has looked at, the files
TL that have been downloaded, the files that are in the queue to be downloaded,
TL the command line and .wgetrc options, etc.

TL With some clever planning by someone who knows the internals of the program
TL really well, it might be possible for wget to create a resumption file with
TL the state of the download, but I'm guessing that is a huge task.

Well, I said I don't know what it takes and if it makes sense
programming-wise. And actually I thought it wasn't about wget getting
to remember more. If it creates a resumption file then it
no-clobbers all the complete downloads (no remembering) when the
broken download has to be repeated, doesn't
find the current incompleted one (because of the extension), starts to download (again 
with
resumption extension), finds there is one when it tries write and decides to
continue for that file at the right point.

Well, the conventional way of finding the broken file, deleting it and
start again with -nc works too, of course. :-)

-- Brix




Re: New suggestion.

2002-04-10 Thread Ivan Buttinoni

On Monday 08 April 2002 19:18, you wrote:
 Ivan Buttinoni [EMAIL PROTECTED] writes:
  Again I send a suggestion, this time quite easy.  I hope it's not
  allready implemented, else I'm sorry in advance.  It will be nice if
  wget can use the regexp to evaluate what accept/refuse to download.
  The regexp have to work on whole URL and/or filename and/or hostname
  and/or CGI argument.  Sometime I found the apache directory sorting
  links that are unusefull, eg:
  .../?N=A
  .../?M=D
 
  Here follows an hipotesis for the above example:
  wget -r -l0 --reg-exclude '[A-Z]=[AD]$' http://

 The problem with regexps is that their use would make Wget dependent
 on a regexp library.  To make matters worse, regexp libraries come in
 all shapes and sizes, with incompatible APIs and implementing
 incompatible dialects of regexps.

 I'm staying away from regexps as long as I possibly can.

Ok, exist a lot of implementation regexp as a consequence exist a lot of 
implementations/dialets, but
don't forget _gnu rexep_ (http://www.gnu.org/directory/rx.html)!

And how difficult is insert  regexps at compile time? (ex. ./configure 
--with-gnuregexp )?

Ciao
Ivan

-- 
 =
 BWARE TECHNOLOGIES - http://www.bware.it/ Via S.Gregorio, 3, Milano 20124
 Italy -  Phone: +39 02 2779181 Fax: +39 02 27791828  GSM: +39 335 1280432
 =  



Re: New suggestion.

2002-04-08 Thread Hrvoje Niksic

Ivan Buttinoni [EMAIL PROTECTED] writes:

 Again I send a suggestion, this time quite easy.  I hope it's not
 allready implemented, else I'm sorry in advance.  It will be nice if
 wget can use the regexp to evaluate what accept/refuse to download.
 The regexp have to work on whole URL and/or filename and/or hostname
 and/or CGI argument.  Sometime I found the apache directory sorting
 links that are unusefull, eg:
 .../?N=A 
 .../?M=D

 Here follows an hipotesis for the above example:
 wget -r -l0 --reg-exclude '[A-Z]=[AD]$' http://

The problem with regexps is that their use would make Wget dependent
on a regexp library.  To make matters worse, regexp libraries come in
all shapes and sizes, with incompatible APIs and implementing
incompatible dialects of regexps.

I'm staying away from regexps as long as I possibly can.



Re: [Feature suggestion] SMIL support

2002-03-18 Thread Alan Eldridge

On Tue, Mar 19, 2002 at 12:06:48AM +0100, Fabrice Bauzac wrote:

Maybe there is an easy way of saying hey, SMIL files are like HTML
to wget?

There's an option to set the recognized tag set for html docs. Maybe some
trickery with that, plus --force-html, might do the trick.

-- 
AlanE
When the going gets tough, the weird turn pro. - HST




Re: -H suggestion

2002-01-16 Thread Hrvoje Niksic

[EMAIL PROTECTED] writes:

 Funny you mention this.  When I first heard about -p (1.7?) I
 thought exactly that it would default to [spanning hosts to retrieve
 page requisites].  I think it would be really useful if the page
 requisites could be wherever they want. I mean, -p is already
 ignoring -np (since 1.8?), what I think is also very useful.

Since 1.8.1.  I considered it a bit more dangerous to allow
downloading from just any host if the user has not allowed it
explicitly.  For example, maybe the user doesn't want to load the
banner ads?  Or maybe he does?

In either way, I was presented with a user interface problem.  I
couldn't quite figure out how to arrange the options to allow for
three cases:

 * -p gets stuff from this host only, including requisites.
 * -p gets stuff from this host only, but requisites may span hosts.
 * everything may span hosts.

Fred's suggestion raises the bar, because to implement it we'd need a
set of options to juggle with the different download depths depending
on whether you're referring to the starting host or to the other
hosts.

  The -i switch provides for a file listing the URLs to be downloaded.
  Please provide for a list file for URLs to be avoided when -H is
  enabled.
 
 URLs to be avoided?  Given that a URL can be named in more than one
 way, this might be hard to do.
 
 Sorry, but does --reject-host (or similar, I don't have the docs here ATM)
 not exactly do this?

The existing rejection switches reject on the basis of host name, and
on the basis of file name.  There is no switch to disallow downloading
a specific URL.



Re: -H suggestion

2002-01-15 Thread jens . roesner

Hi!

Once again I think this has nothing to do in the bug list, but, there you
go:

 I've toyed with the idea of making a flag to allow `-p' span hosts
 even when normal download doesn't.

Funny you mention this.
When I first heard about -p (1.7?) I thought exactly that it would default
to that behaviour.
I think it would be really useful if the page requisites could be wherever 
they want. I mean, -p is already ignoring -np (since 1.8?), what I think is
also very useful.

  The -i switch provides for a file listing the URLs to be downloaded.
  Please provide for a list file for URLs to be avoided when -H is
  enabled.
 
 URLs to be avoided?  Given that a URL can be named in more than one
 way, this might be hard to do.
 
Sorry, but does --reject-host (or similar, I don't have the docs here ATM)
not exactly do this? I may well be missing the point here.
But with disallowing hosts and dirs you should be able to do this.
Or is the problem to load the lists from an external file?
Then, please ignore my comment, I have no experience in this.

CU
Jens

-- 
GMX - Die Kommunikationsplattform im Internet.
http://www.gmx.net




-H suggestion

2002-01-11 Thread Fred Holmes

WGET suggestion

The -H switch/option sets host-spanning.  Please provide a way to specify a 
different limit on recursion levels for files retrieved from foreign hosts.

-r -l0 -H2

for example would allow unlimited recursion levels on the target host, but 
only 2 [addtional] levels when a file is being retrieved from a foreign host.

Second suggestion:

The -i switch provides for a file listing the URLs to be downloaded.

Please provide for a list file for URLs to be avoided when -H is enabled.

Thanks for listening.

And thanks for a marvelous product.

Fred Holmes  [EMAIL PROTECTED]




Suggestion on job size

2002-01-11 Thread Fred Holmes

It would be nice to have some way to limit the total size of any job, and 
have it exit gracefully upon reaching that size, by completing the -k -K 
process upon termination, so that what one has downloaded is useful.  A 
switch that would set the total size of all downloads --total-size=600MB 
would terminate the run when the total bytes downloaded reached 600 MB, and 
process the -k -K.  What one had already downloaded would then be properly 
linked for viewing.

Probably more difficult would be a way of terminating the run manually 
(Ctrl-break??), but then being able to run the -k -K process on the 
already-downloaded files.

Fred Holmes




Re: Suggestion on job size

2002-01-11 Thread Jens Rösner

Hi Fred!

First, I think this would rather belong in the normal wget list, 
as I cannot see a bug here.
Sorry to the bug tracers, I am posting to the normal wget List and
cc-ing Fred, 
hope that is ok.

To your first request: -Q (Quota) should do precisely what you want.
I used it with -k and it worked very well.
Or am I missing your point here?

Your second wish is AFAIK not possible now.
Maybe in the future wget could write the record 
of downloaded files in the appropriate directory.
After exiting wget, this file could then be used 
to process all the files mentioned in it.
Just an idea, I would normally not think that 
this option is an often requested one.
HOWEVER: 
-K works (when I understand it correctly) on the fly, as it decides on
the run, 
if the server file is newer, if a previously converted file exists and
what to do.
So, only -k would work after the download, right?

CU
Jens

http://www.JensRoesner.de/wgetgui/

 It would be nice to have some way to limit the total size of any job, and
 have it exit gracefully upon reaching that size, by completing the -k -K
 process upon termination, so that what one has downloaded is useful.  A
 switch that would set the total size of all downloads --total-size=600MB
 would terminate the run when the total bytes downloaded reached 600 MB, and
 process the -k -K.  What one had already downloaded would then be properly
 linked for viewing.
 
 Probably more difficult would be a way of terminating the run manually
 (Ctrl-break??), but then being able to run the -k -K process on the
 already-downloaded files.
 
 Fred Holmes



suggestion

2002-01-09 Thread Mike Jackson

I'm using wget for a watcher script that I run to monitor some servers and
was
thinking that it'd be handy to be able to have the http response code (200,
404, etc) as the return value on exit.  Currently having it return 0 for ok
and 1 for not ok is fine, but I can see some instances in the future where I
might want to have the http response code instead.

Anyway, just a thought, some assembly required, batteries not include, your
mileage may vary.

--mikej
-=-
mike jackson
[EMAIL PROTECTED]




wget suggestion

2002-01-08 Thread Michiel Dethmers


Hi
Just a suggestion. I'm using wget 1.6.
If using FTP, add an option to download with same file permissions.

Cheers
Michiel
-- 





Re: suggestion

2001-11-29 Thread Hrvoje Niksic

Jerome Lapous [EMAIL PROTECTED] writes:

 One option that can be interesting is to print the donwload result
 on standard output instead of a file. It would avoid rights problem
 when the same shell is used by multiple users.

Have you tried `-q -O -'?



wget suggestion...

2001-11-20 Thread [Total K]

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

One small suggestion for a possible later release... a mask for all files..

  wget -m http://localhost/*.txt

for example.

Other than that.. all's good =)




Regards..

  Total K


http://www.oc32.cjb.net ~ OC32 Home
http://www.digiserv.cjb.net ~ Home of [Total K]
http://www.digitaldisorder.cjb.net/php/download.php?sec=pgpsub_sec=f=totalk.asc ~ 
PGP Key


- ---
If mini skirts get any higher, said the Fairy to the Gnome,
We'll have two more cheeks to powder, and a few more hairs to comb.



-BEGIN PGP SIGNATURE-
Version: PGP 6.5.3

iQA/AwUBO/ptAuVkNn/VM/QPEQKhfQCgqsmh85/7XZlWdFNYHS2tyt8g0hUAnR0H
k6ekAq6xnmZQMU23vzHKTccA
=QtzJ
-END PGP SIGNATURE-





Re: wget suggestion

2001-11-16 Thread Vladi Belperchinov-Shabanski

[EMAIL PROTECTED] wrote:
 
 hiya!
 
 i'd like to have wget forking into background as default
 (via .wgetrc) but sometimes, eg. in shell scripts, i need
 wget to stay in foreground, so the script knows when the
 file is completely downloaded (well, after wget exits =)
 is it possible to implement such a feature?
 thanks in advance, wget rocks!
 
 greets, alex

you can get wget running in background by adding `' at the end, i.e.

wget http://somewhere/file.txt 

if you don't add `' wget will run in foreground, then you still
can `ctrl+z' and `bg' to send it to background or simply close the
terminal in which wget is running (it will also send wget in background
and even will send all messages to `wget-log' log file)...

well, all this is written somewhere in the docs I'm sure :)

P! Vladi.
-- 
Vladi Belperchinov-Shabanski [EMAIL PROTECTED] [EMAIL PROTECTED]
Personal home page at http://www.biscom.net/~cade
DataMax Ltd. http://www.datamax.bg
Too many hopes and dreams won't see the light...


smime.p7s
Description: S/MIME Cryptographic Signature


wget suggestion

2001-11-05 Thread ap6

hiya!

i'd like to have wget forking into background as default
(via .wgetrc) but sometimes, eg. in shell scripts, i need
wget to stay in foreground, so the script knows when the
file is completely downloaded (well, after wget exits =)
is it possible to implement such a feature?
thanks in advance, wget rocks!

greets, alex




RE: suggestion

2001-07-11 Thread Herold Heiko

Something like that has been suggested already, but is not yet
implemented (at least not in the official source, which is at 1.7
btw).

For any chance, are you using a proxy ? Some (braindead imho) of those
insert a string like connection interrupted at the end of a failed
download. If you can get a good copy of a ruined file try comparing them
in order to understand exactly where they are different, and try to
match the differences with the offsets when wget had to continue a
download (from a wget -v output).

Heiko

-- 
-- PREVINET S.p.A.[EMAIL PROTECTED]
-- Via Ferretto, 1ph  x39-041-5907073
-- I-31021 Mogliano V.to (TV) fax x39-041-5907087
-- ITALY



-Original Message-
From: Luis Yanes [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, July 11, 2001 2:24 AM
To: [EMAIL PROTECTED]
Subject: suggestion


Dear team. First let me thank you for a so great utility.

After retrieving huge iso files with wget 1.53 and finding them
unusable due to checksum failure, though about a possible enhancement
for wget.

I think that the most probably transmission errors will ocurr just
before a disconect event, giving the last few bytes the greatest
chance to become corrupt, even using TCP. Althought haven't any
meassurement of this affirmation.

Reviewing the wget docs, haven't found anything related to this.
When using the -c, --continue options for http or ftp, requesting for
a few overlapping bytes could solve this potential problem. A default
overlap of 256 bytes to 1K would have an insigficant impact on data
throughput and may avoid a huge trashed file.

Allow me suggest the following syntax:

--overlap overlap (default bytes) while continuing a download
--overlap=BYTES   overlap BYTES while continuing a broken download

--ignore-overlap  If overlap segment differ use new/old. Well we are in
trouble. This would need more discussion.

I haven't any experience with gnu software developping, but if 
you don't
find this interesting enough to work on it, I could try to make a patch
to the lattest wget code for your review to include in the 
distribution.

-- 
73's de Luis

mail: [EMAIL PROTECTED]
Ampr: eb7gwl.ampr.org
http://www.terra.es/personal2/melus0/ - PCBs for Homebrewed Hardware




suggestion

2001-07-10 Thread Luis Yanes

Dear team. First let me thank you for a so great utility.

After retrieving huge iso files with wget 1.53 and finding them
unusable due to checksum failure, though about a possible enhancement
for wget.

I think that the most probably transmission errors will ocurr just
before a disconect event, giving the last few bytes the greatest
chance to become corrupt, even using TCP. Althought haven't any
meassurement of this affirmation.

Reviewing the wget docs, haven't found anything related to this.
When using the -c, --continue options for http or ftp, requesting for
a few overlapping bytes could solve this potential problem. A default
overlap of 256 bytes to 1K would have an insigficant impact on data
throughput and may avoid a huge trashed file.

Allow me suggest the following syntax:

--overlap overlap (default bytes) while continuing a download
--overlap=BYTES   overlap BYTES while continuing a broken download

--ignore-overlap  If overlap segment differ use new/old. Well we are in
trouble. This would need more discussion.

I haven't any experience with gnu software developping, but if you don't
find this interesting enough to work on it, I could try to make a patch
to the lattest wget code for your review to include in the distribution.

-- 
73's de Luis

mail: [EMAIL PROTECTED]
Ampr: eb7gwl.ampr.org
http://www.terra.es/personal2/melus0/ - PCBs for Homebrewed Hardware




Suggestion...

2001-07-04 Thread Charlie Sorsby

I have wget v1.5.3 -- don't know if this is current version or not
but, if so, is there any possibility of a future version that
translates from HTML to text files as netscape is (usually) able to
do?  It would be nice to be able to retrieve a text version of a
web page with a script with something lacking the bloat of
netscape (e.g. from a script).

If there's a more recent version that already has this ability,
I'll appreciate a pointer to it.

Thanks.

Charlie
[EMAIL PROTECTED]



Re: Suggestion...

2001-07-04 Thread toad

On Wed, Jul 04, 2001 at 01:42:02PM -0600, Charlie Sorsby wrote:
 I have wget v1.5.3 -- don't know if this is current version or not
 but, if so, is there any possibility of a future version that
 translates from HTML to text files as netscape is (usually) able to
 do?  It would be nice to be able to retrieve a text version of a
 web page with a script with something lacking the bloat of
 netscape (e.g. from a script).
It would be even nicer to avoid bloat in wget altogether by taking the
revolutionary step of using an external conversion script.
 
 If there's a more recent version that already has this ability,
 I'll appreciate a pointer to it.
 
 Thanks.
 
 Charlie
   [EMAIL PROTECTED]

-- 
Always hardwire the explosives
-- Fiona Dexter quoting Monkey, J. Gregory Keyes, Dark Genesis



Suggestion

2001-07-03 Thread Jan Thonemann

.. or better a question?!?

Hi

Sorry for the bad english in advance :-)

I have a problem and i hope you can help me.

I have tried to download some files from a ftp server by using an input
file. The command I used looks like that.

wget -i file

the file looks like this

ftp://user:[EMAIL PROTECTED]/path1/file1
ftp://user:[EMAIL PROTECTED]/path2/file2

The list is much longer. I don`t want to use the -r option, because i don`t
need all the files.

My problem is, that wget makes a new login for each file. But the files are
all on the same server. The login on the ftp server takes a quit long time,
and thats why i want to ask what i must do that wget just login in once an
get all the files.

I downloaded your new 1.7 version, and tried to do what i want with
the --base option, but that doesn`t work, or i don`t know how.

I hope you can help me. Because i would like to use wget what is doing good
work in all other cases.

Greets Jan Thonemann



WGET suggestion

2001-06-04 Thread Michael Widowitz

Hello,

I'm using wget and prefer it to a number of GUI-programs. It only
seems to me that Style Sheets (css-files) aren't downloaded. Is this
true, or am I doing something wrong? If not, I would suggest that
stylesheets should also be retrieved by wget.

Regards,

Michael

-- 
Michael Widowitz
[EMAIL PROTECTED]
http://widowitz.com - letztes Update 22.4.2001
http://astraxa.net





Re: WGET suggestion

2001-06-04 Thread Jan Prikryl

\Quoting Michael Widowitz ([EMAIL PROTECTED]):

 I'm using wget and prefer it to a number of GUI-programs. It only
 seems to me that Style Sheets (css-files) aren't downloaded. Is this
 true, or am I doing something wrong? If not, I would suggest that
 stylesheets should also be retrieved by wget.

Michael,

which version of wget do you use? I guess (but maybe I'm mistaken)
that versions 1.6 and upwards do download CSS when doing recursive
traversal (or --page-requisities).

-- jan

+--
 Jan Prikryl| vr|vis center for virtual reality and visualisation
 [EMAIL PROTECTED] | http://www.vrvis.at
+--



Re: suggestion for wget

2001-03-18 Thread Jan Prikryl

Quoting Jonathan Nichols ([EMAIL PROTECTED]):

i have a suggestion for the wget program.  would it be possible to
 have a command line option that, when invoked, would tell wget to
 preserve the modification date when transfering the file?

i guess that `-N' (or `--timestamping') is what you're looking
for.

-- jan

+--
 Jan Prikryl| vr|vis center for virtual reality and visualisation
 [EMAIL PROTECTED] | http://www.vrvis.at
+--



suggestion for wget

2001-03-15 Thread Jonathan Nichols

hello,

i have a suggestion for the wget program.  would it be possible to
have a command line option that, when invoked, would tell wget to
preserve the modification date when transfering the file??  the
modification time would then reflect the last time the file was modified
on the remote machine, as opposed to the last time it was modified on
the local machine.  i know that the cp command has this option (-p).  is
this reasonable/possible for wget??

thanks,

jon






RE: SUGGESTION: rollback like GetRight

2001-01-10 Thread ZIGLIO Frediano

I suggest two parameter:
- rollback-size
- rollback-check-size
where 0 = rollback-check-size = rollback-size
The first for calculate the beginning of range (filesize - rollback-size)
and the second for check (wget should check the range [filesize -
rollback-size,filesize - rollback-size + rollback-check-size) )

freddy77

 
 Hrvoje Niksic [EMAIL PROTECTED] writes:
  Daniel Stenberg [EMAIL PROTECTED] writes:
  
   Could you elaborate on this and describe in what way, 
 theoretically,
   the errors would sneak into the destination file?
  
  By a silly proxy inserting a "transfer interrupted" string when the
  transfer between the proxy and the actual server gets interrupted.
 
 How awful.  Okay, I added this to the TODO.  I imagine it 
 won't get done
 until someone with one of those broken proxies sends in a 
 patch to implement
 it, though.
 
 ---
 Dan Harkless| To help prevent SPAM contamination,
 GNU Wget co-maintainer  | please do not mention this email
 http://sunsite.dk/wget/ | address in Usenet posts -- thank you.
 



Entra in www.omnitel.it. Ti aspetta un mondo di servizi on line




Re: SUGGESTION: rollback like GetRight

2001-01-10 Thread Jan Prikryl

Quoting ZIGLIO Frediano ([EMAIL PROTECTED]):

 I suggest two parameter:
 - rollback-size
 - rollback-check-size
 where 0 = rollback-check-size = rollback-size
 The first for calculate the beginning of range (filesize - rollback-size)
 and the second for check (wget should check the range [filesize -
 rollback-size,filesize - rollback-size + rollback-check-size) )

My understanding of the rollback problem is that there are some broken
proxies that do add some additional text garabge after the conection
has timed out for example. Then, for `--rollback-size=NUM' after
timing-out, wget shall cut the last NUM bytes of the file and try to
resume the download.

Chould you elaborate more on the situation where something like
`--rollback-check-size' would be needed? What shall be checked there?

-- jan

+--
 Jan Prikryl| vr|vis center for virtual reality and visualisation
 [EMAIL PROTECTED] | http://www.vrvis.at
+--




RE: SUGGESTION: rollback like GetRight

2001-01-10 Thread ZIGLIO Frediano

Rollback is usefull mainly for checking if file is not changed.
You check (compare) download data with your file.

freddy77
 
 Quoting ZIGLIO Frediano ([EMAIL PROTECTED]):
 
  I suggest two parameter:
  - rollback-size
  - rollback-check-size
  where 0 = rollback-check-size = rollback-size
  The first for calculate the beginning of range (filesize - 
 rollback-size)
  and the second for check (wget should check the range [filesize -
  rollback-size,filesize - rollback-size + rollback-check-size) )
 
 My understanding of the rollback problem is that there are some broken
 proxies that do add some additional text garabge after the conection
 has timed out for example. Then, for `--rollback-size=NUM' after
 timing-out, wget shall cut the last NUM bytes of the file and try to
 resume the download.
 
 Chould you elaborate more on the situation where something like
 `--rollback-check-size' would be needed? What shall be checked there?
 
 -- jan
 
 +-
 -
  Jan Prikryl| vr|vis center for virtual reality and 
 visualisation
  [EMAIL PROTECTED] | http://www.vrvis.at
 +-
 -
 



Entra in www.omnitel.it. Ti aspetta un mondo di servizi on line