There needs to be a way to tell wget to reject all domains EXCEPT
those that are accepted. This should include subdomains. Ie. I
just want to download www.mydomain.com and cache.mydomain.com. I
thought the --domains option would work this way but it doesn't.
From: Robert La Ferla
There needs to be a way to tell wget to reject all domains EXCEPT those
that are accepted. This should include subdomains. Ie. I just want to
download www.mydomain.com and cache.mydomain.com. I thought the
--domains option would work this way but it doesn't.
Can you
GNU Wget 1.10.2
Capture this sub-site and not the rest of the site so that you can
view it locally. i.e. just www.boston.com and cache.boston.com
http://www.boston.com/ae/food/gallery/cheap_eats/
On May 3, 2007, at 10:34 PM, Steven M. Schweda wrote:
From: Robert La Ferla
There needs
From: Robert La Ferla
GNU Wget 1.10.2
Ok. Running on what?
Capture this sub-site and not the rest of the site so that you can
view it locally. i.e. just www.boston.com and cache.boston.com
http://www.boston.com/ae/food/gallery/cheap_eats/
What is a sub-site? Do you mean this
Hi
I use wget to download the same file in regular intervals
(price list). And I caught myself renaming files programmatically
(even in several projetcs) after they were downloaded by wget
and named file.1 file.2 etc.
Why I need this: one day I take the downloaded files from downlddir/
and move
From: Alvydas
I guess it would relatively easy and quite useful to add an option
to name file.20070426142800 file.20070426142955 ... instead just numbers.
The relevant code is in src/utils.c: unique_name(), and should be
easy enough to change. On a fast system, however, one-second
Hi there,
wget -r is very useful to slurp and archive entire web sites, I use it all the
time when working with other web designers remotely. However, images declared
in CSS rules are ignored by the robot if they are not seen elsewhere in the
HTML pages. Therefore, to mirror the site properly,
From: Daniel Clarke - JAS Worldwide
I'd like to suggest a feature for WGET: the ability to download a file
and then delete it afterwards.
Assuming that you'd like to delete it on the FTP server, and not
locally, the basics of this seem pretty easy to add:
0. Documentation.
1. Some
Hi
I'd like to suggest a feature for WGET: the ability to download a file
and then delete it afterwards.
At the moment I use this tool as part of a batch script that downloads
all the waiting files from a remote server using wget, then quits wget,
and used Windows's FTP.exe to delete all the
First thanks for this great tool.
Perhaps following features are helpfull (for me they are)
--strict-level download data only from the given depthlevel
-include should break -np. Momentarily wget don't accept the include-directory
when it is an a higher level and -np is set.
- filtering
Hello,
How about adding an option to display the md5 and/or sha1 signatures of
files that wget downloads?
These signatures can be calculated in real-time as each file is
downloaded, and so would not require much extra I/O or cpu, but having
the signatures shown right away would help people
Hello,
as far as I can see, wget always prints the final data transfer speed
in autodetected units. I think it would be useful (and I guess also
simple to add) an option, which would tell wget to always print the
speed in bytes per second (for example) so that it is always nicely
parsable no
I notice that the logging output that wget provides only includes a time
stamp, but not date. So when using the -a for appending output to a log
file, the time of execution is logged but you have no idea which dates it
ran. Seems very odd. This applies to both -nv and -v options.
If -O output file and -N are both specified, it seems like there should be some
mode where
the tests for noclobber apply to the output file, not the filename that exists
on the remote machine.
So, if I run
# wget -N http://www.gnu.org/graphics/gnu-head-banner.png -O foo
and then
# wget -N
From: Mitch Silverstein
If -O output file and -N are both specified [...]
When -O foo is specified, it's not a suggestion for a file name to
be used later if needed. Instead, wget opens the output file (foo)
before it does anything else. Thus, it's always a newly created file,
and hence
John McCabe-Dansted ha scritto:
Wget has no way of verifying that the local file is
really a valid prefix of the remote file
Couldn't wget redownload the last 4 bytes (or so) of the file?
For a few bytes per file we could detect changes to almost all
compressed files and the majority
On 9/15/06, Mauro Tortonesi [EMAIL PROTECTED] wrote:
reliable detection of changes in the resource to be downloaded would be
a very interesting feature. but do you really think that checking the
last X ( 100) bytes would be enough to be reasonably sure the resource
was (not) modified? what about
Wget has no way of verifying that the local file is
really a valid prefix of the remote file
Couldn't wget redownload the last 4 bytes (or so) of the file?
For a few bytes per file we could detect changes to almost all
compressed files and the majority of uncompressed files.
--
John
Kumar Varanasi ha scritto:
Hello there,
I am using WGET in my system to download http files. I see that there is no
option to download the file faster with multiple connections to the server.
Are you planning on a multi-threaded version of WGET to make downloads
much faster?
no, there is no
A lot of packages to build Linux distributions for
the embedded world relies on "wget".
Typically they are based on a Makefile and a
configuration file included by the Makefile.
If a package is to be built, then the package is
downloaded to a directory somewhere.
The makefile will try to
Hallo,
yesterday I encountered to wget and I find it a very useful program. I
am mirroring a big site, more precious a forum. Because it is a forum
under each post you have the action quote. Because that forum has
20.000 post it would download all with action=quote, so I rejected it
with
It may be useful to add a paragraph to the manual which lets users know
they can use the --debug option to see why certain URLs are not followed
(rejected) by wget. It would be especially useful to mention this in
9.1 Robot Exclusion. Something like this:
If you wish to see which URLs are
I saw that the option "-k, --convert-links" make the links on the root directory, not at the directory you down the pages. For example: if I download a page that the url is www.pageexample.com, the pages I download goes into there. But if i use that option, in the pages the links will link to the
Bonjourno! :-)
Sigh. Was hoping that someone who wrote the original
man page format might already have expertise in that area.
It's just arcana (obscure knowledge, not necessarily hard
to learn or use, just not widely known). Are you saying
that you wrote the original, but aren't familiar with
This is no bug. But we encountered a situation where a server insists on
accurate FQDN in the host header, or no header at all. When we have to
access the server from outside the NAT firewall using port forwarding, wget
cannot retrieve file.
If there's an option to surpress host header all
Being a computer geek, I tend to like things organized
in tables so options stand out. I took the time to rewrite
the text for the --progress section of the manpage, as
it was always difficult for me to find the values and
differences for the different subtags. Looking at the
--progress=type,
Hello,
On Fri, Aug 26, 2005 at 02:07:16PM +0200, Hrvoje Niksic wrote:
I've applied a slightly modified version of this patch, thanks.
[...] I used elif instead.
thank you, also for correcting my mistake.
Actually, I wasn't aware about the fact that shell elif is portable.
(I checked, and it
Stepan Kasal [EMAIL PROTECTED] writes:
1) I removed the AC_DEFINEs of symbols HAVE_GNUTLS, and HAVE_OPENSSL.
AC_LIB_HAVE_LINKFLAGS defines HAVE_LIBGNUTLS and HAVE_LIBSSL, which
can be used instead. wget.h was fixed to expect these symnbols.
(You might think your defines are more aptly named,
Hello,
attached please find a patch with several suggestions.
(I'm not sending it to wget-patches, as I'm not sure all the suggestions
will be welcome.)
1) I removed the AC_DEFINEs of symbols HAVE_GNUTLS, and HAVE_OPENSSL.
AC_LIB_HAVE_LINKFLAGS defines HAVE_LIBGNUTLS and HAVE_LIBSSL, which
can
Matthew J Harms [EMAIL PROTECTED] writes:
I'm sure you've already had this suggested, and I don't know if it
will work, due to the complexity of the suggestion, but is there a
way you could implement the capability of wget to download any file
that meets a criteria yet use wildcards (i.e
Im sure youve already had this suggested, and
I dont know if it will work, due to the complexity of the suggestion,
but is there a way you could implement the capability of wget to download any
file that meets a criteria yet use wildcards (i.e. * or ?) to fill in the
blanks. For example
Is there an option, or could you add one if there isn't,
to specify that I want wget to write the downloaded html
file, or whatever, to stdout so I can pipe it into some
filters in a script?
Mark Anderson [EMAIL PROTECTED] writes:
Is there an option, or could you add one if there isn't, to specify
that I want wget to write the downloaded html file, or whatever, to
stdout so I can pipe it into some filters in a script?
Yes, use `-O -'.
Stephen Leaf [EMAIL PROTECTED] writes:
parameter option --stdout
this option would print the file being downloaded directly to stdout. which
would also mean that _only_ the file's content is printed. no errors,
verbosity.
usefulness?
wget --stdout http://server.com/file.bz2 | bzcat file
parameter option --stdout
this option would print the file being downloaded directly to stdout. which
would also mean that _only_ the file's content is printed. no errors,
verbosity.
usefulness?
wget --stdout http://server.com/file.bz2 | bzcat file
Hello all,
Would it be possible to specify minimum size for files
to retrieve?
Please add me in the CC list of your replies as I'm
not a subscriber.
Thanks,
Baptiste
__
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn
Hello all,
Would it be possible to specify minimum size for files
to retrieve?
Please add me in the CC list of your replies as I'm
not a subscriber.
Thanks,
Baptiste
__
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection
Hello all,
Would it be possible to specify minimum size for files
to retrieve?
Please add me in the CC list of your replies as I'm
not a subscriber.
Thanks,
Baptiste
__
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
hi there ::)
the would be ok to have 2 or more downloads in the same time because
some files are big and the host limits the speeed...
thanks:)
Sorin
On Sat, Feb 05, 2005 at 02:04:26PM +0200, Sorin wrote:
hi there ::)
the would be ok to have 2 or more downloads in the same time because
some files are big and the host limits the speeed...
You could use a multithreaded download manager (example: d4x). Many of
these packages use wget as a
Hello Robert,
On Thursday, September 30, 2004 at 6:36:43 PM +0200, Robert Thomson wrote:
It would be really advantageous if wget had a --range command line
argument, that would download a range of bytes of a file, if the
server supports it.
You could try the feature patch posted by
G'day,
It would be really advantageous if wget had a --range command line
argument, that would download a range of bytes of a file, if the
server supports it.
I've tried adding it with --header 'Range: bytes=from-to' but wget has
a problem with the 206 return code, and I can't see a way around
david-zhan [EMAIL PROTECTED] writes:
WGET is popular FTP software for UNIX. But, after the files were
downloaded for the first time, WGET always use the date and time,
matching those on the remote server, for the downloaded files. If
WGET is executed in temporary directory in which the files
Suggestion to add an switch on timestamps
Dear Sir/Madam:
WGET is popular FTP software for UNIX. But, after the files were downloaded
for the first time, WGET always use the date and time, matching those on the
remote server, for the downloaded files. If WGET is executed in temporary
Please put in the wget docs, in at least 2 places The rc file used
by wget under windows is actually wgetrc (no prefixed period), not
.wgetrc.
I could not find this info in the docs, and only figured it out by
experimentation.
Chuck
--
__
Freezone Freeware:
- Original Message -
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, December 07, 2003 8:04 AM
Subject: wget Suggestion: ability to scan ports BESIDE #80, (like 443)
Anyway Thanks for WGET!
What's wrong with wget https://www.somesite.com ?
hi,
if I observerd correctly, wget behaves this way:
errors are classified into two classes:. critical and non-critical errors.
when a non critical error (eg time out) occurs wget retries continuing
at the byte the last transmission stopped at.
(if configured that way)
if a critical error
it would be great if there was a flag that could be used with -q that
would only give output if there was an error.
i use wget a lot in pcs:
johnjosephbachir.org/pcs
thanks!
john
is -nv (non-verbose) an improvement?
$ wget -nv www.johnjosephbachir.org/
12:50:57 URL:http://www.johnjosephbachir.org/ [3053/3053] - index.html [1]
$ wget -nv www.johnjosephbachir.org/m
http://www.johnjosephbachir.org/m:
12:51:02 ERROR 404: Not Found.
but if you're not satisfied you
great, thanks for the suggestions. yeah i am lookig for somethign that
will be absolutely quiet when there is no error, but i have been using
-nv in the meantime.
john
On Fri, 12 Sep 2003, Aaron S. Hawley wrote:
|is -nv (non-verbose) an improvement?
|
|$ wget -nv www.johnjosephbachir.org/
Dear Sirs,
thanks for WGet, it's a great tool. I would very appreciate one more
option: a possibility to get http page using POST method instead of GET.
Cheers,
Roman
it's available in the CVS version..
information at:
http://www.gnu.org/software/wget/
On Tue, 17 Jun 2003, Roman Dusek wrote:
Dear Sirs,
thanks for WGet, it's a great tool. I would very appreciate one more
option: a possibility to get http page using POST method instead of GET.
Cheers,
WGET could download only certains files based on file
type. Usually the purporse is to avoid wasting time
in those unrelated files. Accutally we don't care
much on small files, such as 1K text files.
Can we just limit the file size? For example, just
take those less than 1M. Generally we get
Hello,
This is not a bug, but could you please add in the manual, after the
sentence
The proxy is on by default if the appropriate environmental
variable is defined.
that this variable is called http_proxy. It is not easy to guess.
Yours,
U. Elias
could you add to wget an option to force line feeds
to be 0D0A ?
Rsync would be very convenient. You've got my vote on that one.
Erlend Aasland
On 09/23/02 10:01, Lars Chr. Hausmann wrote:
Max == Max Bowsher [EMAIL PROTECTED] writes:
Max As a dial-up user, I find it extremely useful to have access to
Max the full range of cvs functionality whilst
Max == Max Bowsher [EMAIL PROTECTED] writes:
Max As a dial-up user, I find it extremely useful to have access to
Max the full range of cvs functionality whilst offline. Some other
Max projects provide read-only rsync access to the CVS repository,
Max which allows a local copy of the
As a dial-up user, I find it extremely useful to have access to the
full range of cvs functionality whilst offline. Some other projects
provide read-only rsync access to the CVS repository, which allows a
local copy of the repository to be made, not just a checkout of a
particular version.
As a dial-up user, I find it extremely useful to have access to the full range
of cvs functionality whilst offline. Some other projects provide read-only rsync
access to the CVS repository, which allows a local copy of the repository to be
made, not just a checkout of a particular version.
Since
Hello Danny,
Wednesday, July 17, 2002, 9:19:10 PM, you wrote:
DL interrput the downloading of a certain file or even
DL branch when downloading directory tree recursively.
For one file
Stop a download with Ctrl-C, and resume it with :
wget -c http://pwet/file_you_were_downloading
For a
DCA This isn't a bug, but the offer of a new feature. The timestamping
DCA feature doesn't quite work for us, as we don't keep just the latest
DCA view of a website and we don't want to copy all those files around for
DCA each update.
Which brings me to mention two features I've been meaning to
The other thing more or less is ripped from the Windows DL-Manager
FlashGet (but why not). Wouldn't it be useful if wget retrieves a file
to a temporary renamed filename, for instance with the extension .wg! or
something and renamed back to the original name after finishing? Two
TL advantages
On Monday 08 April 2002 19:18, you wrote:
Ivan Buttinoni [EMAIL PROTECTED] writes:
Again I send a suggestion, this time quite easy. I hope it's not
allready implemented, else I'm sorry in advance. It will be nice if
wget can use the regexp to evaluate what accept/refuse to download
Ivan Buttinoni [EMAIL PROTECTED] writes:
Again I send a suggestion, this time quite easy. I hope it's not
allready implemented, else I'm sorry in advance. It will be nice if
wget can use the regexp to evaluate what accept/refuse to download.
The regexp have to work on whole URL
On Tue, Mar 19, 2002 at 12:06:48AM +0100, Fabrice Bauzac wrote:
Maybe there is an easy way of saying hey, SMIL files are like HTML
to wget?
There's an option to set the recognized tag set for html docs. Maybe some
trickery with that, plus --force-html, might do the trick.
--
AlanE
When the
was presented with a user interface problem. I
couldn't quite figure out how to arrange the options to allow for
three cases:
* -p gets stuff from this host only, including requisites.
* -p gets stuff from this host only, but requisites may span hosts.
* everything may span hosts.
Fred's suggestion
Hi!
Once again I think this has nothing to do in the bug list, but, there you
go:
I've toyed with the idea of making a flag to allow `-p' span hosts
even when normal download doesn't.
Funny you mention this.
When I first heard about -p (1.7?) I thought exactly that it would default
to that
WGET suggestion
The -H switch/option sets host-spanning. Please provide a way to specify a
different limit on recursion levels for files retrieved from foreign hosts.
-r -l0 -H2
for example would allow unlimited recursion levels on the target host, but
only 2 [addtional] levels when a file
It would be nice to have some way to limit the total size of any job, and
have it exit gracefully upon reaching that size, by completing the -k -K
process upon termination, so that what one has downloaded is useful. A
switch that would set the total size of all downloads --total-size=600MB
Hi Fred!
First, I think this would rather belong in the normal wget list,
as I cannot see a bug here.
Sorry to the bug tracers, I am posting to the normal wget List and
cc-ing Fred,
hope that is ok.
To your first request: -Q (Quota) should do precisely what you want.
I used it with -k and it
I'm using wget for a watcher script that I run to monitor some servers and
was
thinking that it'd be handy to be able to have the http response code (200,
404, etc) as the return value on exit. Currently having it return 0 for ok
and 1 for not ok is fine, but I can see some instances in the
Hi
Just a suggestion. I'm using wget 1.6.
If using FTP, add an option to download with same file permissions.
Cheers
Michiel
--
Jerome Lapous [EMAIL PROTECTED] writes:
One option that can be interesting is to print the donwload result
on standard output instead of a file. It would avoid rights problem
when the same shell is used by multiple users.
Have you tried `-q -O -'?
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
One small suggestion for a possible later release... a mask for all files..
wget -m http://localhost/*.txt
for example.
Other than that.. all's good =)
Regards..
Total K
http://www.oc32.cjb.net ~ OC32 Home
http://www.digiserv.cjb.net
[EMAIL PROTECTED] wrote:
hiya!
i'd like to have wget forking into background as default
(via .wgetrc) but sometimes, eg. in shell scripts, i need
wget to stay in foreground, so the script knows when the
file is completely downloaded (well, after wget exits =)
is it possible to implement
hiya!
i'd like to have wget forking into background as default
(via .wgetrc) but sometimes, eg. in shell scripts, i need
wget to stay in foreground, so the script knows when the
file is completely downloaded (well, after wget exits =)
is it possible to implement such a feature?
thanks in
ph x39-041-5907073
-- I-31021 Mogliano V.to (TV) fax x39-041-5907087
-- ITALY
-Original Message-
From: Luis Yanes [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, July 11, 2001 2:24 AM
To: [EMAIL PROTECTED]
Subject: suggestion
Dear team. First let me thank you for a so great
Dear team. First let me thank you for a so great utility.
After retrieving huge iso files with wget 1.53 and finding them
unusable due to checksum failure, though about a possible enhancement
for wget.
I think that the most probably transmission errors will ocurr just
before a disconect event,
I have wget v1.5.3 -- don't know if this is current version or not
but, if so, is there any possibility of a future version that
translates from HTML to text files as netscape is (usually) able to
do? It would be nice to be able to retrieve a text version of a
web page with a script with
On Wed, Jul 04, 2001 at 01:42:02PM -0600, Charlie Sorsby wrote:
I have wget v1.5.3 -- don't know if this is current version or not
but, if so, is there any possibility of a future version that
translates from HTML to text files as netscape is (usually) able to
do? It would be nice to be able
.. or better a question?!?
Hi
Sorry for the bad english in advance :-)
I have a problem and i hope you can help me.
I have tried to download some files from a ftp server by using an input
file. The command I used looks like that.
wget -i file
the file looks like this
ftp://user:[EMAIL
Hello,
I'm using wget and prefer it to a number of GUI-programs. It only
seems to me that Style Sheets (css-files) aren't downloaded. Is this
true, or am I doing something wrong? If not, I would suggest that
stylesheets should also be retrieved by wget.
Regards,
Michael
--
Michael Widowitz
\Quoting Michael Widowitz ([EMAIL PROTECTED]):
I'm using wget and prefer it to a number of GUI-programs. It only
seems to me that Style Sheets (css-files) aren't downloaded. Is this
true, or am I doing something wrong? If not, I would suggest that
stylesheets should also be retrieved by
Quoting Jonathan Nichols ([EMAIL PROTECTED]):
i have a suggestion for the wget program. would it be possible to
have a command line option that, when invoked, would tell wget to
preserve the modification date when transfering the file?
i guess that `-N' (or `--timestamping') is what you're
hello,
i have a suggestion for the wget program. would it be possible to
have a command line option that, when invoked, would tell wget to
preserve the modification date when transfering the file?? the
modification time would then reflect the last time the file was modified
on the remote
I suggest two parameter:
- rollback-size
- rollback-check-size
where 0 = rollback-check-size = rollback-size
The first for calculate the beginning of range (filesize - rollback-size)
and the second for check (wget should check the range [filesize -
rollback-size,filesize - rollback-size +
Quoting ZIGLIO Frediano ([EMAIL PROTECTED]):
I suggest two parameter:
- rollback-size
- rollback-check-size
where 0 = rollback-check-size = rollback-size
The first for calculate the beginning of range (filesize - rollback-size)
and the second for check (wget should check the range
Rollback is usefull mainly for checking if file is not changed.
You check (compare) download data with your file.
freddy77
Quoting ZIGLIO Frediano ([EMAIL PROTECTED]):
I suggest two parameter:
- rollback-size
- rollback-check-size
where 0 = rollback-check-size = rollback-size
The
89 matches
Mail list logo