Hi,
This is a feature request for Metalink support in wget. Metalink is an
XML format for listing the multiple ways you can get a file along with
checksums. Listing multiple mirrors allows failing over to another
URL. Checksums allow a file to be automatically verified, and repaired
Hi there,
this is an Post of November 2004 and yesterday I stumbled myself over a
situation where I don't want to get redirected.
I tried to script-get worksheets with incremental-numbering. Easy you
would say: 404 or success so check $?.
The server does however try to be nice to me and always
Martin Petricek wrote:
I am using wget 1.10.2 (last time compiled with mingw32 on windows, but I
use it on other platforms too).
When I download file from server, timestamp is set to match the file on
server. This may be appropriate for some users, but for me it is more
important the time I
I am using wget 1.10.2 (last time compiled with mingw32 on windows, but I
use it on other platforms too).
When I download file from server, timestamp is set to match the file on
server. This may be appropriate for some users, but for me it is more
important the time I have downloaded the file,
It seems pointless to download ALL html content under
some
circumstances... esp. if all or most pages contain php
links, like,
Apache file listings
(i.e. ?N=D, ?M=A, etc.). Why not add an option
like --rreject
--pre-reject or maybe --pre-filter, so we can specify
which types of
links to
There seems to be some inconsistency in pattern
matching, regarding the
use of symbol '?'. The manual says Look up the manual
of your shell for
a description of how pattern matching works.. Now I
understand that
maybe this shouldn't be taken so literally, but I
would suggest the
following:
1)
Thanks to Anthony Bryan, from http://www.metalinker.org/ , who helped me
give a name to the feature I wanted.
Here is what I said before:
Just to make sure this is received as a feature request...,
it would be nice to enable wget to download parts of files
simultaneously as multiple processes
Here's a refresher on Metalink:
'Metalink makes complex download pages obsolete by replacing long lists of
download mirrors and BitTorrent trackers with a single .metalink file. As
you might have already guessed, a .metalink file is a file that tells a
download manager all the different ways it
Hi,
I think that wget should include a charset declaration in the html
page if it don't exist.
The charset of a web page can be found in 2 ways :
-In the http header (example : Content-Type: text/html; charset=ISO-8859-1 )
-In the html header (example : meta http-equiv=Content-Type
]
Subject: bug/feature request
To: [EMAIL PROTECTED]
Hi,
i´m not sure if that is a feature request or a bug.
Wget does not collect all page requisites of a given URL.
Many sites are referencing components of these sites in cascading style
sheets,
but wget does not collect these components
Hi,
I wrote about metalink a few months ago. It's an XML file that lists
Mirrors, p2p, checksums, other info for download managers.
aria2 [ http://aria2.sourceforge.net/ ] now supports metalink. aria2 is a
unix command line download utility with resuming and segmented downloading
that also
Hi,
i´m not sure if that is a feature request or a bug.
Wget does not collect all page requisites of a given URL.
Many sites are referencing components of these sites in cascading style sheets,
but wget does not collect these components as page requisites.
A example:
---
$ wget -q -p -k -nc -x
Hi,
I am impressed with the power of wget and its usefulness is
incredible!
There is one thing it doesnt do quite right, and could
either be a bug or a feature needed, not sure.
When the p option is specified, wget happily
downloads htm, css, js, pdf files etc so that websites are
anthony l. bryan wrote:
Hi,
I realize this may be out of the scope of wget, so I hope I don't offend
anyone.
that's a very interesting proposal, actually. but is metalink a widely
used format to describe resource availability from multiple URLs?
--
Aequam memento rebus in arduis servare
Hi,
I realize this may be out of the scope of wget, so I hope I don't offend
anyone.
I wanted to request support for .metalink files. Metalink is a way to store
ftp/http mirror info along with p2p checksums. (p2p info is dropped if the
client doesn't support it). It's a relatively simple XML
Hi there,
Here's a nice feature which would help me in a task I'm currently engaged in.
(I'm betting it already exists.)
What I am doing is using mplayer to download a stream .ra file. I would like to
mirror a website obtain the .ra files as well. 'wget' would be great for this
because it is
I would like the ability to delete downloaded FTP files on the REMOTE
server when complete, just the --delete-after does for local files. I
added the feature to my 1.10.2 source and it seems to work pretty well,
but my coding is not great and I haven't studied the source layout enough
to be
Scott Scriven wrote:
I'd find it useful to guide wget by using regular expressions to
control which links get followed. For example, to avoid
following links based on embedded css styles or link text.
I've needed this several times, but the most recent was when I
wanted to avoid following any
Mauro Tortonesi [EMAIL PROTECTED] writes:
regex support is planned for the next release of wget. but i was
wondering if we should just extend the existing -A and -R option
instead of creating new ones. what do you think?
It would seriously break backward compatibility. If that is
acceptable,
I'd find it useful to guide wget by using regular expressions to
control which links get followed. For example, to avoid
following links based on embedded css styles or link text.
I've needed this several times, but the most recent was when I
wanted to avoid following any add to cart or buy
Hi,
We would like to be able to use wget as part of a monitoring script to have
it watch a bunch of clustered web servers, however, we'd like to ingore the
results from libresolv, so we can target wget at a specific ip address, but
we need to fool the web server into returning the proper results
Ray Arachelian [EMAIL PROTECTED] writes:
wget --serverip=192.168.0.10 http://www.blah.com/restofurl server1
With Wget 1.10 and further you can simulate this in a slightly
roundabout way:
wget --header Host: www.blah.com http://192.168.0.10/restofurl server1
...
Maybe this should make it into
Hey. It seems that cookie handling doesnt allow using cookies beetween
sessions.
I want to know if it's possible to change it to save the cookies, and
reuse them on other queries to the same domain?
And I figured that if I get a location, it's no way to avoid it. Could
there be a feature to
I would be happy to see a feature for wget proceed to download and store
the document sent by the HTTP server, whether or not the status code
isn't 20x. Sometimes, even when an error status is received, the
response body contains some valuable information. I'm especially
interested in getting 500
On Wednesday 20 April 2005 12:38 am, Oliver Schulze L. wrote:
Hi,
it would be nice if you could tell wget not to download a file if the URL
matches some pattern. The pattern could be shell like or a regular
expresion.
For example, if you do a mirror of a ftp site, and you don't want to
Excelent news!
Thanks Mauro,
will be waiting ;)
Oliver
Mauro Tortonesi wrote:
On Wednesday 20 April 2005 12:38 am, Oliver Schulze L. wrote:
Hi,
it would be nice if you could tell wget not to download a file if the URL
matches some pattern. The pattern could be shell like or a regular
expresion.
Hi,
it would be nice if you could tell wget not to download a file if the URL
matches some pattern. The pattern could be shell like or a regular
expresion.
For example, if you do a mirror of a ftp site, and you don't want to
download
the directories that match the i686 pattern, you should run a
Hello,
I'm an enthusianist user of wget, but I miss a feature. If I use wget
like this
wget -N ftp://ftp.server.com/path/to/big_file.txt
time-stamping works great. But if it outputs to somewhere else than
big_file.txt, it won't take the file's time into account. So this
wget -N -O
would it be possible to add a feature to download shared samba files.
something like
wget smb://10.132.8.9/Physics/tut1.pdf
sorry if this has already been requested and rejected for some reason.
wget manpage says its a network downloader.
this feature would make wget complete for me.
chetan
Hi,
It would be nice to be able to do wget --all-ips http://www.foo.bar and
have wget fetch index.html from each ip address listed in dns for that
domain. If this feature was extended to only keep unique versions of the
page, that would be even better.
--
Bye,
Pabs
Hello,
Probably I am just too lazy, haven't spent enough time to read the man, and
wget can actually do exactly what I want.
If so -- I do apologize for taking your time.
Otherwise: THANKS for your time!..:-).
My problem is:
redirects.
I am trying to catch them by using, say, netcat
The other day I wanted to use wget to create an archive of the entire
www.cpa-iraq.org website. It turns out that http://www.cpa-iraq.org,
http://cpa-iraq.org, http://www.iraqcoalition.org and
http://iraqcoalition.org all contain identical content. Nastily, absolute
links to sub-URLs of all of
Hi Jens,
Thanks for the suggestion. I tried it on a test case. I needed to use -nH
not --cut-dirs to inhibit the hostname directory.
It worked fine for the retrieval phase (i.e. it only retrieved one copy of
each distinct file), but it didn't do what I want when I also used
--convert-links.
Hello, list!
wget's default behavior upon encountering a 404 message is to error
out and simply not fetch that page. It would be useful to have a
flag which, when specified would actually download the 404 message
given to wget, rather than just failing out. I imagine that the flag
could
It sounds to me like you could do the equivalent with a simple shell
script. For example:
while read url
do
wget --limit-rate=2k $url
# Your commands go here.
done URL-LIST-FILE
The feature I would like is:
Execute a script or command only after -i download is complete.
As a dialup user I like the limit speed option and will be using it as
part of some other things I am working on!
I just wrote a little script I call apt-slow which obtains a list of URI's
and downloads
Hrvoje Niksic wrote:
Have you seen the rest of the discussion? Would it do for you if Wget
correctly handled something like:
wget --header='Host: jidanni.org' http://216.46.192.85/
I think that is an elegant solution.
Tony
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]
Dan Jacobson [EMAIL PROTECTED] writes:
But I want a
--second-guess-the-dns=ADDRESS
Aside from `--second-guess-the-dns' being an awful name (sorry), what
is the usage scenario for this kind of option? I.e. why would anyone
want to
Come to think of it, I've had need for this before; the switch makes
at least as much sense as `--bind-address', which I've never needed
myself.
Maybe `--connect-address' would be a good name for the option? It
would nicely parallel `--bind-address'.
Are there any takers to implement it?
Herold Heiko [EMAIL PROTECTED] writes:
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]
Maybe `--connect-address' would be a good name for the option? It
would nicely parallel `--bind-address'.
I was wondering if it should be possibile to pass more than one name
to address change (for
On Mon, 17 Nov 2003, Hrvoje Niksic wrote:
Come to think of it, I've had need for this before; the switch makes at
least as much sense as `--bind-address', which I've never needed myself.
Maybe `--connect-address' would be a good name for the option? It would
nicely parallel
On Sun, 16 Nov 2003, Hrvoje Niksic wrote:
You can do this now:
wget http://216.46.192.85/
Using DNS is just a convenience after all, not a requirement.
Unfortunately, widespread use of name-based virtual hosting made it a
requirement in practice. ISP's typically host a bunch of web
Maciej W. Rozycki [EMAIL PROTECTED] writes:
Hmm, couldn't --header Host: hostname work? I think it could,
but now wget appends it instead of replacing its own generated
one...
It's not very hard to fix `--header' to replace Wget-generated
values.
Is there consensus that this is a good
By the way, I did edit /etc/hosts to do one experiment
http://groups.google.com/groups?threadm=vrf7007pbg2136%40corp.supernews.com
i.e. [EMAIL PROTECTED]
to test an IP/name combination, without waiting for DNS's to update.
Good thing I was root so I could do it.
I sure hope that when one sees
P == Post, Mark K [EMAIL PROTECTED] writes:
P You can do this now:
P wget http://216.46.192.85/
P Using DNS is just a convenience after all, not a requirement.
but then one doesn't get the HTTP Host field set to what he wants.
Dan Jacobson [EMAIL PROTECTED] writes:
I sure hope that when one sees
Connecting to jidanni.org[216.46.192.85]:80... connected.
that there is no interference along the way, that that IP is really
where we are going, to wget's best ability.
I can guarantee that much -- the entire point of
H It's not very hard to fix `--header' to replace Wget-generated
H values.
H Is there consensus that this is a good replacement for
H `--connect-address'?
I don't want to tamper with headers.
I want to be able to do experiments leaving all variables alone except
for IP address. Thus
Dan Jacobson [EMAIL PROTECTED] writes:
But I want a
--second-guess-the-dns=ADDRESS
Aside from `--second-guess-the-dns' being an awful name (sorry), what
is the usage scenario for this kind of option? I.e. why would anyone
want to use it?
Perhaps the user should do all this in the
Post, Mark K [EMAIL PROTECTED] writes:
You can do this now:
wget http://216.46.192.85/
Using DNS is just a convenience after all, not a requirement.
Unfortunately, widespread use of name-based virtual hosting made it a
requirement in practice. ISP's typically host a bunch of web sites on
I see there is
--bind-address=ADDRESS
When making client TCP/IP connections, bind() to ADDRESS on the local
machine.
ADDRESS may be specified as a hostname or IP address. This option can be
useful
if your machine is bound to multiple IPs.
But I want a
You can do this now:
wget http://216.46.192.85/
Using DNS is just a convenience after all, not a requirement.
Mark Post
-Original Message-
From: Dan Jacobson [mailto:[EMAIL PROTECTED]
Sent: Saturday, November 15, 2003 4:00 PM
To: [EMAIL PROTECTED]
Subject: feature request: --second
PROTECTED]
Sent: Tuesday, June 17, 2003 11:13 PM
To: Aaron S. Hawley
Cc: [EMAIL PROTECTED]
Subject: Re: Feature Request: Fixed wait
--- Aaron S. Hawley [EMAIL PROTECTED] wrote:
how is your request different than --wait ?
I'm not in position to verify right now and it's been
a while
I'd like to request an additional (or modified) option
that waits for whatever time specified by the user, no
more no less (instead of the linear backoff of
--waitretry which is just a slightly less obnoxious
form of hammering). Looks like it would only take a
few lines of code but I can't figure
how is your request different than --wait ?
On Mon, 16 Jun 2003, Wu-Kung Sun wrote:
I'd like to request an additional (or modified) option
that waits for whatever time specified by the user, no
more no less (instead of the linear backoff of
--waitretry which is just a slightly less obnoxious
Hello all -
I tries to look through the archives, but couldn't find anything related to
what I need. We use wget for mirroring a dynamically generated site and it
work very very cool :)
Many of the pages we download contain queries, so if a URL looks like :
http://somehost/page.html?a=v1b=v2,
Sherwood Botsford wrote:
I wish I had a file exclude option.
I'm behind a firewall that doesn't allow ftp, so I have to find
sites that use http for file transfer.
I'm currently trying to install cygwin on my local lan network.
To do that, I;m using wget to mirror the remote site locally,
I wish I had a file exclude option.
I'm behind a firewall that doesn't allow ftp, so I have to find
sites that use http for file transfer.
I'm currently trying to install cygwin on my local lan network.
To do that, I;m using wget to mirror the remote site locally,
as I hve a very slow transfer
Hi!
Wget 1.5.3 uses /robots.txt to skip some parts of web-site. But it
doesn't use META NAME=ROBOTS CONTENT=NOFOLLOW tag, which serves
to the same purpose.
I believe that Wget must also parse and use META NAME='ROBOTS' ...
tags
WBR
Stas mailto:[EMAIL PROTECTED]
Greetings.
I have a frequent need to drop files into the /incoming directory on the
ftp server of a company I work with. Would it be possible to add an
oxymoron option, such that one could do:
wget --put file.gz ftp://server.company.com/incoming/file.gz
and have wget do an anonymous
Greetings:
There does not appear to be a way to refuse redirects in wget. This
is a problem because certain sites use local click-count CGIs which
return redirects to advertisers. A common form is
http://desired.web.site/clickcount.cgi?http://undesired.advertiser.site/,
which produces a
On 2002-06-29 21:09 -0400, Dang P. Tran wrote:
I use the -i option to download files from an url list. The
server I use have a password that change often. When I have a
large list if the password change while I'm downloading and give
401 error, I want wget stop to prevent hammering the site
Hi,
I use the -i option to download files from an url list.
Theserver I use have a password that change often. When I have a large
listif the password change while I'm downloading and give 401
error,I want wget stop to prevent hammering the site with bad
password.
Thanks,
Hi,
I'm a happy user of GNU software, including the handy tool wget.
Unfortunately I'm missing a feature: When using wget for ftp downloads, I
cannot specify username and password, as I can when using it to download via
http - it's hardcoded, using anonymous and some password containing the
Hi,
I want to use wget in a cron script to regularly download stuff and backup it.
I don't want to get bugged by cron with the output from wget (Time URL:
xxx...) so I tried using the quiet mode -q.
This was fine until some error on the server occured - I didn't spot it
because wget was too
Hi Frederic!
I'd like to know if there is a simple way to 'mirror' only the images
from a galley (ie. without thumbnails).
[...]
I won't address the options you suggested, because I think they should
be evaluated by a developper/coder.
However, as I often download galleries (and have some
On Wed, Apr 24, 2002 at 07:21:06AM +0200, Frederic Lochon (crazyfred) wrote:
Hello,
I'd like to know if there is a simple way to 'mirror' only the images
from a galley (ie. without thumbnails).
Depends on what you'd call easy I guess, apart from Jens' suggestion, you
might want to take a
It also seems these options are incompatible:
--continue with --recursive
This could be useful, imho.
JR How should wget decide if it needs to re-get or continue the file?
JR You could probably to smart guessing, but the chance of false decisions
JR persists.
Not wanting to repeat my post
Hi Brix!
It also seems these options are incompatible:
--continue with --recursive
[...]
JR How should wget decide if it needs to re-get or continue the file?
[...]
Brix:
Not wanting to repeat my post from a few days ago (but doing so nevertheless) the
one way
without checking all files
Hello,
I'd like to know if there is a simple way to 'mirror' only the images
from a galley (ie. without thumbnails).
Maybe a new feature could be useful. This could be done throught this
ways:
- mirroring only images that are a link
- mirroring only 'last' links from a tree
- a more general
]
cc: [EMAIL PROTECTED], (bcc: Mail Administrator/Newcastle/Computer Systems
Australia), (bcc: )
Subject:Re: Feedback feature request
On Mon, Apr 22, 2002 at 04:07:10PM +1000, [EMAIL PROTECTED]
wrote:
definition download points onto LAN servers. What I would like
Mogliano V.to (TV) fax x39-041-5907472
-- ITALY
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]
Sent: Monday, April 22, 2002 9:14 AM
To: [EMAIL PROTECTED]
Subject: Re: Feedback feature request
Thanks for the pointer. It looks like a good product
On Fri, Dec 07, 2001 at 12:00:44PM +0300, Valery Kondakoff wrote:
AE Run wget in the background and then do a 'tail -f logfile' in your
AE console window. Problem solved.
Thank you for your answer.
Unfortunately, I'm running Windows and your suggestion doesn't work on
this platform. :(
See
On Fri, Dec 07, 2001 at 12:09:27AM +0300, Valery Kondakoff wrote:
Hello, people at wget-list!
First of all let me thank you for your great utility.
Can I suggest a feature, that may further enhance wget?
I'm pretty unhappy with current wget logging possibilities, beause I
need to log all
"Mordechai T. Abzug" wrote:
Sometimes, I run wget in background to download a file that will take
hours or days to complete. It would be handy to have an option for
wget to send me mail when it's done, so I can fire and forget.
Thanks!
- Morty
wget comes from the *nix world where
74 matches
Mail list logo