Hi Brian,
maybe this helps:
--html-extension
If a file of type application/xhtml+xml or text/html is downloaded and
the URL does not end with the regexp \.[Hh][Tt][Mm][Ll]?, this option
will cause the suffix .html to be appended to the local filename. This
is useful, for instance, when
Hi list!
saddly I couldn't find the E-Mail of Allan (maybe because I'm atached by
the news gateway) so this is a list-only-post.
Micah Cowan wrote:
Hi Allan,
You'll generally get better results if you post to the mailing list
(wget@sunsite.dk). I've added it to the recipients list.
Coombe,
Liz Labbe wrote:
I just downloaded WGET and am trying to get it to work
under Window's Vista operating system.
I cannot get it to connect: for example, I tried
wget http://www.yahoo.com/
wget force_html=on http://www.yahoo.com/
etc.
I consistently get the message
connecting to
Alexandru Tudor Constantinescu wrote:
I have the feeling wget is not really able to figure out which files
to download from some web sites, when css files are used.
That's right. Up until wget 1.11 (released yesterday) there is no
support for css-files in the matter of parsing links out of it.
Wayne Connolly wrote:
Micah,
Thanks mate- i know we chatted on IRC but just thought someone else may
be able to provide some insight.
Cheers and thanks,
Wayne
Well It's just impossible over wget to do this, because wget only get's
the output a client will get on visiting you page with no
Micah Cowan schrieb:
Nichlas wrote:
Hi, i'm new to the list.
I'm currently trying to download about 600 pdf's linked to from
individual HTML pages on a site.
Problem is, that when the PDFs get downloaded, they get names like
Hi,
we know this. This was just recently discussed on the mailinglist and I
agree with you.
But there are two arguments why this is not default:
a) It's a quite new feature for wget and therefore would brake
compatibility with prior versions and any old script would need to be
rewritten.
b) It's
Tony Godshall wrote:
If it was me, I'd have it default to backing off to 95% by default and
have options for more aggressive behavior, like the multiple
connections, etc.
I don't like a default back-off rule. I often encounter downloads with
often changing download speeds. The idea that the
Micah Cowan wrote:
I'm kinda leaning toward the idea that we change the parser for
--limit-rate to something that takes a percentage, instead of adding a
new option. While it probably means a little extra coding, it handily
deals with broken cases like people specifying both --limit-rate and
Micah Cowan schrieb:
Matthias Vill wrote:
I would appreciate having a --limit-rate N% option.
So now about those broken cases. You could do some least of both
policy (which would of course still need the time to do measuring and
can cut only afterwards).
Or otherwise you could use a non
Josh Williams schrieb:
On 10/15/07, Micah Cowan [EMAIL PROTECTED] wrote:
Note that this doesn't help him much if he's lost his registration e-mail.
Patrick, you'll probably have to go bug the staff at www.dotsrc.org, who
hosts this list; send an email to [EMAIL PROTECTED]
E-mail *address*
Micah Cowan schrieb:
Josh Williams wrote:
On 8/22/07, Micah Cowan [EMAIL PROTECTED] wrote:
What would be the appropriate behavior of -R then?
I think the default option should be to download the html files to
parse the links, but it should discard them afterwards if they do not
match the
Matthias Vill schrieb:
Micah Cowan schrieb:
Josh Williams wrote:
On 8/22/07, Micah Cowan [EMAIL PROTECTED] wrote:
What would be the appropriate behavior of -R then?
I think the default option should be to download the html files to
parse the links, but it should discard them afterwards
Barnett, Rodney schrieb:
-Original Message-
From: Matthias Vill [mailto:[EMAIL PROTECTED]
Sent: Thursday, August 23, 2007 1:54 AM
To: wget@sunsite.dk
Subject: Re: -R and HTML files
Micah Cowan schrieb:
Josh Williams wrote:
On 8/22/07, Micah Cowan [EMAIL PROTECTED] wrote
Hi Josh,
the manual reads:
--spider
When invoked with this option, Wget will behave as a Web spider, which
means that it will not down-load the pages, just check that they are
there. For example, you can use Wget to check your book-marks:
wget --spider --force-html -i
exceed memory.
I also guess that loading the whole 15M of a max-sized-html at once
looks ugly in memory an can lead to problems when you have multiple
wget-processes at once.
Maybe wget should be optimized for HTML-files with a max-size of 4M and
afterwards parse chunks.
Greetings
Matthias Vill
Hi Karthik,
Is there any way to find the file name of the downloaded file?
Of course there is. Normally wget will transform the last segement of
the url into the filename by changing characters not supported by you
filesystem to similar charakters like ? ist stored as %3F wich is
the
Hi Jaymz,
Jaymz Goktug YUKSEL wrote:
process.php?no=0
and when this executes itself and finishes, it redirects itself to
process.php?no=1 and this is supposed to go until it reaches 650.
However when it comes to 19 it stops and it says
Maximum redirection (20) reached. So it connot
So you would suggest handling in the way that when I use
wget --base=/some/serverdir http://server/serverdir/
/.* will be interpreted as /some/.* so if you have a link like
/serverdir/ it would go back to /some/serverdir, right?
I guess this would be ok. Just one question if there is a Link back
I think I got your point:
All in all this is still a matter of comparing the first against the
current url and counting the common dirs from the left side.
Then you compare that number(a) to the depth of the first url(b) and add
b-a ../ so you get to the right position inside your base.
By that
Mauro Tortonesi schrieb:
On Mon, 9 Jul 2007 15:06:52 +1200
[EMAIL PROTECTED] wrote:
wget under win2000/win XP
I get No such file or directory error messages when using the follwing
command line.
wget -s --save-headers
http://www.nndc.bnl.gov/ensdf/browseds.jsp?nuc=%1class=Arc;
%1 = 212BI
Hi Miroslaw,
AFIAK there is no option to get wget deleting files it did not download
in that turn. So if you want to use wget for mirroring you would need to
redownload the whole server to a local dir and than do something like
--- CUT here ---
cd new_dir
find . ../new_files
cd ..
cp -af
Mario Ander schrieb:
Hi everybody,
I think there is a bug storing cookies with wget.
See this command line:
C:\Programme\wget\wget --user-agent=Opera/8.5 (X11;
U; en) --no-check-certificate --keep-session-cookies
--save-cookies=cookie.txt --output-document=-
--debug
Matthias Vill schrieb:
Mario Ander schrieb:
Hi everybody,
I think there is a bug storing cookies with wget.
See this command line:
C:\Programme\wget\wget --user-agent=Opera/8.5 (X11;
U; en) --no-check-certificate --keep-session-cookies
--save-cookies=cookie.txt --output-document
I don't know rapidshare, but i would guess it's either that you need to
supply wget with a rapishare-login-cookie have a look at you're browsers
stored cookies for the rapidshare website)
Or you can try adding user:password@ to your url like
http://user:[EMAIL PROTECTED]/file.ext
Hope this helps
Hi there,
this is an Post of November 2004 and yesterday I stumbled myself over a
situation where I don't want to get redirected.
I tried to script-get worksheets with incremental-numbering. Easy you
would say: 404 or success so check $?.
The server does however try to be nice to me and always
26 matches
Mail list logo