Hello,
I’m trying to use the --directory-prefix=prefix option
for wget on a Windows system. My prefix has spaces in the path
directories. Wget
appears to terminate the path at the first space encountered. In
other words if my prefix is: c:/my prefix/
then wget copies files to c:/m
Jonathan DeGumbia wrote:
> I'm trying to use the --directory-prefix=prefix option for wget on a
> Windows system. My prefix has spaces in the path directories. Wget
> appears to terminate the path at the first space encountered. In other
> words if my prefix is: c:/my prefix/ then wget copi
Jonathan D. Degumbia wrote:
Hello,
I'm trying to use the --directory-prefix=prefix option for wget on a
Windows system. My prefix has spaces in the path directories. Wget
appears to terminate the path at the first space encountered. In other
words if my prefix is: c:/my prefix/ then wget
Frank McCown wrote:
It would be great if wget had a way of limiting the amount of time it
took to run so it won't accidentally hammer on someone's web server for
an indefinate amount of time. I'm often needing to let a crawler run
for a while on an unknown site, and I have to manually kill wge
I think that a combination of --limit-rate and --wait parameters makes
this type of enhancement unnecessary, given that his stated purpose was
to not "hammer" a particular site.
Mark Post
-Original Message-
From: Mauro Tortonesi [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 30, 20
From what I understand, killing wget processes may result in resource
leaks. If wget could self-terminate, it could do so more gracefully and
note the termination in its log file. Plus writing scripts to launch
and kill processes is certainly not trivial.
Frank
Mauro Tortonesi wrote:
Fra
Using these parameters will only slow the process of crawling a site.
The purpose is not only to avoid wasting a web server's resources but
also to quit when it appears the site is taking too long to download.
Since wget doesn't appear to contain any logic for detecting crawler
traps, a timed s
I'm trying to use wget to do the following:
1. retrieve a single page
2. convert the links in the retrieved page to their full, absolute
addresses.
3. save the page with a file name that I specify
I thought this would do it:
wget -k -O test.html http://www.google.com
However, it doesn't c
Frank McCown <[EMAIL PROTECTED]> writes:
> From what I understand, killing wget processes may result in resource
> leaks.
Really? What kind of resource leaks are you referring to? Wget does
not create temporary files, nor does it allocate external resources
other than dynamically allocated memo
Hello. I would like to have a database within wget. The database
would let wget know what it has downloaded earlier. Wget could
download only new and changed files, and could continue the download
without having the old downloadings in my disk.
The database would also be accessed by other program
From what I understand, killing wget processes may result in resource
leaks.
Really? What kind of resource leaks are you referring to? Wget does
not create temporary files, nor does it allocate external resources
other than dynamically allocated memory and network connections, both
of which w
> 1. retrieve a single page
That worked.
> 2. convert the links in the retrieved page to their full, absolute
> addresses.
My "wget -h" output (Wget 1.10.2a1) says:
-k, --convert-links make links in downloaded HTML point to local files.
Wget 1.9.1e says:
-k, --convert-links
Steven M. Schweda wrote:
Not anything about converting relative links to absolute. I don't see
an option to do this automatically.
From the wget man
page for --convert-links:
...if a linked file was downloaded, the link will refer to its local
name; if it was not downloaded, the link wi
13 matches
Mail list logo