RE: Help! How capture the HTLM pages on the PC?

Virden, Larry W. Wed, 29 Mar 2006 07:51:30 -0800

I know I have a similar need. But wget doesn't work for me, because the
links I need to follow are CGI, and wget doesn't follow cgi. I have
tried several things, but so far, I've not had any luck getting things
to work.

-- 
<URL: http://wiki.tcl.tk/ > Indescribable,uncontainable,all
powerful,untameable
Even if explicitly stated to the contrary, nothing in this posting
should be construed as representing my employer's opinions.
<URL: mailto:[EMAIL PROTECTED] > <URL: http://www.purl.org/NET/lvirden/
>

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Konstantin
Khomoutov
Sent: Wednesday, March 29, 2006 10:30 AM
To: [email protected]
Subject: Help! How capture the HTLM pages on the PC?

On Wed, Mar 29, 2006 at 03:56:30PM +0200, Enzo Telatin wrote:

> Some body can help me?
> I would like to capture some HTML pages (with there links) from the 
> Web on the PC and thereby edit/prepare/format this pages with an HTML 
> editor for pluker desktop? A lot of space is lost if you capture 
> directly this pages with the pluker desktop and is not so easy to read

> this pages on the PDA.
I'm not quite sure about what *exactly* do you mean saying "lot of
space", so let me guess.

1) Have you played with the "maximum depth" parameter in your Plucker
spider? It's possibly that it's set too high and you just suck too many
linked pages. Try to set it to 2 (means "this page and all pages that it
has links to").

2) wget is your best friend when fetching sites. On Unices you have to
get it either installed or available. Good ports to Win32 exist.
google.com?q=wget+win32 To get you up quickly, this encantation usually
works: $ wget -nd -nc -np -nH -k -r -l 2 http://www.site.com/root.html
which means:
* don't create URL-path-named intermediate directories;
* don't clobber existing files (on multiple runs);
* don't go up in the URL-path if a link leads there;
* don't go to another hosts;
* convert links in fetched files so that they reference each other;
* go recursively...
* ...with level 2 (this page and all pages it references).

Hope it helps.

_______________________________________________
plucker-list mailing list
[email protected]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list
_______________________________________________
plucker-list mailing list
[email protected]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

RE: Help! How capture the HTLM pages on the PC?

Reply via email to