Re: [LUG@IITD:11178] Extract HTML pages into a single HTML/PDF

h4nnibal Thu, 30 Dec 2010 13:48:21 -0800

On Thu, Dec 30, 2010 at 23:24, g4Ur4v <[email protected]> wrote:


> Is there a way to extract all the HTML pages from a website into a
> single HTML/PDF file.For example,the Dive into HTML5 website
> (www.diveintohtml5.org) contains the book in various html pages.Is
> there a way to combine them all into a single file ?
>

You can mirror a site by using wget utility.

<command>

wget  --continue --mirror --convert-links --backup-converted
--page-requisites --timestamping --verbose --random-wait
http://www.diveintohtml5.org/

</command>

This will mirror the complete site in your current working directory.
Moreover you can sync any update by just writing a cron job in
crontab
#if prefer this.

0 0 * * 0 wget  --continue --mirror --convert-links --backup-converted
--page-requisites --timestamping --verbose --random-wait
http://www.diveintohtml5.org/

Arjun S R <[email protected]>
College Of Engineering,Trivandrum <http://www.cet.ac.in/home.php>
Facebook : http://www.facebook.com/Arjun.S.R
Twitter: http://twitter.com/Arjun_S_R

-- 
l...@iitd - http://tinyurl.com/ycueutm

Re: [LUG@IITD:11178] Extract HTML pages into a single HTML/PDF

Reply via email to