Re: [SLUG] Spider a website

2008-06-02 Thread Ycros
You could use wget to do this, it's installed on most distributions by default. Usually you'd run it like this: wget --mirror -np http://some.url/ (the -np tells it not to recurse up to the parent, which is useful if you only want to mirror a subdirectory. I add it on out of habit.) It's

Re: [SLUG] Spider a website

2008-06-02 Thread Richard Heycock
Excerpts from Peter Rundle's message of Tue Jun 03 14:20:08 +1000 2008: I'm looking for some recommendations for a *simple* Linux based tool to spider a web site and pull the content back into plain html files, images, js, css etc. I have a site written in PHP which needs to be hosted

Re: [SLUG] Spider a website

2008-06-02 Thread Jonathan Lange
On Tue, Jun 3, 2008 at 2:20 PM, Peter Rundle [EMAIL PROTECTED] wrote: I'm looking for some recommendations for a *simple* Linux based tool to spider a web site and pull the content back into plain html files, images, js, css etc. I have a site written in PHP which needs to be hosted

Re: [SLUG] Spider a website

2008-06-02 Thread Robert Collins
On Tue, 2008-06-03 at 14:20 +1000, Peter Rundle wrote: I'm looking for some recommendations for a *simple* Linux based tool to spider a web site and pull the content back into plain html files, images, js, css etc. I have a site written in PHP which needs to be hosted temporarily on a

[SLUG] Spider a website

2008-06-02 Thread Peter Rundle
I'm looking for some recommendations for a *simple* Linux based tool to spider a web site and pull the content back into plain html files, images, js, css etc. I have a site written in PHP which needs to be hosted temporarily on a server which is incapable (read only does static content). This

Re: [SLUG] Spider a website

2008-06-02 Thread Ycros
On 03/06/2008, at 3:19 PM, Mary Gardiner wrote: On Tue, Jun 03, 2008, Ycros wrote: It's not always perfect however, as it can sometimes mess the URLs up, but it's worth a try anyway. The -k option to convert any absolute paths to relative ones can be helpful with this (depending on what

Re: [SLUG] Spider a website

2008-06-02 Thread Mary Gardiner
On Tue, Jun 03, 2008, Ycros wrote: It's not always perfect however, as it can sometimes mess the URLs up, but it's worth a try anyway. The -k option to convert any absolute paths to relative ones can be helpful with this (depending on what you meant by mess the URLs up). -Mary -- SLUG -

Re: [SLUG] Spider a website

2008-06-02 Thread Daniel Pittman
Peter Rundle [EMAIL PROTECTED] writes: I'm looking for some recommendations for a *simple* Linux based tool to spider a web site and pull the content back into plain html files, images, js, css etc. Others have suggested wget, which works very well. You might also consider 'puf': Package:

Re: [SLUG] Spider a website

2008-06-02 Thread James Polley
wget-smubble-yew-get. Wget works great for getting a single file or a very simple all-under-this-tree setup, but it can take forever. Try httrack - http://www.httrack.com/. Ignore the pretty little screenshots, the linux commandline version does the same job, just requires much command-line-fu.