Re: R: Using Nutch for only retriving HTML

2009-09-30 Thread O. Olson
. Best regards, Magnus On Tue, Sep 29, 2009 at 10:25 PM, Susam Pal susam@gmail.com wrote: On Wed, Sep 30, 2009 at 1:39 AM, O. Olson olson_...@yahoo.it wrote: Sorry for pushing this topic, but I would like to know if Nutch would help me get the raw HTML in my situation described

R: Using Nutch for only retriving HTML

2009-09-29 Thread O. Olson
Sorry for pushing this topic, but I would like to know if Nutch would help me get the raw HTML in my situation described below. I am sure it would be a simple answer to those who know Nutch. If not then I guess Nutch is the wrong tool for the job. Thanks, O. O. --- Gio 24/9/09, O. Olson

Using Nutch for only retriving HTML

2009-09-24 Thread O. Olson
Hi, I am new to Nutch. I would like to completely crawl through an Internal Website and retrieve all the HTML Content. I don’t intend to do further processing using Nutch. The Website/Content is rather huge. By crawl, I mean that I would go to a page, download/archive the HTML, get the