RE: where nutch store crawled data

beansproud Tue, 17 Jun 2008 06:57:34 -0700

oh, you are right.
thanks


POIRIER David wrote:
> 
> When executing a crawl, Nutch creates segments, based on the crawel
> depth if I'm not mistaking, in which the fetched content is stored. For
> example, if crawling a web site named site-xyz, into the directory
> $nutch_home/crawls/crawl-xyz, you will find the segments into the
> following directory: $nutch_home/crawls/crawl-xyz/segments. For each
> segment directory you will find a content directory. 
> 
> To be honest, I don't think you can directly access the stored content
> found in thoses directories, the idea being to index it and not
> necesserely store it.
> 
> David
> 
> 
> 
> -----Original Message-----
> From: beansproud [mailto:[EMAIL PROTECTED] 
> Sent: lundi, 16. juin 2008 16:42
> To: [email protected]
> Subject: where nutch store crawled data
> 
> 
> Hi,
>     I'm fresh for nutch.And when I use nutch for crawling pages.I can
> get
> the crawled data by using the command : nutch readseg.
>     My question is can I get the data directly ? I just can't find where
> nutch put them.
>     Can anybody tell me ?
>     Thanks very much!
> -- 
> View this message in context:
> http://www.nabble.com/where-nutch-store-crawled-data-tp17865961p17865961
> .html
> Sent from the Nutch - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/where-nutch-store-crawled-data-tp17865961p17905486.html
Sent from the Nutch - User mailing list archive at Nabble.com.

RE: where nutch store crawled data

Reply via email to