Hi Joshua,

 

 

I'm not using Nutch only for fetch and parse before sending it to a Solr 
instance. During my actions such as creating a new segment from the crawldb, 
fetching and parsing and reinserting new URL's in the crawldb, i can give any 
arbitrary path to a segment directory, thus renaming, would in my case, not be 
a problem at all. I don't know if this holds if you use Nutch's built-in 
indexer though.

 

By the way, i delete the newly created segment directories anyway after they've 
been sent to Solr.

 

 

Cheers,
 
-----Original message-----
From: Joshua J Pavel <[email protected]>
Sent: Tue 11-05-2010 19:47
To: [email protected]; 
Subject: Renaming segments?

Hi everyone!

I crawl often, and move my crawl to a different server to serve out the 
results, replacing the previous crawl's filesystem.  This can quickly lead 
to inactive segments accruing on the server running the web portion.

I would like to rename my segments to a standard, non-dated format (e.g, 
segment1, segment2, segment3, ...) to make it more portable.  Is this 
possible?

Thanks!
-Josh 

Reply via email to