liv wrote:
> I intend to use nutch with a fairly complex structure of subcollections. I
> did some tests and the storage/search performs as expected; however there is
> an aspect I may have neglected and cannot find an answer. 
> 
> How/at which stage are subcollections added to the index structure?

If you are talking about the subcollections generated by the
subcollection plugin then the subcollection data is stored at indexing
phase.

> I plan on crawling frequently, adding new sites to existent repository,
> merging/reindexing as needed. However if I need to change the subcollection
> structure (ie. add a site to a newly created subcollection) I don't want to
> recrawl it again. I hope it can be done by simply using the existent/crawled
> data.

no need to recrawl, unfortunately you still need to reindex.

--
 Sami Siren

Reply via email to