Aha, thanks for the clarification!  :-)   The mergesegs command has a -i
option to index the output segment.  Perhaps the SegmentSlicer command
could be modified to optionally index the output segments, too?

New question: aside from slicing URLs by a Perl5 pattern, is there a way
to slice an index by the MD5 hash of the URL?  I'd like to evenly
distribute my big index into evenly-distributed and determinate smaller
indexes.  And applying a Perl5 pattern sounds expensive.  :-)

Thanks,
DaveG


-----Original Message-----
From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 20, 2005 10:39 AM
To: [email protected]
Subject: Re: [Nutch-dev] distributed search

Goldschmidt, Dave wrote:

>Hi Rafi,
>
>Not sure if anyone answered this, but I think you're just after the
>segslice command:
>
>$ nutch segslice
>
>  
>

If I understand the original request, that's only half of the answer, 
but the right half.. ;-)

segslice doesn't slice the Lucene indexes, only the segment data. So, 
after you slice the segments you need to re-index them. Sorry.

I believe it's possible to do the index slicing, it's just not 
implemented yet...

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to