Aha, thanks for the clarification! :-) The mergesegs command has a -i option to index the output segment. Perhaps the SegmentSlicer command could be modified to optionally index the output segments, too?
New question: aside from slicing URLs by a Perl5 pattern, is there a way to slice an index by the MD5 hash of the URL? I'd like to evenly distribute my big index into evenly-distributed and determinate smaller indexes. And applying a Perl5 pattern sounds expensive. :-) Thanks, DaveG -----Original Message----- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 20, 2005 10:39 AM To: [email protected] Subject: Re: [Nutch-dev] distributed search Goldschmidt, Dave wrote: >Hi Rafi, > >Not sure if anyone answered this, but I think you're just after the >segslice command: > >$ nutch segslice > > > If I understand the original request, that's only half of the answer, but the right half.. ;-) segslice doesn't slice the Lucene indexes, only the segment data. So, after you slice the segments you need to re-index them. Sorry. I believe it's possible to do the index slicing, it's just not implemented yet... -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
