Aha, thanks for the clarification! :-) The mergesegs command has a -i option to index the output segment. Perhaps the SegmentSlicer command could be modified to optionally index the output segments, too?
New question: aside from slicing URLs by a Perl5 pattern, is there a way to slice an index by the MD5 hash of the URL? I'd like to evenly distribute my big index into evenly-distributed and determinate smaller indexes. And applying a Perl5 pattern sounds expensive. :-) Thanks, DaveG -----Original Message----- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 20, 2005 10:39 AM To: [email protected] Subject: Re: [Nutch-dev] distributed search Goldschmidt, Dave wrote: >Hi Rafi, > >Not sure if anyone answered this, but I think you're just after the >segslice command: > >$ nutch segslice > > > If I understand the original request, that's only half of the answer, but the right half.. ;-) segslice doesn't slice the Lucene indexes, only the segment data. So, after you slice the segments you need to re-index them. Sorry. I believe it's possible to do the index slicing, it's just not implemented yet... -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_idv37&alloc_id865&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
