Hi Maybe you can implement SegmentMergeFilter interface to filter segments during segment merge.
On Wed, Mar 6, 2013 at 6:02 AM, Markus Jelsma <[email protected]>wrote: > Hi > > You can't do this with -slice but you can merge segments and filter them. > This would mean you'd have to merge the segments for each domain. But > that's far too much work. Why do you want to do this? There may be better > ways in achieving you goal. > > > > -----Original message----- > > From:Jason S <[email protected]> > > Sent: Tue 05-Mar-2013 22:18 > > To: [email protected] > > Subject: keep all pages from a domain in one slice > > > > Hello, > > > > I seem to remember seeing a discussion about this in the past but I > can't seem to find it in the archives. > > > > When using mergesegs -slice, is it possible to keep all the pages from a > domain in the same slice? I have just been messing around with this > functionality (Nutch 1.6), and it seems like the records are simply split > after the counter has reached the slice size specified, sometimes splitting > the records from a single domain over multiple slices. > > > > How can I segregate a domain to a single slice? > > > > Thanks in advance, > > > > ~Jason > -- Don't Grow Old, Grow Up... :-)

