Hi,

I just committed a high-level API for working with segment data. The classes are located in net.nutch.segment.* package.

The SegmentReader offers a superset of functionality of DumpSegment tool, therefore I'm removing that tool. Thanks to John Xing for providing the initial implementation! I'll update the command-line script shortly to reflect these changes.

SegmentReader offers also a function to fix partially corrupted segments.

SegmentSlicer provides a function to restructure your segment collection, i.e. to take a few of input segments and write out a couple of output segments containing the same data.

I will shortly provide an updated version of SegmentMergeTool (previsouly known as FastSegmentMergeTool), which makes use of this API.

--
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)




------------------------------------------------------- This SF.Net email is sponsored by: Sybase ASE Linux Express Edition - download now for FREE LinuxWorld Reader's Choice Award Winner for best database on Linux. http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to