if you try bin/nutch
without any arguments and options, it will show you Usage: nutch [-core] COMMAND where COMMAND is one of: ... parse parse a segment's pages invertlinks create a linkdb from parsed segments index run the indexer on parsed segments and linkdb there is no need to redo the whole crawl. only reparse and reindex. may be you have to delete some of the directories in the segments you want to reparse(i guess parse_data and parse_text) reinh...@thord:>ls crawl/segments/20091021095928/ content crawl_fetch crawl_generate crawl_parse parse_data parse_text regards reinhard sprabhu_PN schrieb: > We have added a few plug-ins such as date parsing plug-in that get exercised > during a Nutch crawl and update a field in each index record. Now we find > that we need to improve the plug-in and re-run it. Is the only option to > crawl the whole index once again ? Is there any way we can do a recrawl > which will just exercise newer versions of plug-ins and take less time to do > it ? > > Thanks in advance. > > Regards > Shreekanth Prabhu >