Hello, Can I parse more than once fetched segments without having to fetch everything again?
When I first tried to use the "./bin nutch parse ./path/to/an/already/parsed/segment" command I got a java exception explaining that the segment involved had already be parsed. Indeed the following subdirectories could be found under the segment directory: segment/content segment/crawl_fetch segment/crawl_generate segment/crawl_parse segment/parse_data segment/parse_text To try and force the parsing process I renamed the last 3 subdirectories to something else and re-lunched the "./bin nutch parse" command. It has been running for more than 24 hours... and it is still not over. My idea is to afterward recreate an index with the newly parsed segment. Is this the way to do it? Isn't there a simpler, and maybe quicker, way to reparsed segments? Thank you, David
