Thank you a lot! Now I'm working on that, I have some doubts more...I'm not able to run the command readseg...I've been consulting some help forum and the basic synthesis is readseg <path of the file withe the segments> I have the segments in this path: D:\nutch-0.9\crawl-20100420112025\segments The file named crawl-20100420112025 is the one where are stored the segments. So I'm trying to execute the command using these but none is working: readseg d/nutch-0.9/crawl-20100420112025/segments readseg crawl-20100420112025/segments readseg crawl-20100420112025
What I'm doing wrong??When I try to execute I get bash: readseg:command not found. Any idea??Thank you in advance. -- View this message in context: http://n3.nabble.com/how-to-parse-html-files-while-crawling-tp706816p739953.html Sent from the Nutch - User mailing list archive at Nabble.com.