Hi,Rajesh. Use "prune" tool. ./nutch prune /path/to/segments/dir /path/to/file/with/rules
You wrote 21 èþíÿ 2006 ã., 20:35:34: > I would like to delete certain documents from the crawled documents > depending on a certain criteria. Is there a way to achieve this? My guess > is, nutch downloads all the files before parsing it. > __________ NOD32 1.1611 (20060620) Information __________ > This message was checked by NOD32 antivirus system. > http://www.eset.com -- Regards, Dima mailto:[EMAIL PROTECTED] All the advantages of Linux Managed Hosting--Without the Cost and Risk! Fully trained technicians. The highest number of Red Hat certifications in the hosting industry. Fanatical Support. Click to learn more http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
