Hi Tri, The status de_fetched means that these URLs have been fetched and exisit within you crawldb. Is this what you are after?
The CrawlDatum class [1] displays all of the possible states that an URL can exist in within your crawldb. [1] https://svn.apache.org/repos/asf/nutch/trunk/src/java/org/apache/nutch/crawl/CrawlDatum.java On Mon, Oct 24, 2011 at 12:05 PM, Tri Nguyen <[email protected]> wrote: > Dear Helpers, > > > Could you please help me how we do this task in Nutch 1.3? > > > Thank you so much for your help, > > Regards, > > Tri Nguyen. > -- *Lewis*

