Thank you for you help.
bin/nutch admin could be useful but I need something based on crawling date.

I checked again the documentation and I think I will use this command :
bin/nutch  segread segments/20050922091545 -dump | grep outlink

This way, I will be able to generate reports based on the dates of the crawls.

Richard Rodrigues
www.Kelforum.com


----- Original Message ----- :

From: "Michael Ji" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, September 22, 2005 11:28 PM
Subject: Re: Links in a segement


the simpliest way is to use bin/nutch amdin.. to dump
webdb, from dumped text file of null.link, you can
pick the outlinks for a particular URL (or MD5),

Michael Ji,

--- Richard Rodrigues <[EMAIL PROTECTED]>
wrote:

Hello,

I am developping a search engine for internet forum
using Nutch.
I would like to create a page with the most linked
pages in the last crawl.

I would like to kown if there is a ways to get all
outgoing links in a
segment or
all the outgoing links in the db (with a date
condition).

Thanks in adavance for any suggestion,

Best Regards,

Richard Rodrigues
www.Kelforum.com




__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Reply via email to