Thank you for you help.
bin/nutch admin could be useful but I need something based on crawling
date.
I checked again the documentation and I think I will use this command :
bin/nutch segread segments/20050922091545 -dump | grep outlink
This way, I will be able to generate reports based on the dates of the
crawls.
Richard Rodrigues
www.Kelforum.com
----- Original Message ----- :
From: "Michael Ji" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, September 22, 2005 11:28 PM
Subject: Re: Links in a segement
the simpliest way is to use bin/nutch amdin.. to dump
webdb, from dumped text file of null.link, you can
pick the outlinks for a particular URL (or MD5),
Michael Ji,
--- Richard Rodrigues <[EMAIL PROTECTED]>
wrote:
Hello,
I am developping a search engine for internet forum
using Nutch.
I would like to create a page with the most linked
pages in the last crawl.
I would like to kown if there is a ways to get all
outgoing links in a
segment or
all the outgoing links in the db (with a date
condition).
Thanks in adavance for any suggestion,
Best Regards,
Richard Rodrigues
www.Kelforum.com
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com