Sami, Thanks a lot, I would like to see a feature in, that a link to a webpage is sowing all areay indexed urls.
So other spiders can fetch this site and get the urls, the open souce natuch has already to provide. So we need to start not to have open source coding the machine, but as well every node offering an open, downloadable database of urls, And we need a list of urls, of other nutch domains. With this list, each Nutch can crawl the urls of the other nutch providing them on a website. As Million of urls are a lot, I suggest to have 26 websites from a-z to display all urls of the `word´ "a", all 25 urls links b-z as well on the page of the word-page "a". then several Nutch nodes could use a small p2p feature and as well the sister yacy can fetch the urls from a central open source point: all nutch domains. Would this be possible to generate a webpage-link somewhere on the nutch-homepage of the individual serverinstall with all urls? Opensource has to found solidarity, so make the nutch url database open for as well open source search engine spiders from central points. thanks -------- Original-Nachricht -------- Datum: Tue, 16 Jan 2007 17:53:41 +0200 Von: Sami Siren <[EMAIL PROTECTED]> An: nutch-dev@lucene.apache.org Betreff: Next Nutch release > Hello, > > It has been a while from a previous release (0.8.1) and looking at the > great fixes done in trunk I'd start thinking about baking a new release > soon. > > Looking at the jira roadmaps there are 1 blocking issues (fixing the > license headers) for 0.8.2 and two other blocking issues for 0.9.0 of > which I think NUTCH-233 is safe to put in. > > The top 10 voted issues are currently: > > NUTCH-61 Adaptive re-fetch interval. Detecting umodified content > NUTCH-48 "Did you mean" query enhancement/refignment feature > NUTCH-251 Administration GUI > NUTCH-289 CrawlDatum should store IP address > NUTCH-36 Chinese in Nutch > NUTCH-185 XMLParser is configurable xml parser plugin. > NUTCH-59 meta > data support in webdb > NUTCH-92 DistributedSearch incorrectly scores results > NUTCH-68 A > tool to generate arbitrary fetchlists NUTCH-87 > Efficient > site-specific crawling for a large number of sites > > Are there any opinions about issues that should go in before the next > release (Answering yes means that you are willing to provide a patch for > it). > > -- > Sami Siren -- "Feel free" - 5 GB Mailbox, 50 FreeSMS/Monat ... Jetzt GMX ProMail testen: http://www.gmx.net/de/go/promail ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers