[Nutch Wiki] Update of PublicServers by KevinReader
Dear Wiki user, You have subscribed to a wiki page or wiki category on Nutch Wiki for change notification. The following page has been changed by KevinReader: http://wiki.apache.org/nutch/PublicServers -- * [http://campusgw.library.cornell.edu/ Cornell University Library] is collaborating with the research group of Thorsten Joachims to develop a learning search engine for library web pages based on Nutch. The nutch-based search engine is near the bottom of the page. * [http://search.creativecommons.org/ Creative Commons] is a search engine for creative commons licensed material. + + * [http://www.dadi360.com/ Dadi360] Usee nutch search engine for providing search of Chinese language websites in North America. * [http://www.ecolicommunity.org/Websearch Ecolhub Web Search] an E. coli specific search engine based on Nutch. EcoliHub WebSearch includes only those sites relevant to E. coli, thereby reducing the number of spurious hits. Searches can be optionally limited to your choice of resources. More than 110,000 pages to search. More resources getting added.
Re: [VOTE] Release Apache Nutch 1.0
I tried the latest RC and it seems to be downloading really slow compared to an earlier development version. 5 to 10 times slower actually - 0.6 pages/second per fetcher. Thanks, Cosmin On 3/27/09 9:56 PM, Sami Siren ssi...@gmail.com wrote: Thanks Andrzej, This vote has passed, we now have a release with three binding +1 votes from: -Andrzej Bialecki -Dennis Kubes -Sami Siren I'll finalize the remaining tasks and do the announcement after the package has been mirrored. ps. we should perhaps create jira issues for all the findings, small and big, so we can take care of them before next release. -- Sami Siren Andrzej Bialecki wrote: Sami Siren wrote: Hello, I have packaged the third release candidate for Apache Nutch 1.0 release at http://people.apache.org/~siren/nutch-1.0/rc2/ See the CHANGES.txt[1] file for details on release contents and latest changes. The release was made from tag: http://svn.apache.org/viewvc/lucene/nutch/tags/release-1.0-rc2/ The following issues that were discovered during the review of last rc have been fixed: https://issues.apache.org/jira/browse/NUTCH-722 https://issues.apache.org/jira/browse/NUTCH-723 https://issues.apache.org/jira/browse/NUTCH-725 https://issues.apache.org/jira/browse/NUTCH-726 https://issues.apache.org/jira/browse/NUTCH-727 Please vote on releasing this package as Apache Nutch 1.0. The vote is open for the next 72 hours. Only votes from Lucene PMC members are binding, but everyone is welcome to check the release candidate and voice their approval or disapproval. The vote passes if at least three binding +1 votes are cast. [ ] +1 Release the packages as Apache Nutch 1.0 [ ] -1 Do not release the packages because... +1. There's a minor issue when using the supplied build.xml to rebuild the sources - there are no conf/*.template files in the package, so Ant fails with an error. Creating an empty conf/dummy.template fixes this. IMHO this is a minor thing, so I vote for releasing the package as is.
[ANNOUNCE] Apache Nutch 1.0
I am pleased to announce the availability of Apache Nutch 1.0. Apache Nutch, a subproject of Apache Lucene, is open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats. Apache Nutch 1.0 contains a number of bug fixes and improvements such as Solr Integration, new indexing framework and new scoring framework just to mention a few. Details can be found in the changes file: http://svn.apache.org/repos/asf/lucene/nutch/tags/release-1.0/CHANGES.txt Apache Nutch is available for download from the following download page: http://www.apache.org/dyn/closer.cgi/lucene/nutch/nutch-1.0.tar.gz When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: http://www.apache.org/dist/lucene/nutch/KEYS For more information on Apache Nutch, visit the project home page: http://lucene.apache.org/nutch -- Sami Siren (on behalf of the Apache Nutch community)