[Nutch Wiki] Update of PublicServers by KevinReader

2009-03-28 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Nutch Wiki for change 
notification.

The following page has been changed by KevinReader:
http://wiki.apache.org/nutch/PublicServers

--
* [http://campusgw.library.cornell.edu/ Cornell University Library] is 
collaborating with the research group of Thorsten Joachims to develop a 
learning search engine for library web pages based on Nutch. The nutch-based 
search engine is near the bottom of the page.
  
* [http://search.creativecommons.org/ Creative Commons] is a search engine 
for creative commons licensed material.
+ 
+   * [http://www.dadi360.com/ Dadi360] Usee nutch search engine for providing 
search of Chinese language websites in North America.
  
* [http://www.ecolicommunity.org/Websearch Ecolhub Web Search] an E. coli 
specific search engine based on Nutch. EcoliHub WebSearch includes only those 
sites relevant to E. coli, thereby reducing the number of spurious hits. 
Searches can be optionally limited to your choice of resources. More than 
110,000 pages to search. More resources getting added.
  


Re: [VOTE] Release Apache Nutch 1.0

2009-03-28 Thread Cosmin Lehene
I tried the latest RC and it seems to be downloading really slow compared to an 
earlier development version. 5 to 10 times slower actually - 0.6 pages/second 
per fetcher.

Thanks,
Cosmin

On 3/27/09 9:56 PM, Sami Siren ssi...@gmail.com wrote:

Thanks Andrzej,

This vote has passed, we now have a release with three binding +1 votes
from:

-Andrzej Bialecki
-Dennis Kubes
-Sami Siren

I'll finalize the remaining tasks and do the announcement after the
package has been mirrored.

ps. we should perhaps create jira issues for all the findings, small and
big, so we can take care of them before next release.

--
  Sami Siren



Andrzej Bialecki wrote:
 Sami Siren wrote:
 Hello,

 I have packaged the third release candidate for Apache Nutch 1.0
 release at http://people.apache.org/~siren/nutch-1.0/rc2/

 See the CHANGES.txt[1] file for details on release contents and latest
 changes. The release was made from tag:
 http://svn.apache.org/viewvc/lucene/nutch/tags/release-1.0-rc2/

 The following issues that were discovered during the review of last rc
 have been fixed:

 https://issues.apache.org/jira/browse/NUTCH-722
 https://issues.apache.org/jira/browse/NUTCH-723
 https://issues.apache.org/jira/browse/NUTCH-725
 https://issues.apache.org/jira/browse/NUTCH-726
 https://issues.apache.org/jira/browse/NUTCH-727

 Please vote on releasing this package as Apache Nutch 1.0. The vote is
 open for the next 72 hours. Only votes from Lucene PMC members are
 binding, but everyone is welcome to check the release candidate and
 voice their approval or disapproval. The vote  passes if at least
 three binding +1 votes are cast.

 [ ] +1 Release the packages as Apache Nutch 1.0
 [ ] -1 Do not release the packages because...

 +1. There's a minor issue when using the supplied build.xml to rebuild
 the sources - there are no conf/*.template files in the package, so Ant
 fails with an error. Creating an empty conf/dummy.template fixes this.
 IMHO this is a minor thing, so I vote for releasing the package as is.






[ANNOUNCE] Apache Nutch 1.0

2009-03-28 Thread Sami Siren

I am pleased to announce the availability of  Apache Nutch 1.0.

Apache Nutch, a subproject of Apache Lucene, is open source web-search 
software. It builds on Lucene Java, adding web-specifics, such as a 
crawler, a link-graph database, parsers for HTML and other document formats.


Apache Nutch 1.0 contains a number of bug fixes and improvements such as 
Solr Integration, new indexing framework and new scoring framework just 
to mention a few. Details can be found in the changes file:


http://svn.apache.org/repos/asf/lucene/nutch/tags/release-1.0/CHANGES.txt

Apache Nutch is available for download from the following download page:
http://www.apache.org/dyn/closer.cgi/lucene/nutch/nutch-1.0.tar.gz

When downloading from a mirror site, please remember to verify the 
downloads using signatures found on the Apache site:

http://www.apache.org/dist/lucene/nutch/KEYS

For more information on Apache Nutch, visit the project home page:
http://lucene.apache.org/nutch

-- Sami Siren (on behalf of the Apache Nutch community)