(apologies for cross posting...) Good Evening Everyone,
The Apache Nutch PMC are very pleased to announce the release of Apache Nutch v2.0. This release offers users an edition focused on large scale crawling which builds on storage abstraction (via Apache Gora™) for big data stores such as Apache Accumulo™, Apache Avro™, Apache Cassandra™, Apache HBase™, HDFS™, an in memory data store and various high profile SQL stores. After some two years of development Nutch v2.0 also offers all of the mainstream Nutch functionality and it builds on Apache Solr™ adding web-specifics, such as a crawler, a link-graph database and parsing support handled by Apache Tika™ for HTML and an array other document formats. Nutch v2.0 shadows the latest stable mainstream release (v1.5.X) based on Apache Hadoop™ and covers many use cases from small crawls on a single machine to large scale deployments on Hadoop clusters. Please see the list of changes http://www.apache.org/dist/nutch/2.0/CHANGES.txt made in this version for a full breakdown.. A full PMC release statement can be found below http://nutch.apache.org/#07+July+2012+-+Apache+Nutch+v2.0+Released Nutch v2.0 is available in source (zip and tar.gz) from the following download page: http://www.apache.org/dyn/closer.cgi/nutch/2.0 In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: http://www.apache.org/dist/nutch/KEYS For more information on Apache Nutch, visit the project home page: http://nutch.apache.org Thank you very much Lewis John McGibbney (on behalf of the Apache Nutch community) -- Lewis