Author: lewismc
Date: Wed Jul 3 19:20:18 2013
New Revision: 1499527
URL: http://svn.apache.org/r1499527
Log:
NUTCH-1599
Modified:
nutch/site/forrest/src/documentation/content/xdocs/downloads.xml
nutch/site/forrest/src/documentation/content/xdocs/index.xml
nutch/site/publish/bot.html
nutch/site/publish/bot.pdf
nutch/site/publish/credits.html
nutch/site/publish/credits.pdf
nutch/site/publish/downloads.html
nutch/site/publish/downloads.pdf
nutch/site/publish/faq.html
nutch/site/publish/faq.pdf
nutch/site/publish/index.html
nutch/site/publish/index.pdf
nutch/site/publish/issue_tracking.html
nutch/site/publish/issue_tracking.pdf
nutch/site/publish/linkmap.html
nutch/site/publish/linkmap.pdf
nutch/site/publish/mailing_lists.html
nutch/site/publish/mailing_lists.pdf
nutch/site/publish/nightly.html
nutch/site/publish/nightly.pdf
nutch/site/publish/old_downloads.html
nutch/site/publish/old_downloads.pdf
nutch/site/publish/sonar.html
nutch/site/publish/sonar.pdf
nutch/site/publish/tutorial.html
nutch/site/publish/tutorial.pdf
nutch/site/publish/version_control.html
nutch/site/publish/version_control.pdf
nutch/site/publish/wiki.html
nutch/site/publish/wiki.pdf
Modified: nutch/site/forrest/src/documentation/content/xdocs/downloads.xml
URL:
http://svn.apache.org/viewvc/nutch/site/forrest/src/documentation/content/xdocs/downloads.xml?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/forrest/src/documentation/content/xdocs/downloads.xml (original)
+++ nutch/site/forrest/src/documentation/content/xdocs/downloads.xml Wed Jul 3
19:20:18 2013
@@ -32,8 +32,8 @@
<p> Apache Nutch 2.2.1 (src-tar and src-zip only) and 1.7 (src-tar,
src-zip, bin-tar and bin-zip) are now available. See
the
- <a
href="http://apache.org/dist/nutch/2.1/CHANGES-2.2.1.txt">CHANGES-2.2.1.txt</a>,
and
- <a
href="http://apache.org/dist/nutch/1.6/CHANGES-1.7.txt">CHANGES-1.7.txt</a>
+ <a
href="http://apache.org/dist/nutch/2.2.1/CHANGES-2.2.1.txt">CHANGES-2.2.1.txt</a>,
and
+ <a
href="http://apache.org/dist/nutch/1.7/CHANGES-1.7.txt">CHANGES-1.7.txt</a>
files for more information on the list of updates in these releases.
</p>
<p> All Apache Nutch distributions is distributed under the <a
href="http://www.apache.org/licenses/LICENSE-2.0.html">Apache License, version
2.0</a>.
Modified: nutch/site/forrest/src/documentation/content/xdocs/index.xml
URL:
http://svn.apache.org/viewvc/nutch/site/forrest/src/documentation/content/xdocs/index.xml?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/forrest/src/documentation/content/xdocs/index.xml (original)
+++ nutch/site/forrest/src/documentation/content/xdocs/index.xml Wed Jul 3
19:20:18 2013
@@ -25,19 +25,30 @@
<body>
<section>
<title>What is Apache Nutch?</title>
- <p>Apache Nutch is an open source web-search software project.
- Stemming from <a href="ext:lucene">Apache Lucene</a>, it now builds
- on <a href="ext:solr">Apache Solr</a> adding web-specifics, such as a
crawler,
- a link-graph database and parsing support handled by <a
href="ext:tika">Apache Tika</a>
- for HTML and and array other document formats.</p>
+ <p>Apache Nutch is a highly extensible and scalable open source web
crawler software project.
+ Stemming from <a href="ext:lucene">Apache Lucene™</a>, the project
has diversified and now comprises two
+ codebases, namely:</p>
+ <ol>
+ <li>Nutch 1.x: A well matured, production ready crawler. 1.x enables
fine grained configuration,
+ relying on <a href="http://hadoop.apache.org">Apache Hadoop™</a>
data structures, which are
+ great for batch processing.</li>
+ <li>Nutch 2.x: An emerging alternative taking direct inspiration from
1.x, but which differs in one key
+ area; storage is abstracted away from any specific underlying data
store by using
+ <a href="http://gora.apache.org">Apache Gora™</a> for handling
object to persistent mappings. This means
+ we can implement an extremely flexibile model/stack for storing
everything (fetch time, status, content,
+ parsed text, outlinks, inlinks, etc.) into a number of NoSQL storage
solutions.
+ </li>
+ </ol>
+ <p>
+ Being pluggable and modular of course has it's benefits, Nutch provides
extensible interfaces such as Parse,
+ Index and ScoringFilter's for custom implementations e.g. <a
href="http://tika.apache.org">Apache Tika™</a> for parsing.
+ Additonally, pluggable indexing exists for <a href="ext:solr">Apache
Solr™</a>,
+ <a href="http://www.elasticsearch.org">Elastic Search</a>, etc.
+ </p>
<p>Nutch can run on a single machine, but gains a lot of its
strength from running in a <a href="ext:hadoop">Hadoop cluster</a></p>
- <p>The system can be enhanced (eg other document formats can be
- parsed) using a highly flexible, easily extensible and thoroughly
maintained
- plugin infrastructure.</p>
-
<p>You can download Nutch <a href="./downloads.html">here</a>.</p>
<p>For more information about Apache Nutch, please see the <a
@@ -77,7 +88,7 @@
release includes library upgrades to <a
href="http://hadoop.apache.org">Apache Hadoop</a> 1.2.0 and
<a href="http://tika.apache.org">Apache Tika</a> 1.3, it is predominantly
a bug fix for
<a href="https://issues.apache.org/jira/browse/NUTCH-1591">NUTCH-1591 -
Incorrect conversion of ByteBuffer to String</a>.
- Please see the <a
href="http://www.apache.org/dist/nutch/2.2.1/2.2.1-CHANGES.txt">list of
changes</a> for a full
+ Please see the <a
href="http://www.apache.org/dist/nutch/2.2.1/CHANGES-2.2.1.txt">list of
changes</a> for a full
breakdown, or see the <a href="http://s.apache.org/PGa">release
report</a>.
As usual in the 2.x series, this release is made available only as
source, but is also available within
<a href="http://search.maven.org/">Maven Central</a>.
Modified: nutch/site/publish/bot.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/bot.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/bot.html (original)
+++ nutch/site/publish/bot.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Apache Nutch robot</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/bot.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/bot.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/credits.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/credits.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/credits.html (original)
+++ nutch/site/publish/credits.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Apache Nutch Credits</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/credits.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/credits.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/downloads.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/downloads.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/downloads.html (original)
+++ nutch/site/publish/downloads.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Nutch Downloads</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
@@ -274,8 +274,8 @@ document.write("Last Published: " + docu
<div class="section">
<p> Apache Nutch 2.2.1 (src-tar and src-zip only) and 1.7 (src-tar, src-zip,
bin-tar and bin-zip) are now available. See
the
- <a
href="http://apache.org/dist/nutch/2.1/CHANGES-2.2.1.txt">CHANGES-2.2.1.txt</a>,
and
- <a
href="http://apache.org/dist/nutch/1.6/CHANGES-1.7.txt">CHANGES-1.7.txt</a>
+ <a
href="http://apache.org/dist/nutch/2.2.1/CHANGES-2.2.1.txt">CHANGES-2.2.1.txt</a>,
and
+ <a
href="http://apache.org/dist/nutch/1.7/CHANGES-1.7.txt">CHANGES-1.7.txt</a>
files for more information on the list of updates in these releases.
</p>
<p> All Apache Nutch distributions is distributed under the <a
href="http://www.apache.org/licenses/LICENSE-2.0.html">Apache License, version
2.0</a>.
Modified: nutch/site/publish/downloads.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/downloads.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/faq.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/faq.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/faq.html (original)
+++ nutch/site/publish/faq.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Apache Nutch FAQ's</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/faq.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/faq.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/index.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/index.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/index.html (original)
+++ nutch/site/publish/index.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Welcome to Apache Nutch™</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
@@ -367,17 +367,32 @@ document.write("Last Published: " + docu
<a name="N1000E"></a><a name="What+is+Apache+Nutch%3F"></a>
<h2 class="h3">What is Apache Nutch?</h2>
<div class="section">
-<p>Apache Nutch is an open source web-search software project.
- Stemming from <a href="http://lucene.apache.org/java/">Apache
Lucene</a>, it now builds
- on <a href="http://lucene.apache.org/solr/">Apache Solr</a> adding
web-specifics, such as a crawler,
- a link-graph database and parsing support handled by <a
href="http://tika.apache.org/">Apache Tika</a>
- for HTML and and array other document formats.</p>
+<p>Apache Nutch is a highly extensible and scalable open source web crawler
software project.
+ Stemming from <a href="http://lucene.apache.org/java/">Apache
Lucene™</a>, the project has diversified and now comprises two
+ codebases, namely:</p>
+<ol>
+
+<li>Nutch 1.x: A well matured, production ready crawler. 1.x enables fine
grained configuration,
+ relying on <a href="http://hadoop.apache.org">Apache Hadoop™</a>
data structures, which are
+ great for batch processing.</li>
+
+<li>Nutch 2.x: An emerging alternative taking direct inspiration from 1.x, but
which differs in one key
+ area; storage is abstracted away from any specific underlying data
store by using
+ <a href="http://gora.apache.org">Apache Gora™</a> for handling
object to persistent mappings. This means
+ we can implement an extremely flexibile model/stack for storing
everything (fetch time, status, content,
+ parsed text, outlinks, inlinks, etc.) into a number of NoSQL storage
solutions.
+ </li>
+
+</ol>
+<p>
+ Being pluggable and modular of course has it's benefits, Nutch provides
extensible interfaces such as Parse,
+ Index and ScoringFilter's for custom implementations e.g. <a
href="http://tika.apache.org">Apache Tika™</a> for parsing.
+ Additonally, pluggable indexing exists for <a
href="http://lucene.apache.org/solr/">Apache Solr™</a>,
+ <a href="http://www.elasticsearch.org">Elastic Search</a>, etc.
+ </p>
<p>Nutch can run on a single machine, but gains a lot of its
strength from running in a <a href="http://hadoop.apache.org/">Hadoop
cluster</a>
</p>
-<p>The system can be enhanced (eg other document formats can be
- parsed) using a highly flexible, easily extensible and thoroughly
maintained
- plugin infrastructure.</p>
<p>You can download Nutch <a href="./downloads.html">here</a>.</p>
<p>For more information about Apache Nutch, please see the <a
href="http://wiki.apache.org/nutch/">Nutch wiki.</a>
</p>
@@ -386,7 +401,7 @@ document.write("Last Published: " + docu
</div>
-<a name="N10041"></a><a name="Getting+Started"></a>
+<a name="N10056"></a><a name="Getting+Started"></a>
<h2 class="h3">Getting Started </h2>
<div class="section">
<p>To get started, begin here:</p>
@@ -408,7 +423,7 @@ document.write("Last Published: " + docu
</div>
-<a name="N10060"></a><a name="Download+Nutch"></a>
+<a name="N10075"></a><a name="Download+Nutch"></a>
<h2 class="h3">Download Nutch</h2>
<div class="section">
<p>
@@ -417,23 +432,23 @@ document.write("Last Published: " + docu
</div>
-<a name="N1006E"></a><a name="Apache+Nutch+News"></a>
+<a name="N10083"></a><a name="Apache+Nutch+News"></a>
<h2 class="h3">Apache Nutch News</h2>
<div class="section">
-<a name="N10074"></a><a name="02+July+2013+-+Apache+Nutch+v2.2.1+Released"></a>
+<a name="N10089"></a><a name="02+July+2013+-+Apache+Nutch+v2.2.1+Released"></a>
<h3 class="h4">02 July 2013 - Apache Nutch v2.2.1 Released</h3>
<p>The Apache Nutch PMC are pleased to announce the immediate release of
Apache Nutch v2.2.1, we advise all
current users and developers of the 2.X series to upgrade to this release
ASAP. Although this
release includes library upgrades to <a
href="http://hadoop.apache.org">Apache Hadoop</a> 1.2.0 and
<a href="http://tika.apache.org">Apache Tika</a> 1.3, it is predominantly
a bug fix for
<a href="https://issues.apache.org/jira/browse/NUTCH-1591">NUTCH-1591 -
Incorrect conversion of ByteBuffer to String</a>.
- Please see the <a
href="http://www.apache.org/dist/nutch/2.2.1/2.2.1-CHANGES.txt">list of
changes</a> for a full
+ Please see the <a
href="http://www.apache.org/dist/nutch/2.2.1/CHANGES-2.2.1.txt">list of
changes</a> for a full
breakdown, or see the <a href="http://s.apache.org/PGa">release
report</a>.
As usual in the 2.x series, this release is made available only as
source, but is also available within
<a href="http://search.maven.org/">Maven Central</a>.
The release is available <a
href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.
</p>
-<a name="N1009A"></a><a name="24th+June+2013+-+Apache+Nutch+v1.7+Released"></a>
+<a name="N100AF"></a><a name="24th+June+2013+-+Apache+Nutch+v1.7+Released"></a>
<h3 class="h4">24th June 2013 - Apache Nutch v1.7 Released</h3>
<p>The Apache Nutch PMC are extremely pleased to announce the immediate
release of Apache Nutch v1.7. This
release includes over 20 bug fixes, as many improvements; most noticeably
featuring a new
@@ -449,7 +464,7 @@ document.write("Last Published: " + docu
<a href="http://search.maven.org/">Maven Central</a>.
The release is available <a
href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.
</p>
-<a name="N100CC"></a><a name="08+June+2013+-+Apache+Nutch+v2.2+Released"></a>
+<a name="N100E1"></a><a name="08+June+2013+-+Apache+Nutch+v2.2+Released"></a>
<h3 class="h4">08 June 2013 - Apache Nutch v2.2 Released</h3>
<p>The Apache Nutch PMC are extremely pleased to announce the immediate
release of Apache Nutch v2.2. This
release includes over 30 bug fixes and over 25 improvements representing
the third release of increasingly
@@ -464,7 +479,7 @@ document.write("Last Published: " + docu
<a href="http://search.maven.org/">Maven Central</a>.
The release is available <a
href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.
</p>
-<a name="N100FA"></a><a
name="06+December+2012+-+Apache+Nutch+v1.6+Released"></a>
+<a name="N1010F"></a><a
name="06+December+2012+-+Apache+Nutch+v1.6+Released"></a>
<h3 class="h4">06 December 2012 - Apache Nutch v1.6 Released</h3>
<p>The Apache Nutch PMC are extremely pleased to announce the release of
Apache Nutch v1.6. This
release includes over 20 bug fixes, the same in improvements, as well as
new functionalities including a new HostNormalizer,
@@ -476,7 +491,7 @@ document.write("Last Published: " + docu
in this version for a full breakdown. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.
</p>
-<a name="N10118"></a><a
name="05+October+2012+-+Apache+Nutch+v2.1+Released"></a>
+<a name="N1012D"></a><a
name="05+October+2012+-+Apache+Nutch+v2.1+Released"></a>
<h3 class="h4">05 October 2012 - Apache Nutch v2.1 Released</h3>
<p>The Apache Nutch PMC are very pleased to announce the release of Apache
Nutch v2.1. This
release continues to provide Nutch users with a simplified Nutch
distribution building on the 2.x
@@ -486,18 +501,18 @@ document.write("Last Published: " + docu
in this version for a full breakdown. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.
</p>
-<a name="N10136"></a><a
name="10+August+2012+-+Happy+10th+Birthday+Apache+Nutch%21%21"></a>
+<a name="N1014B"></a><a
name="10+August+2012+-+Happy+10th+Birthday+Apache+Nutch%21%21"></a>
<h3 class="h4">10 August 2012 - Happy 10th Birthday Apache Nutch!!</h3>
<p>It's official, Apache Nutch is now a decade old! The project has come a
long long way since inception, through <a
href="#January+2005%3A+Nutch+Joins+Apache+Incubator">acceptance into the Apache
Incubator</a> way back in Janurary 2005, to the <a
href="#21+April+2010+-+Apache+Nutch+graduates+to+TLP">Top Level Project</a> it
became on 21st April 2010. Happy birthday Nutch and thanks to all contributors
past and present! See <a
href="https://twitter.com/cutting/status/233415059798372353">Doug Cutting's
tweet</a>.
</p>
-<a name="N1014C"></a><a name="10+July+2012+-+Apache+Nutch+v1.5.1+Released"></a>
+<a name="N10161"></a><a name="10+July+2012+-+Apache+Nutch+v1.5.1+Released"></a>
<h3 class="h4">10 July 2012 - Apache Nutch v1.5.1 Released</h3>
<p>The Apache Nutch PMC are very pleased to announce the release of Apache
Nutch v1.5.1. This release is a maintainence release of the popular 1.5.X
mainstream version of Nutch which has been widely adopted within the community.
Please see the <a
href="http://www.apache.org/dist/nutch/1.5.1/CHANGES.txt">list of changes</a>
made
in this version for a full breakdown. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.
</p>
-<a name="N1015E"></a><a name="07+July+2012+-+Apache+Nutch+v2.0+Released"></a>
+<a name="N10173"></a><a name="07+July+2012+-+Apache+Nutch+v2.0+Released"></a>
<h3 class="h4">07 July 2012 - Apache Nutch v2.0 Released</h3>
<p>The Apache Nutch PMC are very pleased to announce the release of Apache
Nutch v2.0. This
release offers users an edition focused on large scale crawling which
builds on storage abstraction
@@ -512,14 +527,14 @@ document.write("Last Published: " + docu
in this version for a full breakdown. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.
</p>
-<a name="N10170"></a><a name="07+June+2012+-+Apache+Nutch+1.5+Released"></a>
+<a name="N10185"></a><a name="07+June+2012+-+Apache+Nutch+1.5+Released"></a>
<h3 class="h4">07 June 2012 - Apache Nutch 1.5 Released</h3>
<p>The 1.5 release of Nutch is now available. This release includes several
improvements
including upgrades of several major components including Tika 1.1 and
Hadoop 1.0.0, improvements to LinkRank and WebGraph elements as well as a
number of new plugins covering blacklisting, filering and parsing to name a
few. Please see the <a href="http://www.apache.org/dist/nutch/CHANGES-1.5.txt">
list of changes</a> made in this version for a full breakdown of the
50 odd improvements the release boasts. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.
</p>
-<a name="N10182"></a><a
name="26+November+2011+-+Apache+Nutch+1.4+Released"></a>
+<a name="N10197"></a><a
name="26+November+2011+-+Apache+Nutch+1.4+Released"></a>
<h3 class="h4">26 November 2011 - Apache Nutch 1.4 Released</h3>
<p>The 1.4 release of Nutch is now available. This release includes several
improvements
including allowing Parsers to declare support for multiple MIME types,
configurable Fetcher
@@ -528,7 +543,7 @@ document.write("Last Published: " + docu
list of changes</a> made in this version. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.
</p>
-<a name="N10194"></a><a
name="23+September+2011+-+Apache+Nutch+focuses+on+1.x+series+for+main+development"></a>
+<a name="N101A9"></a><a
name="23+September+2011+-+Apache+Nutch+focuses+on+1.x+series+for+main+development"></a>
<h3 class="h4">23 September 2011 - Apache Nutch focuses on 1.x series for main
development</h3>
<p>After some <a
href="http://www.mail-archive.com/[email protected]/msg03581.html">discussion</a>
and a <a
href="http://www.mail-archive.com/[email protected]/msg04348.html">vote</a>
about the
@@ -536,7 +551,7 @@ document.write("Last Published: " + docu
the 1.x series of Nutch, and to branch the now former Nutch trunk
based on Gora, allowing others to
try and improve it, while the mainline development goes on.
</p>
-<a name="N101A6"></a><a name="7+June+2011+-+Apache+Nutch+1.3+Released"></a>
+<a name="N101BB"></a><a name="7+June+2011+-+Apache+Nutch+1.3+Released"></a>
<h3 class="h4">7 June 2011 - Apache Nutch 1.3 Released</h3>
<p>The 1.3 release of Nutch is now available. This release includes several
improvements
(improved RSS parsing support, tighter integration with Apache Tika,
external parsing support,
@@ -545,7 +560,7 @@ document.write("Last Published: " + docu
list of changes</a> made in this version. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.
</p>
-<a name="N101B8"></a><a
name="24+September+2010+-+Apache+Nutch+1.2+Released"></a>
+<a name="N101CD"></a><a
name="24+September+2010+-+Apache+Nutch+1.2+Released"></a>
<h3 class="h4">24 September 2010 - Apache Nutch 1.2 Released</h3>
<p>The 1.2 release of Nutch is now available. This release includes several
improvements (addition
of parse-html as a selectable parser again, configurable per-field
indexing),
@@ -555,14 +570,14 @@ document.write("Last Published: " + docu
list of changes</a> made in this version. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.
</p>
-<a name="N101CA"></a><a name="06+June+2010+-+Apache+Nutch+1.1+Released"></a>
+<a name="N101DF"></a><a name="06+June+2010+-+Apache+Nutch+1.1+Released"></a>
<h3 class="h4">06 June 2010 - Apache Nutch 1.1 Released</h3>
<p>The 1.1 release of Nutch is now available. This release includes several
major upgrades of existing
libraries (Hadoop, Solr, Tika, etc.) on which Nutch depends. Various bug
fixes, and speedups (e.g., to
Fetcher2) have also been included. See <a
href="http://www.apache.org/dist/nutch/CHANGES-1.1.txt">
list of changes</a> made in this version. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.</p>
-<a name="N101DC"></a><a
name="21+April+2010+-+Apache+Nutch+graduates+to+TLP"></a>
+<a name="N101F1"></a><a
name="21+April+2010+-+Apache+Nutch+graduates+to+TLP"></a>
<h3 class="h4">21 April 2010 - Apache Nutch graduates to TLP</h3>
<p>
<a
href="http://www.apache.org/foundation/records/minutes/2010/board_minutes_2010_04_21.txt">Passed
by
@@ -570,7 +585,7 @@ document.write("Last Published: " + docu
the website, and moving things around, so if you notice anything out of
place, <a href="./mailing_lists.html">
please let us know.</a>
</p>
-<a name="N101EC"></a><a name="14+August+2009+-+Lucene+at+US+ApacheCon"></a>
+<a name="N10201"></a><a name="14+August+2009+-+Lucene+at+US+ApacheCon"></a>
<h3 class="h4">14 August 2009 - Lucene at US ApacheCon</h3>
<p>
@@ -674,14 +689,14 @@ document.write("Last Published: " + docu
</li>
</ul>
-<a name="N1026A"></a><a name="23+March+2009+-+Apache+Nutch+1.0+Released"></a>
+<a name="N1027F"></a><a name="23+March+2009+-+Apache+Nutch+1.0+Released"></a>
<h3 class="h4">23 March 2009 - Apache Nutch 1.0 Released</h3>
<p>The 1.0 release of Nutch is now available. This release includes several
major feature improvements
such as new indexing framework, new scoring framework, Apache Solr
integration just to mention a few.
See <a href="http://www.apache.org/dist/nutch/CHANGES-1.0.txt">
list of changes</a> made in this version. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.</p>
-<a name="N1027C"></a><a
name="09+February+2009+-+Lucene+at+ApacheCon+Europe+2009+in%0A%09%09%09Amsterdam"></a>
+<a name="N10291"></a><a
name="09+February+2009+-+Lucene+at+ApacheCon+Europe+2009+in%0A%09%09%09Amsterdam"></a>
<h3 class="h4">09 February 2009 - Lucene at ApacheCon Europe 2009 in
Amsterdam</h3>
<p>
@@ -724,7 +739,7 @@ document.write("Last Published: " + docu
</ul>
-<a name="N102C8"></a><a name="2+April+2007%3A+Nutch+0.9+Released"></a>
+<a name="N102DD"></a><a name="2+April+2007%3A+Nutch+0.9+Released"></a>
<h3 class="h4">2 April 2007: Nutch 0.9 Released</h3>
<p>The 0.9 release of Nutch is now available. This is the second release of
Nutch
based entirely on the underlying Hadoop platform. This release includes
several critical
@@ -733,41 +748,41 @@ document.write("Last Published: " + docu
See <a href="http://www.apache.org/dist/nutch/CHANGES-0.9.txt">
list of changes</a> made in this version. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.</p>
-<a name="N102DE"></a><a name="24+September+2006%3A+Nutch+0.8.1+Released"></a>
+<a name="N102F3"></a><a name="24+September+2006%3A+Nutch+0.8.1+Released"></a>
<h3 class="h4">24 September 2006: Nutch 0.8.1 Released</h3>
<p>The 0.8.1 release of Nutch is now available. This is a maintenance release
to 0.8 branch fixing many serous bugs found in version 0.8.
See <a href="http://www.apache.org/dist/nutch/CHANGES-0.8.1.txt">
list of changes</a> made in this version. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.</p>
-<a name="N102F0"></a><a name="25+July+2006%3A+Nutch+0.8+Released"></a>
+<a name="N10305"></a><a name="25+July+2006%3A+Nutch+0.8+Released"></a>
<h3 class="h4">25 July 2006: Nutch 0.8 Released</h3>
<p>The 0.8 release of Nutch is now available. This is the first release of
Nutch based on
hadoop architecure. See <a
href="http://svn.apache.org/viewvc/nutch/tags/release-0.8/CHANGES.txt?view=markup">
CHANGES.txt</a> for list of changes made in this version. The release is
available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.</p>
-<a name="N10302"></a><a name="31+March+2006%3A+Nutch+0.7.2+Released"></a>
+<a name="N10317"></a><a name="31+March+2006%3A+Nutch+0.7.2+Released"></a>
<h3 class="h4">31 March 2006: Nutch 0.7.2 Released</h3>
<p>The 0.7.2 release of Nutch is now available. This is a bug fix release for
0.7 branch. See
<a
href="http://svn.apache.org/viewcvs.cgi/nutch/branches/branch-0.7/CHANGES.txt?rev=390158">
CHANGES.txt</a> for details. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.</p>
-<a name="N10314"></a><a name="1+October+2005%3A+Nutch+0.7.1+Released"></a>
+<a name="N10329"></a><a name="1+October+2005%3A+Nutch+0.7.1+Released"></a>
<h3 class="h4">1 October 2005: Nutch 0.7.1 Released</h3>
<p>The 0.7.1 release of Nutch is now available. This is a bug fix release. See
<a
href="http://svn.apache.org/viewcvs.cgi/nutch/branches/branch-0.7/CHANGES.txt?rev=292986">
CHANGES.txt</a> for details. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.</p>
-<a name="N10326"></a><a name="17+August+2005%3A+Nutch+0.7+Released"></a>
+<a name="N1033B"></a><a name="17+August+2005%3A+Nutch+0.7+Released"></a>
<h3 class="h4">17 August 2005: Nutch 0.7 Released</h3>
<p>This is the first Nutch release as an Apache Lucene sub-project. See
<a
href="http://svn.apache.org/viewcvs.cgi/nutch/trunk/CHANGES.txt?rev=233150">
CHANGES.txt</a> for details. The release is available
<a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>.</p>
-<a name="N10338"></a><a name="June+2005%3A+Nutch+graduates+from+Incubator"></a>
+<a name="N1034D"></a><a name="June+2005%3A+Nutch+graduates+from+Incubator"></a>
<h3 class="h4">June 2005: Nutch graduates from Incubator</h3>
<p>Nutch has now graduated from the Apache incubator, and is now
a Subproject of Lucene.</p>
-<a name="N10342"></a><a
name="January+2005%3A+Nutch+Joins+Apache+Incubator"></a>
+<a name="N10357"></a><a
name="January+2005%3A+Nutch+Joins+Apache+Incubator"></a>
<h3 class="h4">January 2005: Nutch Joins Apache Incubator</h3>
<p>Nutch is a two-year-old open source project, previously
hosted at Sourceforge and backed by its own non-profit
@@ -778,7 +793,7 @@ document.write("Last Published: " + docu
overhead of an independent non-profit organization. Nutch's
board of directors and its developers were both polled and
supported the move to the Apache foundation.</p>
-<a name="N1034C"></a><a
name="September+2004%3A+Creative+Commons+launches+Nutch-based+Search"></a>
+<a name="N10361"></a><a
name="September+2004%3A+Creative+Commons+launches+Nutch-based+Search"></a>
<h3 class="h4">September 2004: Creative Commons launches Nutch-based
Search</h3>
<p>Creative Commons unveiled a beta version of its search
engine, which scours the web for text, images, audio, and video
@@ -786,7 +801,7 @@ document.write("Last Published: " + docu
no other company or organization.</p>
<p>See the <a
href="http://creativecommons.org/press-releases/entry/5064">Creative
Commons Press Release</a> for more details.</p>
-<a name="N1035D"></a><a
name="September+2004%3A+Oregon+State+University+switches+to+Nutch"></a>
+<a name="N10372"></a><a
name="September+2004%3A+Oregon+State+University+switches+to+Nutch"></a>
<h3 class="h4">September 2004: Oregon State University switches to Nutch</h3>
<p>Oregon State University is converting its searching
infrastructure from Googletm to the open source project
@@ -796,7 +811,7 @@ document.write("Last Published: " + docu
search engine use and management.</p>
<p>For more details see the announcement by OSU's <a
href="http://osuosl.org/news_folder/nutch">Open Source
Lab</a>.</p>
-<a name="N1036E"></a><a name="Apache+Nutch+Trademark+Attributions"></a>
+<a name="N10383"></a><a name="Apache+Nutch+Trademark+Attributions"></a>
<h3 class="h4">Apache Nutch Trademark Attributions</h3>
<p>Apache Nutch, Nutch, Apache, the Apache feather logo, and the Apache Nutch
project logo are trademarks of The Apache Software Foundation.</p>
</div>
Modified: nutch/site/publish/index.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/index.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/issue_tracking.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/issue_tracking.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/issue_tracking.html (original)
+++ nutch/site/publish/issue_tracking.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Nutch Issue Tracking</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/issue_tracking.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/issue_tracking.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/linkmap.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/linkmap.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/linkmap.html (original)
+++ nutch/site/publish/linkmap.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Site Linkmap Table of Contents</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/linkmap.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/linkmap.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/mailing_lists.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/mailing_lists.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/mailing_lists.html (original)
+++ nutch/site/publish/mailing_lists.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Nutch Mailing Lists</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/mailing_lists.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/mailing_lists.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/nightly.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/nightly.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/nightly.html (original)
+++ nutch/site/publish/nightly.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Nightly builds</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/nightly.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/nightly.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/old_downloads.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/old_downloads.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/old_downloads.html (original)
+++ nutch/site/publish/old_downloads.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Older Downloads</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/old_downloads.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/old_downloads.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/sonar.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/sonar.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/sonar.html (original)
+++ nutch/site/publish/sonar.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Apache Nutch Sonar Analysis</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/sonar.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/sonar.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/tutorial.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/tutorial.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/tutorial.html (original)
+++ nutch/site/publish/tutorial.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Apache Nutch Tutorial</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/tutorial.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/tutorial.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/version_control.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/version_control.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/version_control.html (original)
+++ nutch/site/publish/version_control.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Nutch Version Control System</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/version_control.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/version_control.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.
Modified: nutch/site/publish/wiki.html
URL:
http://svn.apache.org/viewvc/nutch/site/publish/wiki.html?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
--- nutch/site/publish/wiki.html (original)
+++ nutch/site/publish/wiki.html Wed Jul 3 19:20:18 2013
@@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
-<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-version" content="0.10-dev">
<meta name="Forrest-skin-name" content="nutch">
<title>Apache Nutch Wiki Page</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">
Modified: nutch/site/publish/wiki.pdf
URL:
http://svn.apache.org/viewvc/nutch/site/publish/wiki.pdf?rev=1499527&r1=1499526&r2=1499527&view=diff
==============================================================================
Binary files - no diff available.