Author: snagel Date: Fri Jul 26 08:46:42 2019 New Revision: 1863783 URL: http://svn.apache.org/viewvc?rev=1863783&view=rev Log: Update links - Wiki (MoinMoin -> Confluence migration) - http:// -> https:// where applicable - remove about.md (not related to Nutch)
Removed: nutch/cms_site/trunk/content/about.md Modified: nutch/cms_site/trunk/content/index.md nutch/cms_site/trunk/content/javadoc.md nutch/cms_site/trunk/content/version_control.md Modified: nutch/cms_site/trunk/content/index.md URL: http://svn.apache.org/viewvc/nutch/cms_site/trunk/content/index.md?rev=1863783&r1=1863782&r2=1863783&view=diff ============================================================================== --- nutch/cms_site/trunk/content/index.md (original) +++ nutch/cms_site/trunk/content/index.md Fri Jul 26 08:46:42 2019 @@ -25,7 +25,7 @@ under the License. <div class="carousel-caption"> <h1>Highly extensible, highly scalable Web crawler</h1> <p class="lead">Nutch is a well matured, production ready Web crawler. Nutch 1.x enables - fine grained configuration, relying on <a href="http://hadoop.apache.org">Apache Hadoop™</a> + fine grained configuration, relying on <a href="https://hadoop.apache.org/">Apache Hadoop™</a> data structures, which are great for batch processing.</p> <a class="btn btn-large btn-primary" href="downloads.html">Download</a> </div> @@ -38,10 +38,10 @@ under the License. <h1>Pluggable parsing, protocols, storage and indexing</h1> <p class="lead">Being pluggable and modular of course has it's benefits, Nutch provides extensible interfaces such as Parse, Index and ScoringFilter's for custom - implementations e.g. <a href="http://tika.apache.org">Apache Tika™</a> for parsing. - Additonally, pluggable indexing exists for <a href="http://lucene.apache.org/solr">Apache Solr™</a>, - <a href="http://www.elasticsearch.org/">Elastic Search</a>, <a href="https://cwiki.apache.org/confluence/display/solr/SolrCloud">SolrCloud</a>, etc.</p> - <a class="btn btn-large btn-primary" href="http://wiki.apache.org/nutch/">Learn About</a> + implementations e.g. <a href="https://tika.apache.org/">Apache Tika™</a> for parsing. + Additonally, pluggable indexing exists for <a href="https://lucene.apache.org/solr">Apache Solr™</a>, + <a href="https://www.elastic.co/">Elastic Search</a>, <a href="https://cwiki.apache.org/confluence/display/solr/SolrCloud">SolrCloud</a>, etc.</p> + <a class="btn btn-large btn-primary" href="https://cwiki.apache.org/confluence/display/NUTCH/Home">Learn About</a> </div> </div> </div> @@ -53,7 +53,7 @@ under the License. <p class="lead">Nutch 2.X branch is becoming an emerging alternative taking direct inspiration from 1.X. 2.X differs in one key area; storage is abstracted away from any specific underlying data store by using - <a href="http://gora.apache.org">Apache Gora™</a> for handling object to persistent + <a href="https://gora.apache.org/">Apache Gora™</a> for handling object to persistent data store mappings.</p> <a class="btn btn-large btn-primary" href="mailing_lists.html">Join the Community</a> </div> @@ -106,6 +106,10 @@ under the License. #Apache Nutch News +##26 July 2019 - Nutch Wiki Migrated + +The [Apache Nutch wiki](https://cwiki.apache.org/confluence/display/NUTCH/Home) has been migrated from MoinMoin to Confluence. + ##9 August 2018 - Nutch 1.15 Release The Apache Nutch PMC are pleased to announce the immediate release of Apache Nutch v1.15, we advise all @@ -115,8 +119,8 @@ An account of the CHANGES in this releas [release report](https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=10680&version=12342302). As usual in the 1.X series, release artifacts are made available as both source and binary and also available within -[Maven Central](http://search.maven.org/#search|gav|1|g%3A%22org.apache.nutch%22%20AND%20a%3A%22nutch%22) as a Maven dependency. -The release is available from our [DOWNLAODS PAGE](http://nutch.apache.org/downloads.html). +[Maven Central](https://search.maven.org/search?q=g:org.apache.nutch%20AND%20a:nutch&core=gav) as a Maven dependency. +The release is available from our [DOWNLAODS PAGE](/downloads.html). ##23 December 2017 - Nutch 1.14 Release @@ -127,8 +131,8 @@ An account of the CHANGES in this releas [release report](https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=10680&version=12340218). As usual in the 1.X series, release artifacts are made available as both source and binary and also available within -[Maven Central](http://search.maven.org/#search|gav|1|g%3A%22org.apache.nutch%22%20AND%20a%3A%22nutch%22) as a Maven dependency. -The release is available from our [DOWNLAODS PAGE](http://nutch.apache.org/downloads.html). +[Maven Central](https://search.maven.org/search?q=g:org.apache.nutch%20AND%20a:nutch&core=gav) as a Maven dependency. +The release is available from our [DOWNLAODS PAGE](/downloads.html). ##02 April 2017 - Nutch 1.13 Release @@ -139,8 +143,8 @@ An account of the CHANGES in this releas [release report](https://s.apache.org/wq3x). As usual in the 1.X series, release artifacts are made available as both source and binary and also available within -[Maven Central](http://search.maven.org/#search|gav|1|g%3A%22org.apache.nutch%22%20AND%20a%3A%22nutch%22) as a Maven dependency. -The release is available from our [DOWNLAODS PAGE](http://nutch.apache.org/downloads.html). +[Maven Central](https://search.maven.org/search?q=g:org.apache.nutch%20AND%20a:nutch&core=gav) as a Maven dependency. +The release is available from our [DOWNLAODS PAGE](/downloads.html). ##18 June 2016 - Nutch 1.12 Release @@ -151,8 +155,8 @@ This release is the result of many month [release report](https://s.apache.org/nutch1.12). As usual in the 1.X series, release artifacts are made available as both source and binary and also available within -[Maven Central](http://search.maven.org/#search|gav|1|g%3A%22org.apache.nutch%22%20AND%20a%3A%22nutch%22) as a Maven dependency. -The release is available from our [DOWNLAODS PAGE](http://nutch.apache.org/downloads.html). +[Maven Central](https://search.maven.org/search?q=g:org.apache.nutch%20AND%20a:nutch&core=gav) as a Maven dependency. +The release is available from our [DOWNLAODS PAGE](/downloads.html). ##21 January 2016 - Nutch 2.3.1 Release @@ -160,11 +164,11 @@ The Apache Nutch PMC are pleased to anno current users and developers of the 2.X series to upgrade to this release. This bug fix release contains around 40 issues addressed. For a complete overview of these issues please see the -[release report](http://s.apache.org/nutch_2.3.1). +[release report](https://s.apache.org/nutch_2.3.1). As usual in the 2.X series, release artifacts are made available as only source and also available within -[Maven Central](http://search.maven.org/#search|gav|1|g%3A%22org.apache.nutch%22%20AND%20a%3A%22nutch%22) as a Maven dependency. -The release is available from our [DOWNLAODS PAGE](http://nutch.apache.org/downloads.html). +[Maven Central](https://search.maven.org/search?q=g:org.apache.nutch%20AND%20a:nutch&core=gav) as a Maven dependency. +The release is available from our [DOWNLAODS PAGE](/downloads.html). The recommended Gora backends for this Nutch release are @@ -185,11 +189,11 @@ The Apache Nutch PMC are pleased to anno current users and developers of the 1.X series to upgrade to this release. This release is the result of many months of work and around 100 issues addressed. For a complete overview of these issues please see the -[release report](http://s.apache.org/nutch11). +[release report](https://s.apache.org/nutch11). As usual in the 1.X series, release artifacts are made available as both source and binary and also available within -[Maven Central](http://search.maven.org/#search|gav|1|g%3A%22org.apache.nutch%22%20AND%20a%3A%22nutch%22) as a Maven dependency. -The release is available from our [DOWNLAODS PAGE](http://nutch.apache.org/downloads.html). +[Maven Central](https://search.maven.org/search?q=g:org.apache.nutch%20AND%20a:nutch&core=gav) as a Maven dependency. +The release is available from our [DOWNLAODS PAGE](/downloads.html). ##06 May 2015 - Nutch 1.10 Release @@ -197,15 +201,15 @@ The Apache Nutch PMC are pleased to anno current users and developers of the 1.X series to upgrade to this release. This release is the result of many months of work and well over 100 issues addressed. For a complete overview of these issues please see the -[release report](http://s.apache.org/nutch10). +[release report](https://s.apache.org/nutch10). As usual in the 1.X series, release artifacts are made available as both source and binary and also available within -[Maven Central](http://search.maven.org/) as a Maven dependency. -The release is available from our [DOWNLAODS PAGE](http://nutch.apache.org/downloads.html). +[Maven Central](https://search.maven.org/) as a Maven dependency. +The release is available from our [DOWNLAODS PAGE](/downloads.html). ## 23 April 2015 - Apache Nutch Reaches 2000th Jira Issue -<blockquote class="twitter-tweet" lang="en"><p><a href="https://twitter.com/ApacheNutch">@ApacheNutch</a> reaches 2000th issue on <a href="https://twitter.com/TheASF">@TheASF</a> <a href="https://twitter.com/hashtag/JIRA?src=hash">#JIRA</a> with over a decade of <a href="https://twitter.com/hashtag/opensource?src=hash">#opensource</a> crawling on the <a href="https://twitter.com/hashtag/www?src=hash">#www</a> <a href="http://t.co/k3VLhbJQhg">pic.twitter.com/k3VLhbJQhg</a></p>— Apache Nutch (@ApacheNutch) <a href="https://twitter.com/ApacheNutch/status/591359830171856896">April 23, 2015</a></blockquote> +<blockquote class="twitter-tweet" lang="en"><p><a href="https://twitter.com/ApacheNutch">@ApacheNutch</a> reaches 2000th issue on <a href="https://twitter.com/TheASF">@TheASF</a> <a href="https://twitter.com/hashtag/JIRA?src=hash">#JIRA</a> with over a decade of <a href="https://twitter.com/hashtag/opensource?src=hash">#opensource</a> crawling on the <a href="https://twitter.com/hashtag/www?src=hash">#www</a> <a href="https://t.co/k3VLhbJQhg">pic.twitter.com/k3VLhbJQhg</a></p>— Apache Nutch (@ApacheNutch) <a href="https://twitter.com/ApacheNutch/status/591359830171856896">April 23, 2015</a></blockquote> <script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script> ##22 January 2015 - Nutch 2.3 Release @@ -214,35 +218,35 @@ The Apache Nutch PMC are pleased to anno current users and developers of the 2.X series to upgrade to this release. After successful completion of the first [Nutch Google Summer of Code project](https://issues.apache.org/jira/browse/NUTCH-841) we are pleased to announce that Nutch 2.3 release now comes packaged with a self -contained [Apache Wicket](http://wicket.apache.org)-based Web Application. +contained [Apache Wicket](https://wicket.apache.org/)-based Web Application. This release is the result of many months of work and 143 issues addressed. For a complete overview of these issues please see the -[release report](http://s.apache.org/nutch_2.3). +[release report](https://s.apache.org/nutch_2.3). As usual in the 2.x series, this release is made available only as source, but is also available within -[Maven Central](http://search.maven.org/) as a Maven dependency. -The release is available from our [DOWNLAODS PAGE](http://nutch.apache.org/downloads.html). +[Maven Central](https://search.maven.org/) as a Maven dependency. +The release is available from our [DOWNLAODS PAGE](/downloads.html). -The supported [Apache Gora](http://gora.apache.org) v0.5 backends are; +The supported [Apache Gora](https://gora.apache.org/) v0.5 backends are; - * [Apache Hadoop](http://hadoop.apache.org) 1.0.1 & 2.4.0 - * [Apache Cassandra](http://cassandra.apache.org) 2.0.2 - * [Apache HBase](http://hbase.apache.org) 0.94.14 - * [Apache Accumulo](http://accumulo.apache.org) 1.5.1 - * [MongoDB](http://mongodb.org) 2.12.2 - * [Apache Solr](http://lucene.apache.org/solr) 4.8.1 - * [Apache Avro](http://avro.apache.org) 1.7.6 + * [Apache Hadoop](https://hadoop.apache.org/) 1.0.1 & 2.4.0 + * [Apache Cassandra](https://cassandra.apache.org/) 2.0.2 + * [Apache HBase](https://hbase.apache.org/) 0.94.14 + * [Apache Accumulo](https://accumulo.apache.org/) 1.5.1 + * [MongoDB](https://mongodb.org/) 2.12.2 + * [Apache Solr](https://lucene.apache.org/solr) 4.8.1 + * [Apache Avro](https://avro.apache.org/) 1.7.6 Please note that the SQL backend for Gora has been deprecated. ##22 September 2014 - Wicket WebApp now part of Nutch 2.x Codebase - <a title="Apache Wicket" href="http://wicket.apache.org"> - <img src="http://wicket.apache.org/guide/img/apache-wicket.png" class="float-right" alt="Apache Wicket Logo" height="100" width="400"/> + <a title="Apache Wicket" href="https://wicket.apache.org/"> + <img src="https://wicket.apache.org/guide/img/apache-wicket.png" class="float-right" alt="Apache Wicket Logo" height="100" width="400"/> </a> After successful completion of the first [Nutch Google Summer of Code project](https://issues.apache.org/jira/browse/NUTCH-841) we are pleased to announce that Nutch 2.X branch now comes packaged with a self -contained [Apache Wicket](http://wicket.apache.org)-based Web Application. +contained [Apache Wicket](https://wicket.apache.org/)-based Web Application. This not only greatly lowers the barrier for direct interaction with the Nutch 2.X REST API but also provides a stepping stone from which we intend to backport this @@ -251,8 +255,8 @@ work to the Nutch 1.X (trunk) series. Some of the Web Application features include: * Functionality to dynamically load seed URLs in order to bootstrap Nutch crawls - * Browsable and dynamic editing of [Configuration overrides](http://wiki.apache.org/nutch/NutchPropertiesCompleteList) - * Complete [REST API documentation](https://wiki.apache.org/nutch/NutchRESTAPI) and UML + * Browsable and dynamic editing of [Configuration overrides](https://cwiki.apache.org/nutch/NutchPropertiesCompleteList) + * Complete [REST API documentation](https://cwiki.apache.org/nutch/NutchRESTAPI) and UML model describing REST API calls, Administration and Job and Configuration Management. The new Web Application feature will be present within the upcoming Nutch 2.3 Release. @@ -262,11 +266,11 @@ The new Web Application feature will be <p>The Apache Nutch PMC are pleased to announce the immediate release of Apache Nutch v1.9, we advise all current users and developers of the 1.X series to upgrade to this release. This release addressed no fewer than 55 issues in total. - Please see the <a href="http://www.apache.org/dist/nutch/1.9/CHANGES.txt">list of changes</a> for a full - breakdown, or see the <a href="http://s.apache.org/1.9-release">release report</a>. + Please see the <a href="https://www.apache.org/dist/nutch/1.9/CHANGES.txt">list of changes</a> for a full + breakdown, or see the <a href="https://s.apache.org/1.9-release">release report</a>. As usual in the 1.X series, this release is made available both as source and binary. Additionally developers - can find Maven artifacts within <a href="http://search.maven.org/">Maven Central</a>. - The release is available <a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>. + can find Maven artifacts within <a href="https://search.maven.org/">Maven Central</a>. + The release is available <a href="https://www.apache.org/dyn/closer.cgi/nutch/">here</a>. </p> </div> @@ -308,7 +312,7 @@ The new Web Application feature will be <p>The Apache Nutch PMC are pleased to announce the immediate release of Apache Nutch v1.8, we advise all current users and developers of the 1.X series to upgrade to this release. Alhough this release includes library upgrades to <a href="http://code.google.com/p/crawler-commons/">Crawler Commons</a> 0.3 and - <a href="http://tika.apache.org">Apache Tika</a> 1.5, it also provides over 30 bug fixes as well as 18 improvements. + <a href="https://tika.apache.org/">Apache Tika</a> 1.5, it also provides over 30 bug fixes as well as 18 improvements. Please see the <a href="http://www.apache.org/dist/nutch/1.8/CHANGES.txt">list of changes</a> for a full breakdown, or see the <a href="http://s.apache.org/oHY">release report</a>. As usual in the 1.X series, this release is made available both as source and binary. Additionally developers @@ -322,8 +326,8 @@ The new Web Application feature will be <h2>02 July 2013 - Apache Nutch v2.2.1 Released</h2> <p>The Apache Nutch PMC are pleased to announce the immediate release of Apache Nutch v2.2.1, we advise all current users and developers of the 2.X series to upgrade to this release ASAP. Although this - release includes library upgrades to <a href="http://hadoop.apache.org">Apache Hadoop</a> 1.2.0 and - <a href="http://tika.apache.org">Apache Tika</a> 1.3, it is predominantly a bug fix for + release includes library upgrades to <a href="https://hadoop.apache.org/">Apache Hadoop</a> 1.2.0 and + <a href="https://tika.apache.org/">Apache Tika</a> 1.3, it is predominantly a bug fix for <a href="https://issues.apache.org/jira/browse/NUTCH-1591">NUTCH-1591 - Incorrect conversion of ByteBuffer to String</a>. Please see the <a href="http://www.apache.org/dist/nutch/2.2.1/CHANGES-2.2.1.txt">list of changes</a> for a full breakdown, or see the <a href="http://s.apache.org/PGa">release report</a>. @@ -342,8 +346,8 @@ The new Web Application feature will be <a href="http://lucene.apache.org/solr">Apache Solr</a> and <a href="http://www.elasticsearch.org/">Elastic Search</a>. Shadowing the recent Nutch 2.2 release, parsing of Robots.txt is now delegated to <a href="http://code.google.com/p/crawler-commons/"> - Crawler-Commons</a>. Key library upgrades have been made to <a href="http://hadoop.apache.org">Apache Hadoop</a> 1.2.0 - and <a href="http://tika.apache.org">Apache Tika</a> 1.3. Please see the <a href="http://www.apache.org/dist/nutch/1.7/1.7-CHANGES.txt">list of + Crawler-Commons</a>. Key library upgrades have been made to <a href="https://hadoop.apache.org/">Apache Hadoop</a> 1.2.0 + and <a href="https://tika.apache.org/">Apache Tika</a> 1.3. Please see the <a href="http://www.apache.org/dist/nutch/1.7/1.7-CHANGES.txt">list of changes</a> or the <a href="http://s.apache.org/1zE">release report</a> made in this version for a full breakdown. As usual in the 1.x series, the release is made available as binary and source (zip + tar.gz) and is also available within @@ -359,8 +363,8 @@ The new Web Application feature will be release includes over 30 bug fixes and over 25 improvements representing the third release of increasingly popular 2.x Nutch series. This release features inclusion of <a href="http://code.google.com/p/crawler-commons/"> Crawler-Commons</a> which Nutch now utilizes for improved robots.txt parsing, library upgrades to - <a href="http://hadoop.apache.org">Apache Hadoop</a> 1.1.1, <a href="http://gora.apache.org">Apache Gora</a> - 0.3, <a href="http://tika.apache.org">Apache Tika</a> 1.2 and <a href="http://www.brics.dk/automaton/automaton"> + <a href="https://hadoop.apache.org/">Apache Hadoop</a> 1.1.1, <a href="https://gora.apache.org/">Apache Gora</a> + 0.3, <a href="https://tika.apache.org/">Apache Tika</a> 1.2 and <a href="http://www.brics.dk/automaton/automaton"> Automaton</a> 1.11-8. Please see the <a href="http://www.apache.org/dist/nutch/2.2/2.2-CHANGES.txt">list of changes</a> or the <a href="http://s.apache.org/LPB">release report</a> made in this version for a full breakdown. @@ -376,7 +380,7 @@ The new Web Application feature will be release includes over 20 bug fixes, the same in improvements, as well as new functionalities including a new HostNormalizer, the ability to dynamically set fetchInterval by MIME-type and functional enhancements to the Indexer API inluding the normalization of URL's and the deletion of robots noIndex documents. Other notable improvements include the upgrade of key dependencies to - <a href="http://tika.apache.org/1.2/index.html">Tika 1.2</a> and <a href="http://www.brics.dk/automaton/">Automaton 1.11-8</a>. + <a href="https://tika.apache.org/1.2/index.html">Tika 1.2</a> and <a href="http://www.brics.dk/automaton/">Automaton 1.11-8</a>. Please see the <a href="http://www.apache.org/dist/nutch/1.6/CHANGES_1.6.txt">list of changes</a> or the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=10680&version=12319941">release report</a> made in this version for a full breakdown. The release is available @@ -389,7 +393,7 @@ The new Web Application feature will be <p>The Apache Nutch PMC are very pleased to announce the release of Apache Nutch v2.1. This release continues to provide Nutch users with a simplified Nutch distribution building on the 2.x development drive which is growing in popularity amongst the community. As well as addressing ~20 bugs - this release also offers improved properties for better <a href="http://lucene.apache.org/solr/">Solr</a> configuration, upgrades to various <a href="http://gora.apache.org">Gora</a> dependencies and the introduction of the option to build indexes in <a href="http://www.elasticsearch.org/">elastic search</a>. + this release also offers improved properties for better <a href="http://lucene.apache.org/solr/">Solr</a> configuration, upgrades to various <a href="https://gora.apache.org/">Gora</a> dependencies and the introduction of the option to build indexes in <a href="http://www.elasticsearch.org/">elastic search</a>. Please see the <a href="http://www.apache.org/dist/nutch/2.1/CHANGES-2.1.txt">list of changes</a> made in this version for a full breakdown. The release is available <a href="http://www.apache.org/dyn/closer.cgi/nutch/">here</a>. Modified: nutch/cms_site/trunk/content/javadoc.md URL: http://svn.apache.org/viewvc/nutch/cms_site/trunk/content/javadoc.md?rev=1863783&r1=1863782&r2=1863783&view=diff ============================================================================== --- nutch/cms_site/trunk/content/javadoc.md (original) +++ nutch/cms_site/trunk/content/javadoc.md Fri Jul 26 08:46:42 2019 @@ -35,7 +35,7 @@ under the License. <!-- div id="bodyColumn" class="span9"--> <p>It should ne noted that the 1.X branch is currently the Nutch trunk code base.</p> <p>Nutch 2.X is a different code base and uses different data structures. For more information - on the 2.X branch, we urge users to approach the <a href="http://wiki.apache.org/nutch/#Nutch_2.x">wiki documentation</a> + on the 2.X branch, we urge users to approach the <a href="https://cwiki.apache.org/nutch/#Nutch_2.x">wiki documentation</a> </p> <div class="section"> Modified: nutch/cms_site/trunk/content/version_control.md URL: http://svn.apache.org/viewvc/nutch/cms_site/trunk/content/version_control.md?rev=1863783&r1=1863782&r2=1863783&view=diff ============================================================================== --- nutch/cms_site/trunk/content/version_control.md (original) +++ nutch/cms_site/trunk/content/version_control.md Fri Jul 26 08:46:42 2019 @@ -4,4 +4,4 @@ Title: Nutch Version Control System Nutch uses the Apache Software Foundation Git writeable repositories as its master repository. -You can find more information about Nutch source control [here](https://wiki.apache.org/nutch/UsingGit). +You can find more information about Nutch source control [here](https://cwiki.apache.org/nutch/UsingGit).