This is an automated email from the ASF dual-hosted git repository. snagel pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/nutch.git
commit 75e4e63823e5ea2a086dca9b1ba5824c77058985 Author: Sebastian Nagel <sna...@apache.org> AuthorDate: Tue Jun 9 12:28:00 2020 +0200 NUTCH-2789 Documentation: update links to point to cwiki --- CHANGES.txt | 2 +- conf/httpclient-auth.xml.template | 4 ++-- conf/nutch-default.xml | 7 ++++--- src/java/org/apache/nutch/plugin/Pluggable.java | 2 +- src/java/org/apache/nutch/plugin/package.html | 12 ++++++------ src/java/org/apache/nutch/util/SitemapProcessor.java | 2 +- src/plugin/index-replace/README.txt | 2 +- src/plugin/indexer-cloudsearch/README.md | 2 +- src/plugin/indexer-csv/README.md | 4 ++-- src/plugin/indexer-dummy/README.md | 4 ++-- src/plugin/indexer-elastic/README.md | 2 +- src/plugin/indexer-rabbit/README.md | 4 ++-- src/plugin/indexer-solr/README.md | 4 ++-- src/plugin/indexer-solr/schema.xml | 2 +- .../src/java/org/apache/nutch/protocol/httpclient/Http.java | 4 ++-- 15 files changed, 29 insertions(+), 28 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index e430a64..3f26a8d 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -154,7 +154,7 @@ Release Report: https://s.apache.org/nczS Breaking Changes - indexer plugins are now configured in a single XML file (conf/index-writers.xml), - see https://wiki.apache.org/nutch/IndexWriters - setting or overwriting configuration + see https://cwiki.apache.org/confluence/display/NUTCH/IndexWriters - setting or overwriting configuration parameters via Nutch properties is not possible anymore. Bug diff --git a/conf/httpclient-auth.xml.template b/conf/httpclient-auth.xml.template index 9d23093..f9763bf 100644 --- a/conf/httpclient-auth.xml.template +++ b/conf/httpclient-auth.xml.template @@ -56,7 +56,7 @@ specified as the value for 'realm' attribute in case of NTLM. More information on Basic, Digest and NTLM authentication - support can be located at https://wiki.apache.org/nutch/HttpAuthenticationSchemes + support can be located at https://cwiki.apache.org/confluence/display/NUTCH/HttpAuthenticationSchemes HTTP-POST Authentication Support Http Form-based Authentication is a very common used authentication @@ -106,7 +106,7 @@ DEFAULT, RFC_2109, etc. More information on HTTP POST can be located at - https://wiki.apache.org/nutch/HttpPostAuthentication + https://cwiki.apache.org/confluence/display/NUTCH/HttpPostAuthentication --> diff --git a/conf/nutch-default.xml b/conf/nutch-default.xml index b833288..23af74b 100644 --- a/conf/nutch-default.xml +++ b/conf/nutch-default.xml @@ -1818,7 +1818,7 @@ CAUTION: Set the parser.timeout to -1 or a bigger value than 30, when using this Add scoring-similarity to the list of active plugins in the parameter 'plugin.includes' in order to use it. For more detailed information on the working of this filter -visit https://wiki.apache.org/nutch/SimilarityScoringFilter--> +visit https://cwiki.apache.org/confluence/display/NUTCH/SimilarityScoringFilter --> <property> <name>scoring.similarity.model</name> @@ -2060,8 +2060,9 @@ visit https://wiki.apache.org/nutch/SimilarityScoringFilter--> fldname1=/regexp/replacement/flags fldname2=/regexp/replacement/flags - Field names would be one of those from https://wiki.apache.org/nutch/IndexStructure. - See https://wiki.apache.org/nutch/IndexReplace for further details. + Field names would be one of those from + https://cwiki.apache.org/confluence/display/NUTCH/IndexStructure + See https://cwiki.apache.org/confluence/display/NUTCH/IndexReplace for further details. </description> </property> diff --git a/src/java/org/apache/nutch/plugin/Pluggable.java b/src/java/org/apache/nutch/plugin/Pluggable.java index 09aba30..6a66561 100644 --- a/src/java/org/apache/nutch/plugin/Pluggable.java +++ b/src/java/org/apache/nutch/plugin/Pluggable.java @@ -22,7 +22,7 @@ package org.apache.nutch.plugin; * * @author Jérôme Charron * - * @see <a href="http://wiki.apache.org/nutch/AboutPlugins">About Plugins</a> + * @see <a href="https://cwiki.apache.org/confluence/display/NUTCH/AboutPlugins">About Plugins</a> * @see <a href="package-summary.html#package_description"> plugin package * description</a> */ diff --git a/src/java/org/apache/nutch/plugin/package.html b/src/java/org/apache/nutch/plugin/package.html index 5ca4c9e..442ed09 100644 --- a/src/java/org/apache/nutch/plugin/package.html +++ b/src/java/org/apache/nutch/plugin/package.html @@ -18,22 +18,22 @@ listed in the {@link org.apache.nutch.plugin.Pluggable} interface. @see <a href="./doc-files/plugin.dtd">Nutch plugin manifest DTD</a> -@see <a href="http://wiki.apache.org/nutch/PluginCentral"> +@see <a href="https://cwiki.apache.org/confluence/display/NUTCH/PluginCentral"> Plugin Central </a> -@see <a href="http://wiki.apache.org/nutch/AboutPlugins"> +@see <a href="https://cwiki.apache.org/confluence/display/NUTCH/AboutPlugins"> About Plugins </a> -@see <a href="http://wiki.apache.org/nutch/WhyNutchHasAPluginSystem"> +@see <a href="https://cwiki.apache.org/confluence/display/NUTCH/WhyNutchHasAPluginSystem"> Why Nutch has a Plugin System? </a> -@see <a href="http://wiki.apache.org/nutch/WhichTechnicalConceptsAreBehindTheNutchPluginSystem"> +@see <a href="https://cwiki.apache.org/confluence/display/NUTCH/WhichTechnicalConceptsAreBehindTheNutchPluginSystem"> Which technical concepts are behind the nutch plugin system? </a> -@see <a href="http://wiki.apache.org/nutch/WhatsTheProblemWithPluginsAndClass-loading"> +@see <a href="https://cwiki.apache.org/confluence/display/NUTCH/WhatsTheProblemWithPluginsAndClass-loading"> What's the problem with Plugins and Class loading? </a> -@see <a href="http://wiki.apache.org/nutch/WritingPluginExample"> +@see <a href="https://cwiki.apache.org/confluence/display/NUTCH/WritingPluginExample"> Writing Plugin Example </a> </body> diff --git a/src/java/org/apache/nutch/util/SitemapProcessor.java b/src/java/org/apache/nutch/util/SitemapProcessor.java index c686d6a..da5c7e7 100644 --- a/src/java/org/apache/nutch/util/SitemapProcessor.java +++ b/src/java/org/apache/nutch/util/SitemapProcessor.java @@ -78,7 +78,7 @@ import crawlercommons.sitemaps.SiteMapURL; * </ol> * * <p>For more details see: - * https://wiki.apache.org/nutch/SitemapFeature </p> + * https://cwiki.apache.org/confluence/display/NUTCH/SitemapFeature </p> */ public class SitemapProcessor extends Configured implements Tool { public static final Logger LOG = LoggerFactory.getLogger(SitemapProcessor.class); diff --git a/src/plugin/index-replace/README.txt b/src/plugin/index-replace/README.txt index 4c866a7..e136d47 100644 --- a/src/plugin/index-replace/README.txt +++ b/src/plugin/index-replace/README.txt @@ -13,7 +13,7 @@ Configuration Example Property format: index.replace.regexp The format of the property is a list of regexp replacements, one line per field being - modified. Field names would be one of those from https://wiki.apache.org/nutch/IndexStructure. + modified. Field names would be one of those from https://cwiki.apache.org/confluence/display/NUTCH/IndexStructure The fieldname precedes the equal sign. The first character after the equal sign signifies the delimiter for the regexp, the replacement value and the flags. diff --git a/src/plugin/indexer-cloudsearch/README.md b/src/plugin/indexer-cloudsearch/README.md index ddef693..10b5daa 100644 --- a/src/plugin/indexer-cloudsearch/README.md +++ b/src/plugin/indexer-cloudsearch/README.md @@ -24,7 +24,7 @@ Each `<writer>` element has two mandatory attributes: ## Mapping -The mapping section is explained [here](https://wiki.apache.org/nutch/IndexWriters#Mapping_section). The structure of this section is general for all index writers. +The mapping section is explained [here](https://cwiki.apache.org/confluence/display/NUTCH/IndexWriters#IndexWriters-Mappingsection). The structure of this section is general for all index writers. ## Parameters diff --git a/src/plugin/indexer-csv/README.md b/src/plugin/indexer-csv/README.md index 1eadea1..8022097 100644 --- a/src/plugin/indexer-csv/README.md +++ b/src/plugin/indexer-csv/README.md @@ -22,7 +22,7 @@ Each `<writer>` element has two mandatory attributes: ## Mapping -The mapping section is explained [here](https://wiki.apache.org/nutch/IndexWriters#Mapping_section). The structure of this section is general for all index writers. +The mapping section is explained [here](https://cwiki.apache.org/confluence/display/NUTCH/IndexWriters#IndexWriters-Mappingsection). The structure of this section is general for all index writers. ## Parameters @@ -39,4 +39,4 @@ escapechar | Escape character used to escape a quote character | " maxfieldlength | Max. length of a single field value in characters | 4096 maxfieldvalues | Max. number of values of one field, useful for, e.g., the anchor texts field | 12 header | Write CSV column headers | true -outpath | Output path / directory (local filesystem path, relative to current working directory) | csvindexwriter \ No newline at end of file +outpath | Output path / directory (local filesystem path, relative to current working directory) | csvindexwriter diff --git a/src/plugin/indexer-dummy/README.md b/src/plugin/indexer-dummy/README.md index 0461789..2a4b2bd 100644 --- a/src/plugin/indexer-dummy/README.md +++ b/src/plugin/indexer-dummy/README.md @@ -22,7 +22,7 @@ Each `<writer>` element has two mandatory attributes: ## Mapping -The mapping section is explained [here](https://wiki.apache.org/nutch/IndexWriters#Mapping_section). The structure of this section is general for all index writers. +The mapping section is explained [here](https://cwiki.apache.org/confluence/display/NUTCH/IndexWriters#IndexWriters-Mappingsection). The structure of this section is general for all index writers. ## Parameters @@ -31,4 +31,4 @@ Each parameter has the form `<param name="<name>" value="<value>"/>` and the par Parameter Name | Description | Default value --|--|-- path | Path where the file will be created. | ./dummy-index.txt - delete | If delete operations should be written to the file. | false \ No newline at end of file + delete | If delete operations should be written to the file. | false diff --git a/src/plugin/indexer-elastic/README.md b/src/plugin/indexer-elastic/README.md index bccadf7..12c9387 100644 --- a/src/plugin/indexer-elastic/README.md +++ b/src/plugin/indexer-elastic/README.md @@ -22,7 +22,7 @@ Each `<writer>` element has two mandatory attributes: ## Mapping -The mapping section is explained [here](https://wiki.apache.org/nutch/IndexWriters#Mapping_section). The structure of this section is general for all index writers. +The mapping section is explained [here](https://cwiki.apache.org/confluence/display/NUTCH/IndexWriters#IndexWriters-Mappingsection). The structure of this section is general for all index writers. ## Parameters diff --git a/src/plugin/indexer-rabbit/README.md b/src/plugin/indexer-rabbit/README.md index ea043ed..6ea09a9 100644 --- a/src/plugin/indexer-rabbit/README.md +++ b/src/plugin/indexer-rabbit/README.md @@ -22,7 +22,7 @@ Each `<writer>` element has two mandatory attributes: ## Mapping -The mapping section is explained [here](https://wiki.apache.org/nutch/IndexWriters#Mapping_section). The structure of this section is general for all index writers. +The mapping section is explained [here](https://cwiki.apache.org/confluence/display/NUTCH/IndexWriters#IndexWriters-Mappingsection). The structure of this section is general for all index writers. ## Parameters @@ -41,4 +41,4 @@ routingkey | The routing key used to route messages in the exchange. It only mak commit.mode | **single** if a message contains only one document. In this case, a header with the action (write, update or delete) will be added. **multiple** if a message contains all documents. | multiple commit.size | Amount of documents to send into each message if the value of `commit.mode` property is **multiple**. In **single** mode this value represents the amount of messages to be sent. | 250 headers.static | Headers to add to each message. It must have the form `key1=value1,key2=value2`. | -headers.dynamic | Document's fields to add as headers to each message. It must have the form `field1,field2`. Only used when the value of `commit.mode` property is **single**. | \ No newline at end of file +headers.dynamic | Document's fields to add as headers to each message. It must have the form `field1,field2`. Only used when the value of `commit.mode` property is **single**. | diff --git a/src/plugin/indexer-solr/README.md b/src/plugin/indexer-solr/README.md index a5305ca..c3a4601 100644 --- a/src/plugin/indexer-solr/README.md +++ b/src/plugin/indexer-solr/README.md @@ -22,7 +22,7 @@ Each `<writer>` element has two mandatory attributes: ## Mapping -The mapping section is explained [here](https://wiki.apache.org/nutch/IndexWriters#Mapping_section). The structure of this section is general for all index writers. +The mapping section is explained [here](https://cwiki.apache.org/confluence/display/NUTCH/IndexWriters#IndexWriters-Mappingsection). The structure of this section is general for all index writers. ## Parameters @@ -41,4 +41,4 @@ password | The password of Solr server. | password ## schema.xml -In the distribution of the indexer-solr plugin there is a schema.xml file available. Nutch does not use this file, but it is provided to Solr users as a reference/guide to facilitate the configuration of Solr. \ No newline at end of file +In the distribution of the indexer-solr plugin there is a schema.xml file available. Nutch does not use this file, but it is provided to Solr users as a reference/guide to facilitate the configuration of Solr. diff --git a/src/plugin/indexer-solr/schema.xml b/src/plugin/indexer-solr/schema.xml index 57a44ac..6865eb0 100644 --- a/src/plugin/indexer-solr/schema.xml +++ b/src/plugin/indexer-solr/schema.xml @@ -103,7 +103,7 @@ matching across fields. For more info on customizing your analyzer chain, please see - http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters + https://cwiki.apache.org/confluence/display/solr/AnalyzersTokenizersTokenFilters --> <!-- A general text field that has reasonable, generic diff --git a/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java b/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java index 6ca8f42..cd188fb 100644 --- a/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java +++ b/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java @@ -65,12 +65,12 @@ import org.apache.nutch.util.NutchConfiguration; * </p> * <p> * Documentation can be found on the Nutch - * <a href="https://wiki.apache.org/nutch/HttpAuthenticationSchemes" > + * <a href="https://cwiki.apache.org/confluence/display/NUTCH/HttpAuthenticationSchemes" > * HttpAuthenticationSchemes</a> wiki page. * </p> * <p> * The original description of the motivation to support - * <a href="https://wiki.apache.org/nutch/HttpPostAuthentication" > + * <a href="https://cwiki.apache.org/confluence/display/NUTCH/HttpPostAuthentication" > * HttpPostAuthentication</a> is also included on the Nutch wiki. Additionally * HttpPostAuthentication development is documented at the * <a href="https://issues.apache.org/jira/browse/NUTCH-827">NUTCH-827</a> Jira