[
https://issues.apache.org/jira/browse/NUTCH-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315671#comment-17315671
]
Hudson commented on NUTCH-2858:
-------------------------------
SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #31 (See
[https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/31/])
NUTCH-2858 urlnormalizer-protocol: URL port is lost during normalization
(snagel:
[https://github.com/apache/nutch/commit/c454a640f97efc9e16da1b4a910fab6d85103f95])
* (edit)
src/plugin/urlnormalizer-protocol/src/test/org/apache/nutch/net/urlnormalizer/protocol/TestProtocolURLNormalizer.java
* (edit)
src/plugin/urlnormalizer-protocol/src/java/org/apache/nutch/net/urlnormalizer/protocol/ProtocolURLNormalizer.java
* (edit)
src/plugin/urlnormalizer-basic/src/test/org/apache/nutch/net/urlnormalizer/basic/TestBasicURLNormalizer.java
NUTCH-2858 urlnormalizer-protocol: URL port is lost during normalization
(snagel:
[https://github.com/apache/nutch/commit/d7499205a7ad5eb0d324af7e027b9fdd051d8d22])
* (edit) conf/protocols.txt.template
> urlnormalizer-protocol: URL port is lost during normalization
> -------------------------------------------------------------
>
> Key: NUTCH-2858
> URL: https://issues.apache.org/jira/browse/NUTCH-2858
> Project: Nutch
> Issue Type: Bug
> Components: plugin, urlnormalizer
> Affects Versions: 1.18
> Reporter: Sebastian Nagel
> Assignee: Sebastian Nagel
> Priority: Minor
> Fix For: 1.19
>
>
> If a URL includes a port, e.g. {{http://example.com:8080/}} or
> {{https://example.com:8443/}}, the port is removed when normalizing using the
> protocol-urlnormalizer.
> Instead, if the port is set,
> - the port should be kept as is and
> - the protocol should be unchanged
> -* keeping the port and changing the protocol might result in a connection
> failure
> -* unlike the default port mappings (80 (http) <> 443 (https)),
> non-default port mappings (8080 <> 8443) are risky and unlikely to work on
> every server setup.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)