sebastian-nagel opened a new pull request #575: URL: https://github.com/apache/nutch/pull/575
- if URL includes a port the protocol is not normalized Note that - urlnormalizer-basic removes default ports: `https://example.com:443/` is normalized to `https://example.com/` - by chaining normalizers there is no need to handle default ports in urlnormalizer-protocol - non-default ports can always be mapped by urlnormalizer-regex, there shouldn't be many, so the price of more complex rules and slower execution is acceptable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]

