Gerard Bouchar created NUTCH-2555:
-------------------------------------
Summary: URL normalization problem: path not starting with a '/'
Key: NUTCH-2555
URL: https://issues.apache.org/jira/browse/NUTCH-2555
Project: Nutch
Issue Type: Sub-task
Reporter: Gerard Bouchar
When an URL does not have a path but has GET parameters (for instance
'[http://example.com?a=1')|http://example.com/?a=1%27)] it should be normalized
to add a '/' at the beginning of the path (giving
[http://example.com/?a=1|http://example.com/?a=1%27)]). Our logs show that
non-normalized URLs reach protocol-http, which then tries to send an invalid
HTTP request:
GET ?a=1 HTTP/1.0
instead of
GET /?a=1 HTTP/1.0
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)