Gerard Bouchar created NUTCH-2555:
-------------------------------------

             Summary: URL normalization problem: path not starting with a '/'
                 Key: NUTCH-2555
                 URL: https://issues.apache.org/jira/browse/NUTCH-2555
             Project: Nutch
          Issue Type: Sub-task
            Reporter: Gerard Bouchar


When an URL does not have a path but has GET parameters (for instance 
'[http://example.com?a=1')|http://example.com/?a=1%27)] it should be normalized 
to add a '/' at the beginning of the path (giving 
[http://example.com/?a=1|http://example.com/?a=1%27)]). Our logs show that 
non-normalized URLs reach protocol-http, which then tries to send an invalid 
HTTP request:

GET ?a=1 HTTP/1.0

instead of

GET /?a=1 HTTP/1.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to