[jira] [Commented] (NUTCH-2560) protocol-http throws an error when an http header spans over multiple lines
[ https://issues.apache.org/jira/browse/NUTCH-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509872#comment-16509872 ] Hudson commented on NUTCH-2560: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3534 (See [https://builds.apache.org/job/Nutch-trunk/3534/]) NUTCH-2560 protocol-http throws an error when an http header spans over (snagel: [https://github.com/apache/nutch/commit/a2771dc0d1f551b8dd1e07609ce978251a05f34a]) * (edit) src/plugin/protocol-http/src/test/org/apache/nutch/protocol/http/TestBadServerResponses.java > protocol-http throws an error when an http header spans over multiple lines > --- > > Key: NUTCH-2560 > URL: https://issues.apache.org/jira/browse/NUTCH-2560 > Project: Nutch > Issue Type: Sub-task >Affects Versions: 1.14 >Reporter: Gerard Bouchar >Priority: Major > Fix For: 1.15 > > > Some servers invalidly send headers that span over multiple lines. In that > case, browsers simply ignore the subsequent lines, but protocol-http throws > an error, thus preventing us from fetching the contents of the page. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NUTCH-2560) protocol-http throws an error when an http header spans over multiple lines
[ https://issues.apache.org/jira/browse/NUTCH-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508028#comment-16508028 ] Sebastian Nagel commented on NUTCH-2560: See [RFC 7230, section 3.2.4|https://tools.ietf.org/html/rfc7230#section-3.2.4]: {quote}Historically, HTTP header field values could be extended over multiple lines by preceding each extra line with at least one space or horizontal tab (obs-fold). This specification deprecates such line folding{quote} Actually this seems to work if multi-line headers follow the spec (extra space at beginning of line), the unit test in [commit a2771dc|https://github.com/apache/nutch/pull/347/commits/a2771dc0d1f551b8dd1e07609ce978251a05f34a] passes if ported to Nutch 1.14. > protocol-http throws an error when an http header spans over multiple lines > --- > > Key: NUTCH-2560 > URL: https://issues.apache.org/jira/browse/NUTCH-2560 > Project: Nutch > Issue Type: Sub-task >Affects Versions: 1.14 >Reporter: Gerard Bouchar >Priority: Major > Fix For: 1.15 > > > Some servers invalidly send headers that span over multiple lines. In that > case, browsers simply ignore the subsequent lines, but protocol-http throws > an error, thus preventing us from fetching the contents of the page. -- This message was sent by Atlassian JIRA (v7.6.3#76005)