[ 
https://issues.apache.org/jira/browse/NUTCH-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508096#comment-16508096
 ] 

ASF GitHub Bot commented on NUTCH-2576:
---------------------------------------

sebastian-nagel commented on issue #328: NUTCH-2576 HTTP protocol 
implementation based on okhttp
URL: https://github.com/apache/nutch/pull/328#issuecomment-396253528
 
 
   All TODOs addressed in last commits. The unit tests for NUTCH-2549 are 
ported now. Some failed and are ignored for now. This seems acceptable because 
okhttp only supports HTTP/1.1 and HTTP/2 [but not 
HTTP/1.0](http://square.github.io/okhttp/3.x/okhttp/okhttp3/OkHttpClient.Builder.html#protocols-java.util.List-).
   - no HTTP status line (that conforms only to HTTP/0.9)
   - no multi-line headers (deprecated with HTTP/1.1])
   - also HTTP status line needs to be syntactically correct
   - ignoring errors reading non-200 payload: exception could eventually be 
caught
   ```
   % grep -A1 -i testcase 
build/protocol-okhttp/test/TEST-org.apache.nutch.protocol.okhttp.TestBadServerResponses.txt
   Testcase: testBadHttpServer took 0.444 sec
   Testcase: testNoStatusLine took 0.083 sec
           Caused an ERROR
   --
   Testcase: testOverlongHeader took 0.116 sec
   Testcase: testContentLengthNotANumber took 0.13 sec
   Testcase: testHeaderSpellChecking took 0.101 sec
   Testcase: testMultiLineHeader took 0.074 sec
           FAILED
   --
   Testcase: testHeaderWithColon took 0.103 sec
           Caused an ERROR
   --
   Testcase: testChunkedContent took 0.09 sec
   Testcase: testRequestNotStartingWithSlash took 0.077 sec
   Testcase: testIgnoreErrorInRedirectPayload took 0.073 sec
           Caused an ERROR
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> HTTP protocol plugin based on okhttp
> ------------------------------------
>
>                 Key: NUTCH-2576
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2576
>             Project: Nutch
>          Issue Type: Improvement
>          Components: plugin, protocol
>            Reporter: Sebastian Nagel
>            Priority: Major
>             Fix For: 1.15
>
>
> [Okhttp|http://square.github.io/okhttp/] is an Apache2-licensed http library 
> which supports HTTP/2. [~jnioche]'s implementation 
> [storm-crawler#443|https://github.com/DigitalPebble/storm-crawler/issues/443] 
> proves that it should be straightforward to implement a Nutch protocol plugin 
> using okhttp. A recent HTTP protocol implementation should also fix (most of) 
> the issues reported in NUTCH-2549.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to