[
https://issues.apache.org/jira/browse/NUTCH-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100818#comment-14100818
]
Phu Kieu edited comment on NUTCH-1825 at 8/18/14 4:48 PM:
----------------------------------------------------------
Attached is a node.js proxy that will let you reproduce. It will happen with
any resource that has a Content-Length.
I'm not sure of the exact mechanism that is happening, but if I were to guess
it's because node.js keeps the connection open. In any case, protocol-http
should be able to handle this.
This may be related to NUTCH-1342
was (Author: pkieu):
Attached is a node.js proxy that will let you reproduce. It will happen with
any resource that has a Content-Length.
I'm not sure of the exact mechanism that is happening, but if I were to guess
it's because node.js keeps the connection open. In any case, protocol-http
should be able to handle this.
> protocol-http may hang for certain web pages
> --------------------------------------------
>
> Key: NUTCH-1825
> URL: https://issues.apache.org/jira/browse/NUTCH-1825
> Project: Nutch
> Issue Type: Bug
> Components: protocol
> Affects Versions: 1.9
> Reporter: Phu Kieu
> Priority: Minor
> Attachments: HttpResponse.java.patch, proxy.js
>
>
> There is a rare case where protocol-http will wait for data even when all the
> data has been sent.
> Patch is attached; please test and confirm.
--
This message was sent by Atlassian JIRA
(v6.2#6252)