[
https://issues.apache.org/jira/browse/NUTCH-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471213#comment-16471213
]
Hudson commented on NUTCH-2575:
-------------------------------
SUCCESS: Integrated in Jenkins build Nutch-trunk #3524 (See
[https://builds.apache.org/job/Nutch-trunk/3524/])
NUTCH-2575 Storing total number of bytes read after every chunk
(omkarreddy2008:
[https://github.com/apache/nutch/commit/b541de8ff20b818667e2765664ae2f133b439dc3])
* (edit)
src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java
> protocol-http does not respect the maximum content-size for chunked responses
> -----------------------------------------------------------------------------
>
> Key: NUTCH-2575
> URL: https://issues.apache.org/jira/browse/NUTCH-2575
> Project: Nutch
> Issue Type: Sub-task
> Components: protocol
> Affects Versions: 1.14
> Reporter: Gerard Bouchar
> Priority: Critical
> Fix For: 1.15
>
>
> There is a bug in HttpResponse::readChunkedContent that prevents it to stop
> reading content when it exceeds the maximum allowed size.
> There [is a variable
> contentBytesRead|https://github.com/apache/nutch/blob/master/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java#L404]
> that is used to check how much content has been read, but it is never
> updated, so it always stays null, and [the size
> check|https://github.com/apache/nutch/blob/master/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java#L440-L442]
> always returns false (unless a single chunk is larger than the maximum
> allowed content size).
> This allows any server to cause out-of-memory errors on our size.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)