[jira] [Updated] (NUTCH-2575) protocol-http does not respect the maximum content-size for chunked responses

2018-05-10 Thread Sebastian Nagel (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel updated NUTCH-2575:
---
Fix Version/s: 1.15

> protocol-http does not respect the maximum content-size for chunked responses
> -
>
> Key: NUTCH-2575
> URL: https://issues.apache.org/jira/browse/NUTCH-2575
> Project: Nutch
>  Issue Type: Sub-task
>  Components: protocol
>Affects Versions: 1.14
>Reporter: Gerard Bouchar
>Priority: Critical
> Fix For: 1.15
>
>
> There is a bug in HttpResponse::readChunkedContent that prevents it to stop 
> reading content when it exceeds the maximum allowed size.
> There [is a variable 
> contentBytesRead|https://github.com/apache/nutch/blob/master/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java#L404]
>  that is used to check how much content has been read, but it is never 
> updated, so it always stays null, and [the size 
> check|https://github.com/apache/nutch/blob/master/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java#L440-L442]
>  always returns false (unless a single chunk is larger than the maximum 
> allowed content size).
> This allows any server to cause out-of-memory errors on our size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NUTCH-2575) protocol-http does not respect the maximum content-size for chunked responses

2018-05-10 Thread Sebastian Nagel (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel updated NUTCH-2575:
---
Affects Version/s: 1.14

> protocol-http does not respect the maximum content-size for chunked responses
> -
>
> Key: NUTCH-2575
> URL: https://issues.apache.org/jira/browse/NUTCH-2575
> Project: Nutch
>  Issue Type: Sub-task
>Affects Versions: 1.14
>Reporter: Gerard Bouchar
>Priority: Critical
>
> There is a bug in HttpResponse::readChunkedContent that prevents it to stop 
> reading content when it exceeds the maximum allowed size.
> There [is a variable 
> contentBytesRead|https://github.com/apache/nutch/blob/master/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java#L404]
>  that is used to check how much content has been read, but it is never 
> updated, so it always stays null, and [the size 
> check|https://github.com/apache/nutch/blob/master/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java#L440-L442]
>  always returns false (unless a single chunk is larger than the maximum 
> allowed content size).
> This allows any server to cause out-of-memory errors on our size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NUTCH-2575) protocol-http does not respect the maximum content-size for chunked responses

2018-05-10 Thread Sebastian Nagel (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel updated NUTCH-2575:
---
Summary: protocol-http does not respect the maximum content-size for 
chunked responses  (was: protocol-http does not respect the maximum 
content-size)

> protocol-http does not respect the maximum content-size for chunked responses
> -
>
> Key: NUTCH-2575
> URL: https://issues.apache.org/jira/browse/NUTCH-2575
> Project: Nutch
>  Issue Type: Sub-task
>Reporter: Gerard Bouchar
>Priority: Critical
>
> There is a bug in HttpResponse::readChunkedContent that prevents it to stop 
> reading content when it exceeds the maximum allowed size.
> There [is a variable 
> contentBytesRead|https://github.com/apache/nutch/blob/master/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java#L404]
>  that is used to check how much content has been read, but it is never 
> updated, so it always stays null, and [the size 
> check|https://github.com/apache/nutch/blob/master/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java#L440-L442]
>  always returns false (unless a single chunk is larger than the maximum 
> allowed content size).
> This allows any server to cause out-of-memory errors on our size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)