[
https://issues.apache.org/jira/browse/NUTCH-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920360#comment-16920360
]
Hudson commented on NUTCH-2729:
-------------------------------
SUCCESS: Integrated in Jenkins build Nutch-trunk #3639 (See
[https://builds.apache.org/job/Nutch-trunk/3639/])
NUTCH-2729 protocol-okhttp: fix marking of truncated content (snagel:
[https://github.com/apache/nutch/commit/a82a663881fc3bb05c6e8bf6bd80fae22e36a069])
* (edit)
src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/OkHttpResponse.java
* (edit)
src/plugin/protocol-okhttp/src/test/org/apache/nutch/protocol/okhttp/TestBadServerResponses.java
NUTCH-2729 protocol-okhttp: fix marking of truncated content - avoid (snagel:
[https://github.com/apache/nutch/commit/5c45172bc86f30e24c6dfe126b10274cd9f799bf])
* (edit)
src/plugin/protocol-okhttp/src/test/org/apache/nutch/protocol/okhttp/TestBadServerResponses.java
* (edit)
src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/OkHttpResponse.java
NUTCH-2729 protocol-okhttp: fix marking of truncated content - log (snagel:
[https://github.com/apache/nutch/commit/efcafb67b43938f70354ca57e816c708b92815fe])
* (edit)
src/plugin/protocol-okhttp/src/test/org/apache/nutch/protocol/okhttp/TestBadServerResponses.java
* (edit)
src/plugin/protocol-okhttp/src/java/org/apache/nutch/protocol/okhttp/OkHttpResponse.java
> protocol-okhttp: fix marking of truncated content
> -------------------------------------------------
>
> Key: NUTCH-2729
> URL: https://issues.apache.org/jira/browse/NUTCH-2729
> Project: Nutch
> Issue Type: Bug
> Components: plugin, protocol
> Affects Versions: 1.15
> Reporter: Sebastian Nagel
> Assignee: Sebastian Nagel
> Priority: Minor
> Fix For: 1.16
>
>
> The plugin protocol-okhttp marks content as "truncated" including the reason
> for the truncation - content limit or time limit exceeded, network disconnect
> during fetch.
> The detection of truncation by content limit has one bug: if the fetched
> content is exactly the size of the content limit the loop to request more
> content is exited. It should be continued by requesting one byte more to
> reliably detect whether content is truncated or not.
> Note that the Content-Length header cannot be used to determine truncation
> reliably: it does not indicate the real content length for compressed or
> chunked content or it might be wrong.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)