-- Forwarded message --
From: Simone Frenzel psimon...@googlemail.com
Date: 2011/8/22
Subject: Patch für httpResponse
To: dev-subscr...@nutch.apache.org
Hi,
tested nutch on differnt webpages. In case of a short ziped pages it thrwos
an IO_Exception:
java.io.IOException: unzipBestEffort returned null
2011-08-19 17:06:55,190 ERROR httpclient.Http - at
org.apache.nutch.protocol.http.api.HttpBase.processGzipEncoded(HttpBase.java:310)
2011-08-19 17:06:55,191 ERROR httpclient.Http - at
org.apache.nutch.protocol.httpclient.HttpResponse.init(HttpResponse.java:163)
2011-08-19 17:06:55,191 ERROR httpclient.Http - at
org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:154)
2011-08-19 17:06:55,191 ERROR httpclient.Http - at
org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:138)
2011-08-19 17:06:55,191 ERROR httpclient.Http - at
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:628)
a little change on HttpResponse solve the problem - now there is no
problem with zipped Pages, BaiscAuth and Zipped Pages ... anymore.
Patch is attched.
Greetings and thanks
Index: trunk/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpResponse.java
===
--- trunk/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpResponse.java (Revision 1160266)
+++ trunk/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpResponse.java (Arbeitskopie)
@@ -124,7 +124,7 @@
int totalRead = 0;
ByteArrayOutputStream out = new ByteArrayOutputStream();
while ((bufferFilled = in.read(buffer, 0, buffer.length)) != -1
- totalRead + bufferFilled contentLength) {
+ totalRead + bufferFilled = contentLength) {
totalRead += bufferFilled;
out.write(buffer, 0, bufferFilled);
}