when http.content.limit be set to -1 and  Response.CONTENT_ENCODING  is gzip or 
x-gzip  , it can not fetch any thing.
---------------------------------------------------------------------------------------------------------------------

                 Key: NUTCH-374
                 URL: http://issues.apache.org/jira/browse/NUTCH-374
             Project: Nutch
          Issue Type: Bug
    Affects Versions: 0.8.1, 0.8
            Reporter: King Kong


I set "http.content.limit"  to -1 to not truncate content being fetched.
However , if  response used gzip or x-gzip , then it was not able to uncompress.

I found the problem is in HttpBase.processGzipEncoded  (plugin lib-http) 
  ...
   byte[] content = GZIPUtils.unzipBestEffort(compressed, getMaxContent());
   ...
because it is not  deal with -1 to no limit , so must modify code to solve it;

    byte[] content;
    if (getMaxContent()>=0){
        content = GZIPUtils.unzipBestEffort(compressed, getMaxContent());
    }else{
        content = GZIPUtils.unzipBestEffort(compressed);
    }



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to