Hi all,

if (http.getMaxContent() >= 0
      && contentLength > http.getMaxContent())   // limit download size
      contentLength  = http.getMaxContent();
.......
    for (int i = in.read(bytes); i != -1 && length + i <= contentLength; i
= in.read(bytes)) {

      out.write(bytes, 0, i);
      length += i;
    }

So nutch works like that: If "http.content.limit < contentLength" then
truncate the content.  Then If isTruncated() is true at ParserJob, do not
parse.

Why the content is read? I think we should pass fetch, If
"http.content.limit < contentLength".

If you think so I can implement patch.

Reply via email to