[
https://issues.apache.org/jira/browse/NUTCH-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Doğacan Güney closed NUTCH-560.
-------------------------------
Resolution: Fixed
Fix Version/s: 1.0.0
Assignee: Doğacan Güney
Fixed as part of NUTCH-559.
> protocol-httpclient reading more bytes than http.content.limit
> --------------------------------------------------------------
>
> Key: NUTCH-560
> URL: https://issues.apache.org/jira/browse/NUTCH-560
> Project: Nutch
> Issue Type: Bug
> Components: fetcher
> Affects Versions: 0.9.0, 1.0.0
> Reporter: Joseph M.
> Assignee: Doğacan Güney
> Fix For: 1.0.0
>
>
> I modified protocol-httpclient HttpResponse.java to download files to file
> system. If I set http.content.limit to 5000... it fetches around 5500 to 6000
> bytes instead and downloads it to file system. There is calculation mistake
> in calculateTryToRead() function.
> {code}
> int tryAndRead = calculateTryToRead(totalRead);
> while ((bufferFilled = in.read(buffer, 0, buffer.length)) != -1 &&
> tryAndRead > 0) {
> totalRead += bufferFilled;
> out.write(buffer, 0, bufferFilled);
> tryAndRead = calculateTryToRead(totalRead);
> }{code}
> while loop stops when calculateTryToRead() returns -ve or 0.
> {code}private int calculateTryToRead(int totalRead) {
> int tryToRead = Http.BUFFER_SIZE;
> if (http.getMaxContent() <= 0) {
> return http.BUFFER_SIZE;
> } else if (http.getMaxContent() - totalRead < http.BUFFER_SIZE) {
> tryToRead = http.getMaxContent() - totalRead;
> }
> return tryToRead;
> }{code}
> It is returning -ve when totalRead > http.getMaxContent(). So more bytes than
> http.content.limit is read before breaking while loop.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.