[ 
https://issues.apache.org/jira/browse/NUTCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769338#comment-13769338
 ] 

Talat UYARER commented on NUTCH-1086:
-------------------------------------

Hi Markus,

Yes I know that Httpclient is still in development as part of Apache 
HttpComponents. Second comment is very good information for me. Actually i 
asked that question because i found a little bug in protocol-http: Even If I 
have http.content.limit value set, protocol-http fetches files of all sizes 
(larger files are fetched until limit allows). 
But when Parsing, parser skips incomplete files (parser.skip.truncated 
configuration). It seems like an unnecessary effort to partially fetch contents 
larger than limit if they are not gonna be parsed.
What do you think about this? I will upload a patch about this issue.
                
> Rewrite protocol-httpclient
> ---------------------------
>
>                 Key: NUTCH-1086
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1086
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>    Affects Versions: nutchgora, 1.5
>            Reporter: Markus Jelsma
>            Priority: Critical
>             Fix For: 2.4
>
>
> There are several issues about protocol-httpclient and several comments about 
> rewriting the plugin with the new http client libraries. There is, however, 
> not yet an issue for rewriting/reimplementing protocol-httpclient.
> http://hc.apache.org/httpcomponents-client-ga/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to