Hey Earl, the Nutch-84 enhancement suggestion in JIRA does just this. There is 
also support for request pipelining, which rather unfortunately, isn't a good 
idea when working with dynamic sites.

Check out a previous post on this: 
http://marc.theaimsgroup.com/?l=nutch-developers&m=112476980602585&w=2

kelvin

On Fri, 16 Sep 2005 16:49:50 -0700 (PDT), Earl Cahill wrote:
> Maybe way ahead of me here, but it was just hitting me that it
> would be pretty cool to group urls to fetch my host and then
> perhaps use http 1.1 to reuse the connection and save initial
> handshaking overheard. Not a huge deal for a couple hits, but it I
> think it would make sense for large crawls.
>
> Or maybe keep a pool of http connections to the last x sites open
> somewhere and check there first.
>
> Sound reasonable?  Already doing it?  I would be willing to help.
>
> Just a thought.
>
> Earl
>
>
> __________________________________ Yahoo! Mail - PC Magazine
> Editors' Choice 2005 http://mail.yahoo.com


Reply via email to