Hey Earl, the Nutch-84 enhancement suggestion in JIRA does just this. There is also support for request pipelining, which rather unfortunately, isn't a good idea when working with dynamic sites.
Check out a previous post on this: http://marc.theaimsgroup.com/?l=nutch-developers&m=112476980602585&w=2 kelvin On Fri, 16 Sep 2005 16:49:50 -0700 (PDT), Earl Cahill wrote: > Maybe way ahead of me here, but it was just hitting me that it > would be pretty cool to group urls to fetch my host and then > perhaps use http 1.1 to reuse the connection and save initial > handshaking overheard. Not a huge deal for a couple hits, but it I > think it would make sense for large crawls. > > Or maybe keep a pool of http connections to the last x sites open > somewhere and check there first. > > Sound reasonable? Already doing it? I would be willing to help. > > Just a thought. > > Earl > > > __________________________________ Yahoo! Mail - PC Magazine > Editors' Choice 2005 http://mail.yahoo.com
