Maybe way ahead of me here, but it was just hitting me
that it would be pretty cool to group urls to fetch my
host and then perhaps use http 1.1 to reuse the
connection and save initial handshaking overheard.
Not a huge deal for a couple hits, but it I think it
would make sense for large crawls.
Or maybe keep a pool of http connections to the last x
sites open somewhere and check there first.
Sound reasonable? Already doing it? I would be
willing to help.
Just a thought.
Earl
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com
-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers