Re: proposal: fetcher performance improvements

2008-12-11 Thread Andrzej Bialecki
Todd Lipcon wrote: The issue I see with decreasing max crawl delay is that it essentially blacklists those hosts. Even if I can only crawl these hosts 1/10th as fast, I'd still like to have them in my index. I guess this is where the hostdb will help once that jira is implemented, so this kind

Re: httpclient and cookies

2008-12-11 Thread Ryan Smith
One way is you can try to enable debug logging in log4j so you can see the headers that httpclient is passing back and forth to the webserver. On Thu, Dec 11, 2008 at 10:29 AM, George Herlin [EMAIL PROTECTED] wrote: I have read that if one sets the plugin.includes property to use

Re: httpclient and cookies

2008-12-11 Thread zhengsj03 User
You can print the request headers to verify cookies. I have seen the source code.You can add some codes in the file src/plugin/progocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/httpresponse.java. 在 2008-12-11四的 16:29 +0100,George Herlin写道: I have read that if one sets the