I'm using Nutch 1.11. The "plugin.includes" section of nutch-default.xml
still states that the protocol-httpclient plugin may present intermittent
problems. Is this still the case? What are the problems?

There doesn't appear to be any problem crawling HTTPS using the
protocol-http plugin. Why do I need to use protocol-httpclient for crawling
via HTTPS?

In short, I want to use the "correct" plugin because I am extending it to
perform a bit of extra work. "Correct" in this case means:
- The "recommended" of the two
- Whichever can crawl both HTTP and HTTPS connections
- Whichever performs better

Thanks,
Joe

Reply via email to