Chris Mattmann wrote:

Hi Michi,

Btw, wouldn't it make sense to add protocol-httpclient as default,
because I guess
I am not the only one trying to fetch pages using https?

Indeed. The issue with this was in fact that some time ago, the powers that
be decided that it probably made sense to make protocol-httpclient the
default. However, due to some performance issues with the underlying
commons-httpclient Apache library (I think), it was decided to go with
protocol-http, which turned out to be must faster/more reliable, etc, at the
expense of not natively supporting HTTPS.


ok. So what about adding a comment to nutch-site.xml, e.g.

<!-- NOTE: In order to use https please add protocol-httpclient, but be aware of possible performance problems! -->

Cheers

Michi

I wonder what the user community
thinks about this now though? What do other people think? Have the issues
with protocol-httpclient gone away, such that it makes sense to enable it
again?

Cheers,
 Chris

Thanks again

Michi

Thanks!

Cheers,
Chris



On 1/24/07 2:29 PM, "Michael Wechner" <[EMAIL PROTECTED]> wrote:



Hi

I try to fetch data from a website using https, whereas I have added

<value>nutch-extensionpoints|protocol-file|protocol-http|protocol-https

to nutch-site.xml

but still receive the following error

fetch of https://www.foo.bar/ failed with:
org.apache.nutch.protocol.ProtocolNotFound: protocol not found for url=https

Is there anything else one has to do?

I am using Nutch 0.8.x

Thanks

Michi








--
Michael Wechner
Wyona      -   Open Source Content Management   -    Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
[EMAIL PROTECTED]                        [EMAIL PROTECTED]
+41 44 272 91 61

Reply via email to