Hello,
Nutch is not able crawl this site. Are there any nutch configuration changes
required for this site?
https://www.ich.org/
Thanks and Regards
Raj Chidara
Hello Markus
Sorry for duplicate question. I added selenium plugin in
conf/nutch-default.xml and included following
plugin.includes
Yes, remove the other protocol-* plugins from the configuration. With all
three active it is not always determined which one is going to do the work.
Op ma 30 jan. 2023 om 12:50 schreef Raj Chidara :
>
> Hello Markus
> Sorry for duplicate question. I added selenium plugin in
>
Hello Raj,
I think the same question about the same site was asked here some time ago.
Anyway, this site loads its content via Javascript. You will need a
protocol plugin that supports it, either protocol-htmlunit, or
protocol-selenium, instead of protocol-http or any other.
Change the
Already unsubscribed. Why do I still get this email?
Thanks
Steven
On Mon, Jan 30, 2023 at 7:06 AM Markus Jelsma
wrote:
> Yes, remove the other protocol-* plugins from the configuration. With all
> three active it is not always determined which one is going to do the work.
>
> Op ma 30 jan.
5 matches
Mail list logo