Hi again,
Another issue has appeared with introduction of bidirectional url exemption
filter.
Having
http://www.website.com/page1
and
http://website.com/page2
Before as an indexer output(lets say a text file) I had one
parent/host(www.website.com) with
Hi John,
the recent master has seen an upgrade to the new MapReduce API (NUTCH-2375),
it was a huge change which is already known to have introduced some issues.
For production it's recommended to use 1.14 and if necessary patch it.
Could you open a new issue on
Hello,
I'm currently running Nutch under Amazon EMR 5.12.0 with Hadoop 2.83 using
S3 (EMRFS) as the filesystem. If I build the latest version from the
master branch and run a crawl in distributed mode I get a fetcher error
like fetcher.Fetcher: Fetcher: java.lang.IllegalArgumentException: Wrong
3 matches
Mail list logo