This is a little embarrassing: I'm stuck on the very first step of the new-user installation.
I have apache-nutch-1.13-bin.tar on Ubuntu 16.04 using the Oracle Java8, following the wiki.apache.org/nutch/NutchTutorial (*) with a urls/seed.txt file that contains only http://www.hunchmanifest.com but I get zero urls injected: $ nutch inject crawl/crawldb urls Injector: starting at 2017-07-21 16:17:18 Injector: crawlDb: crawl/crawldb Injector: urlDir: urls Injector: Converting injected urls to crawl db entries. Injector: Total urls rejected by filters: 0 Injector: Total urls injected after normalization and filtering: 0 Injector: Total urls injected but already in CrawlDb: 0 Injector: Total new urls injected: 0 Injector: finished at 2017-07-21 16:17:20, elapsed: 00:00:01 I've tried other URLs and none are excluded by the regex rules (but the above doesn't list any rejects either) What could be wrong with my installation? There's nothing suspicious in the logs other than warnings for plugins not found: 2017-07-21 14:38:49,966 INFO crawl.Injector - Injector: crawlDb: crawl/crawldb 2017-07-21 14:38:49,966 INFO crawl.Injector - Injector: urlDir: urls 2017-07-21 14:38:49,966 INFO crawl.Injector - Injector: Converting injected urls to crawl db entries. 2017-07-21 14:38:50,047 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2017-07-21 14:38:50,762 WARN plugin.PluginRepository - Error while loading plugin `/opt/apache-nutch-1.13/plugins/parse-replace/plugin.xml` java.io.FileNotFoundException: /opt/apache-nutch-1.13/plugins/parse-replace/plugin.xml (No such file or directory) 2017-07-21 14:38:50,775 WARN plugin.PluginRepository - Error while loading plugin `/opt/apache-nutch-1.13/plugins/plugin/plugin.xml` java.io.FileNotFoundException: /opt/apache-nutch-1.13/plugins/plugin/plugin.xml (No such file or directory) 2017-07-21 14:38:50,791 WARN plugin.PluginRepository - Error while loading plugin `/opt/apache-nutch-1.13/plugins/publish-rabitmq/plugin.xml` java.io.FileNotFoundException: /opt/apache-nutch-1.13/plugins/publish-rabitmq/plugin.xml (No such file or directory) 2017-07-21 14:38:50,861 WARN mapred.LocalJobRunner - job_local540893461_0001 java.lang.Exception: java.lang.NullPointerException at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) That last error may be a catch-all, or could be because there are zero urls in the database. I get the same behavior if I run crawler crawlerdb has no files although directories are created. Could there be an environment variable needed? I am running from the nutch install directory with write permissions on all files. Is there something I've overlooked? Is there a -D debug switch I can use to gather more information? (* also, the content.rdf.u8.gz sample file cited in the NutchTutorial page no longer exists; DMOZ is shutdown and the archive site preserves the original link that is now a 404)

