Hi Everyone, I am having difficulty finding instructions that I am able to use to do the following. I feel I am missing something simple I overlooked...
1. Crawl a Website. 2. Index it on SOLR. For now, I am just stuck on #1 - I am running the following command: * bin/crawl -i urls/ TestCrawl/ 2 In my urls/seed.txt file I have the following entry: * http://nutch.apache.org/ In my regexx-urlfilter.txt I have the following entry: * +^http://([a-z0-9]*\.)*nutch.apache.org/ I am running this command from Cygwin (I am on windows) * bin/crawl -i urls/ TestCrawl/ 2 I have set my java_home env variable and such The error I am getting is: $ bin/crawl -i urls/ TestCrawl/ 2 Injecting seed URLs /cygdrive/c/apache-nutch-1.11/bin/nutch inject TestCrawl//crawldb urls/ Injector: starting at 2016-06-08 11:04:45 Injector: crawlDb: TestCrawl/crawldb Injector: urlDir: urls Injector: Converting injected urls to crawl db entries. Injector: java.lang.NullPointerException at java.lang.ProcessBuilder.start(Unknown Source) at org.apache.hadoop.util.Shell.runCommand(Shell.java:445) at org.apache.hadoop.util.Shell.run(Shell.java:418) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) at org.apache.hadoop.util.Shell.execCommand(Shell.java:739) at org.apache.hadoop.util.Shell.execCommand(Shell.java:722) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:633) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:421) at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:281) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:125) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:348) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Unknown Source) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Unknown Source) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:833) at org.apache.nutch.crawl.Injector.inject(Injector.java:323) at org.apache.nutch.crawl.Injector.run(Injector.java:379) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.crawl.Injector.main(Injector.java:369)

