Re: Nutch 1.8 in pseudo dist error

2014-05-03 Thread Sebastian Nagel
Hi, looks like the segment is not addressed properly: hdfs://localhost:54310/user/hduser/TestCrawl/segments/crawl_generate Segments are named by a time-stamp, e.g. .../TestCrawl/segments/20140502231126/ crawl_generate is a subdir. Can you specify the exact commands to run the crawler?

Re: Nutch 1.8 in pseudo dist error

2014-05-03 Thread BlackIce
same as for Nutch 2.2.1 in pseudo bin/crawl urls/seed.txt TestCrawl http://localhost:8983/solr/ 10 from within the deploy dir. However, i remember reading somewhere that the deploy execution for the 1.x series is different than the 2.x series, that some more files, asides the seed.txt had to be