Running Nutch standalone (without Solr)

2013-06-12 Thread Peter Gaines
Hi There, Is it possible to run Nutch as a standalone crawler without integration with Solr? I need to do this in order to do a performance comparison of it’s raw crawling functionality. It seems like it may be possible using the bin/nutch crawl command but this is now deprecated. Is there

RE: Running Nutch standalone (without Solr)

2013-06-12 Thread Markus Jelsma
Hi, Sure, you don't need to index the data and can use the individual commands or the new bin/crawl script. Cheers -Original message- From:Peter Gaines pgai...@deveire.com Sent: Wed 12-Jun-2013 13:57 To: user@nutch.apache.org Subject: Running Nutch standalone (without Solr

Re: Running Nutch standalone (without Solr)

2013-06-12 Thread H. Coskun Gunduz
Hi Peter, Yes, it's possible. You'll need a data store (my personal recommendation is HBase). Regarding on the Nutch version you use, you can follow these tutorials: Nutch 1.x: http://wiki.apache.org/nutch/NutchTutorial Nutch 2: http://wiki.apache.org/nutch/Nutch2Tutorial Happy crawling.