Yep, after fetching and parsing the pages, you need to tell Nutch to
index the data in Solr, like:
./nutch solrindex http://localhost:8080/solr/ crawl/crawldb crawl/linkdb
crawl/segments/*
It's all explained in the wiki: http://wiki.apache.org/nutch/NutchTutorial
Best,
Elisabeth
On 27.09.2011 15:08, Bai Shen wrote:
I'm using Luke 3.3 and Nutch 1.3
I didn't see any fdt files. Are those created when you run the solrindex
command?
On Mon, Sep 26, 2011 at 10:11 AM, Elisabeth Adler<[email protected]
wrote:
Which version of Luke and Nutch are you using? I had the same problem with
Luke 0.9 and Nutch 1.3 indices - I upgraded Luke to 3.3 (
http://code.google.com/p/**luke/<http://code.google.com/p/luke/>) and
it's working without problems now. Btw, you need to select the directory
"data/index" (containing .fdt and more files).
Hope this helps,
Elisabeth
On 26.09.2011 15:49, Bai Shen wrote:
So I used the tutorial to do some crawling with Nutch and I've done all
the
way up to Step 4. I want to look at what I've indexed so far before I
import it into Solr so I can make sure that everything is working
correctly.
But no matter which directory I use, Luke tells me that there's no valid
index. Do I need to run the solrindex command? And is there a way to do
it
without pushing it to my solr install?
Thanks.