Hi,

I'm using Nutch 1.7 and ElasticSearch (installed on the same CentOS machine). 

I could run a quick test using the "bin/nutch crawl …" command on 
nutch.apache.org (using domain-urlfilter). I can run the ElasticSearch indexer 
successfully, but all entries in ElasticSearch only have "segment", "digest", 
and "boost" fields in the "_source" object.  I would expect to see "content" as 
well, right?

I know some content was parsed when running "bin/nutch readseg -get <seg> <url>.

I see that schema.xml is used as mapping for Solr.  I think ElasticSearch 
doesn't need any pre-defined mapping, right?

Logs under $NUTCH_HOME/logs are not helping (no error, 1 warning from 
NativeCodeLoader).

Am I missing some config somewhere?

(Yes, I'm new to Nutch and ElasticSearch.)

JM

Reply via email to