I my experience, the hardest (but most flexible part) is exactly what was
mentioned.. processing the data.  Nutch does have a really easy plugin
interface that you can use, and the example plugin is a great place to
start.  Once you have the raw parsed text, you can do what ever you want
with it.  For example, I wrote a  plugin to add geospatial information to my
NutchDocument.  You then map the fields you added in the NutchDocument to
something you want to have Solr index.  In my case I created a geography
field where I put lat, lon info.  Then you create that same geography field
in the nutch to solr mapping file as well as your solr schema.xml file. 
Then, when you run the crawl and tell it to use "solrindex" it will send the
document to solr to be indexed.  Since you have your new field in the
schema, it knows what to do with it at index time.  Now you can build a user
interface around what you want to do with that field.  


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Newbie-need-a-point-in-the-right-direction-tp2031381p2033687.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to