Hi:
I am a newcomer to the Solr/Nutch community and I have some questions.
I was able to hook up Nutch for search and Solr for indexing, but I
would like to know how (if it is possible) to surface something similar
to the Nutch result summary in Solr. Should I store the value of the
'content' field in Solr and create the summary from it?
Also, Nutch fetches some links that return a 404 error, and these are
then indexed by Solr. Is there some way that I can filter these results
in the SolrIndexer class before they are indexed? Is it possible to get
either the Status, Metadata, Signature in the SolrIndexer?
The last few fields I mentioned can be seen when doing a dump of the
database and looking at the results...
http://xxxx.xxxx..com/xxx/xxx-xxx Version: 6
Status: 1 (db_unfetched)
Fetch time: Tue Oct 21 10:45:36 EDT 2008
Modified time: Wed Dec 31 19:00:00 EST 1969
Retries since fetch: 1
Retry interval: 2592000 seconds (30 days)
Score: 7.0573883E-6
Signature: null
Metadata: _pst_:blocked(23), lastModified=0
http://xxxx.xxxx..com/xxx/xxx-xxx Version: 6
Status: 3 (db_gone)
Fetch time: Fri Dec 05 09:04:13 EST 2008
Modified time: Wed Dec 31 19:00:00 EST 1969
Retries since fetch: 0
Retry interval: 3888000 seconds (45 days)
Score: 6.4350065E-4
Signature: null
Metadata: _pst_:notfound(14), lastModified=0:
http://xxxx.xxxx..com/xxx/xxx-xxx
Thank you in advance for your help.
William J Ortiz