update (or whatever the actual name of the command is) after parsing? On 25 June 2012 22:35, <[email protected]> wrote:
> Hello, > > I have tested nutch-2.0 with hbase and mysql trying to index only one url > with depth 1. > > I tried to fetch an html tag value and parse it to metadata column in > webpage object by adding parse-tag plugin. I saw there is no metadata > member variable in Parse class, so I used putToMetadata function from > Webpage class and it turned out that this function overwrites values for > the same key, i.e, it keeps only the last tag value if there are multiple > tags. > > Next > > bin/nutch solrindex http://127.0.0.1:8983/solr/ -all > SolrIndexerJob: starting > SolrIndexerJob: done. > > I did > 1.bin/nutch inject > 2.bin/nutch generate > 3.bin/nutch fetch batchId > 4.bin/nutch parse batchId > 5.bin/nutch bin/nutch solrindex http://127.0.0.1:8983/solr/ -all > > There is no data added to solr index with the url I tried to index. > > Besides these, nutch-2.0 keeps content in the content column of webpage > table if I put in the config > > <property> > <name>fetcher.store.content</name> > <value>false</value> > <description>If true, fetcher will store content.</description> > </property> > > > Any ideas, what is done wrong or how to fix these issues are welcome. > > Thanks. > Alex. > > > > > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble

