Re: Adding a new field to Nutch + MongoDB datastore using plugin
hi, i have the same issue i follow all steps described in the top, i can see my custom field in elasticsearch index but i cant see it in mongodb. i use nutch 2.3.1 Your help is appreciated -- View this message in context: http://lucene.472066.n3.nabble.com/Adding-a-new-field-to-Nutch-MongoDB-datastore-using-plugin-tp4269632p4323198.html Sent from the Nutch - User mailing list archive at Nabble.com.
Re: Adding a new field to Nutch + MongoDB datastore using plugin
Hi jvence, Please see my reply below On Wed, Apr 13, 2016 at 8:26 AM, <user-digest-h...@nutch.apache.org> wrote: > > From: jvence <jve...@gmail.com> > To: user@nutch.apache.org > Cc: > Date: Tue, 12 Apr 2016 10:17:20 -0700 (MST) > Subject: Adding a new field to Nutch + MongoDB datastore using plugin > I am running Nutch 2.3.1 configured with MondoDB (using Gora) + > Elasticsearch > and would like to add a new field to the storage database NOT the index. > Cool. Please see below. > > I am able to add a field to the elasticsearch index using a custom plugin > but would like to add it to the mongodb record for each website. > > I've added the field to the ./conf/schema.xml file and to > This relates to Solr only. If you have indexer-solr included in plugin.includes then your field will be added to the Index. This has not got anything to do with the Gora DataStore however. > ./conf/gora-mongodb-mapping.xml - The field does appear in the index but > not > in the mongo record.. > In addition to augmenting the mapping file, you need to augment the webpage.avsc [0] as this essentially defines the data model you wish to persist into Gora. We call this the persistent class. If you add your data structure (in accordance with the Avro Specification [1]) then run the following from $NUTCH_HOME then you will be good to go. ant generate-gora-src Any issues, please let us know. Thanks [0] https://github.com/apache/nutch/blob/2.x/src/gora/webpage.avsc [1] https://avro.apache.org/docs/current/spec.html
Adding a new field to Nutch + MongoDB datastore using plugin
I am running Nutch 2.3.1 configured with MondoDB (using Gora) + Elasticsearch and would like to add a new field to the storage database NOT the index. I am able to add a field to the elasticsearch index using a custom plugin but would like to add it to the mongodb record for each website. I've added the field to the ./conf/schema.xml file and to ./conf/gora-mongodb-mapping.xml - The field does appear in the index but not in the mongo record.. Here'e a snapshot of my plugin: public class AddNewField implements IndexingFilter { ... @Override public NutchDocument filter(NutchDocument doc, String url, WebPage page) throws IndexingException { //adds the new field to the document doc.add("mynewField", "HelloWorld"); return doc; } } -- View this message in context: http://lucene.472066.n3.nabble.com/Adding-a-new-field-to-Nutch-MongoDB-datastore-using-plugin-tp4269632.html Sent from the Nutch - User mailing list archive at Nabble.com.