I am running Nutch 2.3.1 configured with MondoDB (using Gora) +
Elasticsearch and would like to add a new field to the storage
database NOT the index.


I am able to add a field to the elasticsearch index using a custom
plugin but would like to add it to the mongodb record for each
website.


I've added the field to the ./conf/schema.xml file and to
./conf/gora-mongodb-mapping.xml - The field does appear in the index
but not in the mongo record..


Here'e a snapshot of my plugin:


public class AddNewField implements IndexingFilter {

...

@Override

  public NutchDocument filter(NutchDocument doc, String url, WebPage page)

      throws IndexingException {

        //adds the new field to the document

        doc.add("mynewField", "HelloWorld");

    return doc;

  }

}

Can this be achieved using a plugin or would I need to modify Nutch's
source code?

Thank you for any assistance you can provide.

Reply via email to